Deducing Decision Rules using the Categorization of the Numerical Features

Hanan Mokhtar Abdelaziz Fahmy Elhilbawi;

Abstract


Discretizing continuous attributes is one essential and important data preprocessing step in data mining. Various data mining techniques are designed to be applied to discrete attributes. There have been tremendous efforts to propose discretization techniques with different characteristics. This thesis first presents an overview of the discretization process and the various existing taxonomies and proposes a taxonomy based on the existence of class information and relationship between attributes in the analyzed dataset. The thesis then reviews different discretization techniques. Next, the thesis presents an overview of the machine learning techniques used and the most important advantages and disadvantages and examines the importance of discretization as a preprocessing step and how it assists in achieving better classification performance compared to using continuous attributes. The performance of multiple parametric and non-parametric discretization methods in conjunction with a number of machine learning classifiers is assessed when applied to the problem of predicting Intensive Care Unit (ICU) mortality. The medical dataset selected to predict mortality of ICU patients was obtained from the PhysioNet Computing in Cardiology. The dataset has been discretized using different discretization techniques namely Equal width, Equal Frequency, K-means, Minimum Description Length Principle (MDLP), Class-Attribute Interdependence Maximization (CAIM) and Iterative Dichotomiser 3 (ID3). The thesis compares the accuracy of each of those methods in prediction using different machine learning classifiers. The achieved accuracy when discretization is used is compared to using the continuous attributes without discretization. The results demonstrate that using discretization in this problem enhances the accuracy of machine learning models compared to dealing with continuous attributes.


Other data

Title Deducing Decision Rules using the Categorization of the Numerical Features
Other Titles استنتاج قواعد القرار بتصنيف السمات العددية
Authors Hanan Mokhtar Abdelaziz Fahmy Elhilbawi
Issue Date 2021

Attached Files

File SizeFormat
BB7691.pdf968.31 kBAdobe PDFView/Open
Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check

views 2 in Shams Scholar


Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.