A Data Mining Tool Using a Modified Ordered Attribute Trees Algorithm for Handling Missing Values
Mona Farouk Ahmed Kamal El Deen;
Abstract
Data mining "techniques are becoming increasingly useful in a wide range of environments as a source of business intelligence. A wide range of companies has deployed successful applications of data mining. When attempting to discover by learning concepts embedded in data it is common to find that infonnation is missing from the data. Such missing inforn1ation can diminish the confidence on the •:oncepts learned from the data. When missing values occur in the data, the learning algorithm fails to find an accurate representation of the concept. Properly filling missing values in data helps in reducing the error rate of the learned concepts. This work aims at solving the problem of missing values in data presented to a data mining decision tree learner. The missing values handling teclmique proposed in this work is a modification of the Ordered Attribute Trees method which is a machine learning approach to the missing values problem. A decision tree is constructed to detennine the missing values o:0 each attribute by using infonnation contained in other Httributes. Also, an ordering for the construction of the decision trees for the attributes is fonnulated.
This work is presented together with an implementation of two other missing values handling techniques namely, Unordered Attribute Trees which is also a machine learning approach and the Probabilistic method which is a good example of the statistical approach for handling missing values. The three methods are tested on the same data sets and perforn1ance resi.1lts and evaluation are presented. Results show the proposed modification is at advantage in comparison to the other two techniques.•
This work is presented together with an implementation of two other missing values handling techniques namely, Unordered Attribute Trees which is also a machine learning approach and the Probabilistic method which is a good example of the statistical approach for handling missing values. The three methods are tested on the same data sets and perforn1ance resi.1lts and evaluation are presented. Results show the proposed modification is at advantage in comparison to the other two techniques.•
Other data
| Title | A Data Mining Tool Using a Modified Ordered Attribute Trees Algorithm for Handling Missing Values | Other Titles | اداه للتنقيب عن البيانات باستخدام خوارزم شجرات الصفات مع وجود بيانات ناقصة | Authors | Mona Farouk Ahmed Kamal El Deen | Issue Date | 2006 |
Attached Files
| File | Size | Format | |
|---|---|---|---|
| B12764.pdf | 941.91 kB | Adobe PDF | View/Open |
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.