MLRMUD: A MULTI LINEAR REGRESSION APPROACH FOR MISSING VALUES PREDICTION WITH UNKNOWN DEPENDENT VARIABLE
Ahmed Karama Mahboab Alhebshi;
Abstract
The Missing Value problem (MV) is the problem of predicting the missing value in the data set while achieving accurate values. An Additional attribute has been imposed on the missing value problem which is an unknown dependent variable.
In this work, a new approach, MLRMUD, based on Multiple Linear Regression is used to predict Missing values for a data set with an Unknown Dependent variable if complete rows are at least 20%. If they are less than that the Mean method is used to fill some rows until the complete rows reach 20%, after that MLRMUD can be applied normally. This approach is composed of three algorithms; splitting algorithm, dependent variable selection algorithm and multi linear regression algorithm.
MLRMUD is compared to other counterparts in the literature where it was proved that it outperforms them all in the accuracy of missing values computation determined in terms of the Root Mean Square Error (RMSE) and Mean Standard Error (MSE). A method to determine the unknown dependent variable from the training set is proposed. The results show that the proposed method can successfully select the dependent variable with an accuracy of 83% overall the data sets examined
In this work, a new approach, MLRMUD, based on Multiple Linear Regression is used to predict Missing values for a data set with an Unknown Dependent variable if complete rows are at least 20%. If they are less than that the Mean method is used to fill some rows until the complete rows reach 20%, after that MLRMUD can be applied normally. This approach is composed of three algorithms; splitting algorithm, dependent variable selection algorithm and multi linear regression algorithm.
MLRMUD is compared to other counterparts in the literature where it was proved that it outperforms them all in the accuracy of missing values computation determined in terms of the Root Mean Square Error (RMSE) and Mean Standard Error (MSE). A method to determine the unknown dependent variable from the training set is proposed. The results show that the proposed method can successfully select the dependent variable with an accuracy of 83% overall the data sets examined
Other data
| Title | MLRMUD: A MULTI LINEAR REGRESSION APPROACH FOR MISSING VALUES PREDICTION WITH UNKNOWN DEPENDENT VARIABLE | Other Titles | طريقة الانحدار الخطي للتنبؤ بالقيم المفقودة مع المتغير المعتمد المجهول | Authors | Ahmed Karama Mahboab Alhebshi | Issue Date | 2019 |
Recommend this item
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.