FEATURES SELECTION USING PARAMETRIC AND NON-PARAMETRIC METHODS: TAG SNPs SELECTION USING GA-SVM AND GA-KNN
Elatraby, Amr I. A.; Rashad R. T. Wahba;
Abstract
The study of genetic variations of the human genome, especially Single Nucleotide Polymorphisms (SNPs), can lead to the discovery of new methods to prevent, diagnose and treat diseases. Full examination of all the SNPs of the human genome has become too expensive, thus a small subset of informative SNPs called tag SNPs must be selected. In this study, two methods for the selection of tag SNPs are presented. The first method is called GA-SVM, which integrates the Support Vector Machine (SVM) as a parametric technique with the Genetic Algorithm (GA). The second method is called GA-KNN, which integrates the K-Nearest Neighbor (KNN) as a non-parametric technique with GA. The two methods are tested on a group of genes, which known to be related to the natural clearance of Hepatitis C Virus (HCV). The genes’ SNPs data had extracted from the HapMap site (http://hapmap.org). Moreover, the prediction accuracy of each method has been evaluated by using the 10-Fold Cross Validation (10- FCV) method. Our results have showed that, although the prediction accuracy of GA-SVM outperforms the prediction accuracy of GA-KNN when selecting a very small number of tag SNPs, the prediction accuracy of GA-KNN outperforms GA-SVM in all other cases. In addition, our results have indicated that the GA-KNN method requires more computing time as compared with GA-SVM.
Other data
Title | FEATURES SELECTION USING PARAMETRIC AND NON-PARAMETRIC METHODS: TAG SNPs SELECTION USING GA-SVM AND GA-KNN | Authors | Elatraby, Amr I. A. ; Rashad R. T. Wahba | Issue Date | 2015 | Journal | Advances and Applications in Statistics | DOI | 2 105 45 10.17654/ADASMay2015_105_123 |
Recommend this item
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.