International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 43 - Number 3 |
Year of Publication: 2012 |
Authors: Amal Tamer, Amr Badr |
10.5120/6081-8219 |
Amal Tamer, Amr Badr . A Comparative Study on Bioinformatics Feature Selection and Classification. International Journal of Computer Applications. 43, 3 ( April 2012), 5-8. DOI=10.5120/6081-8219
This paper presents an application of supervised machine learning approaches to the classification of the colon cancer gene expression data. Established feature selection techniques based on principal component analysis (PCA), independent component analysis (ICA), genetic algorithm (GA) and support vector machine (SVM) are, for the first time, applied to this data set to support learning and classification. Different classifiers are implemented to investigate the impact of combining feature selection and classification methods. Learning classifiers implemented include K-Nearest Neighbors (KNN) and support vector machine. Results of comparative studies are provided, demonstrating that effective feature selection is essential to the development of classifiers intended for use in high dimension domains. This research also shows that feature selection helps increase computational efficiency while improving classification accuracy.