CFP last date
20 January 2025
Reseach Article

Article:Feature Subset Selection using Cascaded GA & CFS: A Filter Approach in Supervised Learning

by Asha Gowda Karegowda, M.A.Jayaram, A.S .Manjunath
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 23 - Number 2
Year of Publication: 2011
Authors: Asha Gowda Karegowda, M.A.Jayaram, A.S .Manjunath
10.5120/2865-3711

Asha Gowda Karegowda, M.A.Jayaram, A.S .Manjunath . Article:Feature Subset Selection using Cascaded GA & CFS: A Filter Approach in Supervised Learning. International Journal of Computer Applications. 23, 2 ( June 2011), 1-10. DOI=10.5120/2865-3711

@article{ 10.5120/2865-3711,
author = { Asha Gowda Karegowda, M.A.Jayaram, A.S .Manjunath },
title = { Article:Feature Subset Selection using Cascaded GA & CFS: A Filter Approach in Supervised Learning },
journal = { International Journal of Computer Applications },
issue_date = { June 2011 },
volume = { 23 },
number = { 2 },
month = { June },
year = { 2011 },
issn = { 0975-8887 },
pages = { 1-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume23/number2/2865-3711/ },
doi = { 10.5120/2865-3711 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:09:06.749684+05:30
%A Asha Gowda Karegowda
%A M.A.Jayaram
%A A.S .Manjunath
%T Article:Feature Subset Selection using Cascaded GA & CFS: A Filter Approach in Supervised Learning
%J International Journal of Computer Applications
%@ 0975-8887
%V 23
%N 2
%P 1-10
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Medical data mining has enormous potential for exploring the hidden patterns in the data sets of the medical domain. These patterns can be utilized by the physicians to improve clinical diagnosis. Feature subset selection is one of data preprocessing step, which is of immense importance in the field of data mining. As a part of feature subset selection step of data preprocessing, a filter approach with genetic algorithm (GA) and Correlation based feature selection has been used in a cascaded fashion. GA rendered global search of attributes with fitness evaluation effected by CFS. Experimental results signify that the feature subset recognized by the proposed filter GA+CFS, when given as input to five classifiers, namely decision tree, Naïve Bayes, Bayesian, Radial basis function and k-nearest neighbor classifiers showed enhanced classification accuracy. Experiments have been carried out on four medical data sets publicly available at UCI.

References
  1. J. Han And M. Kamber,” Data Mining: Concepts and Techniques”, San Francisco, Morgan Kauffmann Publishers, 2001.
  2. Jennifer G. , ”Feature Selection for Unsupervised Learning”, Journal of Machine Learning, Vol. 5,pp845-889, Dec 2004.
  3. Maciej Modrzejewski,“Feature selection using Rough Sets Theory”, In Proceedings of the European Conference on Machine learing,pp 213-226,1993.
  4. Manoranjan Dash, Kiseiok Choi, Petr Scheuermann, Huan Liu. ,”Feature Selection for Clustering – a Filter Solution”, In Proceedings of the Second International Conference on Data Mining, 2002.
  5. Volfer Rotz, and Tilman Lange,” Feature Selection in Clustering Problems”, In Advances in Neural Information Processing Systems 16,2003.
  6. Ron Kohavi, George H. John ,”Wrappers for feature subset Selection”, Artificial Intelligence, Vol. 97, No. 1-2. pp. 273-324, 1997.
  7. M. Dash, H. Liu,” Feature Selection for Classification”, Intelligent Data Analysis, pp 131–156, March1997.
  8. Mark A. Hall,”Correlation-based Feature Selection for Machine Learning”, Dept of Computer science, University of Waikato. http://www.cs.waikato.ac.nz/ mhall/thesis.pdf, 1998.
  9. Shyamala Doraisamy ,Shahram Golzari ,Noris Mohd. Norowi, Md. Nasir B Sulaiman , Nur Izura Udzir,”A Study on Feature Selection and Classification Techniques for Automatic Genre Classification of Traditional Malay Music”, ismir2008.ismir.net/papers/ISMIR2008 256.pdf, 2008.
  10. Y.Saeys, I.Inza, and P. LarrANNaga,” A review of feature selection techniques in bioinformatics”, Bioinformatics, 23(19) pp.2507-2517,2007.
  11. Hussein Almuallim and Thomas G. Dietterich,” Learning Boolean concepts in the presence of many irrelevant features”, Artificial Intelligence, 69(1-2): 279–305, 1994.
  12. Antonio Arauzo, Jose Manuel Benitez, and Juan Luis Castro,” C-focus: A continuous extension of focus”, In proceedings of the 7th online World Conference on Soft Computing in Industrial Applications, 2002.
  13. Kira, Kenji, and Larry A. RENDELL, “ A practical approach to feature selection”, In: Derek H. SLEEMAN and Peter EDWARDS, eds. ML92, Proceedings of the Ninth International, Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufman Publishers Inc., pp. 249–256, 1992.
  14. H. Liu and W.X. Wen,”Concept Learning Through Feature Selection”, In Proceedings of First Australian and New Zealand Conf. Intelligent Information Systems, 1993.
  15. Baranidharan Raman, Thomas R, Ioerger, “Instance Based Filter for Feature Selection”, 2002.–http://citeseerx.ist.psu.edu.
  16. M.A.Jayaram, Asha Gowda Karegowda,” Integrating Decision Tree and ANN for Categorization of Diabetics Data”, International Conference on Computer Aided Engineering, IIT Madras, Chennai, India, December 13-15, 2007.
  17. Asha Gowda Karegowda and M.A. Jayaram,” Cascading GA & CFS for Feature Subset Selection in Medical Data Mining”, International Conference on IEEE International Advance Computing Conference (IACC’09), Thapar University, Patiala, Punjab India, March 6-7, 2009
  18. D. Goldberg,” Genetic Algorithms in Search, Optimization, and Machine learning”, Addison Wesley, 1989.
  19. Joseph L.Breault, “Data Mining Diabetic Databases: Are rough Sets a Useful Addition?” 2001. www.galaxy.gmu.edu/interface/I01/.../JBreault/JBreault-Paper.pdf
  20. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, “Introduction To Data Mining”, Pearson Education, Third Impression, 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Feature selection filters Genetic Algorithm Correlation based feature selection Decision tree Naïve Bayes Bayesian Classifier Radial Basis Function K-Nearest Neighbor