We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms

by Mehdi Naseriparsa, Amir-masoud Bidgoli, Touraj Varaee
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 69 - Number 17
Year of Publication: 2013
Authors: Mehdi Naseriparsa, Amir-masoud Bidgoli, Touraj Varaee
10.5120/12065-8172

Mehdi Naseriparsa, Amir-masoud Bidgoli, Touraj Varaee . A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms. International Journal of Computer Applications. 69, 17 ( May 2013), 28-35. DOI=10.5120/12065-8172

@article{ 10.5120/12065-8172,
author = { Mehdi Naseriparsa, Amir-masoud Bidgoli, Touraj Varaee },
title = { A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 69 },
number = { 17 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 28-35 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume69/number17/12065-8172/ },
doi = { 10.5120/12065-8172 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:30:32.555241+05:30
%A Mehdi Naseriparsa
%A Amir-masoud Bidgoli
%A Touraj Varaee
%T A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms
%J International Journal of Computer Applications
%@ 0975-8887
%V 69
%N 17
%P 28-35
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper a hybrid feature selection method is proposed which takes advantages of wrapper subset evaluation with a lower cost and improves the performance of a group of classifiers. The method uses combination of sample domain filtering and resampling to refine the sample domain and two feature subset evaluation methods to select reliable features. This method utilizes both feature space and sample domain in two phases. The first phase filters and resamples the sample domain and the second phase adopts a hybrid procedure by information gain, wrapper subset evaluation and genetic search to find the optimal feature space. Experiments carried out on different types of datasets from UCI Repository of Machine Learning databases and the results show a rise in the average performance of five classifiers (Naïve Bayes, Logistic, Multilayer Perceptron, Best First Decision Tree and JRIP) simultaneously and the classification error for these classifiers decreases considerably. The experiments also show that this method outperforms other feature selection methods with a lower cost.

References
  1. Silberschatz, A. , Korth, H. F. , and Sudarshan, S. , 2010. Database System Concepts, McGrawHill.
  2. Bellman, R. , 1996. Adaptive Control Processes: A Guided Tour, Princeton University Press, Princeton.
  3. Ladha, L. , Deepa, T. , 2011. Feature Selection Methods and Algorithms, International Journal on Computer Science and Engineering, Volume 3, page(s). 1787-1790.
  4. Liu, H. , and Zhao, Z. , 2012. Manipulating Data and Dimension Reduction Methods: Feature Selection, Journal of Computational Complexity, Page(s) 1790-1800.
  5. Janecek, G. K. , Gansterer, N. , Demel, A. , and Ecker, F. , 2008. On the relationship between feature selection and classification accuracy, Journal of Machine Learning and Research. JMLR: Workshop and Conference Proceedings 4, Pages 90–105.
  6. Assareh, A. , Moradi, M. , and Volkert, L. , 2008. A hybrid random subspace classifier fusion approach for protein mass spectra classification, Springer, LNCS, Volume 4973, Page(s) 1–11, Heidelberg.
  7. Hayward, J. , Alvarez, S. , Ruiz, C. , Sullivan, M. , Tseng, J. , and Whalen, G. , 2008. Knowledge discovery in clinical performance of cancer patients, IEEE International Conference on Bioinformatics and Biomedicine, USA, Page(s) 51–58.
  8. Dhiraj, K. , Santanu Rath, K. , and Pandey, A. , 2009. Gene Expression Analysis Using Clustering, 3rd international Conference on Bioinformatics and Biomedical Engineering.
  9. Jiang, B. , Ding, X. , Ma, L. , He, Y. , Wang, T. , and Xie, W. , 2008. A Hybrid Feature Selection Algorithm: Combination of Symmetrical Uncertainty and Genetic Algorithms, The Second International Symposium on Optimization and Systems Biology, Page(s) 152–157, Lijiang, China, October 31– November 3.
  10. Zhou, J. , Peng, H. , and Suen, C. , 2008. Data-driven decomposition for multi-class classification, Journal of Pattern Recognition, Volume 41, Page(s) 67 – 76.
  11. Fayyad U. , Piatetsky-Shapiro, G. , and Smyth, P. , 1996. From Data Mining A Knowledge Discovery in Databases, American Association for Artificial Intelligence.
  12. Novakovic, J. , 2010. The Impact of Feature Selection on the Accuracy of Naïve Bayes Classifier, 18th Telecommunications Forum TELFOR, page(s) 1114-1116, November 23-25, Serbia, Belgrade.
  13. Domingos, P. , Pazzani, M. , 1997. On the Optimality of the Simple Bayesian Classifier under Zero-One loss, Machine Learning, Volume29, page(s) 103-130, November/December 1997.
  14. Blum, A. L. , Rivest, R. L. , 1992. Training a 3-node neural networks is NP-complete, Neural Networks, Volume 5, page(s) 117-127.
  15. Almuallim, H. , Dietterich, T. G. , 1991. Learning with many irrelevant features, in proceedings of AAAI-91, page(s) 547-552, Anaheim, California.
  16. Blum, A. L. , Langley, P. , 1997. Selection of Relevant Features and Examples in Machine Learning, Artificial Intelligence, Volume 97, page(s) 245-271.
  17. Chawla, N. V. , Bowyer, K. W. , Hall, L. O. , and Kegelmeyer, W. P. , 2002. SMOTE: Synthetic Minority Over-sampling Technique, Journal of Artificial Intelligence Research, Volume 16, page(s) 321-357.
  18. Kohavi, R. , John, G. H. , 1997. Wrappers for Feature Subset Selection, Artificial Intelligence, Volume 97, page(s) 273-324.
  19. Olusola, A. A. , Oladele, A. S. , Abosede, D. O. , 2010. Analysis of KDD'99 Intrusion Detection Dataset for Selection of Relevance Features, Proceedings of the World Congress on Engineering and Computer Science, October20-22, San Francisco, USA.
  20. Randy, H. , and Haupt, S. , 1998. Practical Genetic Algorithms, John Wiley and Sons.
  21. Mertz C. J. , and Murphy, P. M. , 2013. UCI Repository of machine learning databases, http://www. ics. uci. edu/~mlearn/MLRepository. html, University of California.
  22. Dash, M. , Liu, H. , 2003. Consistency-based Search in Feature Selection, Artificial Intelligence, Volume 151, page(s) 155-176.
  23. Chou, T. S. , Yen, K. K. , and Luo, J. , 2008. Network Intrusion Detection Design Using Feature Selection of Soft Computing Paradigms, International Journal of Information and Mathematical Sciences, Volume 4, page(s) 196-208.
Index Terms

Computer Science
Information Sciences

Keywords

Feature Selection Resampling Information Gain Wrapper Subset Evaluation