We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

A Feature Subset Selection Method based on Conditional Mutual Information and Ant Colony Optimization

by Syed Imran Ali, Waseem Shahzad
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 60 - Number 11
Year of Publication: 2012
Authors: Syed Imran Ali, Waseem Shahzad
10.5120/9734-3389

Syed Imran Ali, Waseem Shahzad . A Feature Subset Selection Method based on Conditional Mutual Information and Ant Colony Optimization. International Journal of Computer Applications. 60, 11 ( December 2012), 5-10. DOI=10.5120/9734-3389

@article{ 10.5120/9734-3389,
author = { Syed Imran Ali, Waseem Shahzad },
title = { A Feature Subset Selection Method based on Conditional Mutual Information and Ant Colony Optimization },
journal = { International Journal of Computer Applications },
issue_date = { December 2012 },
volume = { 60 },
number = { 11 },
month = { December },
year = { 2012 },
issn = { 0975-8887 },
pages = { 5-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume60/number11/9734-3389/ },
doi = { 10.5120/9734-3389 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:07:47.422657+05:30
%A Syed Imran Ali
%A Waseem Shahzad
%T A Feature Subset Selection Method based on Conditional Mutual Information and Ant Colony Optimization
%J International Journal of Computer Applications
%@ 0975-8887
%V 60
%N 11
%P 5-10
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Feature subset selection is one of the key problems in the area of pattern recognition and machine learning. Feature subset selection refers to the problem of selecting only those features that are useful in predicting a target concept i. e. class. Most of the data acquired through different sources are not particularly screened for any specific task e. g. classification, clustering, anomaly detection, etc. When this data is fed to a learning algorithm, its results deteriorate. The proposed method is a pure filter based feature subset selection technique that incurs less computational cost and highly efficient in terms of classification accuracy. Moreover, along with high accuracy the proposed method requires less number of features in most of the cases. In the proposed method the issue of feature ranking and threshold value selection is addressed. The proposed method adaptively selects number of features as per the worth of an individual feature in the dataset. An extensive experimentation is performed, comprised of a number of benchmark datasets over three well known classification algorithms. Empirical results endorse efficiency and effectiveness of the proposed method.

References
  1. M. Dash, H. Liu, "Feature Selection for Classification", Intelligent Data Analysis (IDA), Vol. 1, No. 3, pp. 131-156, 1997.
  2. Richard Jensen, Computational Intelligence and Feature Selection: Rough and Fuzzy Approaches, Wiley-IEEE Press 2008.
  3. Y. Saeys, I. Inza, and P. Larrañaga, "A review of feature selection techniques in bioinformatics", Bioinformatics, Vol. 23, No. 19, pp. 2507-2517, 2007.
  4. Isabelle Guyon, "An introduction to variable and feature selection ", Journal of Machine Learning Research, Vol. 3, pp. 1157-1182, 2003.
  5. Lei Yu and Huan Liu. "Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution", In Proceedings of the Twentieth International Conference on Machine Leaning, pp. 856-863, 2003.
  6. M. Dash and H. Liu, "Consistency-based search in feature selection", presented at Artificial Intelligence, Vol. 151, No. 1-2, pp. 155-176, 2003.
  7. Ron Kohavi and Dan Sommerfield, " Feature subset selection using the wrapper method: Over-fitting and Dynamic Search Space Technology", in Proceedings of 2nd Int. Conf. on Knowledge Discovery and Data Mining, 1995.
  8. Bai-Ning Jiang Xiang-Qian Ding Lin-Tao Ma "A Hybrid Feature Selection Algorithm: Combination of Symmetrical Uncertainty and Genetic Algorithms" The Second International Symposium on Optimization and Systems Biology, Lijiang, China, pp. 152–157, 2008.
  9. J. Zhou, R. Ng, and X. Li, "Ant colony optimization and mutual information hybrid algorithms for feature subset selection in equipment fault diagnosis", in Proceeding of 10th International Conference on Control, Automation, Robotics and Vision, pp. 898-903, 2008.
  10. Chun-Kai Zhang, and Hong Hu, "Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster", in Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Vol. 3, pp. 1728–1732, 2005.
  11. R. Jensen and Q. Shen, "Fuzzy-rough data reduction with ant colony optimization", presented at Fuzzy Sets and Systems, Vol. 149, pp. 5-20, 2005.
  12. X. Wang, J. Yang, X. Teng, W. Xia, and R. Jensen, "Feature selection based on rough sets and particle swarm optimization", presented at Pattern Recognition Letters, pp. 459-471, 2007.
  13. A. Hedar, J. Wang, and M. Fukushima, "Tabu search for attribute reduction in rough set theory", Soft Computing, pp. 909-918, 2008.
  14. H. Liu and R. Setiono, "A probabilistic approach to feature selection - A filter solution'', the 13th International Conference on Machine Learning, pp. 319-327, 1996.
  15. David E. Goldberg, Genetic algorithms in search, optimization and machine learning, Addison-Wesley, 1989.
  16. M. Dorigo, Optimization, Learning and Natural Algorithms, PhD thesis, Politecnico di Milano, Italy, 1992.
  17. S. Hettich, and S. D. Bay, "The UCI KDD Archive". Irvine, CA: Dept. Inf. Comput. Sci. , Univ. California, 1996 [Online]. Available: http:// kdd. ics. uci. edu.
  18. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten, "The WEKA Data Mining Software: An Update", SIGKDD Explorations, Vol. 11, No. 1, pp. 10-18, 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Feature Subset Selection Symmetric Uncertainty Ant Colony Optimization Classification