We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

Article:Learning and Optimizing the Features with Genetic Algorithms

by Dr.E.Chandra, K. Nandhini
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 9 - Number 6
Year of Publication: 2010
Authors: Dr.E.Chandra, K. Nandhini
10.5120/1393-1877

Dr.E.Chandra, K. Nandhini . Article:Learning and Optimizing the Features with Genetic Algorithms. International Journal of Computer Applications. 9, 6 ( November 2010), 1-5. DOI=10.5120/1393-1877

@article{ 10.5120/1393-1877,
author = { Dr.E.Chandra, K. Nandhini },
title = { Article:Learning and Optimizing the Features with Genetic Algorithms },
journal = { International Journal of Computer Applications },
issue_date = { November 2010 },
volume = { 9 },
number = { 6 },
month = { November },
year = { 2010 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume9/number6/1393-1877/ },
doi = { 10.5120/1393-1877 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:57:53.490123+05:30
%A Dr.E.Chandra
%A K. Nandhini
%T Article:Learning and Optimizing the Features with Genetic Algorithms
%J International Journal of Computer Applications
%@ 0975-8887
%V 9
%N 6
%P 1-5
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The quality of the data being analyzed is a critical factor that affects the accuracy of data mining algorithms. There are two important aspects of the data quality, one is relevance and the other is data redundancy. The inclusion of irrelevant and redundant features in the data mining model results in poor predictions and high computational overhead. Feature extraction aims to reduce the computational cost of feature measurement, increase classifier efficiency, and allow greater classification accuracy based on the process of deriving new features from the original features. This paper represents an approach for classifying students in order to predict their final grades based on features extracted from logged data in an educational web-based system. A combination of multiple classifiers leads to a significant improvement in classification performance. By weighing feature vectors representing feature importance using a Genetic Algorithm (GA), we can optimize the prediction accuracy and obtain a marked improvement over raw classification. We further show that when the number of features is few, feature weighting and transformation into a new space works efficiently compared to the feature subset selection. This approach is easily adaptable to different types of courses, different population sizes, and allows for different features to be analyzed.

References
  1. A. Al-Ani and M. Deriche, “Feature Selection Using a Mutual Information Based Measure,” Proc. 16th Int’l Conf. Pattern Recognition, vol. 4, pp. 82-85, 2002.
  2. H. Almuallim and T.G. Dietterich, “Learning with Many Irrelevant Features,” Proc. Ninth Nat’l Conf. Artificial Intelligence, pp. 547-552, 1991.
  3. R. Battiti, “Using Mutual Information for Selecting Features in Supervised Neural Net Learning,” IEEE Trans. Neural Networks, vol. 5, pp. 537-550, 1994.
  4. A. Blum and P. Langley, “Selection of Relevant Features and Examples in Machine Learning,” Artificial Intelligence, vol. 97, nos. 1-2, pp. 245-271, 1997.
  5. S. Das, “Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection,” Proc. 18th Int’l Conf. Machine Learning, pp. 74-81, 2001.
  6. M. Dash and H. Liu, “Feature Selection for Classification,” Intelligent Data Analysis: An Int’l J., vol. 1, pp. 131-156, 1997.
  7. M.A. Hall, “Correlation-Based Feature Selection for Discrete and Numeric Class Machine Learning,” Proc. 17th Int’l Conf. Machine Learning, pp. 359-366, 2000.
  8. M.A. Hall and G. Holmes, “Benchmarking Attribute Selection Techniques for Discrete Class Data Mining,” IEEE Trans. Knowledge and Data Eng., 2002.
  9. K.A. De Jong, “An Analysis of Behavior of a Class of Genetic Adaptive Systems,” PhD Dissertation, Dept. of Computer and Comm. Sciences, Univ. of Michigan, 1975.
  10. K. Kira and L.A. Rendell, “A Practical Approach to Feature Selection,” Proc. Ninth Int’l Workshop Machine Intelligence, 1992.
  11. R. Kohavi and G. John, “Wrappers for Feature Subset Selection,” Artificial Intelligence, pp. 273-324, 1997.
  12. I. Kononenko, “Estimating Attributes: Analysis and Extensions of Relief,” Proc. Seventh European Conf. Machine Learning, pp. 171-182, 1994.
  13. P. Langley, “Selection of Relevant Features in Machine Learning,” Proc. AAAI Fall Symp. Relevance, 1994.
  14. H. Liu and R. Setiono, “A Probabilistic Approach to Feature Selection: A Filter Solution,” Proc. 13th Int’l Conf. Machine Learning, pp. 319-327, 1996.
  15. J.A. Miller, W.D. Potter, R.V. Grandham, and C.N. Lapena, “An Evaluation of Local Improvement Operators for Genetic Algorithms,” IEEE Trans. Systems, Man, and Cybernetics, vol. 23,pp. 1340-1351, Sept./Oct. 1993.
  16. Y. Peng and J.A. Reggia, “A Connectionist Model for Diagnostic Problem Solving,” IEEE Trans. Systems, Man, and Cybernetics, vol. 19, pp. 285-298, Mar./Apr. 1989.
  17. W.H. Press, B.P. Flannery, S.A. Teukolski, and W.T. Vetterling,Numerical Recipes in C. Cambridge Univ. Press, http:// www.library.cornell.edu/nr/bookcpdf.html, 2005.
  18. J.R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, Calif.: Morgan Kaufmann, 1993.
  19. L. Yu and H. Liu, “Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution,” Proc. 20th Int’l Conf. Machine Learning (ICML-2003), 2003.
  20. L. Yu and H. Liu, “Efficient Feature Selection via Analysis of Relevance and Redundancy,” J. Machine Learning Research, vol. 5, pp. 1205-1224, Oct. 2004.
  21. http://kdd.ics.uci.edu/databases/kddcup99/task.html, 2005.
Index Terms

Computer Science
Information Sciences

Keywords

Feature Subset Selection Student repository Classification Rule Generations