CFP last date
20 December 2024
Reseach Article

Mining Comprehensible and Interesting Rules: A Genetic Algorithm Approach

by Jyoti Vashishtha, Dharminder Kumar, Saroj Ratnoo, Kapila Kundu
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 31 - Number 1
Year of Publication: 2011
Authors: Jyoti Vashishtha, Dharminder Kumar, Saroj Ratnoo, Kapila Kundu
10.5120/3792-5221

Jyoti Vashishtha, Dharminder Kumar, Saroj Ratnoo, Kapila Kundu . Mining Comprehensible and Interesting Rules: A Genetic Algorithm Approach. International Journal of Computer Applications. 31, 1 ( October 2011), 39-47. DOI=10.5120/3792-5221

@article{ 10.5120/3792-5221,
author = { Jyoti Vashishtha, Dharminder Kumar, Saroj Ratnoo, Kapila Kundu },
title = { Mining Comprehensible and Interesting Rules: A Genetic Algorithm Approach },
journal = { International Journal of Computer Applications },
issue_date = { October 2011 },
volume = { 31 },
number = { 1 },
month = { October },
year = { 2011 },
issn = { 0975-8887 },
pages = { 39-47 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume31/number1/3792-5221/ },
doi = { 10.5120/3792-5221 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:17:01.826775+05:30
%A Jyoti Vashishtha
%A Dharminder Kumar
%A Saroj Ratnoo
%A Kapila Kundu
%T Mining Comprehensible and Interesting Rules: A Genetic Algorithm Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 31
%N 1
%P 39-47
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A majority of contribution in the domain of rule mining overemphasize on maximizing the predictive accuracy of the discovered patterns. The user-oriented criteria such as comprehensibility and interestingness are have been given secondary importance. Recently, it has been widely acknowledged that even highly accurate discovered knowledge might be worthless if it scores low on the qualitative parameters of comprehensibility and interestingness. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible and interesting in CNF form in which along with conjunction in between various attributes there is disjunction among the values of an attribute. A flexible encoding scheme, genetic operators with appropriate syntactic constraints and a suitable fitness function to measure the goodness of rules are proposed for effective evolution of rule sets. The proposed genetic algorithm is validated on several datasets of UCI data set repository and experimental results are presented which clearly indicate lower error rates and more comprehensibility across a range of datasets. Some of the rules show the interesting and valuable nuggets of knowledge discovered from small disjuncts of high accuracy and low support which are very difficult to capture otherwise.

References
  1. Han, J., Kamber, M. and Pei, J. 2011. Data Mining: Concepts and Techniques, Third Edition, Morgan Kaufmann.
  2. Noda E., Freitas A.A. and Lopes H.S. 1999. Discovering interesting prediction rules with a genetic algorithm. In Proceedings of the 1999 Congress on Evolutionary Computation, IEEE Washington, vol. 2, pp.1322-1329
  3. Fidelis, M.V., Lopes, H.S., Freitas, A.A. and Grossa, P. 2000. Discovering Comprehensible Classification Rules with a Genetic Algorithm. In Proceedings of the 2000 congress on Evolutionary Computation, La Jolla, CA, USA, IEEE, vol. 1, pp. 805--810.
  4. Freitas, A.A. 2003. A survey of evolutionary algorithms for data mining and knowledge discovery. In: A. Ghosh and S. Tsutsui (Eds.) Advances in Evolutionary Computation, Springer-Verlag New York, pp.819-845.
  5. Romão, W., Freitas, A.A. and Gimenes, I.M.D.S. 2004. Discovering interesting knowledge from a science and technology database with a genetic algorithm. Applied Soft Computing, vol. 4, pp. 121-137.
  6. Freitas, A.A. and Carvalho, D.R. 2002. A genetic algorithm with sequential niching for discovering small-disjunct rules. Applied Soft Computation, vol. 2, 75-88.
  7. Bharadwaj, K.K. and Basheer, M. Al-Maqaleh. 2005. Evolutionary approach for automated discovery of censored production rules. Enformatika. vol. 10, pp. 147-152.
  8. Jiao, L., Liu, J. and Zhong, W. 2006. An organizational co-evolutionary algorithm for classification. Transactions on Evolutionary Computation, IEEE, vol. 10, pp. 67-80.
  9. Saroj, Bharadwaj, K.K. 2007. A parallel genetic algorithm approach for automated discovery of censored production rules. In Proceedings of International Conference on Artificial Intelligence and Application (IASTED), ACTA Press, Innsbruck, Austria. pp. 435-441.
  10. Freitas, A. A. 2002. Data mining and knowledge discovery with evolutionary algorithms. Natural Computing Series, Springer-Verlag, New York.
  11. Suzuki, E., Zytkow, J.M. 2005. Unified algorithm for undirected discovery of exception rules. International Journal of Intelligent Systems, vol. 20(7), pp. 673 - 691.
  12. DeJong, K.A., Spears, W.M.; Gordon, D.F. 1993. Using genetic algorithms for concept learning. Machine Learning, vol. 13, pp. 161-188.
  13. Janikow, C.Z. 1993. A knowledge-intensive genetic algorithm for supervised learning. Machine Learning, vol. 13, pp. 189-228.
  14. Greene, D.P. and Smith, S.F. 1993. Competition-based induction of decision models from examples. Machine Learning vol. 13, 229-257.
  15. Giordana, A. and Neri, F. 1995. Search-intensive concept induction. evolutionary computation vol. 3(4), pp. 375-416.
  16. Pei, M., Goodman, E.D., Punch III, W.F. 1997. Pattern discovery from data using genetic algorithms. In proceedings of first Pacific-Asia Conference. Knowledge Discovery and Data Mining, 1997.
  17. Freitas, A.A. 2008. A review of evolutionary algorithms for data mining. Soft Computing for Knowledge Discovery and Data Mining, O. Maimon and L. Rokach, Eds., Boston, MA: Springer US, pp. 79-111.
  18. Wilson, S. W. 1995. Classifier fitness based on accuracy. Evolutionary Computation, vol. 3(2), pp. 149–175.
  19. Venturini, G. 1993. SIA: a supervised inductive algorithm with genetic search for learning attributes based concepts. In Machine Learning ECML–93, ser. LNAI, P. Brazdil, Ed., Springer, vol. 667, pp. 280–296.
  20. Bernad´o-Mansilla, E. and Garrell, J. M. 2003. Accuracy–based learning classifier systems: models, analysis and applications to classification tasks. Evolutionary Computation, vol. 11(3), pp. 209–238.
  21. Aguilar-Ruiz, J. S., Gir´aldez, R. and Riquelme, J. C. 2007. Natural encoding for evolutionary supervised learning. IEEE Transactions on Evolutionary Computation, vol. 11(4), pp. 466–479.
  22. Freitas, A.A. 1999. A genetic algorithm for generalized rule induction. In: Roy, et al. (Eds.), Advances in Soft Computing—Engineering Design and Manufacturing, Springer-Verlag, pp. 340–353.
  23. Carvalho, D.E. and Freitas, A. A. 2000. A genetic algorithm-based solution for the problem of small disjuncts. In: D.A. Zighed, J. Komorowski, and J. Zytkow (Eds.), Principles of Data Mining and Knowledge Discovery (Proceedings of Fourth European Conference PKDD-2000, Lyon, France), Lecture Notes in Artificial Intelligence, Berlin, Springer-Verlag. vol. 1910, pp 345-352.
  24. Carvalho, D.R. and Freitas, A.A. 2002. A genetic-algorithm for discovering small-disjunct rules in data mining. Applied Soft Computing, vol. 2, pp. 75-88.
  25. Gopalan, J., Korkmaz, E., Alhajj, R. and Barker, K. 2005. Effective data mining by integrating genetic algorithm into the data preprocessing phase. Fourth International Conference on Machine Learning and Applications (ICMLA’05), Los Angeles, CA, USA: , pp. 331-336.
  26. Dehuri, S., Patnaik, S., Ghosh, A. and Mall, R. 2008. Application of elitist multi-objective genetic algorithm for classification rule generation. Applied Soft Computing, vol. 8, pp. 477-487.
  27. Bharadwaj, K. K., and Al-Maqaleh B. M. 2005. Evolutionary approach for automated discovery of censored production rules. In Proceedings of 8th International Conference on Cybernetics, Informatics and Systemics (CIS-2005) Enformatika. vol. 10, pp. 147-152.
  28. Saroj, S. and Bharadwaj, K.K. 2007. A parallel genetic algorithm approach for automated discovery of censored production rules. In proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications, Innsbruck, Austria: ACTA Press, pp. 435–441.
  29. Bharadwaj, K.K. and Saroj. 2009. Parallel genetic algorithm approach to automated discovery of hierarchical production rules. Applications of Soft Computing, J. Mehnen, M. Köppen, A. Saad, and A. Tiwari, (Eds.), Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 327-336.
  30. Bharadwaj, K.K. and Saroj. 2010. A parallel genetic programming based intelligent miner for discovery of censored production rules with fuzzy hierarchy. Expert Systems with Applications, vol. 37, pp. 4601-4610.
  31. Bharadwaj, K.K. and AlMaqaleh, B. M. 2007. Evolutionary approach for automated discovery of censored production rules with fuzzy hierarchy. International Multi-conference of Engineers and Computer Scientists, Hong Kong, pp. 716-721.
  32. Bharadwaj, K.K. and Jain, N. 1992. Hierarchical censored production rules (HCPRs) system. Data & Knowledge Engineering, vol. 8, pp. 19-34.
  33. Jain, N.K. and Bharadwaj, K.K. 1998. Some learning techniques in hierarchical censored production rules (HCPRs) system. International Journal of Intelligent Systems, vol. 13, pp. 319-344.
  34. Gains, B.R. and Compton, P. 1995. Induction of ripple down rules applied to modeling large database. Journal of Intelligent Information System, vol. 5, pp. 211- 228.
  35. Scheffer, T. 1996. Algebraic foundation and improved methods of induction of ripple down rules. Second Pacific Knowledge Acquisition Workshop, Sydney, pp. 279-292.
  36. Fadl, M., Ba-Alwi, and Bharadwaj, K.K. 2005. Automated discovery of hierarchical ripple down rules (HRDRs). In proceedings of the 23rd IASTED International Multi-Conference Artificial Intelligence and Applications, Innsbruck, Austria.
  37. Saroj, Kapila, Kumar, D. and Kanika. 2011. A genetic algorithm with entropy based probabilistic initialization and memory for automated rule mining. Advances in Computer Science and Information Technology, N. Meghanathan, B.K. Kaushik, and D. Nagamalai, (Eds.), Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 604-613.
  38. Yiyu, Yao and Bing, Zhou. 2008. Micro and macro evaluation of classification rules. In Seventh IEEE International Conference on Cognitive Informatics, ICCI 2008, Stanford University, California, USA, pp. 441-448.
Index Terms

Computer Science
Information Sciences

Keywords

Comprehensibility interestingness classification rules genetic algorithm