CFP last date
20 February 2025
Reseach Article

Study of Various Decision Tree Pruning Methods with their Empirical Comparison in WEKA

by Nikita Patel, Saurabh Upadhyay
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 60 - Number 12
Year of Publication: 2012
Authors: Nikita Patel, Saurabh Upadhyay
10.5120/9744-4304

Nikita Patel, Saurabh Upadhyay . Study of Various Decision Tree Pruning Methods with their Empirical Comparison in WEKA. International Journal of Computer Applications. 60, 12 ( December 2012), 20-25. DOI=10.5120/9744-4304

@article{ 10.5120/9744-4304,
author = { Nikita Patel, Saurabh Upadhyay },
title = { Study of Various Decision Tree Pruning Methods with their Empirical Comparison in WEKA },
journal = { International Journal of Computer Applications },
issue_date = { December 2012 },
volume = { 60 },
number = { 12 },
month = { December },
year = { 2012 },
issn = { 0975-8887 },
pages = { 20-25 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume60/number12/9744-4304/ },
doi = { 10.5120/9744-4304 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:07:50.389687+05:30
%A Nikita Patel
%A Saurabh Upadhyay
%T Study of Various Decision Tree Pruning Methods with their Empirical Comparison in WEKA
%J International Journal of Computer Applications
%@ 0975-8887
%V 60
%N 12
%P 20-25
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Classification is important problem in data mining. Given a data set, classifier generates meaningful description for each class. Decision trees are most effective and widely used classification methods. There are several algorithms for induction of decision trees. These trees are first induced and then prune subtrees with subsequent pruning phase to improve accuracy and prevent overfitting. In this paper, various pruning methods are discussed with their features and also effectiveness of pruning is evaluated. Accuracy is measured for diabetes and glass dataset with various pruning factors. The experiments are shown for this two datasets for measuring accuracy and size of the tree.

References
  1. Dipti D. Patil, V. M. Wadhai, J. A. Gokhale, "Evaluation of Decision Tree Pruning Algorithms for Complexity and Classification Accuracy",IJCSE,volume-II.
  2. Jiawei Han, MichelineKamber, "Data Mining Concepts and Techniques", pp. 279-328, 2001.
  3. Tom. M. Mitchell, "Machine Learning", McGraw-Hill Publications, 1997
  4. "Application of Data Mining Techniques for Medical Image Classification" Proceedings of the Second International Workshop on multimedia Data Mining(MDM/KDD'2001) in conjuction with ACM SIGKDD conference. San Francisco,USA,August 26,2001.
  5. Cscu. cornell. edu, 2003 [Online] SimonaDespa, 4 March 2003 Retrievedfromhttp://www. cscu. cornell. edu/news/statnews/stnews55. pdf [Accessed on May 5, 2009]
  6. Fu, L. (1994). "Rule generation from neural networks. " IEEE Transactions on Systems, Man and Cybernetics 24(8): 1114-1124.
  7. Chih-Wei Hsu ,"A comparison of methods for multiclass support vector machines",Neural Network ,IEEE transaction on mar 2002.
  8. J. Quinlan," Simplifying decision trees", Int. J. Human-Computer Studies.
  9. J. R. Quinlan, "C4. 5: programs for Machine Learning", Morgan Kaufmann, New York,1993
  10. J. R. Quinlan, "Induction of Decision Trees", Machine Learning 1(1986) pp. 81-106.
  11. SamDrazin and Matt Montag,"Decision Tree Analysis using Weka", Machine Learning-Project II, University of Miami.
  12. K. C. Tan, E. J. Teoh, Q. Yu, K. C. Goh," A hybrid evolutionary algorithm for attribute selection in data mining", Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576, Singapore. Rochester Institute of Technology, USA.
  13. Liangxiao JIANG, Chaoqun LI," An Empirical Study on Attribute Selection Measures in Decision Tree Learning", Journal of Computational Information Systems6:1(2010) 105-112.
  14. Max Bramer," Pre-pruning Classification Trees to Reduce Overfitting in Noisy Domains", Faculty of Technology, University of Portsmouth, UK.
  15. F. Esposito, D. Malerba, and G. Semeraro,"A comparative Analysis of Methods for Pruning Decision Trees", IEEE transactions on pattern analysis and machine intelligence,19(5): pp. 476-491, 1997.
  16. B. Cestnik, and I. Bratko, "Estimating Probabilities in Tree Pruning", EWSL, pp. 138-150, 1991.
  17. Esposito F. , Malerba D. , Semeraro G,"A Comparative Analysis of Methods for Pruning Decision Trees", IEEE Transactions on Pattern Analysis and Machine Intelligence, VOL. 19, NO. 5, 1997, P. 476-491.
  18. Minhaz Fahim Zibran," CHI-Squared Test of Independence", Department of Computer Science,University of Calgary, Alberta, Canada.
  19. David C. Howell," Chi-square test Analysis of Contingency tables", University of Vermont.
  20. Jeffrey P. Bradford, Clayton Kunz, Ron Kohavi, Cliff Brunk, Carla E. Brodley," Pruning Decision Trees with Misclassification Costs",ECE Technical Reports. Paper 51.
Index Terms

Computer Science
Information Sciences

Keywords

Attribute Selection Measures Decision tree Post pruning Pre pruning