Comparison of Performance of Decision Tree Algorithms and Random Forest: An Application on OECD Countries Health Expenditures

Songul Cinaroglu

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Design of Instruction Service Quality System in Accordance with the Information and Communication Technology Frameworks

March

2016

Novel Notch Detection Algorithm for Detection of Dicrotic Notch in PPG Signals

January

2014

Design and Simulation of OTA using DTMOS Technique in 180 nm CMOS Process

April

2016

A Survey on FM-UWB Transceivers

January

2013

Reseach Article

Comparison of Performance of Decision Tree Algorithms and Random Forest: An Application on OECD Countries Health Expenditures

by Songul Cinaroglu

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 138 - Number 1

Year of Publication: 2016

Authors: Songul Cinaroglu

10.5120/ijca2016908704

Songul Cinaroglu . Comparison of Performance of Decision Tree Algorithms and Random Forest: An Application on OECD Countries Health Expenditures. International Journal of Computer Applications. 138, 1 ( March 2016), 37-41. DOI=10.5120/ijca2016908704

@article{ 10.5120/ijca2016908704,

author = { Songul Cinaroglu },

title = { Comparison of Performance of Decision Tree Algorithms and Random Forest: An Application on OECD Countries Health Expenditures },

journal = { International Journal of Computer Applications },

issue_date = { March 2016 },

volume = { 138 },

number = { 1 },

month = { March },

year = { 2016 },

issn = { 0975-8887 },

pages = { 37-41 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume138/number1/24346-2016908704/ },

doi = { 10.5120/ijca2016908704 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:38:32.762262+05:30

%A Songul Cinaroglu

%T Comparison of Performance of Decision Tree Algorithms and Random Forest: An Application on OECD Countries Health Expenditures

%J International Journal of Computer Applications

%@ 0975-8887

%V 138

%N 1

%P 37-41

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Decision trees and Random Forest are most popular methods of machine learning techniques. C4.5 which is an extension version of ID.3 algorithm and CART are one of these most commonly use algorithms to generate decision trees. Random Forest which constructs a lot of number of trees is one of another useful technique for solving both classification and regression problems. This study compares classification performances of different decision trees (C4.5, CART) and Random Forest which was generated using 50 trees. Data came from OECD countries health expenditures for the year 2011. AUC and ROC curve graph was used for performance comparison. Experimental results show that Random Forest outperformed in classification accuracy [AUC=0.98] in comparison with CART (0.95) and C4.5 (0.90) respectively. Future studies more focus on performance comparisons of different machine learning techniques using several datasets and different hyperparameter optimization techniques.

References

Bose, I, Mahapatra, R.K. “Business data mining - a machine learning perspective”, Information & Management, 2001, 39, 211-225.
Libbrecht M.W. Noble W.S. “Machine learning applications in genetics and genomics”, Nature Reviews, 2015, 16, 321-322.
Das R. “A comparison of multiple classification methods for diagnosis of Parkinson disease”, Expert Systems with Applications, 2010, 37, 1568-1572.
Chattamvelli R. “Data Mining Methods, Alpha Science International”, Oxford, UK. 2009.
Hammond D.K. Vandergheynst P. Gribonval R. “Wavelets on graphs via spectral graph theory”, Applied and Computational Harmonic Analysis, 2011, 30(2), 129-150.
Baldwin J.F. Lawry J. Martin T.P. “A Mass assignment based ID3 algorithm for decision tree induction”, International Journal of Intelligent Systems, 1997, vol.12, 523-552.
Jin C. De-lin L. Fen-Xiang M. “An improved ID3 decision tree algorithm”, Proceeding of 2009 4th International Conference on Computer Science & Education, 2009, 127-130.
Salzberg S.L. “C4.5: Programs for machine learning” by Quinlan J.R. Morgan Kaufmann Publishers, Inc., 1993
Gislason P.O. Benediktsson J.A. Sveinsson J.R. “Random forests for land cover classification”, Pattern Recognition Letters, 2004, 27(4), 294-300.
Oshiro T.M. Perez P.S. Baranauskas J.A. “How many trees in a random forest? Machine learning and data mining in pattern recognition”, Lecture Notes in Computer Science, 2012, vol.7376, 154-168.
Liaw A. Wiener M. “Classification and regression by randomforest”, R News, 2002, vol.2/3, 18-22.
Latinne P. Debeir O. Decaestecker C. “Limiting the number of trees in random forests”, Multiple Classifier Systems, Lecture Notes in Computer Science, Springer. 2001.
Bradley A. “The use of the area under the ROC curve in the evaluatıon of machıne learning algorithms”, Pattern Recognition, 1997, 30(7), 1145-1159.
Fawcett T. “An introduction to ROC analysis”, Pattern Recognition Letters, 2006, 27(8), 861-874.
Fraggi D. Reiser B. “Estimation of the area under the ROC curve”, Statistics in Medicine, 2002, 21, 3093-3106.
Potrafke N. “The growth of public health expenditures in OECD countries: Do government ideology and electoral motives matter?”, Journal of Health Economics, 2010, 29, 797-810.
OECD StatExtracts, http://stats.oecd.org/, Accessed on: 25.01.2016.
Alin A. “Multicollinearity”, Computational Statistics, 2010, 2(3), 370-374.
Oja H. Randles R.H. “Multivariate nonparametric tests”, Statistica Science, 2004, 19(4), 598-605.
Dietterich T.G. “An Experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting and randomization”, Machine Learning, 40, 139-157, 2000.
Witten I.H. Frank E. Data Mining Practical Machine Learning Tools and Techniques, Elsevier, Third Edition, Morgan Kaufmann Publishers. 2005.
Sohn S.Y. Moon T.H. “Decision Tree based on data envelopment analysis for effective technology commercialization”, Expert Systems with Applications, 2004, 26, 279-284.
Rotim S.T. Dobsa J. Krakar Z. “Using decision trees for identification of most relevant indicators for effective ICT Utilization”, Bulgarian Academy of Sciences, Cybernetics and Information Technologies, 2013, 13(1), 83-94.

Index Terms

Computer Science

Information Sciences

Keywords

Pattern Recognition Machine Learning Decision Trees Health Expenditures.