CFP last date
20 December 2024
Reseach Article

Analysis of Classification Techniques for Efficient Disease Prediction

by N. Sandhya, M. M. Sharanya
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 155 - Number 8
Year of Publication: 2016
Authors: N. Sandhya, M. M. Sharanya
10.5120/ijca2016912388

N. Sandhya, M. M. Sharanya . Analysis of Classification Techniques for Efficient Disease Prediction. International Journal of Computer Applications. 155, 8 ( Dec 2016), 20-24. DOI=10.5120/ijca2016912388

@article{ 10.5120/ijca2016912388,
author = { N. Sandhya, M. M. Sharanya },
title = { Analysis of Classification Techniques for Efficient Disease Prediction },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2016 },
volume = { 155 },
number = { 8 },
month = { Dec },
year = { 2016 },
issn = { 0975-8887 },
pages = { 20-24 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume155/number8/26625-2016912388/ },
doi = { 10.5120/ijca2016912388 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:00:44.294620+05:30
%A N. Sandhya
%A M. M. Sharanya
%T Analysis of Classification Techniques for Efficient Disease Prediction
%J International Journal of Computer Applications
%@ 0975-8887
%V 155
%N 8
%P 20-24
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining plays an important role in processing large volumes of data. It refers to the process of obtaining knowledge from raw data. Classification is the most widely used data mining techniques, which employs some set of pre-classified samples to develop a model called a classifier. Many researches showed that C4.5 algorithm need to be improvised to maximize accuracy, handle large amounts of data, where C5.0 is the improved version. The major goal of the classification technique is to predict the target class accurately for each case in the data. The main objective of this research work is to predict diseases using classification algorithms such as Decision trees, C5.0 and Bayesian Networks. The performance of classification algorithms is compared using the datasets, Breast cancer and Heart disease. The experimental results are compared based on different performance parameters like dataset scalability, accuracy and error rate values. The research shows that in terms of scalability Bayesian networks algorithm was proved to have more accuracy rate and less error rate than the C5.0 algorithm.

References
  1. Soumen Chakrabarti, Earl Cox, Eibe Frank, Ralf Hartmut Güting, Jaiwei Han, Xia Jiang, Micheline Kamber, Sam S. Lightstone, Thomas P. Nadeau Richard E. Neapolitan, Dorian Pyle, Mamdouh Refaat, Markus Schneider, Toby J. Teorey, Ian H. Witten, “Data Mining-Know it all”, Morgan Kaufmann Publishers, 2009
  2. Shomona Gracia Jacob, R.Geetha Ramani, sDiscovery of Knowledge Patterns in Clinical Data through Data Mining Algorithms: Multiclass Categorization of Breast Tissue Data, International Journal of Computer Applications (0975– 8887)Volume 32– No.7, October 2011.
  3. Archana S, Elangovan K.Survey of classification techniques in data mining. International Journal of Computer Science and Mobile Applications. 2014 Feb; 2(2):65–71. ISSN: 2321-8363
  4. Durairaj M, Ranjani V, Data mining applications in healthcare sector a study. Int. J. Sci. Technol. Res. IJSTR, 2(10), 2013.
  5. Zdzislaw Pawlak, Rough Sets, International Journal of Information and Computer Sciences, vol. 11, no. 5, (1982), pp. 341-356.
  6. Pawlak, Z. Granularity of Knowledge, Indiscernibility and Rough Sets, The 1998 IEEE International Conference on Fuzzy Systems Proceedings - IEEE World Congress on Computational Intelligence, (1998) May 4-9, pp. 106-110.
  7. FUZZY SETS AND SYSTEMS, Elsevier An International Journal in Information Science and Engineering
  8. Lakshmi. K.R, Nagesh. Y and VeeraKrishna. M, (2014) Performance Comparison Of Three Data Mining Techniques For Predicting Kidney Dialysis Survivability, International Journal of Advances in Engineering & Technology, Mar., Vol. 7, Issue 1, pg no. 242-254.
  9. SolankiA.V., Data Mining Techniques using WEKA Classification for Sickle Cell Disease, International Journal of Computer Science and Information Technology,5(4): 5857-5860,2014
  10. Milan Kumari, 2Sunila Godara, “Comparative Study of Data Mining Classification Methods in Cardiovascular Disease Prediction”. IJCST Vol. 2, ISSN : 22294333(Print) | ISSN : 0976- 8491(Online) Issue 2, June 2011.
  11. Bhavsar H, Ganatra A. A comparative study of training algorithms for supervised machine learning. IJSCE. 2012 Sep; 2(4). ISSN: 2231-2307.
  12. International Journal of Computer Applications (0975 – 8887) Volume 117 – No. 16, May 2015, C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning.
  13. Informationgain,http://homes.cs.washington.edu/~shapiro/EE596/notes/InfoGain.pdf
  14. International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.1, January 2013, EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORMATION ENTROPY.
  15. Choi, J.P., T.H. Han and R.W. Park, 2009. A hybrid bayesian network model for predicting breast cancer prognosis. J. Korean Society Med. Inform., 15: 49-57. DOI: 10.4258/jksmi.2009.15.1.49
  16. Learning Bayesian Network Model Structure from Data Dimitris Margaritis May 2003 CMU-CS-03-153 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213
  17. Choi, J.P., T.H. Han and R.W. Park, 2009. A hybrid bayesian network model for predicting breast cancer prognosis. J. Korean Society Med. Inform., 15: 49-57. DOI: 10.4258/jksmi.2009.15.1.49
  18. HeartDisease,http://archive.ics.uci.edu/ml/machine/learning-databases/statlog/heart/
  19. International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-11) Distributed Data Mining Classification Algorithms for Prediction of Chronic- Kidney-Disease.
Index Terms

Computer Science
Information Sciences

Keywords

Classification C5.0 Bayesian Networks Decision tree Disease rules Disease Prediction