CFP last date
20 February 2025
Reseach Article

Comparative of Data Mining Classification Algorithm (CDMCA) in Diabetes Disease Prediction

by V. Karthikeyani, I. Parvin Begum, K. Tajudin, I. Shahina Begam
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 60 - Number 12
Year of Publication: 2012
Authors: V. Karthikeyani, I. Parvin Begum, K. Tajudin, I. Shahina Begam
10.5120/9745-4307

V. Karthikeyani, I. Parvin Begum, K. Tajudin, I. Shahina Begam . Comparative of Data Mining Classification Algorithm (CDMCA) in Diabetes Disease Prediction. International Journal of Computer Applications. 60, 12 ( December 2012), 26-31. DOI=10.5120/9745-4307

@article{ 10.5120/9745-4307,
author = { V. Karthikeyani, I. Parvin Begum, K. Tajudin, I. Shahina Begam },
title = { Comparative of Data Mining Classification Algorithm (CDMCA) in Diabetes Disease Prediction },
journal = { International Journal of Computer Applications },
issue_date = { December 2012 },
volume = { 60 },
number = { 12 },
month = { December },
year = { 2012 },
issn = { 0975-8887 },
pages = { 26-31 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume60/number12/9745-4307/ },
doi = { 10.5120/9745-4307 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:07:20.266790+05:30
%A V. Karthikeyani
%A I. Parvin Begum
%A K. Tajudin
%A I. Shahina Begam
%T Comparative of Data Mining Classification Algorithm (CDMCA) in Diabetes Disease Prediction
%J International Journal of Computer Applications
%@ 0975-8887
%V 60
%N 12
%P 26-31
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining is an iterative development within which evolution is defined by discovery, through either usual or manual methods. In this paper using the data mining concept to CDMCA classifies two types supervised and unsupervised classifications. Here illustrate the classification of supervised data mining algorithms base on diabetes disease dataset. It encompass the diseases plasma glucose at least mentioned value. The research describes algorithmic discussion of C4. 5, SVM, K-NN, PNN, BLR, MLR, CRT, CS-CRT, PLS-DA and PLS-LDA. Here used to compare the performance of computing time, precision value and the data evaluated using 10 fold Cross Validation error rate, the error rate focuses True Positive, True Negative, False Positive and False Negative and Accuracy. The outcome CS-CRT algorithm best. The Best results are achieved by using Tanagra tool. Tanagra is data mining matching set. The accuracy is calculate based on addition of true positive and true negative followed by the division of all possibilities.

References
  1. Jiawei Han and Micheline Kamber, "Data Mining Concepts and Techniques", second edition, Morgan Kaufmann Publishers an imprint of Elsevier.
  2. Cover, T. , Hart P. ,1967, "Nearest Neighbour Pattern Classification", IEEE Trans Inform Theory 13(1):21–27.
  3. Breiman, L. , Friedman, J. , Olsen,R. , Stone, C. , 1984. "Classification and Regression Trees", Chapman & Hall.
  4. Dayle, L. , Sampson, Tony J. , Parker, Zee Upton, Cameron, P. , Hurst, September, 2011. "Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine and SIMCA classification, Journal of Chemo metrics", 20(8–10), 341–351.
  5. Barker, M. , & Rayens, W. , 2003. "Partial least squares for discrimination", Journal of Chemo metrics, 17(3), 166–173.
  6. Bylesjo, M. , Rantalainen, M. , Cloarec, O. , "OPLS discriminant analysis: Combining the strengths of PLS-DA",2006.
  7. Breiman,L. ,Friedman,J. ,Olsen,R. , Stone. C . ,1984, . "Classification and Regression Trees", Chapman & Hall.
  8. Cover. ,T. M. , Hart, P. E. ,"Nearest neighbor pattern classification",IEEE Trans. Inform Theory, vol. IT-13, pp. 21-27, Jan, 1967.
  9. Barker, M. , & Rayens, W, "Partial least squares for discrimination", Journal of Chemo metrics, 17(3), 166–173,2003.
  10. Ramakrishna, Gehrkev, "Database Management Systems", International Edition, TMH, p-929.
  11. David,A. ,Aoyama, Jen-Ting,T. , "TimeLine and visualization of multiple-data sets and the visualization querying challenge", Journal of visual languages and Computing 18(2007),1-21.
  12. Chau, M. , Shin,D. , "A Comparative study of Medical Data classification Methods Based on Decision Tree and Bagging algorithms",Proceedings of IEEE International Conference on Dependable,Autonomic and Secure Computing 2009, pp. 183-187.
  13. Palaniappan, S. , Awang, R. , "Intelligent Heart Disease Prediction System Using Data Mining Techniques", Proceedings of IEEE/ACS International Conference on Computer Systems and Applications 2008,pp. 108-115.
  14. Carlos Ordonez, 2004. "Improving Heart Disease Prediction Using Constrained Association Rules ", Seminar Presentation at University of Tokyo.
  15. Liang Yanhong, Tan Runhua, "Text Mining-based Patent Analysis in Product Innovative Process", Hebei University of Technology.
Index Terms

Computer Science
Information Sciences

Keywords

C4. 5 SVM K-NN PNN BLR MLR CRT CS-CRT PLS-DA PLS-LDA Classification based on CT Precision value CV error rate and Accuracy