International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 45 - Number 12 |
Year of Publication: 2012 |
Authors: Asha Gowda Karegowda, Punya V, M. A. Jayaram, A. S. Manjunath |
10.5120/6836-9460 |
Asha Gowda Karegowda, Punya V, M. A. Jayaram, A. S. Manjunath . Rule based Classification for Diabetic Patients using Cascaded K-Means and Decision Tree C4.5. International Journal of Computer Applications. 45, 12 ( May 2012), 45-50. DOI=10.5120/6836-9460
Medical Data mining is the process of extracting hidden patterns from medical data. This paper presents the development of a hybrid model for classifying Pima Indian diabetic database (PIDD). The model consists of two stages. In the first stage, the K-means clustering is used to identify and eliminate incorrectly classified instances. The continuous data is converted to categorical form by approximate width of the desired intervals, based on the opinion of medical expert. In the second stage a fine tuned classification is done using Decision tree C4. 5 by taking the correctly clustered instance of first stage. Experimental results signify the cascaded K-means clustering and Decision tree C4. 5 has enhanced classification accuracy of C4. 5. Further rules generated using cascaded C4. 5 tree with categorical data are less in numbers and easy to interpret compared to rules generated with C4. 5 alone with continuous data. The proposed cascaded model with categorical data obtained the classification accuracy of 93. 33 % when compared to accuracy of 73. 62 % using C4. 5 alone for PIMA Indian diabetic dataset.