CFP last date
20 January 2025
Reseach Article

Improved K-mean Clustering Algorithm for Prediction Analysis using Classification Technique in Data Mining

by Arpit Bansal, Mayur Sharma, Shalini Goel
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 157 - Number 6
Year of Publication: 2017
Authors: Arpit Bansal, Mayur Sharma, Shalini Goel
10.5120/ijca2017912719

Arpit Bansal, Mayur Sharma, Shalini Goel . Improved K-mean Clustering Algorithm for Prediction Analysis using Classification Technique in Data Mining. International Journal of Computer Applications. 157, 6 ( Jan 2017), 35-40. DOI=10.5120/ijca2017912719

@article{ 10.5120/ijca2017912719,
author = { Arpit Bansal, Mayur Sharma, Shalini Goel },
title = { Improved K-mean Clustering Algorithm for Prediction Analysis using Classification Technique in Data Mining },
journal = { International Journal of Computer Applications },
issue_date = { Jan 2017 },
volume = { 157 },
number = { 6 },
month = { Jan },
year = { 2017 },
issn = { 0975-8887 },
pages = { 35-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume157/number6/26838-2017912719/ },
doi = { 10.5120/ijca2017912719 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:03:15.049454+05:30
%A Arpit Bansal
%A Mayur Sharma
%A Shalini Goel
%T Improved K-mean Clustering Algorithm for Prediction Analysis using Classification Technique in Data Mining
%J International Journal of Computer Applications
%@ 0975-8887
%V 157
%N 6
%P 35-40
%D 2017
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering is technique which is used to analyze the data in efficient manner and generate required information. To cluster the dataset, there is a technique named k-mean, is applied which is based on central point selection and calculation of Euclidian Distance. Here in k-mean, dataset will be loaded and from the dataset. Central points are selected using the formulae Euclidian distance and on the basis of Euclidian distance points are assigned to the clusters. The main disadvantage of k-mean is of accuracy, as in k-mean clustering user needs to define number of clusters. Because of user defined number of clusters, some points of the dataset are remained un-clustered. In this work, improvement in the k-mean clustering algorithm will be proposed which can define number of clusters automatically and assign required cluster to un-clustered points. The proposed improvement will leads to improvement in accuracy and reduce clustering time by the member assigned to the cluster to predict cancer.

References
  1. K.Rajalakshmi,, Dr.S.S.Dhenakaran,N.Roobin “Comparative Analysis of K-Means Algorithm in Disease Prediction”, International Journal of Science, Engineering and Technology Research (IJSETR), Volume 4, Issue 7, July 2015
  2. Oyelade, O. J, Oladipupo, O. O and Obagbuwa, I. C, “Application of k-Means Clustering algorithm for prediction of Students’ Academic Performance”, International Journal of Computer Science and Information Security, Vol. 7, o. 1, 2010
  3. Shital A. Raut and S. R. Sathe, “A Modified Fastmap K-Means Clustering Algorithm for Large Scale Gene Expression Datasets”, International Journal of Bioscience, Biochemistry and Bioinformatics, Vol. 1, No. 4, page 120-124, November 2011.
  4. Daljit Kaur and Kiran Jyot, “Enhancement in the Performance of K-means Algorithm”, International Journal of Computer Science and Communication Engineering, Volume 2 Issue 1, 2013
  5. Siddheswar Ray and Rose H. Turi, “Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation”, School of Computer Science and Software Engineering Monash University, Wellington Road, Clayton, Victoria, 3168, Australia, 1999
  6. Azhar Rauf ,Mahfooz, Shah Khusro and Huma Javed “Enhanced K-Mean Clustering Algorithm to Reduce Number of Iterations and Time Complexity”, Middle-East Journal of Scientific Research 12 (7): 959-963, 2012 ISSN 1990-92332012
  7. Madhu Yedla, T M Srinivasa, “Enhancing K-means Clustering Algorithm with Improved Initial Center”, International Journal of Computer Science and Information Technologies, Vol. 1 (2) 2010, page 121-125
  8. Osamor VC, Adebiyi EF, Oyelade JO and Doumbia S “Reducing the Time Requirement of K-Means Algorithm” PLoS ONE, Volume 7, Issue 12, pp-56-62, 2012.
  9. Akhilesh Kumar Yadav, Divya Tomar, Sonali Agarwal, “Clustering of Lung Cancer Data Using Foggy K-Means”, International Conference on Recent Trends in Information Technology (ICRTIT) 2013
  10. Sanjay Chakrabotry, Prof. N.K Nigwani and Lop Dey “Weather Forecasting using Incremental K-means Clustering”, 2014
  11. Chew Li Sa; Bt Abang Ibrahim, D.H.; Dahliana Hossain, E.; bin Hossin, M., "Student performance analysis system (SPAS)," in Information and Communication Technology for The Muslim World (ICT4M), 2014 The 5th International Conference on , vol., no., pp.1-6, 17-18 Nov. 2014
  12. Abdelghani Bellaachia, Erhan Guven, “Predicting Breast Cancer Survivability Using Data Mining Techniques”, Washington DC 20052, 2010
  13. Qasem a. Al-Radaideh, Adel Abu Assaf 3eman Alnagi, “ Predictiong Stock Prices Using Data Mining Techniques”, The International Arab Conference on Information Technology (ACIT’2013)
  14. K. A. Abdul Nazeer, M. P. Sebastian, “Improving the Accuracy and Efficiency of the k-means Clustering Algorithm, Vol IWCE 2009, July 1 - 3, 2009, London, U.K
Index Terms

Computer Science
Information Sciences

Keywords

K-mean clustering Prediction clustering Classification Hierarchal clustering