A New Efficient Approach towards k-means Clustering Algorithm

Pallavi Purohit; Ritesh Joshi

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Real-time Synchronization Mechanisms Between Batch-oriented Legacy Systems and Modern Interfaces in the Retirement Domain

Balamurugan Krishnaswamy Gnanasekaran

Random Articles

Article:PID Control of Heat Exchanger System

October

2010

Shared Cryptography with Embedded Session Key for Secret Audio

July

2011

A Holistic Approach to Autonomic Self-Healing Distributed Computing System

August

2013

Study and Analysis of Scientific Scopes and Issues towards Developing an Efficient LECIM

July

2013

Reseach Article

A New Efficient Approach towards k-means Clustering Algorithm

by Pallavi Purohit, Ritesh Joshi

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 65 - Number 11

Year of Publication: 2013

Authors: Pallavi Purohit, Ritesh Joshi

10.5120/10966-6097

Pallavi Purohit, Ritesh Joshi . A New Efficient Approach towards k-means Clustering Algorithm. International Journal of Computer Applications. 65, 11 ( March 2013), 7-10. DOI=10.5120/10966-6097

@article{ 10.5120/10966-6097,

author = { Pallavi Purohit, Ritesh Joshi },

title = { A New Efficient Approach towards k-means Clustering Algorithm },

journal = { International Journal of Computer Applications },

issue_date = { March 2013 },

volume = { 65 },

number = { 11 },

month = { March },

year = { 2013 },

issn = { 0975-8887 },

pages = { 7-10 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume65/number11/10966-6097/ },

doi = { 10.5120/10966-6097 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:18:33.803131+05:30

%A Pallavi Purohit

%A Ritesh Joshi

%T A New Efficient Approach towards k-means Clustering Algorithm

%J International Journal of Computer Applications

%@ 0975-8887

%V 65

%N 11

%P 7-10

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

K-means clustering algorithms are widely used for many practical applications. Original k-mean algorithm select initial centroids and medoids randomly that affect the quality of the resulting clusters and sometimes it generates unstable and empty clusters which are meaningless. The original k-means algorithm is computationally expensive and requires time proportional to the product of the number of data items, number of clusters and the number of iterations. The new approach for the k-mean algorithm eliminates the deficiency of exiting k mean. It first calculates the initial centroids k as per requirements of users and then gives better, effective and good cluster without scarifying Accuracy. It generates stable clusters to improve accuracy. It also reduces the mean square error and improves the quality of clustering. We also applied our algorithm for the evaluation of student's academic performance for the purpose of making effective decision by the student councilors.

References

Dechang Pi, Xiaolin Qin and Qiang Wang, "Fuzzy Clustering Algorithm Based on Tree for Association Rules", International Journal of Information Technology, vol. 12, No. 3, 2006.
Fahim A. M. , Salem A. M. , "Efficient enhanced k-means clustering algorithm", Journal of Zhejiang University Science, 1626 – 1633, 2006.
Fang Yuag, Zeng Hui Meng, "A New Algorithm to get initial centroid", Third International Conference on Machine Learning and cybernetics, Shanghai, 26-29 August,1191 – 1193, 2004.
Friedrich Leisch1 and Bettina Gr un2, "Extending Standard Cluster Algorithms to Allow for Group Constraints", Compstat 2006, Proceeding in Computational Statistics, Physica verlag, Heidelberg, Germany,2006
J. MacQueen, "Some method for classification and analysis of multi varite observation", University of California, Los Angeles, 281 – 297.
Maria Camila N. Barioni, Humberto L. Razente, Agma J. M. Traina, "An efficient approach to scale up k-medoid based algorithms in large databases", 265 – 279.
Michel Steinbach, Levent Ertoz and Vipin Kumar, "Challenges in high dimensional data set", International Conference of Data management, Vol. 2,No. 3, 2005.
Parsons L. , Haque E. , and Liu H. , "Subspace clustering for high dimensional data: A review", SIGKDD, Explor, Newsletter 6, 90 -105, 2004.
Rui Xu, Donlad Wunsch, "Survey of Clustering Algorithm", IEEE Transactions on Neural Networks, Vol. 16, No. 3, may 2005.
Sanjay garg, Ramesh Chandra Jain, "Variation of k-mean Algorithm: A study for High Dimensional Large data sets", Information Technology Journal5 (6), 1132 – 1135, 2006.
Vance Febre, "Clustering and Continues k-mean algorithm", Los Alamos Science, Georgain Electonics Scientific Journal: Computer Science and Telecommunication, vol. 4,No. 3, 1994.
Zhexue Huang, "A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining".
Nathan Rountree, "Further Data Mining: Building Decision Trees", first presented 28 July 1999.
Yang liu, "Introduction to Rough Set Theory and Its Application in Decision Suppot System"
Wei-YIn loh, "Regression trees with unbiased variable selection and interaction detection", University of Wisconsin–Madison.
S. Rasoul Safavian and David Landgrebe, "A Survey of Decision Tree Classifier Methodology", School of Electrical Engineering ,Purdue University, West Lafayette, IN 47907.
David S. Vogel, Ognian Asparouhov and Tobias Scheffer, "Scalable Look-Ahead Linear Regression Trees" .
Alin Dobra, "Classification and Regression Tree Construction", Thesis Proposal, Department of Computer Science, Cornell university, Ithaca NY, November 25, 2002
Yinmei Huang, "Classification and regression tree (CART) analysis: methodological review and its application", Ph. D. Student, The Department of Sociology, The University of Akron Olin Hall 247, Akron, OH 44325-1905,
Yan X. and Han J. (2003), GSpan: Graph-Based Substructure Pattern Mining. Proc. 2nd IEEE Int. Conf. on Data Mining (ICDM 2003, Maebashi, Japan), 721–724. IEEE Press,Piscataway, NJ,USA.

Index Terms

Computer Science

Information Sciences

Keywords

Cluster analysis Centroids K-mean