CFP last date
20 January 2025
Reseach Article

An Alternative Technique of Selecting the Initial Cluster Centers in the k-means Algorithm for Better Clustering

by Sisir Kumar Rajbongshi, Anjana Kakoti Mahanta
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 67 - Number 7
Year of Publication: 2013
Authors: Sisir Kumar Rajbongshi, Anjana Kakoti Mahanta
10.5120/11409-6736

Sisir Kumar Rajbongshi, Anjana Kakoti Mahanta . An Alternative Technique of Selecting the Initial Cluster Centers in the k-means Algorithm for Better Clustering. International Journal of Computer Applications. 67, 7 ( April 2013), 28-31. DOI=10.5120/11409-6736

@article{ 10.5120/11409-6736,
author = { Sisir Kumar Rajbongshi, Anjana Kakoti Mahanta },
title = { An Alternative Technique of Selecting the Initial Cluster Centers in the k-means Algorithm for Better Clustering },
journal = { International Journal of Computer Applications },
issue_date = { April 2013 },
volume = { 67 },
number = { 7 },
month = { April },
year = { 2013 },
issn = { 0975-8887 },
pages = { 28-31 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume67/number7/11409-6736/ },
doi = { 10.5120/11409-6736 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:26:31.982905+05:30
%A Sisir Kumar Rajbongshi
%A Anjana Kakoti Mahanta
%T An Alternative Technique of Selecting the Initial Cluster Centers in the k-means Algorithm for Better Clustering
%J International Journal of Computer Applications
%@ 0975-8887
%V 67
%N 7
%P 28-31
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Although k-means works well in many cases it offers no accuracy guarantee and it has no idea to select ideal cluster representatives. This article presents a technique in which the initial cluster representatives in the standard k-means algorithm are chosen intelligently. Comparison of the quality of the clusters produced by the standard k-means algorithm, k-means using Furthest-First, and k-means using the proposed initialization technique have investigated. Experiment result shows that the quality of the clusters improves with the proposed algorithm in most of the cases.

References
  1. Pujari A. K. Clustering Techniques. Data mining techniques, chapter 5, University Press, pp. 114-130, 2008.
  2. Tan P. , Steinbach M. and Kumar V. Introduction to Data Mining, Cluster Analysis: Basic Concepts and Algorithms, Chapter 8, Pearson Education, pp. 487-559, 2009.
  3. J. MacQueen. Some methods for Classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistic and Probability, Volume 1, pp. 281-297, 1967.
  4. Eklan C. Clustering with k-means: faster, smarter and cheaper, University of California, San Diego. , April 24, 2004.
  5. Goswami A. , Jin R. , Agrawal G. , Fast and Exact Out-of-Core K-Means Clustering, Department of Computer Science and Engineering Ohio State University, 2004.
  6. Arthur D. , Vassilvitskii S. : "k-means++: The advantages of Careful Seeding" 2007 Symposium on Discrete Algorithms (SODA).
  7. Domings P. and Hulten G. A general method for scaling up machine learning algorithms and its application to clustering. In proceedings of the Eighteenth International Conference on Machine learning, 2001.
  8. Shuttle Dataset Available: http://mlr. cs. umass. edu/ml/datasets/stalog+(shuttle)
  9. Synthetic Control Chart Time Series Dataset Available: http://archive. ics. uci. edu/ml/datasets/synthetic+control+chart+time+series
  10. Wine Recognition Datasets Available: http://mlr. cs. umass. edu. edu/ml/datasets/wine
Index Terms

Computer Science
Information Sciences

Keywords

Cluster representative cluster quality Furthest-First Technique centroid