A Deterministic K-means Algorithm based on Nearest Neighbor Search

Omar Kettani; Benaissa Tadili; Faycal Ramdani

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

A Deterministic K-means Algorithm based on Nearest Neighbor Search

by Omar Kettani, Benaissa Tadili, Faycal Ramdani

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 63 - Number 15

Year of Publication: 2013

Authors: Omar Kettani, Benaissa Tadili, Faycal Ramdani

10.5120/10544-5541

Omar Kettani, Benaissa Tadili, Faycal Ramdani . A Deterministic K-means Algorithm based on Nearest Neighbor Search. International Journal of Computer Applications. 63, 15 ( February 2013), 33-37. DOI=10.5120/10544-5541

@article{ 10.5120/10544-5541,

author = { Omar Kettani, Benaissa Tadili, Faycal Ramdani },

title = { A Deterministic K-means Algorithm based on Nearest Neighbor Search },

journal = { International Journal of Computer Applications },

issue_date = { February 2013 },

volume = { 63 },

number = { 15 },

month = { February },

year = { 2013 },

issn = { 0975-8887 },

pages = { 33-37 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume63/number15/10544-5541/ },

doi = { 10.5120/10544-5541 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:14:25.739406+05:30

%A Omar Kettani

%A Benaissa Tadili

%A Faycal Ramdani

%T A Deterministic K-means Algorithm based on Nearest Neighbor Search

%J International Journal of Computer Applications

%@ 0975-8887

%V 63

%N 15

%P 33-37

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In data mining, the k-means algorithm is among the most commonly and widely used method for solving clustering problems because of its simplicity and performance. However, one of the main drawback of this algorithm is that its accuracy and performance are sensitive to the initial choice of clustering centers, which are generated randomly. To overcome this drawback, we propose a simple deterministic method based on nearest neighbor search and k-means procedure in order to improve clustering results. Experimental results on various data sets reveal that the proposed method is more accurate than standard K-means algorithm.

References

Aloise, D. ; Deshpande, A. ; Hansen, P. ; Popat, P. (2009). "NP-hardness of Euclidean sum-of-squares clustering". Machine Learning 75: 245–249. doi:10. 1007/s10994-009-5103-0.
Lloyd. , S. P. (1982). "Least squares quantization in PCM". IEEE Transactions on Information Theory 28 (2): 129–137. doi:10. 1109/TIT. 1982. 1056489.
P. S. Bradley and U. M. Fayyad, "Refining initial points for K-means Clustering", Proceeding of The Fifteenth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1998, pp. 91-99.
Khan and A. Ahmad, "Cluster Center Initialization for K-mean Clustering", Pattern Recognition Letters, Volume 25, Issue 11, 2004, pp. 1293-1302
Arthur, D. and S. Vassilvitskii, 2007. K-means++: The advantages of careful seeding. Proceeding of the 18th Annual ACM-SIAM Symposium of Discrete Analysis, Jan. 7-9, ACM Press, New Orleans, Louisiana, pp:1027-1035.
Ahmed and W. Ashour "An Initialization Method for the K-means Algorithm using RNN and Coupling Degree" International Journal of Computer Applications (0975 – 8887) Volume 25– No. 1, July 2011
C. Zhang and Z. Fang "An Improved K-means Clustering Algorithm" Journal of Information & Computational Science 10: 1 (2013) 193–199
L. Kaufman and P. J. Rousseeuw. Finding groups in Data: "an Introduction to Cluster Analysis". Wiley, 1990.
Yoonho Hwang; Bohyung Han; Hee-Kap Ahn "A fast nearest neighbor search algorithm by nonlinear embedding" Computer Vision and Pattern Recognition (CVPR), 2012 IEEE
Ming-Chao Chiang, Chun-Wei Tsai, Chu-Sing Yang "A time-efficient pattern reduction algorithm for k-means clustering" Information Sciences 181 (2011) 716–731
You Li, Kaiyong Zhao, Xiaowen Chu , Jiming Liu "Speeding up k-Means algorithm by GPUs" Journal of Computer and System Sciences 79 (2013) 216–229
Merz C and Murphy P, UCI Repository of Machine Learning ftp://ftp. ics. uci. edu/pub/machine-Learning-databases Clustering datasets: http://cs. joensuu. fi/sipu/datasets/
http://www. mathworks. com

Index Terms

Computer Science

Information Sciences

Keywords

Nearest Neighbor Search Initial Centroid K-means Clustering Algorithm