International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 83 - Number 15 |
Year of Publication: 2013 |
Authors: Geet Singhal, Shipra Panwar, Kanika Jain, Devender Banga |
10.5120/14528-2927 |
Geet Singhal, Shipra Panwar, Kanika Jain, Devender Banga . A Comparative Study of Data Clustering Algorithms. International Journal of Computer Applications. 83, 15 ( December 2013), 41-46. DOI=10.5120/14528-2927
Data clustering is a process of partitioning data points into meaningful clusters such that a cluster holds similar data and different clusters hold dissimilar data. It is an unsupervised approach to classify data into different patterns. In general, the clustering algorithms can be classified into the following two categories: firstly, hard clustering, where a data object can belong to a single and distinct cluster and secondly, soft clustering, where a data object can belong to different clusters. In this report we have made a comparative study of three major data clustering algorithms highlighting their merits and demerits. These algorithms are: k-means, fuzzy c-means and K-NN clustering algorithm. Choosing an appropriate clustering algorithm for grouping the data takes various factors into account for illustration one is the size of data to be partitioned.