International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 7 - Number 3 |
Year of Publication: 2010 |
Authors: Srinivasulu Asadi, Dr Ch D V Subba Rao, V Saikrishna |
10.5120/1148-1503 |
Srinivasulu Asadi, Dr Ch D V Subba Rao, V Saikrishna . Finding the Number of Clusters in Unlabeled Datasets using Extended Dark Block Extraction. International Journal of Computer Applications. 7, 3 ( September 2010), 1-4. DOI=10.5120/1148-1503
Clustering analysis is the problem of partitioning a set of objects O = {o1… on} into c self-similar subsets based on available data. In general, clustering of unlabeled data poses three major problems: 1) assessing cluster tendency, i.e., how many clusters to seek? 2) Partitioning the data into c meaningful groups, and 3) validating the c clusters that are discovered. We address the first problem, i.e., determining the number of clusters c prior to clustering. Many clustering algorithms require number of clusters as an input parameter, so the quality of the clusters mainly depends on this value. Most methods are post clustering measures of cluster validity i.e., they attempt to choose the best partition from a set of alternative partitions.