International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 92 - Number 4 |
Year of Publication: 2014 |
Authors: Neelam Singh, Neha Garg, Janmejay Pant |
10.5120/15995-4844 |
Neelam Singh, Neha Garg, Janmejay Pant . A Comprehensive Study of Challenges and Approaches for Clustering High Dimensional Data. International Journal of Computer Applications. 92, 4 ( April 2014), 7-10. DOI=10.5120/15995-4844
Clustering is one of the most effective methods for summarizing and analyzing datasets that are collection of data objects similar or dissimilar in nature. Clustering aims at finding groups, or clusters, of objects with similar attributes. Most clustering methods work efficiently for low dimensional data since distance measures are used to find dissimilarities between objects. High dimensional data, however, may contain attributes which are not required for defining clusters and irrelevant dimension may produce noise and will hide the clusters that are required to be created. The discovery of groups of objects that are highly similar within some subsets of relevant attributes becomes an important but challenging task. In this paper we provide a short introduction to various approaches and challenges for high-dimensional data clustering.