International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 13 - Number 7 |
Year of Publication: 2011 |
Authors: D.Napoleon, S.Pavalakodi |
10.5120/1789-2471 |
D.Napoleon, S.Pavalakodi . A New Method for Dimensionality Reduction using K-Means Clustering Algorithm for High Dimensional Data Set. International Journal of Computer Applications. 13, 7 ( January 2011), 41-46. DOI=10.5120/1789-2471
Clustering is the process of finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups. Dimensionality reduction is the transformation of high-dimensional data into a meaningful representation of reduced dimensionality that corresponds to the intrinsic dimensionality of the data. K-means clustering algorithm often does not work well for high dimension, hence, to improve the efficiency, apply PCA on original data set and obtain a reduced dataset containing possibly uncorrelated variables. In this paper principal component analysis and linear transformation is used for dimensionality reduction and initial centroid is computed, then it is applied to K-Means clustering algorithm.