A New Method for Dimensionality Reduction using K-Means Clustering Algorithm for High Dimensional Data Set

D.Napoleon; S.Pavalakodi

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Real-time Synchronization Mechanisms Between Batch-oriented Legacy Systems and Modern Interfaces in the Retirement Domain

Balamurugan Krishnaswamy Gnanasekaran

Random Articles

Article:PID Control of Heat Exchanger System

October

2010

Shared Cryptography with Embedded Session Key for Secret Audio

July

2011

A Holistic Approach to Autonomic Self-Healing Distributed Computing System

August

2013

Study and Analysis of Scientific Scopes and Issues towards Developing an Efficient LECIM

July

2013

Reseach Article

A New Method for Dimensionality Reduction using K-Means Clustering Algorithm for High Dimensional Data Set

by D.Napoleon, S.Pavalakodi

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 13 - Number 7

Year of Publication: 2011

Authors: D.Napoleon, S.Pavalakodi

10.5120/1789-2471

D.Napoleon, S.Pavalakodi . A New Method for Dimensionality Reduction using K-Means Clustering Algorithm for High Dimensional Data Set. International Journal of Computer Applications. 13, 7 ( January 2011), 41-46. DOI=10.5120/1789-2471

@article{ 10.5120/1789-2471,

author = { D.Napoleon, S.Pavalakodi },

title = { A New Method for Dimensionality Reduction using K-Means Clustering Algorithm for High Dimensional Data Set },

journal = { International Journal of Computer Applications },

issue_date = { January 2011 },

volume = { 13 },

number = { 7 },

month = { January },

year = { 2011 },

issn = { 0975-8887 },

pages = { 41-46 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume13/number7/1789-2471/ },

doi = { 10.5120/1789-2471 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:02:08.172849+05:30

%A D.Napoleon

%A S.Pavalakodi

%T A New Method for Dimensionality Reduction using K-Means Clustering Algorithm for High Dimensional Data Set

%J International Journal of Computer Applications

%@ 0975-8887

%V 13

%N 7

%P 41-46

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Clustering is the process of finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups. Dimensionality reduction is the transformation of high-dimensional data into a meaningful representation of reduced dimensionality that corresponds to the intrinsic dimensionality of the data. K-means clustering algorithm often does not work well for high dimension, hence, to improve the efficiency, apply PCA on original data set and obtain a reduced dataset containing possibly uncorrelated variables. In this paper principal component analysis and linear transformation is used for dimensionality reduction and initial centroid is computed, then it is applied to K-Means clustering algorithm.

References

Bradley, P. S., Bennett, K. P., & Demiriz, A. (2000).Constrained k-means clustering (Technical ReportMSR-TR-2000-65). Microsoft Research, Redmond, WA.
C Ding,”Principal Component Analysis and Effective K-means Clustering”
Chao Shi and Chen Lihui, 2005. Feature dimension reduction for microarray data analysis using locally linear embedding, 3rd Asia Pacific Bioinformatics Conference, pp. 211-217.
Chris Ding and Xiaofeng He, “K-Means Clustering via Principal Component Analysis”, In proceedings of the 21st International Conference on Machine Learning, Banff, Canada, 2004
Davy Michael and Luz Saturnine, 2007. Dimensionality reduction for active learning with nearest neighbor classifier in text categorization problems, Sixth International Conference on Machine Learning and Applications, pp. 292-297
IEEEI.T Jolliffe, “Principal Component Analysis”, Springer, second edition.
Kiri Wagsta- Claire Cardie ,”Constrained K-means Clustering with Background Knowledge”
.Maaten L.J.P., Postma E.O. and Herik H.J. van den, 2007. Dimensionality reduction: A comparative review”, Tech. rep.University of Maastricht.
Moth’d Belal. Al-Daoud , (2005).A New Algorithm for Cluster Initialization, World Academy of Science, Engineering and Technology.
O Shamir,”Model Selection and Stability in k-means Clustering”
Rand, W. M. (1971). Objective criteria for the evaluation of clustering met hods. Journal of the AmericanStatistical Association, 66, 846-850.
RM Suresh, K Dinakaran, P Valarmathie,“Model based modified k-means clustering for microarray data”,
International Conference on Information Management and Engineering, Vol.13, pp 271-273, 2009, .Valarmathie P., Srinath M. and Dinakaran K., 2009. An increased performance of clustering high dimensional data through dimensionality reduction technique, Journal of Theoretical and Applied Information Technology, Vol. 13, pp. 271-273
Wagsta_, K., & Cardie, C. (2000). Clustering with instance-level constraints. Proceedings of the Seventeenth International Conference on Machine Learning (pp. 1103{1110). Palo Alto, CA: Morgan Kaufmann.
Wray Buntine,” K-means Clustering and PCA”, National ICT Australia
Xu R. and Wunsch D., 2005. Survey of clustering algorithms, IEEE Trans. Neural Networks, Vol. 16, No. 3, pp. 645-678.
Yan Jun, Zhang Benyu, Liu Ning, Yan Shuicheng, Cheng Qiansheng, Fan Weiguo, Yang Qiang, Xi Wensi, and Chen Zheng,2006. Effective and efficient dimensionality reduction for large-scale and streaming data preprocessing, IEEE transactions on Knowledge and Data Engineering, Vol. 18, No. 3, pp. 320-333.
Yeung Ka Yee and Ruzzo Walter L., 2000. An empirical study on principal component analysis for clustering gene expressionData”,Tech. Report, University of Washington.

Index Terms

Computer Science

Information Sciences

Keywords

Clustering Dimensionality Reduction Principal component analysis k-means algorithm Amalgamation