International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 175 - Number 11 |
Year of Publication: 2020 |
Authors: Sathyendranath Malli, Nagesh H. R., B. Dinesh Rao |
10.5120/ijca2020920605 |
Sathyendranath Malli, Nagesh H. R., B. Dinesh Rao . Approximation to the K-Means Clustering Algorithm using PCA. International Journal of Computer Applications. 175, 11 ( Aug 2020), 43-46. DOI=10.5120/ijca2020920605
Healthcare is an emerging domain that produces data exponentially. These massive data contain a wide variety of fields, which lead to a problem in analyzing the information. Clustering is a popular method for analyzing data. Data is split into smaller clusters having similar properties and is then analyzed. The K-Means algorithm [1] is a well-known technique among clustering methods. In this paper, an efficient approximation to the K-means problem targeted for large data by reducing the number of features to one through Principle Component Analysis(PCA) is introduced. This data is clustered in one dimension using the K - means algorithm. Intra-cluster RMS error in the modified algorithm is compared with the K-means algorithm in m dimensions and is found to be reasonable. The time taken by the modified algorithm is significantly less when compared to the K - means algorithm.