International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 133 - Number 15 |
Year of Publication: 2016 |
Authors: Juhi Gupta, Aakanksha Mahajan |
10.5120/ijca2016907945 |
Juhi Gupta, Aakanksha Mahajan . BPSO Optimized K-means Clustering Approach for Data Analysis. International Journal of Computer Applications. 133, 15 ( January 2016), 9-14. DOI=10.5120/ijca2016907945
In data mining, K-means clustering is well known for its efficiency in clustering large data sets. The main aim in grouping data points into clusters is to lump similar items together in the same cluster such that objects lying in one cluster should be as close as possible to each other (homogeneity) and objects lying in different clusters are further apart from each other. However, there exist some flaws in classical K-means clustering algorithm. First, the algorithm is sensitive in selecting initial centroids and can be easily trapped at a local minimum with regards to the measurement (the sum of squared errors). Secondly, the KM problem in terms of finding a global minimal sum of the squared errors is NP-hard even when the number of the clusters is equal to 2 or the number of attributes for data point is 2, so finding the optimal clustering is believed to be computationally intractable. In this dissertation, KM clustering problem is solved by optimized KM. The proposed algorithm is named as BPSO in which the issue of how to derive an optimization model for the minimum sum of squared errors for a given data set is considered. Two evolutionary optimization algorithms BFO and PSO are combined to optimize KM algorithm to guarantee that the result of clustering is more accurate than clustering by basic KM algorithm. F-measure is used to do comparison of both basic K-means and BPSO algorithm.