International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 101 - Number 1 |
Year of Publication: 2014 |
Authors: Mugdha Jain, Chakradhar Verma |
10.5120/17652-8457 |
Mugdha Jain, Chakradhar Verma . Adapting k-means for Clustering in Big Data. International Journal of Computer Applications. 101, 1 ( September 2014), 19-24. DOI=10.5120/17652-8457
Big data if used properly can bring huge benefits to the business, science and humanity. The various properties of big data like volume, velocity, variety, variation and veracity render the existing techniques of data analysis ineffective. Big data analysis needs fusion of techniques for data mining with those of machine learning. The k-means algorithm is one such algorithm which has presence in both the fields. This paper describes an approximate algorithm based on k-means. It is a novel method for big data analysis which is very fast, scalable and has high accuracy. It overcomes the drawback of k-means of uncertain number of iterations by fixing the number of iterations, without losing the precision.