International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 153 - Number 3 |
Year of Publication: 2016 |
Authors: Anju, Preeti Gulia |
10.5120/ijca2016911994 |
Anju, Preeti Gulia . Clustering in Big Data: A Review. International Journal of Computer Applications. 153, 3 ( Nov 2016), 44-47. DOI=10.5120/ijca2016911994
BIG DATA[1] is a term for data sets that are so large or complex that traditional data processing[4] applications are inadequate. Accuracy in big data may lead to more confident decision making, and better decisions can result in greater operational efficiency, cost reduction and reduced risk. Various algorithms and techniques like Classification, Clustering, Regression, Artificial Intelligence, Neural Networks, Association Rules, Decision Trees, Genetic Algorithm, Nearest Neighbor method are used for knowledge discovery from databases. Cluster is a group of objects that belongs to the same class. In other words, similar objects are grouped in one cluster and dissimilar objects are group in another cluster. Clustering methods can be classified into Partitioning Method, Hierarchical Method, Density-based Method. Clustering analysis is used in several applications like market research, pattern recognition, data analysis. K-means clustering is well known partitioning method. But this method has problem of empty cluster. The problems with existing system[6] were analysis, capture, search, sharing, storage, transfer, visualization, querying-updating. These problems can be reduced by using proposed algorithm. In this paper clustering and proposed algorithm is discussed.