CFP last date
20 January 2025
Reseach Article

Clustering Algorithms for Huge Datasets: A Mathematical Approach

by Shyam Mohan J. S., Shanmugapriya P.
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 181 - Number 49
Year of Publication: 2019
Authors: Shyam Mohan J. S., Shanmugapriya P.
10.5120/ijca2019918724

Shyam Mohan J. S., Shanmugapriya P. . Clustering Algorithms for Huge Datasets: A Mathematical Approach. International Journal of Computer Applications. 181, 49 ( Apr 2019), 58-62. DOI=10.5120/ijca2019918724

@article{ 10.5120/ijca2019918724,
author = { Shyam Mohan J. S., Shanmugapriya P. },
title = { Clustering Algorithms for Huge Datasets: A Mathematical Approach },
journal = { International Journal of Computer Applications },
issue_date = { Apr 2019 },
volume = { 181 },
number = { 49 },
month = { Apr },
year = { 2019 },
issn = { 0975-8887 },
pages = { 58-62 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume181/number49/30494-2019918724/ },
doi = { 10.5120/ijca2019918724 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:09:38.293789+05:30
%A Shyam Mohan J. S.
%A Shanmugapriya P.
%T Clustering Algorithms for Huge Datasets: A Mathematical Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 181
%N 49
%P 58-62
%D 2019
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Identifying clusters for huge datasets are useful for finding out attributes of a particular dataset and thereby providing insights for making effective decision making. In our previous work, we have proved the concept of clustering algorithms for huge datasets theoretically by applying small computations on the available datasets. In this paper, we extend the same work by applying Mathematical calculations for the datasets so as to prove the correctness of our previous work carried out. Our proposed method is applied to various datasets and proved K-Means algorithm mathematically and the experimental calculations performed on various clustering algorithms shows that our approach provides the new idea of clustering techniques that can be applied for any number of huge and complex datasets.

References
  1. Jain, Anil K., M. Narasimha Murty, and Patrick J. Flynn. "Data clustering: a review." ACM computing surveys (CSUR) 31, no. 3 (1999): 264-323.
  2. Senthilnath, J., S. N. Omkar, and V. Mani. "Clustering using firefly algorithm: performance study." Swarm and Evolutionary Computation 1, no. 3 (2011): 164-171.
  3. Kanungo, Tapas, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, and Angela Y. Wu. "An efficient k-means clustering algorithm: Analysis and implementation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24, no. 7 (2002): 881-892.
  4. Shyam Mohan J S, Shanmugapriya.P ,”Clustering of Huge Datasets using Machine Intelligence Techniques.”IJCA – Vol.181,No.18,September 2018.
  5. Robson L. F. Cordeiro et.al,” Clustering Very Large Multi-dimensional Datasets with MapReduce.” ACM- KDD’11, August 21–24, 2011, San Diego, California, USA.
  6. Dongkuan Xu et.al,” A Comprehensive Survey of Clustering Algorithms.”Springer - Ann. Data. Sci. DOI 10.1007/s40745-015-0040-1.
  7. Max Bodoia ,” MapReduce Algorithms for k-means Clustering.”
  8. Nivranshu Hans et.al,” Big Data Clustering Using Genetic Algorithm On Hadoop MapReduce.” INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 4, ISSUE 04, APRIL 2015 ISSN 2277-8616.
  9. Sreedhar et al.,”Clustering large datasets using K means modified inter and intra clustering (KMI2C) in Hadoop”, Journal Of Big Data , DOI 10.1186/s40537-017-0087-2, Springer 2017.
Index Terms

Computer Science
Information Sciences

Keywords

Machine Intelligence Clustering Algorithms