K-Means Clustering of Cloud Data using Weka and R Language

Banshidhar Choudhary; Vipin Saxena

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

K-Means Clustering of Cloud Data using Weka and R Language

by Banshidhar Choudhary, Vipin Saxena

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 184 - Number 49

Year of Publication: 2023

Authors: Banshidhar Choudhary, Vipin Saxena

10.5120/ijca2023922613

Banshidhar Choudhary, Vipin Saxena . K-Means Clustering of Cloud Data using Weka and R Language. International Journal of Computer Applications. 184, 49 ( Mar 2023), 33-39. DOI=10.5120/ijca2023922613

@article{ 10.5120/ijca2023922613,

author = { Banshidhar Choudhary, Vipin Saxena },

title = { K-Means Clustering of Cloud Data using Weka and R Language },

journal = { International Journal of Computer Applications },

issue_date = { Mar 2023 },

volume = { 184 },

number = { 49 },

month = { Mar },

year = { 2023 },

issn = { 0975-8887 },

pages = { 33-39 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume184/number49/32637-2023922613/ },

doi = { 10.5120/ijca2023922613 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:24:25.210530+05:30

%A Banshidhar Choudhary

%A Vipin Saxena

%T K-Means Clustering of Cloud Data using Weka and R Language

%J International Journal of Computer Applications

%@ 0975-8887

%V 184

%N 49

%P 33-39

%D 2023

%I Foundation of Computer Science (FCS), NY, USA

Abstract

From the literature, it is observed that the there is tremendous growth of cloud data which is increasing day by day in an exponential manner. The cloud data contains large files in the form of text, audio and video formats. Therefore, for optimizing the search timings for said files, there is a need of clustering of data. In the present work, K-means clustering is applied for the large data of banking sector and for this purpose, Weka and R language are used which give optimize results to search the desired information. Computed results are depicted through figures and tables.

References

Ahmad, A., and Dey, L. (2007), “A K-Mean Clustering Algorithm for Mixed Numeric and Categorical Data”. Data and Knowledge Engineering, 63(2), 503–527. https://doi.org/10.1016/j.datak.2007.03.016.
Rujasiri, P., and Chomtee, B. (2009), “Comparison of Clustering Techniques for Cluster Analysis”. In Nat. Sci. 43, 378-388.
Bharti, K. K., Shukla, S., and Jain, S. (2010), “Intrusion Detection using Clustering”, International Journal of Computer and Communication Technology, 248–255. https://doi.org/10.47893/ijcct.2010.1052.
Shrivastava, R., Upadhyay, K., Bhati, R., and Mishra, D. K. (2010), “Comparison between K-Mean and C-Mean Clustering for CBIR”, Proceedings of 2nd International Conference on Computational Intelligence, Modelling and Simulation, CIMSim 2010, 117–118. https://doi.org/10.1109/CIMSiM.2010.66.
Ahmad, A. (2010), “Data Clustering Using K-Mean Algorithm for Network Intrusion Detection”,M.Tech. Report in Lovely Professional University, Chandigarh, India.
Aggarwal, N. and Aggarwal, K. (2012), “A Mid-Point based K-mean Clustering Algorithm for Data mining”. International Journal on Computer Science and Engineering, 4(6), 1174-1180, Springer Singapore, ISSN: 0975-3397.
Chathurvedi, N. and Rajavat, A. (2013), “An Improvement in K-mean Clustering Algorithm using Better Time and Accuracy”, International Journal of Programming Language and Application, 3(4), 13-19. https://doi.org/10.5121/ijpla.2013.3407.
Sharma, I. (2014), “Comparison of Different Clustering Algorithms using WEKA Tool”, 1(2), 20-22. ISSN: 2349-7173. www.ijartes.org.
Yuan, D., Cuan, Y. and Liu, Y. (2014), “An Effective Clustering Algorithm for Transaction Databases Based on K-Mean”. Journal of Computers, 9(4). https://doi.org/10.4304/jcp.9.4.812-816.
Garg, D. and Trivedi, K (2014), “Fuzzy K-Mean Clustering in Map Reduce on Cloud Based Hadoop”, ICACCCT :Proceedings of 2014 IEEE International Conference on Advanced Communication Control & Computing Technologies : May 08-10, 2014. ISBN: 978-1-4799-3914-5/14.
Asnani, R. (2015), “A Distributed K-mean Clustering Algorithm for Cloud Data Mining”. International Journal of Engineering Trends and Technology, 30(7), ISSN: 2231-5381. http://www.ijettjournal.org.
Shaaban, H. R., AbdulkaremHabib, A., and Abbas Obaid, F. (2015), “Performance Evaluation of K-Mean and Fuzzy C-Mean Image Segmentation Based Clustering Classifier”, International Journal of Advanced Computer Science and Applications, 6(12), 176-183. www.ijacsa.thesai.org.
Rathore, P. (2016), “Analysis and Performance Improvement of K-means clustering in Big Data Environment”. International Conference of Communication Network, http://DOI10.1109/ICCN.2015.9.
Abualigah, L. M., Khader, A. T., and Al-Betar, M. A. (2016), “Multi Objectives Based Text Clustering Technique using K-mean Algorithm”, Proceedings - CSIT 2016: 7th International Conference on Computer Science and Information Technology. https://doi.org/10.1109/CSIT.2016.7549464.
Verma, V., Bhardwaj, S., and Singh, H. (2016), “A Hybrid K-Mean Clustering Algorithm for Prediction Analysis”. Indian Journal of Science and Technology, 9(28). https://doi.org/10.17485/ijst/2016/v9i28/98392.
K. Chahal, J., and Kaur, A. (2016). “A Hybrid Approach based on Classification and Clustering for Intrusion Detection System”. International Journal of Mathematical Sciences and Computing, 2(4), 34–40, https://doi.org/10.5815/ijmsc.2016.04.04.
Patel, Archana K. M. and Thakral, Pratel (2016), “The Best Clustering Algorithm in Data Mining”, Adhiparasakthi Engineering College. Department of Electronics and Communication Engineering, Institute of Electrical and Electronics Engineers. Madras Section, & Institute of Electrical and Electronics Engineers (ICCSP), ISBN: 9781509003969. https://doi.org/978-1-5090-0396-9/16/$31.00@2016IEEE.
Kumar, S., Mishra, S., and Asthana, P. (2017), “Automated Detection of Acute Leukemia using K-mean Clustering Algorithm”. Advance in Computer and Computation Science, 655-670.
Bansal, A., Sharma, M., and Goel, S. (2017), “Improved K-mean Clustering Algorithm for Prediction Analysis using Classification Technique in Data Mining”. International Journal of Computer Applications, 157(6), 35–40, https://doi.org/10.5120/ijca2017912719.
Khan, A., Baseer, S., and Javed, S. (2017), “Perception of Students on Usage of Mobile Data by K-mean Clustering Algorithm”, International Journal of Advanced and Applied Seinces, 4(2), 17–21. https://doi.org/10.21833/ijaas.2017.02.003.
Kalra, M., Lal, N., and Qamar, S. (2018), “K-Mean Clustering Algorithm Approach for Data Mining of Heterogeneous Data”. In Lecture Notes in Networks and Systems (Vol. 10, pp. 61–70), Springer. https://doi.org/10.1007/978-981-10-3920-1_7.
Kuraria, A., Jharbade, N., and Soni, M. (2018), “Centroid Selection Process Using WCSS and Elbow Method for K-Mean Clustering Algorithm in Data Mining”. International Journal of Scientific Research in Science, Engineering and Technology, 190–195, https://doi.org/10.32628/ijsrset21841122.
Rashid, M., Singh, H., and Goyal, V. (2019), “Cloud storage privacy in health care systems based on IP and geo-location validation using k-mean clustering technique”. International Journal of E-Health and Medical Communications, 10(4), 54–65, https://doi.org/10.4018/IJEHMC.2019100105.
Vats, S., and Sagar, B. B. (2019), “Performance Evaluation of K-means Clustering on Hadoop Infrastructure”, Journal of Discrete Mathematical Sciences and Cryptography, 22(8), 1349–1363, https://doi.org/10.1080/09720529.2019.1692444.
Wye, K. F. P., Kanagaraj, E., Zakaria, S. M. M. S., Kamarudin, L. M., Zakaria, A., Kamarudin, K., and Ahmad, N. (2019), “RSSI-based Localization Zoning using K-Mean Clustering”, IOP Conference Series: Materials Science and Engineering, 705(1), https://doi.org/10.1088/1757-899X/705/1/012038.
Vaigai College of Engineering, and Institute of Electrical and Electronics Engineers, Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2020):13-15 May 2020.
Shang, R., Ara, B., Zada, I., Nazir, S., Ullah, Z., and Khan, S. U. (2021), “Analysis of Simple K- Mean and Parallel K- Mean Clustering for Software Products and Organizational Performance Using Education Sector Dataset”. Scientific Programming, 2021. https://doi.org/10.1155/2021/9988318.
Zada, I., Ali, S., Khan, I., Hadjouni, M., Elmannai, H., Zeeshan, M., Serat, A. M., and Jameel, A. (2022), “Performance Evaluation of Simple K -Mean and Parallel K -Mean Clustering Algorithms: Big Data Business Process Management Concept”, Mobile Information Systems, 2022. https://doi.org/10.1155/2022/1277765
Omer, A. S., Yemer, T. A., and Woldegebreal, D. H. (2022), “Hybrid K-Mean Clustering and Markov Chain for Mobile Network”, Accessibility and Retain ability Prediction. 9. https://doi.org/10.3390/engproc 2022018009.
7. AUTHORS’ PROFILES
Banshidhar Choudhary received Post Graduate Degree in Computer Applications (M.C.A.) from Dr. Indira Gandhi National Open University, New Delhi in 2002 and M.Phil. Degree from Madurai Kamraj University in 2007 and currently a research scholar in the Department of Computer Science, Babasaheb Bhimrao Ambedkar University. He has 15 years of teaching experience in Computer Science field in the various Indian Universities and 03 years in the Al-Jabal Al-Garbi University, Libya. Currently, he is solvi
Prof. Vipin Saxena received his Ph.D. degree from Indian Institute of Technology, Roorkee, Uttarakhand, India. Presently, he is working as Professor in Department of Computer Science, Babasaheb Bhimrao Ambedkar University, Lucknow, India. He has published more than 190 research articles in the International and National Journals and Conferences, authored 05 books in the field of Computer Science and Scientific Computing, attended 55 International and National Conferences and received three Natio

Index Terms

Computer Science

Information Sciences

Keywords

Cloud data File Formats Clustering K-Means Search Credit/Debit Cards.