CFP last date
20 January 2025
Reseach Article

Analysis of Clustering Algorithm of Weka Tool on Air Pollution Dataset

by Richa Agrawal, Jitendra Agrawal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 168 - Number 13
Year of Publication: 2017
Authors: Richa Agrawal, Jitendra Agrawal
10.5120/ijca2017914522

Richa Agrawal, Jitendra Agrawal . Analysis of Clustering Algorithm of Weka Tool on Air Pollution Dataset. International Journal of Computer Applications. 168, 13 ( Jun 2017), 1-5. DOI=10.5120/ijca2017914522

@article{ 10.5120/ijca2017914522,
author = { Richa Agrawal, Jitendra Agrawal },
title = { Analysis of Clustering Algorithm of Weka Tool on Air Pollution Dataset },
journal = { International Journal of Computer Applications },
issue_date = { Jun 2017 },
volume = { 168 },
number = { 13 },
month = { Jun },
year = { 2017 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume168/number13/27940-2017914522/ },
doi = { 10.5120/ijca2017914522 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:16:01.559640+05:30
%A Richa Agrawal
%A Jitendra Agrawal
%T Analysis of Clustering Algorithm of Weka Tool on Air Pollution Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 168
%N 13
%P 1-5
%D 2017
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining is the process of extracting knowledge from the huge amount of data. The data can be stored in databases and information repositories. Data mining task can be divided into two models descriptive and predictive model. In the Predictive model, we can predict the values from a different set of sample data, they are classified into three types such as classification, regression and time series. The descriptive model enables us to determine patterns in a sample data and sub-divided into clustering, summarization and association rules. Clustering creates a group of classes based on the patterns and relationship between the data. There is different types of clustering algorithms partition, density based algorithm. In this paper, algorithms are analyzing and comparing the various clustering algorithm by using WEKA tool to find out which algorithm will be more comfortable for the users for performing clustering algorithm. This present the application's of data minning WEKA tool it provide the cluster's huge data set and clustering thet provide making hand in the optimizing in search engine.

References
  1. Chauhan R, Kaur H, Alam M A, “Data Clustering Method for Discovering Clusters in Spatial Cancer Databases”, International Journal of Computer Applications , (0975 – 8887) Vol.10– No.6, November 2010.
  2. Data Preprocessing in WEKA, Available at: http://facweb.cs.depaul.edu/mobasher/classes/ect584/weka/preprocess.html.
  3. Raj Bala, Sunil Sikka and Juhi singh et. ,“A Comparative Analysis of Clustering Algorithms”, International Journal of Computer Applications (0975 – 8887) Volume 100 – No.15, August 2014.
  4. Deepti V. Patange Dr. Pradeep K. Butey S. E. Tayde, “Analytical Study of Clustering Algorithms by Using Weka”, National Conference on “Advanced Technologies in Computing and Networking"-ATCON-2015 Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477.
  5. https://archive.ics.uci.edu/ml/datasets/Air+Quality
  6. Z. Huang."Extensions to the k-means algorithm for clustering large data sets with categorical values". Data Mining and Knowledge Discovery,2:283–304, 1998.
  7. http://www.cs.bham.ac.uk/~jxb/NN/l18.pdf
  8. Marie Cottrell, “Some Other Applications of the SOM algorithm : how to use the Kohonen algorithm for forecasting”, 2002.
  9. William Iba and Pat Langley. "Cobweb models of categorization and probabilistic concept formation". In Emmanuel M. Pothos and Andy J. Wills,. Formal approaches in categorization. Cambridge: Cambridge University Press. pp. 253–273. ISBN 9780521190480.
  10. Introduction to Weka, Available at: http://transact.dl.sourceforge.net/sourcefor ge/weka/WekaManual-3.6.0.pdf
  11. Kohonen, T. (1995) : Self-Organizing Maps, Springer Series in Information Sciences Vol 30, Springer.
  12. Kaski, S. (1997) : Data Exploration Using Self-Organizing Maps, Acta Polytechnica Scandinavia, 82.
  13. http://www.cs. ccsu.edu/~markov/ccsu_courses/datamining-ex3.html
  14. Sanjoy Dasgupta ―Performance guarantees for hierarchical clustering Department of Computer Science and Engineering University of California, San Diego.
  15. Ali, MA, Karmakar, GC & Dooley, LS 2008 ‘Review on Fuzzy Clustering Algorithms’. IETECH Journal of Advanced Computations, vol. 2, no. 3, pp. 169 – 181.
  16. Suganya, R & Shanthi, R 2012 ‘Fuzzy C- Means Algorithm - A Review’. Int. J. of Scientific and Research Publications, vol. 2, no. 11, pp. 1-3.
  17. Bora, DJ & Gupta, AK 2014 ‘A Comparative study Between Fuzzy Clustering Algorithm and Hard Clustering Algorithm’. Int. J. of Computer Trends and Technology, vol. 10, no. 2, pp. 108-113.
  18. Glenn Fung, "A Comprehensive Overview of Basic Clustering Algorithms", 2002.
  19. Ossama Abu Abbas., "Comparisons Between of Data Clustering algorithms", The International Arab Journal of Information Technology, Vol. 5, No. 3, 2008.
  20. Madjid Khalilian, Norwati Mustapha, MD Nasir Suliman, MD Ali Mamat, "K-Means Based Clustering Algorithm ", International multi conference of Enginnrs and Computer Scientists, 2010.
  21. Rui Xu, Wunsch, D., II, Dept. of Electr. & Comput. Eng., Univ. of Missouri-Rolla, Rolla, MO, USA, "Survey of clustering algorithms", IEEE Transaction on Neural Networks, 2005.
  22. HE Ling WU Ling-da, CAI Yi-chao(College of Information System & Management ,National University of Defense Technology, Changsha Hunan 410073,China), ''Survey of Clustering Algorithms in Data Mining", 2007.
Index Terms

Computer Science
Information Sciences

Keywords

Data Mining Clustering algorithms K-mean LVQ SOM cobweb WEKA