CFP last date
20 January 2025
Reseach Article

Error Evaluation on K- Means and Hierarchical Clustering with Effect of Distance Functions for Iris Dataset

by Harish Kumar Sagar, Varsha Sharma
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 86 - Number 16
Year of Publication: 2014
Authors: Harish Kumar Sagar, Varsha Sharma
10.5120/15066-3429

Harish Kumar Sagar, Varsha Sharma . Error Evaluation on K- Means and Hierarchical Clustering with Effect of Distance Functions for Iris Dataset. International Journal of Computer Applications. 86, 16 ( January 2014), 1-5. DOI=10.5120/15066-3429

@article{ 10.5120/15066-3429,
author = { Harish Kumar Sagar, Varsha Sharma },
title = { Error Evaluation on K- Means and Hierarchical Clustering with Effect of Distance Functions for Iris Dataset },
journal = { International Journal of Computer Applications },
issue_date = { January 2014 },
volume = { 86 },
number = { 16 },
month = { January },
year = { 2014 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume86/number16/15066-3429/ },
doi = { 10.5120/15066-3429 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:04:20.866818+05:30
%A Harish Kumar Sagar
%A Varsha Sharma
%T Error Evaluation on K- Means and Hierarchical Clustering with Effect of Distance Functions for Iris Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 86
%N 16
%P 1-5
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In Data clustering (a sub field of Data mining), k-means and hierarchical based clustering algorithms are popular due to its excellent performance in clustering of large data sets. This paper presents two different comparative studies which includes various Data Clustering algorithms for analyzing best one with minimum clustering error. The foremost objective of this paper is to divide the data objects into k number of different clusters with homogeneity and the each cluster should be heterogeneous to each other. However, these both algorithms (K-Mean and Hierarchical) are not free with the errors. In this paper, firstly various distance has been considered for these two algorithms for comparing and analyzing the best distance methods to solve the existing problems. .

References
  1. Pham, D. T. , S. S. Dimov, and C. D. Nguyen. "Selection of K in K-means clustering. " Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science 219. 1 (2005): 103-119
  2. Awasthi, Rekha, Anil K. Tiwari, and Seema Pathak. "Empirical Evaluation on K Means Clustering with Effect of Distance Functions for Bank Dataset. " IJITR1. 3 (2013): 233-235.
  3. Density-based clustering algorithms DBSCAN and SNN by Adriano Moreira, Maribel Y. Santos and Sofia Carneiro.
  4. Kanungo, Tapas, et al. "An efficient k-means clustering algorithm: Analysis and implementation. " Pattern Analysis and Machine Intelligence, IEEE Transactions on 24. 7 (2002): 881-892.
  5. J. L. Bentley, Multidimensional Binary Search Trees Used for Associative Searching.
  6. Esteves, Rui Maximo, Rui Pais, and Chunming Rong. "K-means clustering in the cloud--a mahout test. " In Advanced Information Networking and Applications (WAINA), 2011 IEEE Workshops of International Conference on, pp. 514-519. IEEE, 2011.
  7. F. Caoa et. al. , "An initialization method for the k-Means algorithm using neighborhood model", Computers and Mathematics with Applications, vol. 58, pp. 474 – 483, 2009.
  8. Han, Jiawei, Kamber, Micheline. (2000) Data Mining: Concepts and Techniques. Morgan Kaufmann.
  9. Euclidean Distance in http://people. revoledu. com /kardi/tutorial/Similarity/EuclideanDistance. html.
  10. Euclidean distance in http://en. wikipedia. org/wiki/ Euclidean_distance#One-dimensional_distance
Index Terms

Computer Science
Information Sciences

Keywords

K-Means Hierarchical Euclidean Distance Manhattan Distance Filtering cluster Density Based clustering algorithms on clustering error.