International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 45 - Number 4 |
Year of Publication: 2012 |
Authors: Rajesh N. Phursule, P. C. Bhaskar |
10.5120/6770-9056 |
Rajesh N. Phursule, P. C. Bhaskar . Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques. International Journal of Computer Applications. 45, 4 ( May 2012), 40-44. DOI=10.5120/6770-9056
Clustering the documents based on similarity of words and searching the text is major search procedure and widely used for large set of documents. Documents can be clustered using many clustering algorithms such as Nearest Neighbor, K-Means, Hierarchical, Graph Theoretic etc [4] [5] [7]. The performance measurement in terms of space complexity and execution time and searched output in terms of accuracy and redundancy of these algorithms is a needful study [3]. This paper mainly focuses on performance measurement of Nearest Neighbor, K-Means and Hierarchical agglomerative clustering algorithms on text documents as well as compares them in terms of space complexity, execution time, accuracy and redundancy. In particular, preprocess the input text document and convert it into the document graph represented in the form of matrix. Then convert that document graph into relation matrix which gives relation (similarity score) among all the nodes from 0 to 1 [2]. Implementation and the results of applied clustering algorithms ( Nearest Neighbor, K-Means and Hierarchical agglomerative) on documents are discussed and implemented here.