Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques

Rajesh N. Phursule; P. C. Bhaskar

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Article:A Comparative study of Face Recognition with Principal Component Analysis and Cross-Correlation Technique

November

2010

Evaluating Embedded GPUs Performance via Computer Vision Applications

Jul

2020

Detection and Identification of Mass Structure in Digital Mammogram

September

2013

A Two Hop Power Adaptive MAC Protocol for Densely Populated Wireless Networks

March

2013

Reseach Article

Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques

by Rajesh N. Phursule, P. C. Bhaskar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 45 - Number 4

Year of Publication: 2012

Authors: Rajesh N. Phursule, P. C. Bhaskar

10.5120/6770-9056

Rajesh N. Phursule, P. C. Bhaskar . Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques. International Journal of Computer Applications. 45, 4 ( May 2012), 40-44. DOI=10.5120/6770-9056

@article{ 10.5120/6770-9056,

author = { Rajesh N. Phursule, P. C. Bhaskar },

title = { Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques },

journal = { International Journal of Computer Applications },

issue_date = { May 2012 },

volume = { 45 },

number = { 4 },

month = { May },

year = { 2012 },

issn = { 0975-8887 },

pages = { 40-44 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume45/number4/6770-9056/ },

doi = { 10.5120/6770-9056 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:36:45.643487+05:30

%A Rajesh N. Phursule

%A P. C. Bhaskar

%T Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques

%J International Journal of Computer Applications

%@ 0975-8887

%V 45

%N 4

%P 40-44

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Clustering the documents based on similarity of words and searching the text is major search procedure and widely used for large set of documents. Documents can be clustered using many clustering algorithms such as Nearest Neighbor, K-Means, Hierarchical, Graph Theoretic etc [4] [5] [7]. The performance measurement in terms of space complexity and execution time and searched output in terms of accuracy and redundancy of these algorithms is a needful study [3]. This paper mainly focuses on performance measurement of Nearest Neighbor, K-Means and Hierarchical agglomerative clustering algorithms on text documents as well as compares them in terms of space complexity, execution time, accuracy and redundancy. In particular, preprocess the input text document and convert it into the document graph represented in the form of matrix. Then convert that document graph into relation matrix which gives relation (similarity score) among all the nodes from 0 to 1 [2]. Implementation and the results of applied clustering algorithms ( Nearest Neighbor, K-Means and Hierarchical agglomerative) on documents are discussed and implemented here.

References

Sholom Weiss, Brian White and Chidanand Apte, "A Lightweight Document Clustering", IBM T. J. Watson Research Centre NY10598, USA.
Ramkrishna Varadrajan, Vagelis Hristidis, "A System for Query Specific Document Summarization", Florida International University.
Michael Steinbach, George Karypis, Vipin Kumar, "A Comparison of Document Clustering Techniques" ,University of Minnesota, Technical Report #00-034.
A. K. Jain, Michigan State University, M. N. Murthy, Indian Institute of Science and P. J. Flynn, The Ohio State University, "Data Clustering: A Review".
King B. , "Step-wise Clustering Procedures", 1967J. Am. Stat. Assoc. 69, 86–101.
Anderberg M. R. . , "Cluster Analysis for Application", 1973 Academic Press, Inc. , New York Ny. Augustson, J.
Abracos and G. Pereira-Lopes, "Statistical methods for retrieving most significant paragraphs in newspaper articles", ACL/EACL Workshop on Intelligent Scalable Text Summarization, 1997.
S. Agrawal, S. Chaudhuri, and G. Das, "DBXplorer: A System For Keyword-Based Search Over Relational Databases", ICDE,2002.
E. Amitay, C. Paris, "Automatically Summarizing Web Sites -Is there any way around it?", CIKM,2000.
H. H. Chen, J. J. Kuo, and T. C. Su, "Clustering and Visualization in a Multi-Lingual Multi- Document Summarization System ", ECIR,2003
G. Erkan and D. R. Radev. Lexrank, "Graph-based centrality as salience in text summarization", JAIR,2004.
J. Goldstein, M. Kantrowitz, V. Mittal, J. Carbonell, "Summarizing text documents: Sentence selection and evaluation metrics", ACM SIGIR, 1999.
C. Y. Lin, "Improving Summarization Performance by Sentence Compression - A Pilot Study", IRAL,2003.
D. Cutting, D. Karger, J. Pedersen, and J. Tukey, " Scatter/Gather: a Cluster-based Approach to Browsing Large Document collections", ACM SIGIR 1992.
J. Hartigan and M Wong, ". A k-means clustering algorithm", Applied Statitsics, 1979
A. El-Hamdouchi and P. Willet, ". Comparison of Hierarchic Agglomerative Clustering Methods for Document Retrieval", The Computer Journal, Vol. 32, No. 3, 1989

Index Terms

Computer Science

Information Sciences

Keywords

Analysis And Comparison Of K-means Nearest Neighbor Agglomerative Hierarchical Document Graph. Clustering Algorithm