International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 63 - Number 10 |
Year of Publication: 2013 |
Authors: Anoop Kumar Jain, Satyam Maheshwari |
10.5120/10504-5273 |
Anoop Kumar Jain, Satyam Maheshwari . Phrase based Clustering Scheme of Suffix Tree Document Clustering Model. International Journal of Computer Applications. 63, 10 ( February 2013), 30-37. DOI=10.5120/10504-5273
Document clustering is one of the difficult and recent research fields in the search engine research. Most of the existing documents clustering techniques use a group of keywords from each document to cluster the documents. Document clustering arises from information retrieval domains, and "It finds grouping for a set of documents belonging to the same cluster are similar and documents belongs to the different cluster are dissimilar". The nformation retrieval plays an important role in data mining for extracting the relevant information for related to user request. Information retrieval finds the file contents and identifies their similarity. It measures the performance of the documents by using the precision and recall. In this paper we proposed a phrase based clustering scheme which based on application of Suffix Tree Document Clustering (STDC) model. The proposed algorithm is designed to use the STDC model for accurate equivalent representation of document and similarity measurement of the similar documents. This method of clustering reduces the grouping time and similarity accuracy as compared to other existing methods.