National Conference on Advanced Computing and Communications 2012 |
Foundation of Computer Science USA |
NCACC - Number 1 |
August 2012 |
Authors: M. Hanumanthappa, B R Prakash |
61515d51-ad87-4ba8-9e88-c54d1817d88b |
M. Hanumanthappa, B R Prakash . An Efficient Technique to Improve Snippet Clustering and Labeling using Modified FPF Algorithm. National Conference on Advanced Computing and Communications 2012. NCACC, 1 (August 2012), 38-42.
Document clustering is an effective tool to manage information overload. By grouping similardocuments together, we enable a human observer to quickly browse large document collections,make it possible to easily grasp the distinct topics and subtopics. In this Paper we survey the most important problems and techniques relatedto text information retrieval: document pre-processing and filtering, word sense disambiguation,Further we present text clustering using Modified FPF algorithm and comparison of our clustering algorithms against FPF, which isthe most used algorithm in the text clustering context. Further we introducethe problem of cluster labeling: Cluster labeling is achieved by combining intra-clusterand inter-cluster term extraction based on a variant of the informationgain measure.