International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 41 - Number 7 |
Year of Publication: 2012 |
Authors: Vibekananda Dutta, Krishna Kumar Sharma, Deepti Gahalot |
10.5120/5557-7632 |
Vibekananda Dutta, Krishna Kumar Sharma, Deepti Gahalot . Performance Comparison of Hard and Soft Approaches for Document Clustering. International Journal of Computer Applications. 41, 7 ( March 2012), 44-48. DOI=10.5120/5557-7632
There is a tremendous spread in the amount of information on the largest shared information source like search engine. Fast and standards quality document clustering algorithms play an important role in helping users effectively towards vertical search engine, World Wide Web, summarizing & organizing information. Recent surveys have shown that partitional clustering algorithms are more suitable for clustering large datasets like World Wide Web. However the K-means algorithm is the most commonly used in partitional clustering algorithm because it can easily be implemented and most efficient interms of execution in time. In this paper we represent a short overview of method for soft approaches of an optimal fuzzy document clustering algorithm as compare to the hard approaches. In the experiment we conducted, we applied the Hard and soft approaches like K-means and Fuzzy c-means on different text document datasets. The number of document in the datasets ranges from 1500 to 2600 and the number of terms ranges from 6000 to over 7500 in both hard and soft approaches. The results illustrate that the soft approaches can generated slightly better result than the hard approaches.