National Conference on Emerging Trends in Computer Technology |
Foundation of Computer Science USA |
NCETCT - Number 2 |
December 2014 |
Authors: P. H. Govardhan, K. P. Wagh, P. N. Chatur |
bd3787f5-422d-4d30-a04b-c0fade35f6d3 |
P. H. Govardhan, K. P. Wagh, P. N. Chatur . Web Document Clustering using Proposed Similarity Measure. National Conference on Emerging Trends in Computer Technology. NCETCT, 2 (December 2014), 15-18.
Recent advance research in data warehousing and data mining emerges various types of information sources. Web documents are the most useful information resources in this era. Efficient uses of these resources are most important for knowledge discovery. Bunch of documents providing related information is to be grouped in one cluster. Finding the similarity between documents is tedious task. There are various similarity measures introduced earlier to solve the problems related to clustering. Proposing new similarity measure to get better results of clustering is reason behind this paper work. As before concern to previous research, there is no consideration of present and absent features in documents. Proposed similarity measure concentrates on both present and absent features in the documents. Concentrating on similarity measure will help to mining technique.