International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 87 - Number 3 |
Year of Publication: 2014 |
Authors: Kulkarni A. H., Patil B. M. |
10.5120/15186-3546 |
Kulkarni A. H., Patil B. M. . Template Extraction from Heterogeneous Web Pages with Cosine Similarity. International Journal of Computer Applications. 87, 3 ( February 2014), 4-8. DOI=10.5120/15186-3546
Now a day's detection of templates from a large number of web pages has received a lot of attention. Template detection technique improves the performance of clustering, classification & search engines. In our work we proposed a novel algorithm by using cosine similarity based Template Extraction. We are using the cosine similarity approach to cluster the web documents. With the help of underlying structure of web documents we found the template for individual cluster. Our experimental evaluation show that our approach is effective in terms of computing Time and Clustering cost.