International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 119 - Number 7 |
Year of Publication: 2015 |
Authors: Poonam Chahal, Manjeet Singh, Suresh Kumar |
10.5120/21081-3762 |
Poonam Chahal, Manjeet Singh, Suresh Kumar . Relation based Measuring of Semantic Similarity for Web Documents. International Journal of Computer Applications. 119, 7 ( June 2015), 26-29. DOI=10.5120/21081-3762
The World Wide Web (WWW) is the information resource centre in which information exists in the structure of web pages which are interlinked with each other. From the huge amount of information present on WWW it has been found difficult to extract the relevant information for the query given by the user. The reason for this is that the information exists on web is in natural language. The layered architecture semantic web is given by Tim Berner Lee to overcome the issues of information retrieval. In recent times, numerous semantic web search engines have been developed like Ontolook, Swoogle, etc which assist in searching significant documents presented on semantic web. Several attempts have been made in ruling out the similarity of semantic web pages but then also the results of these semantic similarity techniques between web documents is neither appropriate nor upto the user's prospects. This paper proposes an approach for finding the semantic similarity between the web documents along with the consideration of the concepts as well as the relationships that will exists between the concepts also. In our approach the documents are being processed by extracting concepts and relationships between the existing concepts from the documents using the base ontology and the dictionary having the words along with the synonyms. Finally, the set of any two documents are compared to find their semantic similarity by taking the relationships that exists in the documents. We discover all relevant relationships between the words which provide the core information of the document and then the similarity of these relationships is computed on each web page to find out their significance.