International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 106 - Number 17 |
Year of Publication: 2014 |
Authors: Rahul S. Khokale, Mohammad Atique |
10.5120/18615-9908 |
Rahul S. Khokale, Mohammad Atique . A Framework for Automatic Document Understanding for Web Information Retrieval. International Journal of Computer Applications. 106, 17 ( November 2014), 32-36. DOI=10.5120/18615-9908
Most of the web search engines use keyword based approach to search for needed information on the web. When a query is submitted by the user to the search engine, the web crawler tries to match the keywords with name of file, URL or the meta tags of the documents. Because of this, user may get many non-relevant documents along with relevant documents. It can lead to the frustration of information seekers. This problem can be alleviated, if the search is based on the contents and intents rather than only keywords. Automatic document understanding focuses on representation of a document in summarized form with its gist containing important contents and the intention of the author. This paper deals with the framework of a system for automatic document understanding for web information retrieval. The basic purpose of this work is to enhance the effectiveness of information search on the internet.