We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

A Framework for Automatic Document Understanding for Web Information Retrieval

by Rahul S. Khokale, Mohammad Atique
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 106 - Number 17
Year of Publication: 2014
Authors: Rahul S. Khokale, Mohammad Atique
10.5120/18615-9908

Rahul S. Khokale, Mohammad Atique . A Framework for Automatic Document Understanding for Web Information Retrieval. International Journal of Computer Applications. 106, 17 ( November 2014), 32-36. DOI=10.5120/18615-9908

@article{ 10.5120/18615-9908,
author = { Rahul S. Khokale, Mohammad Atique },
title = { A Framework for Automatic Document Understanding for Web Information Retrieval },
journal = { International Journal of Computer Applications },
issue_date = { November 2014 },
volume = { 106 },
number = { 17 },
month = { November },
year = { 2014 },
issn = { 0975-8887 },
pages = { 32-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume106/number17/18615-9908/ },
doi = { 10.5120/18615-9908 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:39:41.862822+05:30
%A Rahul S. Khokale
%A Mohammad Atique
%T A Framework for Automatic Document Understanding for Web Information Retrieval
%J International Journal of Computer Applications
%@ 0975-8887
%V 106
%N 17
%P 32-36
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Most of the web search engines use keyword based approach to search for needed information on the web. When a query is submitted by the user to the search engine, the web crawler tries to match the keywords with name of file, URL or the meta tags of the documents. Because of this, user may get many non-relevant documents along with relevant documents. It can lead to the frustration of information seekers. This problem can be alleviated, if the search is based on the contents and intents rather than only keywords. Automatic document understanding focuses on representation of a document in summarized form with its gist containing important contents and the intention of the author. This paper deals with the framework of a system for automatic document understanding for web information retrieval. The basic purpose of this work is to enhance the effectiveness of information search on the internet.

References
  1. Md. Majharul, Suraiya Pervin and Zerina Begum 2013 Literature Review of Automatic Multiple Documents Text Summarization. International Journal of Innovation and Applied Studies, Vol. 3 No. 1 May 2013, pp. 121-129
  2. Vikrant Gupta, Priya Chauhan, Sohan Garg, Anita Borude, Shobha Krishnan, An Statistical Tool for Multi-Document summarization, International Journal of Scientific and Research Publications, Volume 2, Issue 5, May 2012
  3. Yogan Jaya Kumar and Naomie Salim, Automatic Multi Document Summarization Approaches, Journal of Computer Science 8 (1): 133-140,
  4. Y. Surendranadha Reddy and A. P. Siva Kumar, An Efficient Approach for Web document summarization by Sentence Ranking, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 7, July 2012
  5. Tiedan Zhu, Xinxin Zhao, An Improved Approach to Sentence Ordering For Multi-document Summarization, 2012 IACSIT Hong Kong Conferences, IPCSIT vol. 25 (2012) © (2012) IACSIT Press, Singapore
  6. Nikola Vlahovic, Information Retrieval and Information Extraction in Web 2. 0 environment, International Journal Of Computers, Issue 1, Volume 5, 2011
  7. Yi Guo and George Stylios, An Intelligent Algorithm For Automatic Document Summarization
  8. Monika Arora, Uma Kanjilal, Dinesh Varshney, "Efficient and Intelligent Information Retrieval using Support Vector machine (SVM)", International Journal of Soft Computing and Engineering (IJSCE) , Volume-1, Issue-6, January 2012 pp 39-43
  9. Maryam Hourali and Gholam Ali Montazer, "An intelligent Information Retrieval Approach Based on Two Degrees of Uncertainty Fuzzy Ontology", Hindawi Publishing Corporation Advances in Fuzzy Systems Volume 2011, Article ID 683976, 11 pages
  10. Vandana Dhingra and Komal Kumar Bhatia, "Towards Intelligent Information Retrieval on Web", International Journal on Computer Science and Engineering (IJCSE), Apr 2011, Vol. 3 No. 4, pp 1721-1726
  11. Suzane, Liebowitz Taylor, Deborah A. Dahl, Mark Lipshutz, Carl Weir, Lewis M. Norton, Roslyn Nilson and Marciaa Linebarger, "Integrated Text and Image Understanding for Document Understanding"
  12. Djoerd Hiemstra, "Information Retrieval Models", Published in: Goker, A. , and Davies, J. Information Retrieval: Searching in the 21st Century. John Wiley and Sons, Ltd. , ISBN-13: 978-0470027622, November 2009
  13. Youssef Bassil, " A Survey on Information Retrieval,Text Categorization, and Web Crawling", Journal of Computer Science & Research (JCSCR) - ISSN 2227- 328X, Vol. 1, No. 6, December 2012, pp 1-11
  14. Dilip Kumar Sharma and A. K. Sharma, "A Comparative Analysis of Web Page Ranking Algorithms", International Journal on Computer Science and Engineering, Vol. 02, No. 08, 2010, pp 2670-2676
  15. Sargur Srihari, Stephen Lam, Venu Govindaraju, Rohini Srihari and Jonathan Hull, "Document Understanding: Research Directions", DARPA Document Understanding Workshop, Xerox PARC, Palo Alto, CA, May 6-8, 1992
  16. Soujanya Poria, Erik Cambria, Grégoire Winterstein, Guang-Bin Huang, "Sentic patterns: Dependency-based rules for concept-level sentiment analysis", Knowledge-based Systems Elsevier 2014
Index Terms

Computer Science
Information Sciences

Keywords

Automatic Multi-document Summarization Web Information Retrieval Document Understanding