CFP last date
20 December 2024
Reseach Article

Query Optimization: A Solution for Low Recall Problem in Hindi Language Information Retrieval

by Kumar Sourabh, Vibhakar Mansotra
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 55 - Number 17
Year of Publication: 2012
Authors: Kumar Sourabh, Vibhakar Mansotra
10.5120/8845-2987

Kumar Sourabh, Vibhakar Mansotra . Query Optimization: A Solution for Low Recall Problem in Hindi Language Information Retrieval. International Journal of Computer Applications. 55, 17 ( October 2012), 6-17. DOI=10.5120/8845-2987

@article{ 10.5120/8845-2987,
author = { Kumar Sourabh, Vibhakar Mansotra },
title = { Query Optimization: A Solution for Low Recall Problem in Hindi Language Information Retrieval },
journal = { International Journal of Computer Applications },
issue_date = { October 2012 },
volume = { 55 },
number = { 17 },
month = { October },
year = { 2012 },
issn = { 0975-8887 },
pages = { 6-17 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume55/number17/8845-2987/ },
doi = { 10.5120/8845-2987 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:57:29.416602+05:30
%A Kumar Sourabh
%A Vibhakar Mansotra
%T Query Optimization: A Solution for Low Recall Problem in Hindi Language Information Retrieval
%J International Journal of Computer Applications
%@ 0975-8887
%V 55
%N 17
%P 6-17
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

While information retrieval (IR) has been an active field of research for decades, for much of its history it has had a very strong bias towards English as the language of choice for research and evaluation purposes. Whatever they may have been, over the years, many of the motivations for an almost exclusive focus on English as the language of choice in IR have lost their validity. The Internet is no longer monolingual, as the non- English content is growing rapidly. Hindi is the third most widely-spoken language in the world (after English and Mandarin): an estimated 500-600 million people speak this language. Information Retrieval in Hindi language is getting popularity and IR systems face low recall if existing systems are used as-is. Certain characteristics of Indian languages cause the existing algorithms not being able to match relevant keywords in the documents for retrieval. Some of the major characteristics that affect Indian language IR are due to language morphology, compound word formations, word spelling variations, Ambiguity, Word Synonym, foreign language influence, lack of standards for spelling words. Taking into consideration the aforesaid issues we introduce Hindi Query Optimization technique (design and development) which solved the problem of recall up to a great extent.

References
  1. GirirajAgarwal"Indian Languages on the Internet" Article Source http://span. state. gov/wwwfspseptoct0948. pdf
  2. DebasisMandal, Mayank Gupta, SandipanDandapat, Pratyush Banerjee, and SudeshnaSarkar "Bengali and Hindi to English CLIR Evaluation" Department of Computer Science and Engineering IIT Kharagpur, India – 721302Springer Berlin Heidelberg Series Volume 5152 Series ISSN 0302-9743 Pages pp 95-102.
  3. Manoj Kumar Chinnakotla, SagarRanadive, Pushpak Bhattacharyya and Om P. Damani"Hindi and Marathi to English Cross Language Information Retrieval" at CLEF 2007 Department of CSE IIT Bombay Mumbai, India Advances in Multilingual and Multimodal Information Retrieval Pages 111 - 118 Springer-Verlag Berlin, Heidelberg ©2008 ISBN: 978-3-540-85759-4
  4. Prasad Pingali and VasudevaVarma"Hindi and Telugu to English Cross Language Information Retrieval" at CLEF 2006 Language Technologies Research Centre IIIT, Hyderabad, India. Evaluation of Multilingual and Multi-modal Information Retrieval Lecture Notes in Computer Science, 2007, Volume 4730/2007, 35-42, DOI: 10. 1007/978-3-540-74999-8_4
  5. Tan Xu1 and Douglas W. Oard1 "FIRE-2008 at Maryland: English-Hindi CLIR" College of Information Studies and 2CLIP Lab, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA.
  6. Mallammavreddy, Hanumanthapa. "ENGLISH TO KANNADA/TELUGU NAME TRANSLITERATION IN CLIR: A STATISTICAL APPROACH" Department of Computer Science and Applications, Bangalore University, Bangalore-560 056, INDIA IJMI International Journal of Machine Intelligence ISSN: 0975–2927 & E-ISSN: 0975–9166, Volume 3, Issue 4, 2011, pp-340-345
  7. Mallamma V Reddy, Dr. M. Hanumanthappa"Kannada and Telugu Native Languages to English Cross Language Information Retrieval" Department of Computer Science and Applications, Bangalore University, Bangalore, INDIA. (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5) , 2011, 1876-1880
  8. Dr. S. Saraswathi, AsmaSiddhiqaa. M, Kalaimagal. K, Kalaiyarasi. M"BiLingual Information Retrieval System for English and Tamil" JOURNAL OF COMPUTING, VOLUME 2, ISSUE 4, APRIL 2010, ISSN 2151-9617
  9. Pingali V. V. Prasad Rao "Recall Oriented Approaches for improved Indian Language Information Access" Language Technologies Research Centre International Institute of Information Technology Hyderabad - 500 032, INDIA August 2009 Source iiit. ac. in
  10. SivajiBandhyopadhyayAmitava Das PinakiBhaskar"English Bengali Ad-hoc Monolingual Information Retrieval Task Result at FIRE 2008" Department of Computer Science and EngineeringJadavpur University, Kolkata-700032, India Source www. amitavadas. com/Pub/Fire_2010. pdf
  11. Ashish Almeida, Pushpak Bhattacharyya "Using Morphology to Improve Marathi Monolingual Information Retrieval" IIT Bombay. Source http://www. isical. ac. in/~fire/paper/Ashish_almeida-IITB-fire2008. pdf
  12. GANAPATHIRAJU Madhavi, BALAKRISHNAN Mini, BALAKRISHNAN N. , REDDY Raj "Om: One tool for many (Indian) languages" Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA Journal of Zhejiang University SCIENCE ISSN 1009-3095
  13. Sujoy Das AnuragSeetha M. Kumar J. L. Rana"Post Translation Query Expansion using Hindi Word-Net for English-Hindi CLIR System" source www. isical. ac. in/~fire/paper_2010/sujaydas-manit-fire2010. pdf
  14. KeshavNiranjan"Language Technology in India" Ph. D. Candidate LANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume 12: 4 April 2012 ISSN 1930-2940
  15. Kumar Sourabh and Vibhakar Mansotra"FactorsAffecting the Performance of Hindi Language searching on web: An Experimental Study". Department of Computer Science and IT, University of Jammu J&K 180001. INDIA International Journal of Scientific & Engineering Research Volume 3, Issue 4, April-2012 ISSN 2229-5518
  16. Kumar Sourabh and Vibhakar Mansotra"An Experimental Analysis on the Influence of English on Hindi Language Information Retrieval"Department of Computer Science and IT, University of Jammu J&K 180001. INDIA International Journal of Computer Applications (0975 – 8887) Volume 41– No. 11, March 2012
Index Terms

Computer Science
Information Sciences

Keywords

Information retrieval Hindi Monolingual Query optimization Interface Hindi WordNet