CFP last date
20 December 2024
Reseach Article

Transliterated Search of Hindi Lyrics

by Pallavi Verulkar, Rakesh Chandra Balabantray, Rohit Arvind Chakrapani
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 121 - Number 1
Year of Publication: 2015
Authors: Pallavi Verulkar, Rakesh Chandra Balabantray, Rohit Arvind Chakrapani
10.5120/21506-4390

Pallavi Verulkar, Rakesh Chandra Balabantray, Rohit Arvind Chakrapani . Transliterated Search of Hindi Lyrics. International Journal of Computer Applications. 121, 1 ( July 2015), 32-37. DOI=10.5120/21506-4390

@article{ 10.5120/21506-4390,
author = { Pallavi Verulkar, Rakesh Chandra Balabantray, Rohit Arvind Chakrapani },
title = { Transliterated Search of Hindi Lyrics },
journal = { International Journal of Computer Applications },
issue_date = { July 2015 },
volume = { 121 },
number = { 1 },
month = { July },
year = { 2015 },
issn = { 0975-8887 },
pages = { 32-37 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume121/number1/21506-4390/ },
doi = { 10.5120/21506-4390 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:07:20.615484+05:30
%A Pallavi Verulkar
%A Rakesh Chandra Balabantray
%A Rohit Arvind Chakrapani
%T Transliterated Search of Hindi Lyrics
%J International Journal of Computer Applications
%@ 0975-8887
%V 121
%N 1
%P 32-37
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A huge number of Indian languages are written using native scripts. However, usually the websites and the user generated content (such as tweets, chats and blogs) in these languages are written using Roman script due to various phonetic-typing as the users feel comfortable in writing in their native language. Transliteration of many languages into Roman script is used copiously on the web not just for documents but also for user queries that are used to search these documents. A challenge that search engines face while processing transliterated queries and documents is that of extensive spelling variation. The aim of this topic is to systematically formalize several research problems that one must solve to tackle this unique situation prevalent in Web search for users of many languages around the world. We choose to solve the problem of Language identification when the Hindi words are written in Roman script. Then, transliterate the roman scripted Hindi words into Devanagri form. And when a search query is given, results should be retrieved for Hindi song lyrics.

References
  1. U. Z. Ahmed, K. Bali, M. Choudhury, and S. VB. Challenges in designing input method editors for indian lan-guages: The role of word-origin and context. In Proceedings of the WTIM, pages 1–9, November 2011.
  2. K. Knight and J. Graehl. Machine transliteration. Comput. Linguist. , 24(4):599–612, Dec. 1998.
  3. Gupta, Kanika, Monojit Choudhury, and Kalika Bali. "Mining Hindi-English Transliteration Pairs from Online Hindi Lyrics. " In LREC, pp. 2459-2465. 2012.
  4. FIRE Shared Task on Transliterated Search http://research. microsoft. com/en-us/events/fire13_st_on_transliteratedsearch/default. aspx
  5. R. Saha Roy, M. Choudhury, P. Majumder, and K. Agarwal. "Overview and Datasets of FIRE 2013 Track on Transliterated Search. " In Fifth Forum for Information Retrieval Evaluation, 2013.
  6. Ben King and Steven Abney "Labeling the languages of words in mixed-language documents using weakly supervised methods. " Conference of the North American Chapter of the Association for Computational Linguisics: Human Language Technologies, pages 1110-1119, June-2013
  7. Barman, Utsab, Amitava Das, Joachim Wagner, and Jennifer Foster. "Code Mixing: A Challenge for Language Identification in the Language of Social Media. " EMNLP 2014 (2014): 13.
  8. Dinesh Kumar Prabhakar, Sukomal Pal "ISM@FIRE2013 shared task on Transliterated Search" http://research. microsoft. com/en-us/events/fire13_st_on_transliteratedsearch/fire14st. aspx
  9. Partha Pakray, Pinaki Bhaskar "Transliterated Search System for Indian Languages" http://research. microsoft. com/en-us/events/fire13_st_on_transliteratedsearch/fire14st. aspx
  10. Parth Gupta, Paolo Rosso, and Rafael E. Banchs Encoding transliteration variation through dimensionality reduction: FIRE Shared Task on Transliterated Search http://research. microsoft. com/en-us/events/fire13_st_on_transliteratedsearch/fire14st. aspx
  11. Hardik Joshi , Apurva Bhatt, Honey Patel "Transliterated Search using Syllabification Approach" http://research. microsoft. com/en-us/events/fire13_st_on_transliteratedsearch/fire14st. aspx
Index Terms

Computer Science
Information Sciences

Keywords

Language Identification Transliteration Indexing Search query