CFP last date
20 December 2024
Reseach Article

Kannada Word Sense Disambiguation for Machine Translation

by S. Parameswarappa, V.N.Narayana
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 34 - Number 10
Year of Publication: 2011
Authors: S. Parameswarappa, V.N.Narayana
10.5120/4133-5954

S. Parameswarappa, V.N.Narayana . Kannada Word Sense Disambiguation for Machine Translation. International Journal of Computer Applications. 34, 10 ( November 2011), 1-8. DOI=10.5120/4133-5954

@article{ 10.5120/4133-5954,
author = { S. Parameswarappa, V.N.Narayana },
title = { Kannada Word Sense Disambiguation for Machine Translation },
journal = { International Journal of Computer Applications },
issue_date = { November 2011 },
volume = { 34 },
number = { 10 },
month = { November },
year = { 2011 },
issn = { 0975-8887 },
pages = { 1-8 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume34/number10/4133-5954/ },
doi = { 10.5120/4133-5954 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:20:40.248678+05:30
%A S. Parameswarappa
%A V.N.Narayana
%T Kannada Word Sense Disambiguation for Machine Translation
%J International Journal of Computer Applications
%@ 0975-8887
%V 34
%N 10
%P 1-8
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Polysemous Words can have more than one distinct meaning. Word sense disambiguation (WSD) is the ability to identify the exact meaning of such polysemous words in context in a computational manner. WSD is considered as an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problem in Artificial Intelligence. In this paper, we propose an Integrated Kannada Word Sense Disambiguation system which includes a suite of high performance Natural Language Processing (NLP) modules implemented in Perl (Program Extraction and Reporting Language) to carry out word sense disambiguation task. The corpus builder module will construct the raw Kannada corpora using web. The proposed system uses randomly selected sentences from the corpora as a test bed for disambiguation. The electronic machine readable dictionary is built by Dictionary builder module using the corpora. The Target Word Sense Disambiguation module will disambiguate the potential ambiguous target words in a sentence. The polysemous verb in a sentence is disambiguated by Verb Sense Disambiguation module. The rule based disambiguator will disambiguate all ambiguous words with different lexical category. Experiments conducted and the results obtained have been described. The efficiency of the system proved to be reliable and extendable.

References
  1. Kavi Narayana Murthy and Bharadwaja Kumar, G. 2006. Language Identification from Small Text Samples. Journal of Quantitative Linguistics. Vol 13, No 1. pp. 57-80.
  2. Kavi Narayana Murthy. 2001. Computer processing of Kannada Language. In Proceeding of the KUWH.
  3. Ray, P. R., Harsha, V., Sudesna Sarkar and Anupam Basu. 2003. Parts of Speech tagging and local word grouping techniques for Natutal language parsing in Hindi. In Proceedings of the ICON – 2003.
  4. Sridhar, S. N. 2007. Modern Kannada Grammar. Manohar Publications & Distributors. New Delhi.
  5. Eneko Agirre and Philip Edmonds. 2007. Word Sense Disambiguation: Algorithms and Applications. Text, Speech and Language Technology. Vol 33. Springer.
  6. Fellbaum Christiane. 1998. WordNet: An electronic Lexical database. MIT Press.
  7. Lesk, M. 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceeding of the SIGDOC ’86.
  8. Walker, D. and Amsler, R. 1986. The Use of Machine Readable Dictionaries in Sublanguage Analysis. In Analyzing Language in Restricted Domains, Grishman and Kittredge (eds), LEA Press, pp. 69- 83.
  9. Aggire, E. and Rigau, G. 1996. Word Sense Disambiguation using Conceptual density. In Proceeding of International Conference on Computational Linguistics.
  10. Mihalcea Rada. 2005. Large vocabulary unsupervised word sense disambiguation with graph-based algorithms for sequence data labeling. In Proceedings of the Joint Human Language Technology and Empirical Methods in Natural Language Processing Conference (HLT/EMNLP), Vancouver, Canada, pp. 411-418.
  11. Lee Yoong, K., Ng Hwee, T. and Tee chia, K. 2004. Supervised word sense disambiguation with support vector machines and multiple knowledge sources. In Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. Barcelona. Spain. pp. 137-140.
  12. Ng Hwee, T. and Hian, B. Lee. 1996. Integrating multiple knowledge sources to disambiguate word senses: An exemplar-based approach. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL). Santa Cruz. U.S.A. pp. 40-47.
  13. Yarowsky David. 1994. Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. In Proceedings of the 32nd Annual Meeting of the association for Computational Linguistics (ACL). Las Cruces. U.S.A. pp. 88-95.
  14. Yarowsky David. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL). Cambridge. MA. pp. 189-196.
  15. Véronis Jean. 2004. HyperLex: Lexical cartography for information retrieval. Computer Speech & Language, Vol 18 No 3. pp. 223-252.
  16. Schütze, Hinrich. 1998. Automatic word sense discrimination. Computational Linguistics, Vol 24 No 1. pp. 97–123.
  17. Roberto Navigli, Paolo Velardi. 2005. Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation. IEEE Transactions On Pattern Analysis and Machine Intelligence.
  18. Banerjee, S and Pedersen, T. 2002. Lesk algorithm for word sense disambiguation using Wordnet. Computational Linguistics and Intelligent Text Processing. pp. 117-171.
  19. Sasi Kanth Ala and Kavi Narayana Murthy. 2004. Significance of syntactic features for word sense disambiguation. Advances in Natural Language Processing. In the proceedings of the fourth Int. Conf. EsTAL 2004. Alicante. Spain.
  20. Umamaheshwar Rao, G. and Rajyarama, K. 2010. Representation of Complex Predicates in Wordnet. In Proceedings of the 5th Global Wordnet Int. Conf. IIT, Bombay. India. pp. 271-276.
  21. Parameswarappa, S and Narayana, V.N. 2011. Target word sense disambiguation system for Kannada language. In Proceedings of Int. Conf. on Advances in Recent Technologies in Communication and Computing. ARTCom -2011. Bangalore. India.
  22. Baker, M. 1988. A Theory of Grammatical function changing. Chicago, The University of Chicago Press.
  23. Grimshow, J. A. and Mester. 1990. Argument Structure. Cambridge Massachusetts, MIT Press.
  24. Taegoo Chung 2000. Argument structure and English Grammar. Korea University.
  25. Hoa Trang Dang and Martha Palmer. 2005. The Role of Semantic Roles in Disambiguating Verb Senses. In Proceedings of the 43rd Annual Meeting of the ACL. Ann Arbor. pp. 42–49.
  26. Rafiya Begum, Samar Husain, Lakshmi Bai and Dipti Misra Sharma. 2008. Developing Verb Frames for Hindi. In Proceedings of LREC - 08.
  27. Rafiya Begum, Dipti Misra Sharma. 2010. A Preliminary Work on Causative Verbs in Hindi. Eighth Workshop on Asian Language Resources (ALR8) held in conjunction with The 23rd Conference on Computational Linguistics. COLING – 2010.
  28. Matthew Brook O’Donnell, Nick Ellis. 2010. Towards an Inventory of English Verb Argument Constructions. In Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics. pp. 9–16.
  29. Parameswarappa, S and Narayana, V.N. 2012. Kannada Verbs and their Automatic Sense Disambiguation. To appear in the Proceedings of Int. Conf. on Global wordnet. GWC-2012. Kunibiki messe, Japan.
  30. Warmter, S. 1989. Integration of syntactic and semantic constraints for Structural noun phrase disambiguation. In Proceedings of the IJCAI – 1989.
  31. Bharathi, A., Chaitanya, V., and Sangal, R. 1995. Natural Language Processing: A paninian Perspective, PHI – 1995.
  32. Parameswarappa, S and Narayana, V.N. 2011. Rule Based Kannada Word Sense Disambiguator. To appear in the Proceedings of Int. Conf. on Data Engineering and Communication System. ICDECS-2011. Bangalore, India.
  33. Barlow, M. 1996. Corpora for theory and Practice. In Journal of Corpus Linguistics. Vol. 1, No. 1. pp. 1-38.
  34. Lancashire, I., Percy, C and Mayer, C. 1996. Synchronic Corpus Linguistics. Rodopi, Amsterdam, Atlanta.
  35. Oostdijk, N and Hann, P. 1994. Corpus based research into Language, Rodopi, Amsterdam, Atlanta.
  36. Teubert, W. 2000. Corpus Linguistics: A Partian View. In Journal of Corpus Linguistics. Vol 4, No. 1. pp. 1-16.
  37. Adam Kilgarriff, Siva Reddy, Jan Pomikalek and Avinesh, P.V.S. 2010. A corpus factory for many languages. In Proceedings of the LREC – 2010.
  38. Adam Kilgarriff and Girish Duvuru. 2011. Large web corpora for Indian languages. In Proceedings. of International Conference on Information Systems for Indian Languages.
  39. Parameswarappa, S., Narayana, V.N. and Bharathi, G.N. 2012. A Novel Approach to build Kannada Web Corpora. To appear in the Proceedings of Int. Conf. on Computer Communication and Informatics. ICCCI-2012. Coimbatore, India.
  40. Kavi Narayana Murthy. 1997. Electronic Dictionaries and Comp tools. Linguistics Today. Vol. 1, No. 1. pp. 34-50.
  41. Sinclair, J. 1991. Corpus, Concordance, Collocation. Oxford University Press. Oxford.
  42. Wikipedia. (online) http://kn.wikipedia.org.
  43. Wiki Dictionary. (online) http://kn.wiktionary.org.
  44. Sampada. (online) http://sampada.net
  45. Kannada web blog. (online) http://kannadablogs.ning.com/
  46. Prajavani. (online) http://prajavani.net
  47. Parser. (online). http://ltrc.iiit.ac.in/analyzer/kannada/
  48. Bharathi, A and Sangal, R. 1993. Parsing free word order languages in Paninian framework. In Proceedings of the ACL-1993. pp. 105-111.
Index Terms

Computer Science
Information Sciences

Keywords

Kannada Word Sense Disambiguation Kannada Corpus Kannada machine readable dictionary Target Word Verb Sense Disambiguation Verbalizer