National Conference on Recent Advances in Information Technology |
Foundation of Computer Science USA |
NCRAIT - Number 4 |
February 2014 |
Authors: H. B. Patil, A. S. Patil, B. V. Pawar |
d1d1aaca-e332-4970-accd-885e4fe5da7d |
H. B. Patil, A. S. Patil, B. V. Pawar . Part-of-Speech Tagger for Marathi Language using Limited Training Corpora. National Conference on Recent Advances in Information Technology. NCRAIT, 4 (February 2014), 33-37.
Part-of-speech tagging in Marathi language is a very complex task as Marathi is highly inflectional in nature & free word order language. In this paper we have demonstrated a rule-based Part-of-Speech tagger for Marathi Language. The hand–constructed rules that are learned from corpus and some manual addition after studying the grammar of Marathi language are added and that are used for developing the tagger. Disambiguation is done by analyzing the linguistic feature of the word, its preceding word, its following word, etc. After testing the system with three data sets we got encouraging results. The accuracy of our system is of an average 78. 82% after testing it on three different data sets.