Reseach Article

Part-of-Speech Tagger for Marathi Language using Limited Training Corpora

Published on February 2014 by H. B. Patil, A. S. Patil, B. V. Pawar
National Conference on Recent Advances in Information Technology
Foundation of Computer Science USA
NCRAIT - Number 4
February 2014
Authors: H. B. Patil, A. S. Patil, B. V. Pawar

Part-of-speech tagging in Marathi language is a very complex task as Marathi is highly inflectional in nature & free word order language. In this paper we have demonstrated a rule-based Part-of-Speech tagger for Marathi Language. The hand–constructed rules that are learned from corpus and some manual addition after studying the grammar of Marathi language are added and that are used for developing the tagger. Disambiguation is done by analyzing the linguistic feature of the word, its preceding word, its following word, etc. After testing the system with three data sets we got encouraging results. The accuracy of our system is of an average 78. 82% after testing it on three different data sets.

Index Terms

Computer Science
Information Sciences


Pos Tagger Morphological Analysis Rule-based.