International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 84 - Number 3 |
Year of Publication: 2013 |
Authors: Arshdeep Singh, Jyoti Rani, Amandeep Kaur |
10.5120/14553-2650 |
Arshdeep Singh, Jyoti Rani, Amandeep Kaur . Maximum Entropy Approach based Named Entity Recognition in Punjabi Language. International Journal of Computer Applications. 84, 3 ( December 2013), 1-5. DOI=10.5120/14553-2650
Named Entity Recognition is the task of identifying and classifying named entities into some predefine categories like person, location, organization etc. NER is used in many applications like text summarization, text classification, question answering and machine translation systems etc. For English a lot of work has already been done in the field of NER, where capitalization is a major key for rules, whereas Indian languages do not have such feature. This makes the task difficult for Indian Languages. This work reports about the evaluation of a Named Entity Recognition (NER) system for Punjabi language using the Maximum Entropy Approach (MAXENT). A manually tagged Punjabi news corpus is used for the evaluation which was developed from Punjabi newspaper available online. The training set annotated with a NE tagset of 12 tags is used. A MAXENT based NER system for Punjabi has reported an overall Precision, Recall and F-Score values of 90. 92%, 72. 30% and 80. 55% respectively with feature set context word, Part of Speech (POS) information, NE tag of previous word and First name Gazetteer list.