International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 33 - Number 3 |
Year of Publication: 2011 |
Authors: Vishal Gupta, Gurpreet Singh Lehal |
10.5120/4001-5668 |
Vishal Gupta, Gurpreet Singh Lehal . Named Entity Recognition for Punjabi Language Text Summarization. International Journal of Computer Applications. 33, 3 ( November 2011), 28-32. DOI=10.5120/4001-5668
Named Entity Recognition (NER) is used to locate and classify atomic elements in text into predetermined classes such as the names of persons, organizations, locations, concepts etc. NER is used in many applications like text summarization, text classification, question answering and machine translation systems etc. For English a lot of work has already done in field of NER, where capitalization is a major clue for rules, whereas Indian Languages do not have such feature. This makes the task difficult for Indian languages. This paper explains the Named Entity Recognition System for Punjabi language text summarization. A Condition based approach has been used for developing NER system for Punjabi language. Various rules have been developed like prefix rule, suffix rule, propername rule, middlename rule and lastname rule. For implementing NER, various resources in Punjabi, have been developed like a list of prefix names, a list of suffix names, a list of proper names, middle names and last names. The Precision, Recall and F-Score for condition based NER approach are 89.32%, 83.4% and 86.25% respectively.