International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 22 - Number 8 |
Year of Publication: 2011 |
Authors: B. Sasidhar, P. M. Yohan, Dr. A. Vinaya Babu, Dr. A. Govardhan |
10.5120/2602-3628 |
B. Sasidhar, P. M. Yohan, Dr. A. Vinaya Babu, Dr. A. Govardhan . Named Entity Recognition in Telugu Language using Language Dependent Features and Rule based Approach. International Journal of Computer Applications. 22, 8 ( May 2011), 30-34. DOI=10.5120/2602-3628
The objective of Named Entity Recognition (NER) is to categorize all named entities in a document into predefined classes like person, organization, location, brand names and others. Named Entity Recognition is a difficult process in Indian languages like Telugu, Hindi, and Bengali, Urdu etc., where sufficient gazetteers and annotated corpora are not available compared to English language? A rule based systems is very difficult to implement because of lack of grammatical and linguistic analysis to make rules in Indian languages like “Telugu”. In this paper we describe the identification of Named Entities using various features, gazetteer lists using language dependent features and rule based approaches for Telugu language. Here we described two phase representation of Named Entity Recognition. The first phase describes the noun identification using Telugu dictionaries, noun morphological stemmer and noun suffixes. The second phase identifies the Named Entities using transliterated gazetteer lists related to different Named Entity tags, various Named Entity suffix features, context features and morphological features.