We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Persian Named Entity Recognition based with Local Filters

by Morteza Kolali Khormuji, Mehrnoosh Bazrafkan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 100 - Number 4
Year of Publication: 2014
Authors: Morteza Kolali Khormuji, Mehrnoosh Bazrafkan
10.5120/17510-8062

Morteza Kolali Khormuji, Mehrnoosh Bazrafkan . Persian Named Entity Recognition based with Local Filters. International Journal of Computer Applications. 100, 4 ( August 2014), 1-6. DOI=10.5120/17510-8062

@article{ 10.5120/17510-8062,
author = { Morteza Kolali Khormuji, Mehrnoosh Bazrafkan },
title = { Persian Named Entity Recognition based with Local Filters },
journal = { International Journal of Computer Applications },
issue_date = { August 2014 },
volume = { 100 },
number = { 4 },
month = { August },
year = { 2014 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume100/number4/17510-8062/ },
doi = { 10.5120/17510-8062 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:29:03.489979+05:30
%A Morteza Kolali Khormuji
%A Mehrnoosh Bazrafkan
%T Persian Named Entity Recognition based with Local Filters
%J International Journal of Computer Applications
%@ 0975-8887
%V 100
%N 4
%P 1-6
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Persian (Farsi) language named entity recognition is a challenging, difficult, yet important task in natural language processing. This paper presents an approach based on a Local Filters model to recognize Persian (Farsi) language named entities. It uses multiple dictionaries, which are freely available on the Web. A dictionary is a collection of phrases that describe named entities. The framework is composed of two stages: (1) detection of named entity candidates using dictionaries for lookups and (2) filtering of false positives based. Dictionary lookups are performed using an efficient prefix-tree data structure. Our dictionary ?? based recognizer performs on Persian (Farsi) language with up to 88. 95% precision, 79. 65% recall, and an 82. 73% F1 score using ASEM.

References
  1. Branimir T. Todorovi, Svetozar R. Rancic, Edin H. Mulalic, Context Hidden Markov Model for Named Entity Recognition, Approximation and Computation, springer, Volume 42, 2011, pp 447-460.
  2. Duc-Thuan Vo, Cheol-Young Ock, A Hybrid Approach of Pattern Extraction and Semi-supervised Learning for VietnameseNamed Entity Recognition, Computational Collective Intelligence. Technologies and Applications, springer, Volume 7653, 2012, pp 83-93.
  3. David Nadeau, Peter D. Turney, Stan Matwin, Semi- Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision, 2007 http://citeseerx. ist. psu. edu/, doi=10. 1. 1. 109. 4327.
  4. Thi-Ngan Pham Vietnamese People's Police Acad. , Hanoi, Vietnam, Le Minh Nguyen, Quang-Thuy Ha, Named Entity Recognition for Vietnamese Documents Using Semi-supervised Learning Method of CRFs with Generalized Expectation Criteria, Asian Language Processing (IALP), 2012 International Conference on, IEEE, 13-15 Nov. 2012,Page 85 - 88.
  5. Chieu, Hai Leong and Ng, Hwee Tou, Named Entity Recognition: A Maximum Entropy Approach Using Global Information, COLING'02: Proceedings of the 19th international conference on Computational linguistics, 2002.
  6. M. BIJANKHAN, The Role of the Corpus in Writing a Grammar: An Introduction to a Software, Iranian Journal of Linguistics, vol. 19, no. 2, fall and winter 2004.
  7. Christian Bizer, Jens Lehmann, Georgi Kobilarov, Soren Auer, Christian Becker, Richard Cyganiak, and Sebastian Hellmann. DBpedia - A crystallization point for the Web of Data. Web Semantics, 7(3):154-165, 2009.
  8. Michael A. Olson, Keith Bostic, and Margo Seltzer. Berkeley DB. In ATEC '99: Proceedings of the annual conference on USENIX Annual Technical Conference, pages 43-43, Monterey, California, 1999.
  9. http://ece. ut. ac. ir/dbrg/bijankhan/ Bijankhan corpus was created in DBRG Lab. at University of Tehran ECE department.
  10. Rathany Chan Sam, Huong Thanh Le, Thuy Thanh Nguyen, Thien Huu Nguyen, "Combining Proper Name-Coreference with Conditional Random Fields for Semi-supervised Named Entity Recognition in Vietnamese Text", Advances in Knowledge Discovery and Data Mining, springer Volume 6634, 2011, pp 512-524.
  11. Norshuhani Zamin, Alan Oxley, Building a Corpus-Derived Gazetteer for Named Entity Recognition, Software Engineering and Computer Systems, springer, Volume 180, 2011, pp 73-80.
  12. Nuno Freire, Jos Borbinha, Pvel Calado, An Approach for Named Entity Recognition in Poorly Structured Data, The SemanticWeb: Research and Applications, springer, Volume 7295, 2012, pp 718-732.
  13. Yu Miao, Lv Yajuan, Liu Qun, Su Jinsong, Xiong Hao, Chinese Named Entity Recognition and Disambiguation Based on Wikipedia, Natural Language Processing and Chinese Computing, springer, Volume 333, 2012, pp 272-283.
  14. Christopher Dozier, Ravikumar Kondadadi, Marc Light, Arun Vachher, Sriharsha Veeramachaneni, Ramdev Wudali, Named Entity Recognition and Resolution in Legal Text, Semantic Processing of Legal Texts, springer, Volume 6036, 2010, pp 27-43.
  15. Anup Patel, Ganesh Ramakrishnan, Pushpak Bhattacharya, Incorporating Linguistic Expertise Using ILP for Named Entity Recognition in Data Hungry Indian Languages, Inductive Logic Programming, springer, Volume 5989, 2010, pp 178-185.
  16. Asif Ekbal, Sriparna Saha, Christoph S. Garbe, Multiobjective Optimization Approach for Named Entity Recognition, PRICAI 2010: Trends in Artificial Intelligence, springer, Volume 6230, 2010, pp 52-63.
  17. Micha Marcinczuk, Maciej Piasecki, Study on Named Entity Recognition for Polish Based on Hidden Markov Models, Text, Speech and Dialogue, springer, Volume 6231, 2010, pp 142-149.
Index Terms

Computer Science
Information Sciences

Keywords

Persian Natural language processing Named Entity Recognition Local Filters