CFP last date
20 December 2024
Reseach Article

Conditional Random Field Based Named Entity Recognition in Geological text

by Sobhana N.V, Pabitra Mitra, S.K. Ghosh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Number 3
Year of Publication: 2010
Authors: Sobhana N.V, Pabitra Mitra, S.K. Ghosh
10.5120/72-166

Sobhana N.V, Pabitra Mitra, S.K. Ghosh . Conditional Random Field Based Named Entity Recognition in Geological text. International Journal of Computer Applications. 1, 3 ( February 2010), 119-125. DOI=10.5120/72-166

@article{ 10.5120/72-166,
author = { Sobhana N.V, Pabitra Mitra, S.K. Ghosh },
title = { Conditional Random Field Based Named Entity Recognition in Geological text },
journal = { International Journal of Computer Applications },
issue_date = { February 2010 },
volume = { 1 },
number = { 3 },
month = { February },
year = { 2010 },
issn = { 0975-8887 },
pages = { 119-125 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume1/number3/72-166/ },
doi = { 10.5120/72-166 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:44:06.923550+05:30
%A Sobhana N.V
%A Pabitra Mitra
%A S.K. Ghosh
%T Conditional Random Field Based Named Entity Recognition in Geological text
%J International Journal of Computer Applications
%@ 0975-8887
%V 1
%N 3
%P 119-125
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The paper describes about the development of a Named Entity Recognition (NER) system for Geological text using Conditional Random Fields (CRFs). The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the various named entity (NE) classes. The NE tagged geological corpus was developed from the collection of scientific reports and articles on the geology of the Indian subcontinent has been used to build up the system. The training set consists of more than 2 lakh words and has been manually annotated with a NE tag set of seventeen tags. The system is able to recognize 17 classes of NEs with 75.8% F-measure.

References
  1. Wakao, T., Gaizauskas, V. and Wilks, Y. 1996. Evaluation of an algorithm for the recognition and classification of proper names, In Proceedings of COLING-96.
  2. Singh, A. K. and Surana, H. 2007. Can Corpus Based Measures be Used for Comparative Study of Languages, In Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology, ACL 2007.
  3. Bikel, D. M. , Schwartz, R. L. and Weischedel R. M. 1999. An Algorithm that Learns What's in a Name. Machine Learning, pp. 211-231.
  4. Borthwick, 1999. Maximum Entropy Approach to Named Entity Recognition, Ph.D. thesis, New York University.
  5. Lafferty, J. D., McCallum, A. and Perera, F. C. N. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, pp.282-289, ICML 2001.
  6. Li W and McCallum A. 2003. Rapid development of Hindi named entity recognition using conditional random fields and feature induction, ACM Transactions on Asian Language Information Processing (TALIP), pp.290–294.
  7. Srihari, R., Niu, C. and Li, W. 2000. A Hybrid Approach for Named Entity and Sub-Type Tagging, In Proceedings of the sixth conference on Applied natural language processing.
  8. Toral, A., Noguera, E. Llopis, F. and Munoz, R. 2005. Improving question answering using named entity recognition, In Proceedings of the 10th NLDB congress, Lecture notes in Computer Science.
  9. Spain, A., Babych B. and Hartley. 2003. A. Improving machine translation quality with automatic named entity recognition, Springer-Verlag 2003.
  10. Wallach, H. M. 2004. Conditional random fields: An introduction, Technical Report MS-CIS-04-21, University of Pennsylvania, Department of Computer and Information Science, University of Pennsylvania.
  11. Taku kudo. 2005. CRF++, an open source toolkit for CRF, http://crfpp.sourceforge.net
Index Terms

Computer Science
Information Sciences

Keywords

Geological Corpus Named Entity Recognition Precision Recall F-measure Geographic references