We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Sentence Boundary Detection in Kannada Language

by Deepamala. N, Ramakanth Kumar. P
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 39 - Number 9
Year of Publication: 2012
Authors: Deepamala. N, Ramakanth Kumar. P
10.5120/4852-7124

Deepamala. N, Ramakanth Kumar. P . Sentence Boundary Detection in Kannada Language. International Journal of Computer Applications. 39, 9 ( February 2012), 38-41. DOI=10.5120/4852-7124

@article{ 10.5120/4852-7124,
author = { Deepamala. N, Ramakanth Kumar. P },
title = { Sentence Boundary Detection in Kannada Language },
journal = { International Journal of Computer Applications },
issue_date = { February 2012 },
volume = { 39 },
number = { 9 },
month = { February },
year = { 2012 },
issn = { 0975-8887 },
pages = { 38-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume39/number9/4852-7124/ },
doi = { 10.5120/4852-7124 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:26:02.777794+05:30
%A Deepamala. N
%A Ramakanth Kumar. P
%T Sentence Boundary Detection in Kannada Language
%J International Journal of Computer Applications
%@ 0975-8887
%V 39
%N 9
%P 38-41
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Sentence Boundary Detection is a pre-processing step for any Natural Language Processing application. Various algorithms have been used to achieve Sentence Boundary Detection or Disambiguation in different languages. In this paper, a rule based method is proposed and tested to achieve Sentence Boundary Detection for Kannada Language. Kannada being a grammatically rich Indian language is analyzed based on semantics and tested with a 227K bytes corpus. The code is written in C using wide characters, with support for Unicode. Results showed 99.2% success in detecting sentence boundary.

References
  1. Manning, C.D. and. Schütze., H. 2002. Foundations of statistical natural language processing. The MIT Press, London.
  2. J. Reynar, and Ratnaparkhi. A. 1997. A Maximum Entropy Approach to Identifying Sentence Boundaries, in Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington D.C, pp. 16-19.
  3. Palmer, D.D. and Hearst, M.A..1997. Adaptive multilingual sentence boundary disambiguation. Computational Linguistics 23 241–267
  4. Mikheev, A. 2000. Tagging Sentence Boundaries. In: Proceedings of the NAACL, Seattle, pp 264-271.
  5. T. Kiss and Strunk, J. 2006. Unsupervised multilingual sentence boundary detection. Computational Linguistics, 32(4):485–525.
  6. Walker, Daniel J., David E. Clements, Maki, Darwin and Jan, W. Amtrup. 2001. Sentence boundary detection: a comparison of paradigms for improving MT quality. In: Proceedings of the MT Summit VIII, Santiago de Compostela, Spain.
  7. Akita, Y. 2006. Sentence Boundary Detection of Spontaneous Japanese Using Statistical Language Model and Support Vector Machines. In: Proceedings of. Interspeech-ICSLP, Pittsburgh, PA.
  8. Singh, Preetam, Negi, Rauthan M.M.S and Dhami, H.S. 2010. Sentence Boundary Disambiguation: a User Friendly Approach. IJCA. Vol, 7-No.8.
  9. Mona Parakh, Rajesha N. and Ramya M. 2011. Sentence Boundary Disambiguation in Kannada Texts, Language in India. www.languageinindia.com. 11:5 May 2011 Special Volume: Problems of Parsing in Indian Languages, pp. 17- 19.
  10. Gillick, D. 2009. Sentence Boundary Detection and the Problem with the U.S. In: Proceedings of the NAACL HLT: Short Papers, Boulder, Colorado.
  11. Agarwal N., Ford K., and Shneider M., Sentence Boundary Detection using a MaxEnt Classifier. citeseerx.ist.psu.edu
  12. Wang H. and Huang Y. 2003. Bondec - A sentence Boundary Detector. CS224N Project, Stanford, 2003
Index Terms

Computer Science
Information Sciences

Keywords

Sentence Boundary Detection Verb Suffix Abbreviation