CFP last date
20 January 2025
Reseach Article

Segmentation of Text Lines and Characters in Ancient Tamil Script Documents using Computational Intelligence Techniques

by N. Sridevi, P. Subashini
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 52 - Number 14
Year of Publication: 2012
Authors: N. Sridevi, P. Subashini
10.5120/8268-1826

N. Sridevi, P. Subashini . Segmentation of Text Lines and Characters in Ancient Tamil Script Documents using Computational Intelligence Techniques. International Journal of Computer Applications. 52, 14 ( August 2012), 7-12. DOI=10.5120/8268-1826

@article{ 10.5120/8268-1826,
author = { N. Sridevi, P. Subashini },
title = { Segmentation of Text Lines and Characters in Ancient Tamil Script Documents using Computational Intelligence Techniques },
journal = { International Journal of Computer Applications },
issue_date = { August 2012 },
volume = { 52 },
number = { 14 },
month = { August },
year = { 2012 },
issn = { 0975-8887 },
pages = { 7-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume52/number14/8268-1826/ },
doi = { 10.5120/8268-1826 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:52:12.515168+05:30
%A N. Sridevi
%A P. Subashini
%T Segmentation of Text Lines and Characters in Ancient Tamil Script Documents using Computational Intelligence Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 52
%N 14
%P 7-12
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Document image segmentation is one of the critical phases in handwritten character recognition system. Correct segmentation of individual characters decides the accuracy of the recognition system. It is used to decompose the sequence of characters into individual characters to segmenting text lines and then words. Ancient Tamil scripts documents consist of vowels, consonants and various modifiers. Hence proper segmentation algorithm is required. In existing methods, segmentation of overlapping lines and characters are difficult. In order to overcome this problem, two methods are proposed one for line segmentation and another for character segmentation, first method uses projection profile and PSO for line segmentation. In second method combination of connected components along with nearest neighborhood methods are used to segment the characters. Experimental results show that these methods give better results when compared to other methods.

References
  1. Raghuraj Singh. S. Yadav and Prabhat Verma" Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network" , International Journal of Computer Science & Communication, Vol. 1, No. 1, January-June 2010, pp. 91-95.
  2. Vijay kumar and Pankaj K. Sengar, "Segmentation of Printed Text in Devanagari Script and Gurmukhi Script", International Journal of Computer Applications,Vol 3, No. 8, June 2010,pp. 24-29
  3. N. Dhamayanthi, and P. Thangavel," Handwritten Tamil character recognition using neural network", Proceeding of Tamil Internet 2000, Singapore, July 22-24, 2000, pp. 171-176.
  4. http://www. italki. com/notebook/entry/66643. htm.
  5. Laurence Likforman-Sulem, et. al," Text Line Segmentation of Historical Documents: a survey", Submitted to Special Issue on Analysis of Historical Document, International Journal on Document Analysis and Recognition, Springer, 2006.
  6. Vikas J Dongre and Vijay H Manka, "Devnagari Document Segmentation Using Histogram Approach", International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol. 1, No. 3, August 2011, pp. 46 -53.
  7. R. C. Gonzalez and R. E. Woods. (2004): Digital Image Processing, Pearson Education.
  8. Stephen Marchand Maillet ,"Binary Digital Image Processing- A Discrete Approach", 1999
  9. C V Lakshmi, C PAtardhan "A Multi-font OCR System for printed Telugu Text. ", Proceeding of LEC'02, IEEE, 2002
  10. L. Likforman-Sulem, A. Zahour, B. Taconet," Text line segmentation of historical documents: a survey", International journal of Document Analysis and Recognition,Vol 9, 2007, pp. 123 – 138
  11. Itay Bar-Yosef et, al, "Line segmentation for degraded handwritten historical documents".
  12. R. Sanjeev Kunte and R D Sudhaker Samuel," A Simple and efficient optical character recognition system for basic symbols in printed kannada text", Sadhana, Vol 32, Part 5, October 2007, pp. 521 – 533.
  13. Oliveira . S. L. , S. A. Britto, and R. Sabourin, " Optimizing Class-Related Thresholds with Particle Swarm Optimization", Proceeding of International Joint Conference on Neural Networks, IEEE, Montreal, Canada, July 31 – August 4, 2005,pp. 1511 – 1516.
  14. M Swamy Das et. al, "Segmentation of Overlapping Text Lines, Characters in Printed Telugu Text Document Images", International Journal of Engineering Science and Technology, Vol. 2, No. 11, 2010,pp. 6606 – 6610.
  15. S. Santhosh Baboo, P. Subashini and M. Krishnaveni, "Combining Self-Organizing Maps and Radial Basis Function Networks for Tamil handwritten Character Recognition", International Journal of ICGST-GVIP, Vol. 9, No. 4, August 2009, pp. 1- 7.
  16. Gift Siromoney, S Govindaraju, M. Chandrasekaran, "Thirukkural in Ancient Scripts", Department of Statistics, Madras Christian College, Tambaram, 1980.
Index Terms

Computer Science
Information Sciences

Keywords

Character segmentation Projection profile connected components nearest neighborhood PSO