CFP last date
20 December 2024
Reseach Article

A Study of different Text Line Extraction Techniques for Multi-font and Multi-size Printed Kannada Documents

by R Prajna, Ramya V R, Mamatha H.r
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 119 - Number 11
Year of Publication: 2015
Authors: R Prajna, Ramya V R, Mamatha H.r
10.5120/21113-3923

R Prajna, Ramya V R, Mamatha H.r . A Study of different Text Line Extraction Techniques for Multi-font and Multi-size Printed Kannada Documents. International Journal of Computer Applications. 119, 11 ( June 2015), 32-38. DOI=10.5120/21113-3923

@article{ 10.5120/21113-3923,
author = { R Prajna, Ramya V R, Mamatha H.r },
title = { A Study of different Text Line Extraction Techniques for Multi-font and Multi-size Printed Kannada Documents },
journal = { International Journal of Computer Applications },
issue_date = { June 2015 },
volume = { 119 },
number = { 11 },
month = { June },
year = { 2015 },
issn = { 0975-8887 },
pages = { 32-38 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume119/number11/21113-3923/ },
doi = { 10.5120/21113-3923 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:05:03.057408+05:30
%A R Prajna
%A Ramya V R
%A Mamatha H.r
%T A Study of different Text Line Extraction Techniques for Multi-font and Multi-size Printed Kannada Documents
%J International Journal of Computer Applications
%@ 0975-8887
%V 119
%N 11
%P 32-38
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Line and word segmentation is one of the important step of OCR systems. For the identification of printed characters of non-Indian languages like English, Japanese, Chinese Optical Character Recognition (OCR) systems have been effectively developed. For Indian languages, efforts are on the way for the development of efficient OCR systems, mainly for Kannada, one of the popular South Indian language . In this paper we have proposed a robust method for extraction of individual text lines for printed kannada documents based on the efficient segmentation methodologies such as morphology operations based projection profile,horizontal projection profile and bounding box.

References
  1. Nallapareddy Priyanka, Srikanta Pal, Ranju Mandal "Line and Word Segmentation Approach for Printed Documents", IJCA Special Issue on Recent Trends in Image Processing and Pattern Recognition-RTIPPR,2010, pp 30-36
  2. Sunanda dixit, Suresh Hosahalli Narayana, Mahesh Belur "Kannada text line extraction based on energy minimization and skew correction". IEEE International Advance Computing Conference (IACC) ,2014.
  3. B. Gangamma, Srikanta Murthy K, Riddhi J. Shah, Swati D V "Text Line Extraction from Palm Script Documents Using Morphological Approach", International Conference on Computer Engineering and Applications Dubai,2012,1452-1455.
  4. Vikas J Dongre , Vijay H Mankar "Devnagari document segmentation using histogram approach". International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol. 1, No. 3, August 2011,46-53.
  5. Alireza Alaei, P. Nagabhushan, Umapada Pal "A Benchmark Kannada Handwritten Document Dataset and its Segmentation", International Conference on Document Analysis and Recognition,2011.
  6. U. Pal and B. B. Chaudhuri "Script Line Separation From Indian Multi-Script Documents". In Proc. 4thICDAR,1999.
  7. R. Sanjeev Kunte, R. D. Sudhaker Samuel "An OCR system for printed Kannada text using Two-stage Multi-network classification approach employing Wavelet features", International Conference on Computational Intelligence and Multimedia Applications 2007.
  8. Mamatha Hosalli Ramappa and Srikantamurthy Krishnamurthy "Skew Detection, Correction and Segmentation of Handwritten Kannada Document", International Journal of Advanced Science and Technology Vol. 48, November, 2012.
Index Terms

Computer Science
Information Sciences

Keywords

Morphology operations based projection profile horizontal projection profile bounding box.