CFP last date
20 December 2024
Reseach Article

Parallel Implementation of Devanagari Text Line and Word Segmentation Approach on GPU

by Brijmohan Singh, Nitin Gupta, Rashi Tyagi, Ankush Mittal, Debashish Ghosh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 24 - Number 9
Year of Publication: 2011
Authors: Brijmohan Singh, Nitin Gupta, Rashi Tyagi, Ankush Mittal, Debashish Ghosh
10.5120/2987-3988

Brijmohan Singh, Nitin Gupta, Rashi Tyagi, Ankush Mittal, Debashish Ghosh . Parallel Implementation of Devanagari Text Line and Word Segmentation Approach on GPU. International Journal of Computer Applications. 24, 9 ( June 2011), 7-14. DOI=10.5120/2987-3988

@article{ 10.5120/2987-3988,
author = { Brijmohan Singh, Nitin Gupta, Rashi Tyagi, Ankush Mittal, Debashish Ghosh },
title = { Parallel Implementation of Devanagari Text Line and Word Segmentation Approach on GPU },
journal = { International Journal of Computer Applications },
issue_date = { June 2011 },
volume = { 24 },
number = { 9 },
month = { June },
year = { 2011 },
issn = { 0975-8887 },
pages = { 7-14 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume24/number9/2987-3988/ },
doi = { 10.5120/2987-3988 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:10:30.842937+05:30
%A Brijmohan Singh
%A Nitin Gupta
%A Rashi Tyagi
%A Ankush Mittal
%A Debashish Ghosh
%T Parallel Implementation of Devanagari Text Line and Word Segmentation Approach on GPU
%J International Journal of Computer Applications
%@ 0975-8887
%V 24
%N 9
%P 7-14
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Fast and accurate algorithms are necessary for Optical Character Recognition (OCR) systems to perform operations on document images such as pre-processing, segmentation, feature extraction, training and testing of classifiers and post processing. Text line and word segmentation are two important steps in any OCR system. Wrong segmentation may affect the accuracy rate of OCR systems. The segmentation is very challenging in cases of availability of different types of noises, degradations, and variation in writing and script characteristics. However, existing algorithms suffer from a flawed tradeoff between accuracy and speed. In this research work, Devanagri text line and word segmentation are carried out using modified standard profiling based segmentation approach and parallelized it on Graphics Processing Unit (GPU). The main goal of this research work is to make segmentation faster for processing a large number of document images using parallel implementation of algorithms on GPU. GPUs are emerging as powerful parallel systems at a cheaper cost. Our work employs extensive usage of highly multithreaded architecture and shared memory of multi-cored GPU. An efficient use of shared memory is required to optimize parallel reduction in Compute Unified Device Architecture (CUDA). Experimental results show that our method can achieve a speedup of about 20x-30x over the serial implementation when running on a GPU named GeForce 9500 GT having 32 cores.

References
  1. Sethi, I. K and Chatterjee, B. 1977. Machine recognition of constrained hand-printed Devanagari. Pattern Recognition 9 (1977) 69-75.
  2. Pal, U. and Chaudhuri, B.B. 2004. Indian script character recognition: a survey. Pattern Recognition 37, 1887 – 1899.
  3. Sharma, N. Pal, Kimura, U. F. and Pal, S. 2005. Recognition of offline handwritten Devanagari characters using quadratic classifier. In Proceeding of ICVGP conference, Springer, LNCS 4338, 805-816.
  4. Kumar, S. and Singh, C. 2005. A study of zernike moments and its use in Devnagari handwritten character recognition. In Proceeding of International Conference on Cognition and Recognition, 514-520.
  5. Hanmandlu,M. Ramana Murthy, O.V. and Madasu, V.K. 2007. Fuzzy model based recognition of handwritten Hindi characters. Digital Image Computing Techniques and Applications, 9th Biennial Conference of the Australian Pattern Recognition Society, 454-461.
  6. Pal, U. Sharma, N. Wakabayashi, T. and Kimura, F. 2007. Off- line handwritten character recognition of Devanagari script. In Proceeding of 9th International Conference of Document Analysis and Recognition, 496-500.
  7. Arora, S. Bhattacharjee, D., Nasipuri, Basu M. D.K., and Kundu, M. 2008. Combining multiple feature extraction techniques for handwritten Devnagari character recognition. IEEE Region 10 Colloquium and the Third ICIIS, Kharagpur, INDIA, 1-6.
  8. Pal, U. Chanda, Wakabayashi , S. T. and Kimura, F. 2008. Accuracy improvement of Devanagari character recognition combining SVM and MQDF. In Proceeding of 11th International Conference of Frontier of Handwriting Recognition, 367-372.
  9. Pal, U., Wakabayashi T. and, Kimura, F. 2009. Comparative study of Devanagari handwritten character recognition using different feature and classifiers. In Proceeding of 10th International Conference on Document Analysis and Recognition, 1111- 1115.
  10. Plessis, B. Siscu, Menu, A. E. and Moreau, J.W.V. 1992. Isolated handwritten word recognition for contextual address reading. In Proceeding of USPS 51h Advanced Technology Conference, France, 749-750.
  11. Parui, S. K. and Shaw, B. 2007. Offline handwritten Devanagari word recognition: An HMM based approach. In Proceeding of International conference on PReMI 2007, Springer, LNCS 4815, 528–535.
  12. Shaw, B. Parui, S. K. and Shridhar, M. 2008. A segmentation based approach to offline handwritten Devanagari word recognition. In Proceeding of International Conference on Information Technology, IEEE, 256-257.
  13. Marinai, S. 2008. Introduction to document analysis and recognition. Studies in Computational Intelligence (SCI) 90 (2008) 1–20.
  14. Tang, Y.Y., Suen, C.Y., Yan, C.D. and Cheriet, M. 1991. Document analysis and understanding: a brief survey. In Proceeding of First International Conference on Document Analysis and Recognition, Saint-Malo France, 17-31.
  15. Plamondon, R. and Srihari, S. N. 2000. On-line and off-line handwritten recognition: a comprehensive survey. IEEE Trans on PAMI 22 (2000) 62-84.
  16. Lecolinet, E. and Crettez, J. 1991. A grapheme based segmentation technique for cursive script recognition. In Proceeding of First International Conference of Document Analysis and Recognition, 740-748.
  17. Yanikoglu, B. and Sandon, P.A. 1998. Segmentation of off-Line cursive handwriting using linear programming. Pattern Recognition 31, No. 12, (Dec. 1998), 1038-1041.
  18. Pal, U. and Choudhary, B.B. 2001. Machine printed and handwritten text lines identification. Pattern Recognition Letters 22 (2001) 431-441.
  19. Leroux, M., Salome, J.C. and Badard, J. 1991. Recognition of cursive script words in a small lexicon. In Proceeding of First International Conference of Document Analysis and Recognition, 774-782.
  20. LU, Y.I. and Shridhar, M. 1996. Character segmentation in handwritten words. Pattern Recognition 29 (1996) 77- 96.
  21. Casey, R.G. and Lecolinet, E. 1996. A survey of methods and strategies in character segmentation. 199. IEEE Trans. on PAMI 18 (July1996) 156-161.
  22. Garg, N.K. Kaur, L. and Jindal, M.K. 2010. A new method for line segmentation of handwritten Hindi text. In Proceeding of Seventh International Conference on Information Technology: New Generations (ITNG), IEEE, 392 – 397.
  23. Garg, N.K. Kaur, L. and Jindal, M.K. 2010. Segmentation of handwritten Hindi text. International Journal of Computer Applications 1 (2010) 0975 – 8887.
  24. Thillou, C. M. and Gosselin, B. 2006. Character segmentation by recognition using log-gabor filters. In Proceeding of 18th International Conference on Pattern Recognition, Pattern Recognition, 901- 904.
  25. Casey, R. G. and Nagy, G. 1982. Recursive segmentation and classification of composite character patterns. In Proceeding of 6th International Conference Pattern Recognition, Munich, Germany, (1982), 1023–1026.
  26. Kim, S. H., Jeong, S., Lee, G. S. and Suen, C. Y. 2001. Word segmentation in handwritten Korean text lines based on gap clustering techniques. In Proceeding of 6th International Conference of Document Analysis and Recognition, IEEE, 189-193.
  27. Elgammal, A. M., and Ismail, M. A. 2001. A graph-based segmentation and feature extraction framework for Arabic text recognition. In Proceeding of 6th International Conference of Document Analysis and Recognition, IEEE, 622-626.
  28. Kompalli, S., Setlur, S. and Govindaraju, V. 2006. Design and comparison of segmentation driven and recognition driven Devanagari OCR. In Proceeding of Second International conference of Document Image Analysis for libraries, IEEE, 7-102.
  29. NVIDIA CUDA Programming Guide Version 2.0, available at www.nvidia.com/object/cuda_develop.html.
  30. NVIDIA Corporation: NVIDIA CUDA programming guide. Jan 2007, available at http://developer.download.nvidia.com/compute/cuda/2_0/docs/NVIDIA_CUDA_Programming_Guide_2.0.pdf
Index Terms

Computer Science
Information Sciences

Keywords

OCR Segmentation Profiling Parallelization GPU CUDA