CFP last date
20 February 2025
Reseach Article

Character Recognition using Discrete Curve with the use of Approximate String Matching

Published on January 2013 by Samit Kumar Pradhan, Sujoy Sarkar
International Conference in Distributed Computing and Internet Technology 2013
Foundation of Computer Science USA
ICDCIT - Number 1
January 2013
Authors: Samit Kumar Pradhan, Sujoy Sarkar

Samit Kumar Pradhan, Sujoy Sarkar . Character Recognition using Discrete Curve with the use of Approximate String Matching. International Conference in Distributed Computing and Internet Technology 2013. ICDCIT, 1 (January 2013), 17-22.

@article{
author = { Samit Kumar Pradhan, Sujoy Sarkar },
title = { Character Recognition using Discrete Curve with the use of Approximate String Matching },
journal = { International Conference in Distributed Computing and Internet Technology 2013 },
issue_date = { January 2013 },
volume = { ICDCIT },
number = { 1 },
month = { January },
year = { 2013 },
issn = 0975-8887,
pages = { 17-22 },
numpages = 6,
url = { /proceedings/icdcit/number1/10237-1004/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference in Distributed Computing and Internet Technology 2013
%A Samit Kumar Pradhan
%A Sujoy Sarkar
%T Character Recognition using Discrete Curve with the use of Approximate String Matching
%J International Conference in Distributed Computing and Internet Technology 2013
%@ 0975-8887
%V ICDCIT
%N 1
%P 17-22
%D 2013
%I International Journal of Computer Applications
Abstract

This paper deals with the recognition of printed basic Telugu characters using the discrete curves and approximation string matching. The features are extracted from smoothed images, obtained after the thinning operation. As by only thinning, spines may arise which will affect the recognition of the character. The features are the discrete curves, specified using the 3×3 regions of connected component representation. We represent the discrete curves in the form of a string, so the set of discrete curves result a set of strings. Then using the string matching operation we compare the string obtained from the stored character with the string obtained from the extracted character. As we are dealing with the characters so there may be the presence of noise which will affect the performance of the method so we are considering the approximation string matching instead of the exact string matching. The extracted features of the character are represented as a string and the string is stored in a trie data structure so that a uniform time will take to compare the strings. For the efficient approximate string matching we are using the Look ahead branch and bound scheme with the trie. We apply our method on 42 printed basic Telugu characters for demonstration and it gives promising results. However more extensive study on realistic data is required for betterment of the approach.

References
  1. R. E. W. Rafael C. Gonzalez, Digital Image Processing. New Delhi, India: Pearson/Prentice Hall, 2008.
  2. A. Fred, T. Caelli, R. Duin, A. Campilho, and D. de Ridder, Structural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshops, SSPR 2004 and SPR 2004, Lisbon, Portugal, August 18-20, 2004 Proceedings, ser. Lecture Notes in Computer Science. Springer, 2004.
  3. A. Negi, K. N. Shanker, and C. K. Chereddi, "Localization, extraction and recognition of text in telugu document images," in ICDAR, 2003, pp. 1193–1197.
  4. H. Shang and T. Merrettal, "Tries for approximate string matching," Knowledge and Data Engineering, IEEE Trans-actions on, vol. 8, no. 4, pp. 540 –547, aug 1996.
  5. M. Firebaugh, Artificial intelligence: a knowledge-based ap-proach, ser. PWS-Kent series in computer science. Boyd & Fraser, 1988.
  6. A. Negi, C. Bhagvati, and B. Krishna, "An ocr system for telugu," in ICDAR, 2001, pp. 1110–1114.
  7. A. K. Pujari, C. D. Naidu, M. S. Rao, and B. C. Jinaga, "An intelligent character recognizer for telugu scripts using mul-tiresolution analysis and associative memory," Image Vision Comput. , vol. 22, no. 14, pp. 1221–1227, 2004.
  8. V. K. Koppula, A. Negi, and U. Garain, "Robust text line, word and character extraction from telugu document image," in ICETET, 2009, pp. 269–272.
  9. P. Wang and Y. Zhang, "A fast and flexible thinning algorithm," Computers, IEEE Transactions on, vol. 38, no. 5, pp. 741 –745, may 1989.
  10. B. Oommen and G. Badr, "Dictionary-based syntactic pattern recognition using tries," in Structural, Syntactic, and Statistical Pattern Recognition, ser. Lecture Notes in Computer Science, A. Fred, T. Caelli, R. Duin, A. Campilho, and D. de Ridder, Eds. Springer Berlin Heidelberg, 2004, vol. 3138, pp. 251–259.
  11. R. A. Wagner and M. J. Fischer, "The string-to-string correction problem," J. Assoc. Comput. Mach. , vol. 21, pp. 168–173, 1974
  12. G. Badr and B. Oommen, "A novel look-ahead optimization strategy for trie-based approximate string matching," Pattern Analysis & Applications, vol. 9, pp. 177–187, 2006, 10. 1007/s10044-006-0036-8.
  13. G. Navarro, "A guided tour to approximate string matching," ACM Computing Surveys, vol. 33, p. 2001, 1999.
Index Terms

Computer Science
Information Sciences

Keywords

Discrete Curve Approximate String Matching Trie Look Ahead Branch And Bound