CFP last date
20 January 2025
Reseach Article

A New Approach to Segmentation of Persian Cursive Script based on Adjustment the Fragments

by Mir Mohammad Alipour
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 64 - Number 11
Year of Publication: 2013
Authors: Mir Mohammad Alipour
10.5120/10679-5561

Mir Mohammad Alipour . A New Approach to Segmentation of Persian Cursive Script based on Adjustment the Fragments. International Journal of Computer Applications. 64, 11 ( February 2013), 21-26. DOI=10.5120/10679-5561

@article{ 10.5120/10679-5561,
author = { Mir Mohammad Alipour },
title = { A New Approach to Segmentation of Persian Cursive Script based on Adjustment the Fragments },
journal = { International Journal of Computer Applications },
issue_date = { February 2013 },
volume = { 64 },
number = { 11 },
month = { February },
year = { 2013 },
issn = { 0975-8887 },
pages = { 21-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume64/number11/10679-5561/ },
doi = { 10.5120/10679-5561 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:17:27.824384+05:30
%A Mir Mohammad Alipour
%T A New Approach to Segmentation of Persian Cursive Script based on Adjustment the Fragments
%J International Journal of Computer Applications
%@ 0975-8887
%V 64
%N 11
%P 21-26
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Optical Character Recognition (OCR) is a very old and of great interest in pattern recognition field. The recognition of cursive scripts like Persian and Arabic languages is a difficult task as their segmentation suffers from serious problems in different languages. Segmentation is a process of dividing cursive words into smaller parts in order to decrease complexity and increase accuracy of recognition process. In this paper, an improved segmentation method of the Persian script has been presented and to increase the quality of segmentation, some structural features of Persian language is used to adjust the fragments. This method is robust as well as flexible. It also increases the system's tolerances to font variations. The proposed method is able to segment existing Persian fonts up to 99. 2% accuracy.

References
  1. Abdelazim, H. Y. , Hashish, M. A, 1988. Arabic reading machine. In: Proceedings of the 10th National Computer Conference, Jeddah, pp. 733–744.
  2. Al-Shoshan, A. I. , 2006. Arabic OCR based on image invariants. In: Proceedings of the International Conference on Geometric Modeling and Imaging-New Trends, pp. 150–154.
  3. Amin, A. , 1998. Off-line Arabic character recognition: the state of the art. Pattern Recognition. 31(5), 517–530.
  4. Gouda, A. M. , Rashwan, M. A. 2004. Segmentation of connected Arabic characters using hidden Markov models. IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, USA pp. 115–119.
  5. Kurdy, B. , AlSabbagh, M. , 2004. Omnifont Arabic optical character recognition system. In: Proceedings of International Conference on Information and Communication Technologies: From Theory to Applications, pp. 469–470.
  6. Sarfraz, M. , Nawaz, S. N. , Al-Khuraidly, A. , 2003. Offline Arabic text recognition system. In: Proceedings of International Conference on Geometric Modeling and Graphics, pp. 30–35.
  7. Khosravi, H. , Kabir, E. , 2007. Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit. Lett. 28(10), 1133–1141.
  8. Mansoory, S. , Hassibi, H. , Rajabi, F. , 1998. A heuristic Persian handwritten digit recognition with neural network. In: The 6th Iranian Conference on Electrical Engineering, pp. 131–135.
  9. Nabavi, S. H. , Ebrahimpour, R. , Kabir, E. , 2005. Recognition of handwritten Farsi digits using classifier combination. In: Third Conference on Machine Vision, Image Processing and Applications, Tehran, pp. 116–119.
  10. Soltanzadeh, H. , Rahmati, M. , 2004. Recognition of Persian handwritten digits using image profiles of multiple orientations. Pattern Recognit. Lett. 25(14), 1569–1576.
  11. Azmi, R. , Kabir, E. , 2001. A new segmentation technique for omnifont Farsi text. Pattern Recognit. Lett. 22, 97–104.
  12. Ebrahimi, A. , Kabir, E. , 2008. A pictorial dictionary for printed Farsi subwords. Pattern Recognit. Lett. 29(5), 656–663.
  13. Mehran, R. , Pirsiavash, H. , Razzaziy, F. , 2005. A front-end OCR for omni-font Persian/Arabic cursive printed documents. Digital Imaging Computing: Techniques and Applications, pp. 385–392.
  14. Menhaj, M. B. , Adab, M. , 2002. Simultaneous segmentation and recognition of Farsi/Latin printed texts with MLP. In: International Joint Conference on Neural Networks, pp. 1534–1539.
  15. Yazdi, S. A. B. , A'rabi, B. N. , 2007. Printed Farsi text recognition with simultaneous use of HMM. In: Dynamic Programming and SVM (in Farsi), Machine Vision and Image Processing, Mashhad.
  16. Parhami, B. , Taraghi, M. , 1981. Automatic recognition of printed Farsi texts. Pattern Recognit. Lett. 14, 395–403.
  17. S. Al-Emami and M. Usher, 1990. On-line Recognition of Handwritten Arabic Characters. IEEE Trans. Patt Anal. Machine Intell. 12(7): 704 – 710.
  18. A. Alimi and O. Ghorbo, 1995. The analysis of error in an on-line recognition systems of Arabic handwritten characters. Proceedings of the Third International Conference on Document Analysis and Recognition. 2: 890 – 893.
  19. L. Hamami and D. Berkani, 2002. Recognition system for printed Multi-Font and Multi-Size Arabic Characters, Arabian Journal for Science and Engineering, Vol 27, Number 1B, 57-72.
  20. M . Liana, and G. Venu, 2006. Offline Arabic Handwriting Recognition: A Survey. IEEE, Transactions on Pattern Analysis and Machine Intelligence. 28: 712-724.
  21. F. Farooq, V . Govindaraju, and M. Perrone, 2005. Pre-processing Methods for Handwritten Arabic Documents. (ICDAR'05) Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition, IEEE. 1. pp. 267-271.
  22. C. Gonzales. Rafael and E. Richard,Woods. , 2002. Digital Image Processing. 2nd ed. Englewood Cliffs, NJ: Prentice-Hall.
  23. R. Safabakhsh, and P. Adibi, 2005. Nastaaligh Handwritten Word Recognition Using a Continuous-Density Variable-Duration HMM. The Arabian Journal for Science and Engineering. 30: 95-118. April.
  24. A. M. Zeki, 2005. The segmentation problem on Arabic character recognition – the state of the art. 1st International Conference on Information and Communication Technology (ICICT). pp. 11-26. Karachi, Pakistan.
  25. A. Amin, and J. F. Mari, 1989. Machine recognition and correction of printed Arabic text. IEEE Transactions on Systems, Man and Cybernetics (SMC), 19(5): 1300- 1306.
  26. H. Goraine, M. Usher, and S. Al-Emami, 1992. Off-Line Arabic Character Recognition, Computer, vol. 25, pp. 71-74.
  27. A. Amin and H. Alsadoun, 1994. Hand printed Arabic character recognition system. Proceedings of the 12th International Conference A on Pattern Recognition, IAPR, pp 536–539.
  28. B. AL -Badr and S. Mahmoud . 1995. Survey and bibliography of Arabic optical text recognition. Signal Processing, 41(1): 49-77.
  29. H. Sanossian, 1996. An Arabic character recognition system using neural network. Proceedings of 1996 IEEE Signal Processing Society Workshop, Kyoto, Japan, IEEE, pp; 340–348.
  30. F. Zaki, S. Elkonyaly, A. Elfattah, and Y. Enab, 1986. A new technique for arabic handwriting recognition. Proceedings of the 11th International Conference for Statistics and Computer Science, Cairo, Egypt, pp; 171–180.
  31. A. Dehghani, F . Shabani and P. Nava, 2001. Off-Line Recognition of Isolated Persian Handwritten Characters Using Multiple HiddenMarkov Models, Proc. Int'l Conf. Information Technology: Coding and Computing, pp. 506-510.
  32. S. Mozaffari, K. Faez, and M. Ziaratban, 2005. Structural Decomposition and Statistical Description of Farsi/Arabic Handwritten Numeric Characters, Proc. Int'l Conf. Document Analysis and Recognition, pp. 237-241.
  33. A. Cheung, M. Bennamoun, and N. W. Bergmann, 1997. A New World Segmentation Algorithm for Arabic Script, DICl'A'97, pp. 431-435, Auckland, New Zealand.
Index Terms

Computer Science
Information Sciences

Keywords

Cursive Script Persian Segmentation Optical Character Recognition Adjustment the Fragments