CFP last date
20 December 2024
Reseach Article

The Thinning Problem in Arabic Text Recognition - A Comprehensive Review

by Atallah M. Al-shatnawi, Khairuddin Omar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 103 - Number 3
Year of Publication: 2014
Authors: Atallah M. Al-shatnawi, Khairuddin Omar
10.5120/18055-8969

Atallah M. Al-shatnawi, Khairuddin Omar . The Thinning Problem in Arabic Text Recognition - A Comprehensive Review. International Journal of Computer Applications. 103, 3 ( October 2014), 35-42. DOI=10.5120/18055-8969

@article{ 10.5120/18055-8969,
author = { Atallah M. Al-shatnawi, Khairuddin Omar },
title = { The Thinning Problem in Arabic Text Recognition - A Comprehensive Review },
journal = { International Journal of Computer Applications },
issue_date = { October 2014 },
volume = { 103 },
number = { 3 },
month = { October },
year = { 2014 },
issn = { 0975-8887 },
pages = { 35-42 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume103/number3/18055-8969/ },
doi = { 10.5120/18055-8969 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:33:36.409412+05:30
%A Atallah M. Al-shatnawi
%A Khairuddin Omar
%T The Thinning Problem in Arabic Text Recognition - A Comprehensive Review
%J International Journal of Computer Applications
%@ 0975-8887
%V 103
%N 3
%P 35-42
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The goal of this paper is to present an overview about the thinning problem in Arabic text recognition. Thinning "Skeletonization" is a very crucial stage in the ACR, it simplifies the text shape and reduces the amount of data that needs to be handled and it is usually used as a pre-processing stage for recognition and storage systems. The skeleton of Arabic text can be used for each of the baseline detection, character segmentation, and features extraction and also ultimately supporting the classification. Choosing or designing the effective thinning algorithm for Arabic text is crucial in ACR. In this paper, the importances of the thinning for the ACR and the usage of the text skeleton in ACR system are discussed and presented. As well as the challenges that have an impact on the thinning of Arabic text are discussed. The methods of Arabic text thinning are discussed and reviewed based on the technique used, and the methods advantages and drawbacks are discussed in details.

References
  1. Abuhaiba, I. S. I. , Holt, M. J. J. , and Datta S. 1998. Recognition of off-line cursive handwriting. Computer Vision and Image Understanding. 71(1): 19-38.
  2. Al Aghbari, Z. , and Brook. S. 2009. HAH manuscripts: A holistic paradigm for classifying and retrieving historical Arabic handwritten documents, Expert Systems with Applications: An International Journal, v. 36 n. 8, p. 10942-10951. October.
  3. AL -Badr, B. , and Mahmoud, S. 1995. Survey and bibliography of Arabic optical text recognition. Signal Processing. 41(1): 49-77.
  4. Ali. , and Jumari. 2004. Skeletonization algorithm for Arabic handwriting, Arab gulf journal of scientific research ISSN 1015-4442, vol. 22, no1, pp. 28-33. 2004.
  5. AlKhateeb, J. H. , Ren, J. , Ipson, S. , and Jiang, J. 2008. Knowledge-based baseline detection and optimal thresholding for words segmentation in efficient pre-processing of handwritten Arabic text. Fifth international conference on information technology: new generations. IEEE computer society. pp. 1158-1159.
  6. Almuallim, H. , and Yamaguchi, S. 1987. A method of recognition of Arabic cursive handwriting. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). 9(5): 715-722.
  7. AL-Shatnawi, A. , AL-Zawaideh F. , AL-Salaimeh. S. , and Omar K. 2011. Offline Arabic Text Recognition System – An Overview. World of Computer Science and Information Technology Journal (WCSIT), vol. 1, no. 5, pp. 184-192.
  8. AL-Shatnawi, A. , and Omar K. 2008. Methods of Arabic Baseline Detection -The State of Art," International Journal of Computer Science and Network Security, vol. 8, no. 10, pp. 137-142.
  9. Al-Shatnawi, A. , and Omar K. 2009. A Comparative Study between Methods of Arabic Baseline Detection. In proceeding of international conference on Electrical Engineering and Informatics, ICEEI'09, vol. 1, pp. 73-77.
  10. AL-Shatnawi, A. , Khairuddin, O. , and Zeki, A. 2014. Comparison of Five Thinning Methods on the Arabic IFN/ENIT Database. ICGST International journal on Artificial Intelligence and Machine Learning (AIML), vol. 14, no. 1, pp. 1-13.
  11. Al-Shatnawi, A. , Omar. K. , AlFawwaz. B M and Zeki. A. M. 2014. Skeleton extraction: Comparison of five methods on the Arabic IFN/ENIT database. 6th International Conference on Computer Science and Information Technology (CSIT), pp. 50- 59, 26-27 March.
  12. AL-Shatnawi, A. , Omar. K. , and Zeki. A. 2011. Challenges in Thinning of Arabic Text. ICGST International Conference on Artificial Intelligence and Machine Learning (AIML- 11), Dubai. United Arab Emiratis. pp 127-133. 12-14 April.
  13. Alshebeili, A. , Nabawi, A. , and Mahmoud. S. 1997. Arabic character recognition using 1-D slices of the character spectrum. Signal Processing. 56(1): 59-75.
  14. Altuwaijri, M. , Bayoumi, M. 1998. A thinning algorithm for Arabic characters using ART2 neural network. CirSysSignal (45), no. 2, February, pp. 260-264.
  15. Al-Yousefi, H. , and Udpa, S. S. 1992. Recognition of Arabic characters. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). 14(8): 853-857.
  16. Amin, A. 2001. Segmentation of Printed Arabic Text. ICAPR 2001: 115-126
  17. Amin, A. 2003. Recognition of Hand-Printed Characters Based on Structural Description and Inductive Logic Programming," Pattern Recognition Letters, vol. 24, pp. 3187-3196.
  18. Argner, V. , and El Abed, H. 2008. Databases and Competitions: Strategies to Improve Arabic Recognition Systems. pp. 82-103.
  19. Benouareth, A. , Ennaji. A. , and Sellami. M. 2008. Arabic handwritten word recognition using HMMs with explicit state duration. EURASIP Journal on Advances in Signal Processing (1-13).
  20. Broumandnia, A. , Shanbehzadeh, J. , and Nourani, M. 2007. Handwritten Farsi/Arabic Word Recognition. IEEE. pp. 767-771.
  21. Bushofa, B. M. F. , and Spann, M. 1995. Segmentation and recognition of printed Arabic characters, in Proc. British Machine Vision Conference, BMVC-95, pp. 543-552.
  22. Cheng, F. H. , and Hsu. W. H. 1985, A new parallel thinning algorithm for binary image, National Computer Symposium, Kaohsiung, Taiwan, pp. 295-299
  23. Couprie, M. 2005. Note on fifteen 2D parallel thinning algorithms. Internal Report, Université de Marne-la-Vallée, IGM2006-01.
  24. Cowell, J. , and Hussain, F. 2001. Thinning Arabic characters for feature extraction. IEEE Conference on Information Visualization. London, UK. pp. 181-185. 25-27 July.
  25. Daya, B. 2008. Parallelization of Two-Dimensional Skeletonization Algorithms, University of Florida, Journal of Undergraduate Research, Volume 9, Issue 4 , Summer.
  26. Fahmy, M. M. M. , and El-Messiry, H. 2001. Automatic recognition of typewritten Arabic characters using Zernike moments as a feature extractor. Journal of Studies in Informatics and Control. 10(3):48-51.
  27. Ferreira, A. , and Ubeda, S. 1994. Ultra fast parallel countor tracking with application to thinning. Pattern Recognition, 27(7):867–878.
  28. Gonzalez, R. , and Woods. R. 1992. Digital Image Processing, Addison-Wesley Publishing Company.
  29. Haj-Hassan, F. 1990. Arabic character recognition. In Mackay, P. A. (ed. ). Computer and the Arabic language. pp. 113-118. Hemisphere, New York.
  30. Haji, M. M. , and Katebi, S. D. 2006. Evaluation of Skeletonization Methods for Arabic/Farsi Handwriting Recognition, proceeding of 11 International CSI Computer Conference (CSICC'06), Tehran, Iran.
  31. Hilditch, C. J. 1983. Comparison of thinning algorithms on a parallel processor. Image Vision Computing, 1, 115–132.
  32. Hong. Z. 2001. A Hybrid Thinning Algorithm for Binary Topographic Map, Geo-spatial Information Seierme, Vol. 4,No. 3, p. 57--61 Sept.
  33. Huang, L. , Wan, G. , and Liu, C. 2003. An Improved Parallel Thinning Algorithm. Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003), 780-783.
  34. Jang, B. K. , and Chin, R. T. 1992. One-pass parallel thinning: analysis, properties, and quantitative evaluation, IEEE Trans. Pattern Anal. Mach. Intell. PAMI-14, 1129-1140.
  35. Jumari, K. , and Ali, M. A. 2002. A Survey and Comparative Evaluation of Selected Off-Line Arabic Handwritten Character Recognition Systems. Jurnal Teknologi, 36(1-18) Jun.
  36. Kegl, B. , Krzyzak, A. 2002. Piecewise linear skeletonization using principal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, N 1, pp. 59-74.
  37. Khatatneh, K. 2006. Probabilistic Artificial Neural Network for Recognizing the Arabic. Hand Written Characters, Journal of Computer Science 3 (12), 881-886.
  38. Khedher, M. Z. , Abandah, G. A. , and Al-Khawaldeh, A. M. 2005. Optimizing Feature Selection for Recognizing Handwritten Arabic Characters. Trans. on Engineering, Computing and Technology, vol. 4 Feb.
  39. Khorsheed, M. S. 2002. Off-line Arabic character recognition - a review. Pattern Analysis & Applications. 5(1): 31-45.
  40. Khorsheed, M. S. 2003. Recognising Handwritten Arabic Manuscripts Using a Single Hidden Markov Model, Pattern Recognition Letters, vol. 24, pp. 2235-2242.
  41. Lam, L. , Lee, S. W. , and Suen, C. Y. 1992. Thinning methodologies, a comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 14(9): 869-885.
  42. Liana, M. , and Venu, G. 2006. Offline Arabic Handwriting Recognition: A Survey. IEEE, Transactions on Pattern Analysis and Machine Intelligence. 28: 712-724.
  43. Mahmoud, S. , Abuhaiba, I. , and Green, R. 1991. Skeletonization of Arabic characters using clustering based skeletonization algorithm (CBSA). Pattern Recognition. 24(5): 453-464.
  44. Manaf, M. 2002. Jawi Handwritten Text Recognition Using Recurrent Bama Neural Networks PhD Thesis, Universiti Kebangsaan Malaysia.
  45. Melhi, M. , Ipson. S. , and Booth, W. 2001. A novel triangulation procedure for thinning hand-written text, Pattern Recognition Letters, (22)1059–1071.
  46. Mostafa, M. G. 2004. An adaptive algorithm for the automatic segmentation of printed Arabic text. 17th National Computer Conference. pp. 437-444. Madinah, Saudi Arabia. 5-8 April.
  47. Mozaffari, S. , Faez, K. , and Ziaratban, M. 2005. Structural Decomposition and Statistical Description of Farsi/Arabic Handwritten Numeric Characters, Proc. Int'l Conf. Document Analysis and Recognition, pp. 237-241.
  48. Naccache, N. J. , and Shinghal, R. 1984. SPTA: A Proposed Algorithm for Digital Pictures. IEEE Trans. on Systems, Man and Cybernetics, vol. SMC-14(3) 409-418.
  49. Nasrudin, M. F. , Omar, K. , Zakaria, M. S. , Liong Choong Yeun . 2008. Handwritten Cursive Jawi Character Recognition: A Survey. 2008 Fifth International Conference on Computer Graphics, Imaging and Visualisation. 247-256 . Penang. Malaysia. 26-28 Aug.
  50. Nawaz, S. N. , Sarfraz, M. , Zidouri, A. , and Al-Khatib, W. G. 2003. An approach to offline Arabic character recognition using neural networks. 10th IEEE International Conference on Electronics, Circuits and Systems (ICECS'03). 3:1328-1331. 14-17 December.
  51. Nazif. A. 1975. A system for the recognition of the printed Arabic characters. M. Sc. Thesis. Cairo University.
  52. Nouh, A. , Ula, A. N. , and Edlin, A. S. 1988. Algorithms for feature extraction: a case study for the Arabic character recognition. 10th National Conference. pp. 653- 666. Jeddah, Saudi Arabia.
  53. Omar, K. 2000. Jawi Handwritin Text Recognition using Multi-level Classifiers, PhD Thesis. University Putra Malaysia.
  54. Omar, K. , Mahmoud, R. , Sulaiman, M. N. , and Ramli, A. 2000. The removal of secondaries of Jawi characters. IEEE Region 10 Annual Conference (TENCON'2000). 2: 149-152. Malaysia. 19-22 August.
  55. Parker, J. R. 1997. Algorithms for Image Processing and Computer Vision. Wiley Computer Publishing, Canada.
  56. Pavlidis, T. 1980. Athinning algorithm for discrete binary images. Computer vision and image processing, 20: 142 - 157.
  57. Pavlidis, T. 1982. Algorithms for Graphic and Image Processing, Computer science press, Rockville, Md, USA.
  58. Pechwitz, M. , and Maergner, V. 2002. Baseline estimation for Arabic handwritten words. In Frontiers in Handwriting Recognition. 479–484.
  59. Pechwitz, M. , Maddouri, S. S. , Märgner, V. , Ellouze, N. , and Amiri, H. 2002. IFN/ENIT- Database of Handwritten Arabic Words. Colloque International Francophone sur l'Écrit et le Document (CIFED'02). pp. 129-136. Hammamet, Tunisia. 21- 23 October.
  60. Rosilim. 2002. Modification of Combained Segmentiation Technique for Jawi Manuscript. MIT thesis. , Jabatan Sains dan Pengurusan Sistem, Fakulti Teknologi dan Sains Maklumat, Universiti Kebangsaan Malaysia, Bangi,
  61. Safabakhsh, R. , and Adibi, P. 2005. Nastaaligh Handwritten Word Recognition Using a Continuous-Density Variable-Duration HMM. The Arabian Journal for Science and Engineering. 30: 95-118. April.
  62. Sarhan, A. M. , and Al Helalat, O. I. 2007 . Arabic character recognition using artificial neural networks and statistical analysis. Proceedings of world academy of science, engineering and technology. ISSN 1307-6884. 21: 32-36. May.
  63. Stefanelli, R. , and Rosenfeld. A. 1971. Some parallel thinning algorithms fordigital pictures," J. of ACM, vol. 18, no. 2, pp. 255-264, 1971.
  64. Suen, C. Y. , and Wang, P. S. 1994. Thinning Methodologies for Pattern Recognition. Series in Machine Perception and Artificial Intelligence, vol. 8, World Scientific.
  65. Tellache, M. , Sid-Ahmed, M. , and Abaza, B. 1993. Thinning algorithms for Arabic OCR. IEEE Pacific Rim Conference on Communications and Signal Processing. 1: 248-251. Victoria, BC, USA. 19-24 May.
  66. Wang, P. S. P. And Zhang, Y. Y. , "A Fast and Flexible Thinning Algorithm", IEEE Trans. Comput. , C, 38, 5,1989, pp 741-754.
  67. Wshah, S. , Sh. Z. , and Govindaraju, V. 2009. Segmentation of Arabic Handwriting based on both Contour and Skeleton Segmentation. 10th International Conference on Document Analysis and Recognition. Pp. 793-797
  68. Zeki, A. M. , Zakaria, M. S. , Liong, C, Y. 2007. Isolation of Dots for Arabic OCR using Voronoi Diagrams. Proceedings of the International Conference on Electrical Engineering and Informatics Institut Teknologi Bandung, Indonesia. 17-19 June.
  69. Zeki, A. M. 2005. The segmentation problem on Arabic character recognition – the state of the art. 1st International Conference on Information and Communication Technology (ICICT). pp. 11-26. Karachi, Pakistan.
  70. Zhang, T. Y. , and Suen, C. Y. 1984. A Fast Parallel Algorithm for Thinning Digital Patterns. Comm. ACM, vol. 27(3) 236-239.
  71. Zidouri, A. , Sarfraz, M. , Shahab, S. A. and Jafri, S. M. 2005. Adaptive dissection-based subword segmentation of printed Arabic text. 9th International Conference on Information Visualisation. pp. 239-243. 6-8 July.
Index Terms

Computer Science
Information Sciences

Keywords

Thinning Skeleton Iterative Non-iterative Parallel Sequential Pre-processing Arabic character recognition.