CFP last date
20 December 2024
Reseach Article

Handwritten Arabic Documents Indexation using HOG Feature

by Y. Elfakir, G. Khaissidi, M. Mrabti, D. Chenouni
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 126 - Number 9
Year of Publication: 2015
Authors: Y. Elfakir, G. Khaissidi, M. Mrabti, D. Chenouni
10.5120/ijca2015906182

Y. Elfakir, G. Khaissidi, M. Mrabti, D. Chenouni . Handwritten Arabic Documents Indexation using HOG Feature. International Journal of Computer Applications. 126, 9 ( September 2015), 14-18. DOI=10.5120/ijca2015906182

@article{ 10.5120/ijca2015906182,
author = { Y. Elfakir, G. Khaissidi, M. Mrabti, D. Chenouni },
title = { Handwritten Arabic Documents Indexation using HOG Feature },
journal = { International Journal of Computer Applications },
issue_date = { September 2015 },
volume = { 126 },
number = { 9 },
month = { September },
year = { 2015 },
issn = { 0975-8887 },
pages = { 14-18 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume126/number9/22579-2015906182/ },
doi = { 10.5120/ijca2015906182 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:16:59.147946+05:30
%A Y. Elfakir
%A G. Khaissidi
%A M. Mrabti
%A D. Chenouni
%T Handwritten Arabic Documents Indexation using HOG Feature
%J International Journal of Computer Applications
%@ 0975-8887
%V 126
%N 9
%P 14-18
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The old manuscripts are a part of the richest cultural heritage and legacy of civilizations where the digitalization is a solution for the preservation of these manuscripts. The conception of handwriting recognition system knows today a great expansion and appears as a necessity in order to exploit the wealth of information contained in ancient manuscripts. In this paper, a holistic approach for spotting and searching query, especially, for images documents in handwritten Arabic is proposed. These operations need a lot of time and effort to do manual work. For this, we use in the first time text line segmentation of handwritten document image based on partial projection, where a sliding-window approach is used to locate the document regions that are most similar to the query. Histograms of Oriented Gradients (HOGs) are used as the feature vectors to represent the query and documents image, then Support Vector Machines (SVM) is used to produce a better representation of the query and to classify feature vectors. Finally, the application of the reclassification technique at the indexation stage, leads to better results.

References
  1. IRHT, coord. Maria Careri (Université de Chiet - membre associé à l’IRHT), Anne-Françoise Leurquin et Marie-Laure Savoye (tt://jonas.irht.cnrs.fr/ 2011 – 2021.
  2. IRHT, coord. Dominique Stutzmann (IRHT) htt://form-tei.irht.cnrs.fr/ 2011 – 2018.
  3. CALABRETTO, Sylvie; BOZZI, Andrea; PINON, Jean-Marie, décembre 1999. “ Numérisation des manuscrits médiévaux ”: le projet européen BAMBI, in: Actes du colloque Vers une nouvelle érudition: numérisation et recherche en histoire du livre, Rencontres Jacques Cartier, Lyon.
  4. A.Zahour,B.Taconet,S .Ramdane,2004.“ Contribution à la segmentation de textes manuscrits anciens,” Conférence Internationale Francophone sur l'Ecrit et le document,CIFED’04.
  5. K. Khurshid, C. Faure, and N. Vincent, 2008. “Recherche de mots dans les images de documents par appariements de caractères”, Proceedings of the 10ème Colloque International Francophone sur l’Ecrit et le Document (CIFED08), Rouen, France , p. 91-96.
  6. X. Zhang, C.L. Tan, 2013. “Segmentation-free keyword spotting for handwritten documents based on heat kernel signature”, in: International Conference on Document Analysis and Recognition, pp. 827–831.
  7. Y. Leydier, A. Ouji, F. Lebourgeois, H. Emptoz, 2009. “Towards an omnilingual word retrieval system for ancient manuscripts”, Pattern Recognit. 42 (2009) 2089–2105.
  8. N.R. Howe, 2013. “Part-structured inkball models for one-shot handwritten word spotting”, in: International Conference on Document Analysis and Recognition, pp. 582–586.
  9. M.Kamble, S.Hegadi, 2015. “Handwritten Marathi character recognition using R-HOG Feature”, in: International Conference on Advanced Computing Technologies and Applications (ICACTA), Procedia Computer Science 45 ( 2015 ) 266 – 274015.
  10. T. Rath, V. Lavrenko, and R. Manmatha, 2003. “Retrieving historical manuscripts using shape”, Technical Report, Center for Intelligent Information Retrieval Univ. of Massachusetts, Amherst.
  11. J. Almazán, A. Gordo, A. Fornés, E. Valveny, 2014. “Segmentation-free word spotting with exemplar SVMs”, Pattern Recognition, 47 (12), pp. 3967–3978.
  12. Y. Liang, M. C. Fairhurst, and R. M. Guest, 2012. “A synthesised word approach to word retrieval in handwritten documents”, Pattern Recognition. 45(12), 4225 –4236.
  13. R.A. Mohamad, L. Likforman-Sulem, C. Mokbel, 2009. “Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (7) (2009) 1165–1177.
  14. Y. Kessentini, T. Paquet, A. Ben Hamadou, 2010. “Off-line handwritten word recognition using multi-stream hidden Markov models”, Pattern Recognition Letters 31 (1) (2010) 60–70.
  15. A. Kundu, T. Hines, J. Phillips, B. Huyck, L. Van Guilder, 2007. “Arabic handwriting recognition using variable duration HMM”, in: 9th International Conference on Document Analysis and Recognition (ICDAR), pp. 644–648.
  16. M .Cheriet, R. Farrahi Moghaddam, R. Hedjam, 2013. “A learning framework for automation and optimization of document binarization methods”, Computer Vision and Image Understanding 117(3): 269-280.
  17. Y.Elfakir, G. Khaissidi, M. Mrabti, 2014. “Traitement des documents anciens par les classificateurs multi-niveaux”, Colloque International sur le Monitoring des Systèmes Industriels, CIMSI14.
  18. Y.Elfakir, G. Khaissidi, M. Mrabti, Z. Lakhliai, D. Chenouni, M.Elyacoubi, 2015. “Contribution à l’indexation des documents manuscrits arabes scannés”, Mediterranean Telecommunication Journal Vol. 5, N° 2.
Index Terms

Computer Science
Information Sciences

Keywords

Indexation Classification SVM Segmentation Arabic handwritten documents Histograms of Oriented Gradients (HOG).