CFP last date
20 December 2024
Reseach Article

Optical Character Recognition Techniques in Urdu- A Survey

Published on July 2018 by Vippon Preet Kour, Naveen Kumar Gondhi
International Conference on Advances in Emerging Technology
Foundation of Computer Science USA
ICAET2017 - Number 3
July 2018
Authors: Vippon Preet Kour, Naveen Kumar Gondhi
e1440682-a102-4e58-a116-eb3282eb1f4e

Vippon Preet Kour, Naveen Kumar Gondhi . Optical Character Recognition Techniques in Urdu- A Survey. International Conference on Advances in Emerging Technology. ICAET2017, 3 (July 2018), 22-26.

@article{
author = { Vippon Preet Kour, Naveen Kumar Gondhi },
title = { Optical Character Recognition Techniques in Urdu- A Survey },
journal = { International Conference on Advances in Emerging Technology },
issue_date = { July 2018 },
volume = { ICAET2017 },
number = { 3 },
month = { July },
year = { 2018 },
issn = 0975-8887,
pages = { 22-26 },
numpages = 5,
url = { /proceedings/icaet2017/number3/29655-7074/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference on Advances in Emerging Technology
%A Vippon Preet Kour
%A Naveen Kumar Gondhi
%T Optical Character Recognition Techniques in Urdu- A Survey
%J International Conference on Advances in Emerging Technology
%@ 0975-8887
%V ICAET2017
%N 3
%P 22-26
%D 2018
%I International Journal of Computer Applications
Abstract

The survey of the optical character reader for Urdu like cursive languages is based on the various techniques and studies performed on designing and implementation of the optical character reader. As, Urdu language has Nastaliq font so different approaches were applied on this font so as to get the desired result. Survey is being performed on all the techniques whether segmentation based, on-line or off-line etc, then all the data gathered is represented in a tabular manner so as to make it an ease to understand or to have an idea of the concept by visualizing the table at once. Non existence of the Urdu OCR has limited the concept a digital Urdu library and this nonexistence leads a pathway for immense research in this field.

References
  1. S. A. Husain, A multi-tier holistic approach for urdu Nastaliq recognition, in: Proceedings of the 6th International Multitopic IEEE Conference (INMIC'02), 2002.
  2. U. Pal, A. Sarkar, Recognition of printed Urdu script, in: Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003), 2003.
  3. M. Akram, S. Hussain, Word segmentation for urdu OCR system, in: Proceedings of the 8th Workshop on Asian Language Resources. Asian Federation for Natural Language Processing, Beijing, China, 2010.
  4. S. Zaman, W. Slany, F. Sahito, Recognition of segmented Arabic/Urdu characters using pixel values as their features, in: Proceedings of the 1st International Conference on Computer and Information Technology (ICCIT'2012), 2012
  5. M. W. Sagheer, C. L. He, N. Nobile, C. Y. Suen, A new large Urdu database for off-Line handwriting recognition 5716 (2009).
  6. S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, D. K. Basu, A novel framework for automatic sorting of postal documents with multi-script address blocks, Pattern Recognition 43 (10) (2010) .
  7. S. Sardar, A. Wahab, Optical character recognition system for Urdu: online and of?ine OCR irrespective of fonts, in: Proceedings of the International Conference on Information and Emerging Technologies (ICIET), Karachi, Pakistan, 2010.
  8. M. I. Razzak, A. Belaïd, S. A. Hussain, Effect of ghost character theory on arabic script based languages character recognition, in: Proceedings of the WASE Global Conference on Image Processing and Analysis (GCIA'09), Taiwan, China, 2009.
  9. M. I. Razzak, F. Anwar, S. A. Husain, A. Belaïd, M. Sher, HMM and fuzzy logic: a hybrid approach for online urdu script-based languages' character recognition, Knowledge Based Systems 23 (8) (2010)
  10. S. T. Javed, Investigation into a segmentation based OCR for the Nastaleeq writing system (Master's thesis). National University of Computer & Emerging Sciences, Lahore, Pakistan, 2007.
  11. Z. A. Shah, Ligature based optical character recognition of Urdu-Nastaleeq font, in: Proceedings of the 6th International Multitopic IEEE Conference (INMIC'02), 2002.
  12. S. T. Javed, S. Hussain, Improving Nastalique-speci?c pre-recognition process for Urdu OCR, in: Proceedings of the 13th International Multitopic IEEE Conference (INMIC'09), 2009.
  13. S. F. Rashid, S. S. Bukhari, F. Shafait, T. M. Breuel, A discriminative learning approach for orientation detection of urdu document images, in: Proceedings of the 13th International Multitopic IEEE Conference (INMIC'09), 2009
  14. M. Riley, Beyond quasi-stationarity: designing time- frequency representation for speech signals in : Proceedings of the International Conference on Acoustics Speech and Signal Processing(ICASSP87), vol. 12, 1987 ,pp, 657-660.
  15. Nabeel Shahzad,Brandon Paulson and Tracy Hammond Urdu Qaeda: Recognition System for Isolated UrduCharacters IUI 2009 Workshop on Sketch Recognition February 8, 2009, Sanibel Island, Florida Chair: Tracy Hammond
  16. Tabassam Nawaz, Syed Ammar Hassan Shah Naqvi, Habib ur Rehman & Anoshia Faiz Optical Character Recognition System for Urdu (Naskh Font) Using Pattern Matching Technique International Journal of Image Processing, (IJIP)Volume (3) : Issue (3)
  17. Sohail Abdul Sattar Shams-ul Haque Mahmood Khan Pathan "A Finite State Model for Urdu Nastalique Optical Character Recognition ",IJCSNS International Journal of Computer Science and Network Security, VOL. 9 No. 9, September 2009
  18. Faisal Shafait, Adnan-ul-Hasan, Daniel Keysers, and Thomas M. Breuel, "Layout Analysis of Urdu Document Images," [Multitopic Conference, 2006. INMIC '06. IEEE,p. 293 – 298. ]
  19. S. A. Hussain, Anwar F. , Asma. "Online Urdu Character Recognition System. " MVA2007 IAPR Conference on Machine Vision Applications.
  20. Liana M & Venu G. (2006). Offline Arabic Handwriting Recognition: A Survey. IEEE,Transactions On Pattern Analysis and Machine Intelligence, vol. 28, No. 5, pp. 712-724. I.
  21. R. Safabakhsh and P. Adibi. (2005). Nastaaligh Handwritten Word Recognition Using a ContinuousDensity variable-Duration HMM. The Arabian J. Science and Eng. , vol. 30, pp. 95-118.
Index Terms

Computer Science
Information Sciences

Keywords

Image Segmentation Optical Character Reader Feature Extraction Classification