CFP last date
20 December 2024
Reseach Article

Pre-processing of Indian Script Handwritten Document

Published on None 2011 by Varsha Hole, Leena Ragha, Pravin Hole
journal_cover_thumbnail
International Conference and Workshop on Emerging Trends in Technology
Foundation of Computer Science USA
ICWET - Number 1
None 2011
Authors: Varsha Hole, Leena Ragha, Pravin Hole
a1bf993f-b899-42e3-8c12-48edc1a2fd20

Varsha Hole, Leena Ragha, Pravin Hole . Pre-processing of Indian Script Handwritten Document. International Conference and Workshop on Emerging Trends in Technology. ICWET, 1 (None 2011), 35-42.

@article{
author = { Varsha Hole, Leena Ragha, Pravin Hole },
title = { Pre-processing of Indian Script Handwritten Document },
journal = { International Conference and Workshop on Emerging Trends in Technology },
issue_date = { None 2011 },
volume = { ICWET },
number = { 1 },
month = { None },
year = { 2011 },
issn = 0975-8887,
pages = { 35-42 },
numpages = 8,
url = { /proceedings/icwet/number1/2066-aca205/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference and Workshop on Emerging Trends in Technology
%A Varsha Hole
%A Leena Ragha
%A Pravin Hole
%T Pre-processing of Indian Script Handwritten Document
%J International Conference and Workshop on Emerging Trends in Technology
%@ 0975-8887
%V ICWET
%N 1
%P 35-42
%D 2011
%I International Journal of Computer Applications
Abstract

Preprocessing of document image is a very important step to handle the deformations namely noise, different handwriting complexities that may result in base line skew, word skew, character skew, accents may be cited either above or below the text line and parts of neighboring text lines may be connected, etc. The paper presents a novel preprocessing technique for handwritten document to handle some of the deformations usually present in the document like touching components, overlapping components, skewed lines, words with individual skews etc. and build a proper text image with all these deformations removed.

References
  1. A. Nicolaou, and B. Gatos “Handwritten Text Line Segmentation by Shredding Text into its Lines”, 10th International Conference on Document Analysis and Recognition, IEEE Computer society, 2009, 626-630.
  2. Bikash Shaw, Swapan Kumar Parui, Malayappan Shridhar, “Offline Handwritten Devanagari Word Recognition: A holistic approach based on directional chain code feature and HMM”, International Conference on Information Technology, IEEE 2008, 203-208.
  3. Baird H. S., “The skew angle of printed documents”, Proc. of SPSE 40th Symposium on Hybrid imaging systems, Rochester NY, 1987, 739- 743.
  4. Bidyut B. Chaudhuri, Sumedha Bera, “Handwritten Text Line Identification In Indian Scripts”, 10th International Conference on Document Analysis and Recognition, 2009,636-640.
  5. Bikash Shaw,Swapan Kumar Parui, Malayappan Shridha, “A Segmentation Based Approach to Offline Handwritten Devanagari Word Recognition”, International Conference on InformationTechnology,2008,256-257.
  6. Bruzzone, E., Coffetti, M.C. (1999), An algorithm for extracting cursive text lines;, 1999. Proceedings of ICDAR '99, 20-22Sept.,749–752.
  7. B. Yu and A. K. Jain, “A robust and fast skew detection algorithm for generic documents,” Pattern Recognition, 29, no. 10, 1996, 1599- 1630.
  8. C. Huang, S. Srihari, “Word segmentation of off-line handwritten documents”, in: Proceedings of the Document Recognition and Retrieval (DRR) XV, IST/SPIE Annual Symposium, San Jose, CA, USA, January 2008.
  9. A. Hashizume, P. S. Yeh, A. Rosenfeld, “A method of detecting the orientation of aligned components”, Pattern Recognition Letters, 1996, 125-132.
  10. Fajri Kurniawan , Amjad Rehman Khan, Dzulkifli Mohamad, “ Contour vs Non-Contour based Word Segmentation from Handwritten Text Lines: an Experimental Analysis” International Journal of Digital Content Technology and its Applications Volume3,Number2,June2009,127-131.
  11. G. Louloudisa, B.Gatosb,I.Pratikakisb, C.Halatsisa, “ Text line and word segmentation of handwritten documents”, Pattern Recognition42,2009,3169–3183.
  12. G. Louloudis, K. Halatsis, B. Gatos, I. Pratikakis, A block-based Hough transform mapping for text line detection in handwritten documents, in: The 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR 2006), La Baule, France, October 2006, pp. 515–520.
  13. J. He, Q. D. M. Do, A. C. Downton and J. H. Kim, “A Comparison of Binarization Methods for Historical Archive Documents” Eight International Conference on Document Analysis and Recognition (ICDAR’05), 2005, 538-542.
  14. Le D S, Thoma G R and Wechsler H, “Automatic page orientation and skew angle detection for binary document images.” Patter Recognition 27, 1994, 1325–1344.
  15. Manjunath Aradhya V N, Hemantha Kumar G. and Shivakumara P, “Skew detection technique for binary document images based on Hough transform”, International Journal of Information Technology, Vol. 3, 2006.
  16. Nandini N., Srikanta Murthy K., and G. Hemantha Kumar, “Estimation of Skew Angle in Binary Document Images Using Hough Transform”, World Academy of Science, Engineering and Technology 42, 2008, 44-49.
  17. Najman L., “Using mathematical morphology for document skew estimation”, SPIE Document Recognition and retrievals XI vol.5296, 2004, 182-191.
  18. Postl W., “Detection of linear oblique structure and skew scan in digitized documents”, Proc. of Int. Conf. on Pattern Recognition, 1986, 687-689.
  19. Pal U and Chaudhari B. B, “An improved document skew angle estimation technique”, Pattern Recognition Letters, Vol. 17, 1996, 899-904.
  20. Srihari S. N. and Govindraju V., “Analysis of textual images using Hough Transform”, Machine vision Applications 2, 1989, 141-153.
  21. Satadal Saha, Subhadip Basu, Mita Nasipuri and Dipak Kr. Basu, “A Hough Transform based Technique for Text Segmentation”, journal of computing, volume 2, issue 2, February 2010,134-14.
  22. M Ahmed and R Ward, “Rotation Invariant Rule-Based Thinning Algorithm for Character Recognition”, IEEE. Trans. Pattern Analysis and Machine Intelligence, vol. 24, No. 12, December2002.
  23. J.M. Marin, K. Mengersen, C.P. Robert, Bayesian Modelling and Inference on Mixtures of Distributions, Handbook of Statistics, vol. 25, Elsevier-Sciences, Amsterdam, 2005.
  24. L. A. Fletcher, R. Kasturi, ''A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images'', IEEE Trans. Pattern Analysis and Machine Intelligence, Vol.10, No.6, November 1988, pp. 910-918.
Index Terms

Computer Science
Information Sciences

Keywords

Optical character recognition connected components touching components overlapping components