International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 39 - Number 9 |
Year of Publication: 2012 |
Authors: S. Karthik, Hemanth.V.K, V. Balaji, K. P. Soman |
10.5120/4846-7117 |
S. Karthik, Hemanth.V.K, V. Balaji, K. P. Soman . Level Set Methodology for Tamil Document Image Binarization and Segmentation. International Journal of Computer Applications. 39, 9 ( February 2012), 7-12. DOI=10.5120/4846-7117
The most challenging task in OCR is getting the characters segmented properly. The accuracy of segmentation depends on the quality of the binarization technique applied. Binarization is the process of setting all intensity values greater than some threshold value to ”on”. It converts the document image into binary image as extracting text and eliminating the background. This process also removes the noise. The output of this process is used as input to image segmentation process. Conventionally separate methods are used for binarizarion and segmentation. In this paper we investigate the use of recently introduced convex optimization methods, selective local/global segmentation (SLGS) algorithm [16] and fast global minimization (FGM) algorithm [15] for simultaneous binarization and segmentation. Out of the two methods we tried out, one of them is found to be suitable for OCR task. The FGM algorithm provides an average accuracy of 89.97% for Tamil character segmentation.