Recent Trends in Image Processing and Pattern Recognition |
Foundation of Computer Science USA |
RTIPPR - Number 1 |
None 2010 |
Authors: Nallapareddy Priyanka, Srikanta Pal, Ranju Manda |
29aa59d4-1077-41bf-b464-68ec8b444366 |
Nallapareddy Priyanka, Srikanta Pal, Ranju Manda . Line and Word Segmentation Approach for Printed Documents. Recent Trends in Image Processing and Pattern Recognition. RTIPPR, 1 (None 2010), 30-36.
Line and word segmentation is one of the important step of OCR systems. In this paper we have proposed a robust method for segmentation of individual text lines based on the modified histogram obtained from run length based smearing. A complete line and word segmentation system for some popular Indian printed languages is presented here. Both foreground and background information are used here for accurate line segmentation. There may be some touching or overlapping characters between two consecutive text lines and most of the line segmentation errors are generated due to touching and overlapping character occurrences. Sometimes, interline space and noises make line segmentation a difficult task. Our method can take care of this situation accurately. Word segmentation from individual lines is also discussed here. We have tested our method on documents of Bangla, Devnagari, Kannada, Telugu scripts as well as some multi-script documents and we have obtained encouraging results from our proposed technique.