National Conference on Recent Trends in Computing |
Foundation of Computer Science USA |
NCRTC - Number 5 |
May 2012 |
Authors: V. K. Yeotikar, M. P. Dhore |
d37c6dc7-feb3-4a9a-a640-bd287b5aa803 |
V. K. Yeotikar, M. P. Dhore . Comparison of Some Text Extraction Methodologies. National Conference on Recent Trends in Computing. NCRTC, 5 (May 2012), 30-33.
In Document Image analysis the digitized images of printed documents typically consist of a mixture of text, graphics, and image elements. For proper processing and efficient representation, these elements have to be separated. For most of the applications it is essential to separate between text and non-text, because text captures the most information. These text lines may have different orientations or the text lines may be of curved shapes. Some of the techniques proposed for text string extraction are completely independent from text orientation and may deal with text in various font styles and sizes. There are many fast and efficient methods for extracting graphics and text paragraphs from printed document. This paper outlines the comparisons of some text extraction techniques proposed by researchers.