CFP last date
20 January 2025
Reseach Article

Comparison of Some Text Extraction Methodologies

Published on May 2012 by V. K. Yeotikar, M. P. Dhore
National Conference on Recent Trends in Computing
Foundation of Computer Science USA
NCRTC - Number 5
May 2012
Authors: V. K. Yeotikar, M. P. Dhore
d37c6dc7-feb3-4a9a-a640-bd287b5aa803

V. K. Yeotikar, M. P. Dhore . Comparison of Some Text Extraction Methodologies. National Conference on Recent Trends in Computing. NCRTC, 5 (May 2012), 30-33.

@article{
author = { V. K. Yeotikar, M. P. Dhore },
title = { Comparison of Some Text Extraction Methodologies },
journal = { National Conference on Recent Trends in Computing },
issue_date = { May 2012 },
volume = { NCRTC },
number = { 5 },
month = { May },
year = { 2012 },
issn = 0975-8887,
pages = { 30-33 },
numpages = 4,
url = { /proceedings/ncrtc/number5/6550-1040/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Recent Trends in Computing
%A V. K. Yeotikar
%A M. P. Dhore
%T Comparison of Some Text Extraction Methodologies
%J National Conference on Recent Trends in Computing
%@ 0975-8887
%V NCRTC
%N 5
%P 30-33
%D 2012
%I International Journal of Computer Applications
Abstract

In Document Image analysis the digitized images of printed documents typically consist of a mixture of text, graphics, and image elements. For proper processing and efficient representation, these elements have to be separated. For most of the applications it is essential to separate between text and non-text, because text captures the most information. These text lines may have different orientations or the text lines may be of curved shapes. Some of the techniques proposed for text string extraction are completely independent from text orientation and may deal with text in various font styles and sizes. There are many fast and efficient methods for extracting graphics and text paragraphs from printed document. This paper outlines the comparisons of some text extraction techniques proposed by researchers.

References
  1. Frank Hones, Jiirgen Lichter. "TEXT STRING EXTRACTION WITHIN MIXED-MODE DOCUMENTS" 1993 IEEE.
  2. Xuhong Li ,Peter A. Ng. "A DOCUMENT CLASSIFICATION AND EXTRACTION SYSTEM WITH LEARNING ABILITY "
  3. T. Perroud, K. Sobottka, and H. Bunke "Text extraction from color documents - clustering approaches in three and four dimensions" 2001 IEEE.
  4. Jiangying zhou, Daniel Lopresti "EXTRACTING TEXT FROM WWW IMAGES". 1997 IEEE
  5. Xuewen Wang "CHARACTER EXTRACTION AND RECOGNITIONS IN NATURAL SCENE IMAGES" 2001 IEEE.
  6. F. Leabourgeois, Z. Bublinski and H. Emptoz"A Fast and Efficient Method for Extracting Text paragraphs and graphics from unconstrained Document"1992 IEEE.
  7. U. Pal and Partha Pratim Roy "Multioriented and Curved Text Lines Extraction From Indian Documents" 2004 IEEE
Index Terms

Computer Science
Information Sciences

Keywords

Directed Weight Graph Technique Mixed Mode Technique Histogram-based Clustering Technique Clustering Technique