CFP last date
20 January 2025
Reseach Article

Direct Processing of Run-Length Compressed Document Image for Segmentation and Characterization of a Specified Block

by Mohammed Javed, P. Nagabhushan, B. B. Chaudhuri
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 83 - Number 15
Year of Publication: 2013
Authors: Mohammed Javed, P. Nagabhushan, B. B. Chaudhuri
10.5120/14521-2926

Mohammed Javed, P. Nagabhushan, B. B. Chaudhuri . Direct Processing of Run-Length Compressed Document Image for Segmentation and Characterization of a Specified Block. International Journal of Computer Applications. 83, 15 ( December 2013), 1-6. DOI=10.5120/14521-2926

@article{ 10.5120/14521-2926,
author = { Mohammed Javed, P. Nagabhushan, B. B. Chaudhuri },
title = { Direct Processing of Run-Length Compressed Document Image for Segmentation and Characterization of a Specified Block },
journal = { International Journal of Computer Applications },
issue_date = { December 2013 },
volume = { 83 },
number = { 15 },
month = { December },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume83/number15/14521-2926/ },
doi = { 10.5120/14521-2926 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:59:25.270752+05:30
%A Mohammed Javed
%A P. Nagabhushan
%A B. B. Chaudhuri
%T Direct Processing of Run-Length Compressed Document Image for Segmentation and Characterization of a Specified Block
%J International Journal of Computer Applications
%@ 0975-8887
%V 83
%N 15
%P 1-6
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Extracting a block of interest referred to as segmenting a specified block in an image and studying its characteristics is of general research interest, and could be a challenging if such a segmentation task has to be carried out directly in a compressed image. This is the objective of the present research work. The proposal is to evolve a method which would segment and extract a specified block, and carry out its characterization without decompressing a compressed image, for two major reasons that most of the image archives contain images in compressed format and 'decompressing' an image indents additional computing time and space. Specifically in this research work, the proposal is to work on run-length compressed document images.

References
  1. Sheraz Ahmed, Muhammad Imran Malik, Marcus Liwicki, and Andreas Dengel. Signature segmentation from document images. International Conference on Frontiers in Handwriting Recognition (ICFHR), pages 423–427, 2012.
  2. Thomas M. Breuel. Binary morphology and related operations on run-length representations. International Conference on Computer Vision Theory and Applications - VISAPP, pages 159–166, 2008.
  3. J. Capon. A probabilistic model for run-length coding of pictures. IRE Transactions on Information Theory, 5:157–163, 1959.
  4. Ricardo L. de Queiroz and Reiner Eschbach. Segmentation of compressed documents. Proceedings of International Conference on Image Processing, 3:70–73, 1997.
  5. Ricardo L. de Queiroz and Reiner Eschbach. Fast segmentation of the jpeg compressed documents. Journal of Electronic Imaging, 7(2):367–377, 1998.
  6. M. Sezer Erkilinc, Mustafa Jaber, Eli Saber, Peter Bauer, and Dejan Depalov. Text, photo, and line extraction in scanned documents. Journal of Electronic Imaging, 21(3):033006–1– 033006–18, 2012.
  7. Sahana D. Gowda and P Nagabhushan. Entropy quantifiers useful for establishing equivalence between text document images. International Conference on Computational Intelligence and Multimedia Applications, pages 420 – 425, 2007.
  8. G. Grant and A. F. Reid. An efficient algorithm for boundary tracing and feature extraction. Computer Graphics and Image Processing, 17:225–237, November 1981.
  9. Jonathan J. Hull. Document image similarity and equivalence det. International Journal on Document Analysis and Recognition (IJDAR'98), 1:37–42, 1998.
  10. Mohammed Javed, P Nagabhushan, and B B Chaudhuri. Extraction of line-word-character segments directly from runlength compressed printed text-documents. National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG'13), Jodhpur, India, December 19-21, 2013 in Press.
  11. Mohammed Javed, P Nagabhushan, and B B Chaudhuri. Extraction of projection profile, run-histogram and entropy features straight from run-length compressed documents. Proceedings of Second IAPR Asian Conference on Pattern Recognition (ACPR'13), Okinawa, Japan, November 2013.
  12. Saurabh Kataria, William Browuer, Prasenjit Mitra, and C. Lee Giles. Automatic extraction of data points and text blocks from 2-dimensional plots in digital documents. Association for the Advancement of Artificial Intelligence, 2008.
  13. Dar Shyang Lee and Jonathan J. Hull. Detecting duplicates among symbolically compressed images in a large document database. Pattern Recognition Letters, 22:545–550, 2001.
  14. J. O. Limb and I. G. Sutherland. Run-length coding of television signals. Proceedings of IEEE, 53:169–170, 1965.
  15. Yue Lu and Chew Lim Tan. Document retrieval from compressed images. Pattern Recognition, 36:987–996, 2003.
  16. P. Nagabhushan, Mohammed Javed, and B. B. Chaudhuri. Entropy computation of document images in run-length compressed domain. International Conference on Signal and Image Processing (ICSIP14), Bangalore, India, January 8-11, 2014 in Press.
  17. Arash Asef Nejad and Karim Faez. A novel method for extracting and recognizing logos. International Journal of Electrical and Computer Engineering (IJECE), 2(5):577–588, October 2012.
  18. Cartic Ramakrishnan, Abhishek Patnia, Eduard Hovy, and Gully APC Burns. Layout-aware text extraction from fulltext pdf of scientific articles. Source Code for Biology and Medicine, 7:7, 2012.
  19. E. Regentova, S. Latifi, S. Deng, and D. Yao. An algorithm with reduced operations for connected components detection in itu-t group 3/4 coded images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8):1039 – 1047, August 2002.
  20. E. E. Regentova, S. Latifi, D. Chen, K. Taghva, and D. Yao. Document analysis by processing jbig-encoded images. International Journal on Document Analysis and Recognition (IJDAR), 7:260–272, 2005.
  21. Yoshihiro Shima, Seiji Kashioka, and Jun'Ichi Higashino. A high-speed algorithm for propagation-type labeling based on block sorting of runs in binary images. Proceedings of 10th International Conference on Pattern Recognition (ICPR), 1:655–658, 1990.
  22. A. Lawrence Spitz. Analysis of compressed document images for dominant skew, multiple skew, and logotype detection. Computer vision and Image Understanding, 70(3):321–334, June 1998.
  23. T. Tsuiki, T. Aoki, and S. Kino. Image processing based on a runlength coding and its application to an intelligent facsimile. Proc. Conf. Record, GLOBECOM '82, pages B6. 5. 1– B6. 5. 7, November 1982.
  24. K R Varshney. Block-segmentation and classification of grayscale postal images. Report in School of Electrical and Computer Engineering, Cornell University, 2004.
Index Terms

Computer Science
Information Sciences

Keywords

Compressed data Document Block Extraction Document Characterization Entropy Density