CFP last date
20 December 2024
Reseach Article

A Shape Feature based Identification of a Complex Document

Published on April 2015 by Shridevi Soma, B.v. Dhandra
National conference on Digital Image and Signal Processing
Foundation of Computer Science USA
DISP2015 - Number 3
April 2015
Authors: Shridevi Soma, B.v. Dhandra
07deb682-a66a-4478-991c-b4c325da1be9

Shridevi Soma, B.v. Dhandra . A Shape Feature based Identification of a Complex Document. National conference on Digital Image and Signal Processing. DISP2015, 3 (April 2015), 1-6.

@article{
author = { Shridevi Soma, B.v. Dhandra },
title = { A Shape Feature based Identification of a Complex Document },
journal = { National conference on Digital Image and Signal Processing },
issue_date = { April 2015 },
volume = { DISP2015 },
number = { 3 },
month = { April },
year = { 2015 },
issn = 0975-8887,
pages = { 1-6 },
numpages = 6,
url = { /proceedings/disp2015/number3/20489-3024/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National conference on Digital Image and Signal Processing
%A Shridevi Soma
%A B.v. Dhandra
%T A Shape Feature based Identification of a Complex Document
%J National conference on Digital Image and Signal Processing
%@ 0975-8887
%V DISP2015
%N 3
%P 1-6
%D 2015
%I International Journal of Computer Applications
Abstract

Identification of ownership of any complex document is a challenging task in the field of Document Image Processing. There are many ways in which the constituent parts of the whole document image is used to know the ownership. These constituent parts might be seal, logo, signature, letter number, name of the organization written with different font type and size, In this paper a system is devised to identify the document using Reference Number part of the letter, which is normally found in any official letters, normally it contains the information such as name of the organization, name of the department, academic year and letter number. The segmented part of this reference number is used for defining the feature set that contains total of 17 features out of which 10 are shape features and 07 are Hu's moment invariant features. The Support Vector Machine classifier with RBF kernel is used for pattern matching. The proposed algorithm is experimented on a data set of letters from Gulbarga University Gulbarga. The experimental results have shown the average recognition accuracy of 85. 42%.

References
  1. Sachin Grover, Kushal Arora and Suman K Mitra, "Text Extraction from Document Image using Edge Information", IEEE India Council Conference, 2009.
  2. Yen-Lin Chen, "Automatic Text Extraction, Removal and Inpainting of Complex Document Images", International Journal of Innovative Computing, Information and Control, ISSN 1349-4198, pp 303-327, 2012.
  3. G. Rama Mohan Babu, P. Srimaiyee and A. Srikrishna, "Text Extraction From Hetrogenous Images using Mathematical Morphology", JATIT, 2010.
  4. Liukas Neumann and Jiri Matas, "A Method for text Localization and Recognition in real-world images", 10th Asian Conference on Computer Vision, Queenstown, New Zealand, 2010.
  5. B. V. Dhandra, Shridevi Soma, Rashmi T, Gururaj M, Classification of Document Image Components, International Journal of Engineering Research and Technology, Vol. 2, Issue 10, October 2010, page 1429-1439.
  6. B. V Dhandra, Mallikarjun Hangarge, On Seperation of English Numerals from Multilingual Document Images, International Journal of Multimedia(JM), Vol. 2, No. 6 Nov. 2007, Academy Publisher, Oulu, Finland, page 26-33, ISSN: 1796-2048.
  7. Shazia Akram, Mehraj-Ud-Din Dar, Aasia Quyoum, "Docment Image Processing - A Review", International Journal of Computer Applications (0975-887), Vol. 10-No. 5, Nov. 2010.
  8. Rangachar Kasturi, Lawrence O'Gorman and Venu Govindaraju, "Document image analysis:A primer", Sadhana Vol. 27, Part 1, Feb 2002, pp 3-22.
  9. Robert M. Haralick, "Document Image Understanding : Geometric and Logical Layout", IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 4, 1994, pp 384-390.
  10. Dengsheng Zhang, Guojun Lu, "Review of Shape representation and description techniques", Pattern Recognition Society, Elsevier Ltd. , 2004, pp 1-19.
  11. Dimitri A Lisin, M. Mattar, M. Blaschko, M. Benfield, E. Learned-Miller, "Combining Local and Global Image Features for Object Class Recognition," Proceedings of IEEE Workshop on Learning in Computer Vision and Pattern Recognition (in conjunction with CVPR), San Diego, California, June, 2005.
  12. C. P. Sumathi, T. Santhanam and G. Gayathri Devi, "A Survey on various approaches of Text extraction in Images", International Journal of Computer Science and Engineering Survey(IJCSES), Vol. 3, No. 4, Aug 2012.
  13. R. Chandrasekaran, R. M. Chandrasekaran, "Morphology based Text Extraction in Images", IJCST, Vol. 2, Issue 4, 2011.
  14. Md. Shorif Uddin, Tenzila Rahman, Umme Sayma Busra and Madeena Sultana, "Automated Extraction of Text from Images using Morphology Based Approach", IJEI, Vol. 1, No. 1, Aug. 2012.
  15. Digital Image using MATLAB by Rafael C. Gonzales, Richard E. Words and Steven L Eddins, Low Price Edition, India.
Index Terms

Computer Science
Information Sciences

Keywords

Connected Component Labeling Moment Invariant Svm.