We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Call for Paper
December Edition
IJCA solicits high quality original research papers for the upcoming December edition of the journal. The last date of research paper submission is 20 November 2024

Submit your paper
Know more
Reseach Article

A Kannada Document Image Retrieval System based on Correlation Method

by Chandrakala H T
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 77 - Number 3
Year of Publication: 2013
Authors: Chandrakala H T
10.5120/13377-0989

Chandrakala H T . A Kannada Document Image Retrieval System based on Correlation Method. International Journal of Computer Applications. 77, 3 ( September 2013), 39-46. DOI=10.5120/13377-0989

@article{ 10.5120/13377-0989,
author = { Chandrakala H T },
title = { A Kannada Document Image Retrieval System based on Correlation Method },
journal = { International Journal of Computer Applications },
issue_date = { September 2013 },
volume = { 77 },
number = { 3 },
month = { September },
year = { 2013 },
issn = { 0975-8887 },
pages = { 39-46 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume77/number3/13377-0989/ },
doi = { 10.5120/13377-0989 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:49:19.259006+05:30
%A Chandrakala H T
%T A Kannada Document Image Retrieval System based on Correlation Method
%J International Journal of Computer Applications
%@ 0975-8887
%V 77
%N 3
%P 39-46
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The present growth of digitization of books and manuscripts demands an immediate solution to access them electronically. This requires research in the area of document image understanding, specifically in the area of document image retrieval. There is an immense scope for such a retrieval system for a digital library of Kannada Document Images. This paper presents an efficient Content Based Document Image Retrieval system for a Kannada document image collection. A recognition-free approach is followed because recognition based approach is inefficient in terms of performance. The data is pre-processed and segmented for faster matching and retrieval. An efficient search technique - Correlation method is used to search in large collection of document images. Performance evaluation using different datasets of Kannada documents shows the effectiveness of the approach.

References
  1. A. Balasubrahmanian, Million Meshesha, C. V. Jawahar, "Retrieval from Document Image Collections", In Proc. of the 7th IAPR Workshop on Document Analysis Systems (DAS) (LNCS), pp 1-12, 2006.
  2. Ashwin T V, Sastry P S 2002 "A font and size-independent OCR system for printed Kannada documents using support vector machines. " Sadhana 27: 35–58.
  3. Bansal V, Sinha R M K 1999 "On how to describe shapes of Devanagari characters and use them for recognition. " In Proc. Fifth Int. Conf. on Document Analysis and Recognition, Bangalore (IEEE Computer Society Press) pp
  4. B. Vijaykumar and A. G. Ramakrishnan, "Machine Recognition of Printed Kannada Text," in Proceedings of the Fifth International Workshop on Document Analysis Systems. 2002, pp. 37–48, Springer, Berlin.
  5. Casey, R. G. and Lecolinet, E. , "A Survey of Methods and Strategies in Character Segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, Vol. 18, No. 8, pp. 690-706.
  6. Choudhury B B, Pal U 1997 "An OCR system to read two Indian language scripts: Bangla and Devanagari. "In Proc. Fourth Int. Conf. on Document Analysis and Recognition (IEEE Computer Society Press) pp 1011–1015
  7. David Shen, Zaizai Lu "Computation of Correlation Coefficient and Its Confidence Interval in SAS"
  8. E. Garcia "A Tutorial on Correlation Coefficients"
  9. F. Luthy, T. Varga, H. Bunke, ''Using Hidden Markov Models as a Tool for Handwritten Text Line Segmentation'', Ninth International Conference on Document Analysis and Recognition, Curitiba, Brazil, 2007, pp. 8-12.
  10. G. Louloudis, K. Halatsis, B. Gatos, I. Pratikakis, "A Block-Based Hough Transform Mapping for Text Line Detection in Handwritten Documents", 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR 2006), La Baule, France, October 2006, pp. 515- 520.
  11. G. Louloudis, B. Gatos, I. Pratikakis, C. Halatsis" Line and Word Segmentation of Handwritten Documents"031-3203/©2009 Elsevier Ltd.
  12. Gonzalez R C, Woods R E 1993 Digital image processing (Boston, MA, USA: Addison Wesley Longman Publishing Co. Inc. )
  13. HOU Dewen, WANG Xichang, LIU Jiang "A Content-Based Retrieval Algorithm for Document Image Database" 978-1-4244-7874-3/10/©IEEE 2010 Crown
  14. Keerthi S S, Shevade S K, Bhattacharyya C, MurthyKRK2000 "A fast iterative nearest point algorithm for support vector machine classifier design. " IEEE Trans. Neural Networks 11: 124–136
  15. Khotanzad A 1998 " Rotation invariant pattern recognition using Zernike moments. " Proc. Int. Conf. on Pattern Rec. 326–328
  16. Liang, S. , Shridhar, M. and Ahmadi, M. , "Segmentation of Touching Characters in Printed Document Recognition". Pattern Recognition, 1994, Vol. 27, No. 6, pp. 825-840.
  17. Million Meshesha , C. V. Jawahar "Matching word images for content-based retrieval from printed document images" DOI 10. 1007/s10032-008-0067-3
  18. Nagabhushan P, Pai Radhika M 1999" Modified region decomposition method and optimal depth decision tree in the recognition of non-uniform sized characters—An experimentation with Kannada characters. " Pattern Rec. Lett. 20: 1467–1475.
  19. O'Gorman L, Kasturi R 1995 Document image analysis (IEEE Computer Society Press)
  20. Pavlidis T 1986 "A vectorizer and feature extractor for document recognition. " Computer. Vision Graphics Image Processing. 35: 111
  21. Ramachandra Manthalkar and P. K. Biswas, "An Automatic Script Identification Scheme for Indian Languages", NCC, 2002.
  22. R Sanjeev Kunte, R D Sudhaker Samuel. , 2007. An OCR system for printed Kannada text using Two-stage Multi-network classification approach employing Wavelet Features. Proc. International Conference on Computational Intelligence and Multimedia Applications (IEEE Computer Society Press. ), 349-353.
  23. Siddhaling Urolagin, Prema K. V, V. Subba Reddy "A Gabor Filters Based Method for Segmenting Inflected Characters of Kannada Script" 978-1-4244-6653-5/10/ ©2010 IEEE
  24. VijayaKumar B, Ramakrishnan A G 2004 "Radial basis function and sub-space approach for printed Kannada Text recognition. "Proc. IEEE ICASSP 2004 5: 321–324.
  25. Wahl F. M. , Wong, K. Y. , Casey R. G. : ''Block Segmentation and Text Extraction in Mixed Text/Image Documents'' Computer Graphics and Image Processing, 20 (19 82) 375- 390.
Index Terms

Computer Science
Information Sciences

Keywords

Content based image retrieval Correlation coefficient Median Filtering Kannada Document Images Segmentation.