We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

A Study and Comparative Analysis of Different Stemmer and Character Recognition Algorithms for Indian Gujarati Script

by Rajnish M. Rakholia, Jatinderkumar R. Saini
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 106 - Number 2
Year of Publication: 2014
Authors: Rajnish M. Rakholia, Jatinderkumar R. Saini
10.5120/18496-9558

Rajnish M. Rakholia, Jatinderkumar R. Saini . A Study and Comparative Analysis of Different Stemmer and Character Recognition Algorithms for Indian Gujarati Script. International Journal of Computer Applications. 106, 2 ( November 2014), 45-50. DOI=10.5120/18496-9558

@article{ 10.5120/18496-9558,
author = { Rajnish M. Rakholia, Jatinderkumar R. Saini },
title = { A Study and Comparative Analysis of Different Stemmer and Character Recognition Algorithms for Indian Gujarati Script },
journal = { International Journal of Computer Applications },
issue_date = { November 2014 },
volume = { 106 },
number = { 2 },
month = { November },
year = { 2014 },
issn = { 0975-8887 },
pages = { 45-50 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume106/number2/18496-9558/ },
doi = { 10.5120/18496-9558 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:38:22.162855+05:30
%A Rajnish M. Rakholia
%A Jatinderkumar R. Saini
%T A Study and Comparative Analysis of Different Stemmer and Character Recognition Algorithms for Indian Gujarati Script
%J International Journal of Computer Applications
%@ 0975-8887
%V 106
%N 2
%P 45-50
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A lot of work has been reported on optical character recognition for various non-Indian scripts like Chinese, English and Japanese and Indian scripts like Tamil, Hindi Telugu, etc. , in this paper, we present a literature review on stemmer, optical character recognition (OCR) and Text mining work on Indian scripts, mainly on the Gujarati languages. We have discussed the different techniques for OCR and text mining in Gujarati scripts, and summarized most of the published work on this topic and gives future directions of research in the field of Indian script.

References
  1. Ameta J, Joshi N and Mathur I, " Improving the quality of Gujarati-Hindi machine translation thought Part-of-Speech tagging and stemmer assisted transliteration", International Journal on Natural Language Computing (IJNLC) Vol. 2, No. 3, 2013.
  2. Ameta J, Joshi N and Mathur I, "A Lightweight Stemmer for Gujarati", Department of Computer Science, Apaji Institute, Banasthali University, Rajasthan, India.
  3. Antani S, Agnihotri L, "Gujarati Character Recognition", Proceeding 5th ICDAR, IEEE Computer Society, 1999, pp. 418-422.
  4. Bag S, Bhowmick P and Harit G, "Recognition of Bengali handwritten characters using skeletal convexity and dynamic programming, Proceeding of International Conference on Emerging Applications of Information Technology, 2011, pp. 265–268.
  5. Baheti M and Kale K, "Gujarati Numeral Recognition: Affine Invariant Moments Approach", Special Issue of International Journal of electronics, Communication & Soft Computing Science & Engineering, 2012.
  6. Banashree N and Vasanta R, "OCR for script identification of Hindi (Devnagari) numerals using feature sub selection by means of end-point with neuro-memetic model". Int. J. Intell. Tech. 2:2007, pp. 206–210.
  7. Bhattacharya et al, "A hybrid scheme for handprinted numeral recognition based on a self-organizing network and MLP classifiers". Int. J. Pattern Recognition Artificial Intelligence. 16: 2002, pp. 845–864.
  8. Bhattacharya et al, "Recognition of handprinted Bangla numerals using neural network models", Proceedings of the AFSS International Conference on Fuzzy Systems, 2002, pp. 228–235.
  9. Bhattacharya U, Shridhar M and Parui S, "On recognition of handwritten Bangla characters", Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing,2006, pp. 817–828.
  10. Chaudhari S and Gulati R, "Character Level Separation and Identification of English and Gujarati Digits from Bilingual (English-Gujarati) Printed Documents", Proceedings published in International Journal of Computer Applications (IJCA), 2011.
  11. Chauhan K, Patel R and Joshi H, "Towards Improvement in Gujarati Text Information Retrieval by Using Effective Gujarati Stemmer", Journal of Information, Knowledge and Research in Computer Engineering, Vol. 2, Issue 2, 2013.
  12. Choksi A and Thakkar S, "Recognition of Similar appearing Gujarati Characters using Fuzzy-KNN Algorithm", International Journal of Computer Applications, Volume 55– No. 6, 2012.
  13. Dholakia J, Negi A and Rama M, "Zone Identification in the Printed Gujarati Text", Proceeding of 8th ICDAR IEEE Computer Society, 2005, pp. 272-276.
  14. Dholakia J, Yajnik A and Negi A, "Wavelet Feature Based Confusion Character Sets for Gujarati Script", ICCIMA, 2007 p 366-371.
  15. Dholakia J, Yajnik A and Negi A, "Wavelet Feature Based Confusion Character Sets for Gujarati Script" proceeding of ICCIMA, IEEE, 2007, pp. 366-379.
  16. Dobariya A and Rathod V, "Comparative Study of Different Classifier for Gujarati off-Line Text Recognition", International Journal for Scientific Research & Development, Vol. 2, Issue 01, 2014.
  17. Goswami M, Prajapati H and Dabhi V, "Classification of Printed Gujarati Characters using SOM based K-Nearest Neighbor Classifier", Pattern Recognition, Image Processing & Computer Vision, Proceeding of ICIIP, IEEE, 2011, pp. 1-5.
  18. Kompalli S, Setlur S and Govindaraju V, "Devanagari OCR using a recognition driven segmentation framework and stochastic language models", Int. J. Doc. Anal. Recognit. 12: 2009, pp. 123–1308.
  19. Majumdar A, "Bangla basic character recognition using digital curvelet transform", Journal of Pattern Recognition Research 2:2007, pp. 17–26.
  20. Mamta M and Kale K, "Support Vector Machine based Gujarati Numeral Recognition", International Journal on Computer Science and Engineering (IJCSE), Volume 3 - No. 7, 2011.
  21. Moro et al, "Gujarati Handwritten Numeral Optical Character through Neural Network and Skeletonization", Jurnal Sistem Komputer, Indonesia, Volume 3 - No. 1, 2013.
  22. Patel C and Desai A, "Segmentation of text lines into words for Gujarati handwritten text", Proceeding of ICSIP, IEEE, 2010, pp. 130-134.
  23. Patel C and Desai A, "Zone Identification for Gujarati Handwritten Word", Proceeding of 2nd EAIT, IEEE, 2011, pp. 194-197.
  24. Patel M and Balani P, "Clustering Algorithm for Gujarati Language", International Journal for Scientific Research & Development (IJSRD) Vol. 1, Issue 3, 2013.
  25. Patel P, Popat K and Bhattacharyya P, "Hybrid Stemmer for Gujarati", 23rd International Conference on Computational Linguistics (COLING), Beijing, August 2010.
  26. Sheth J and Patel B, "Dhiya: A stemmer for morphological level analysis of Gujarati language", Proceeding of ICICT, IEEE, 2014, pp. 151-154.
  27. Sheth J and Patel B, "Stemming Techniques and Naïve Approach for Gujarati Stemmer", International Journal of Computer Applications (IJCA), 2012.
  28. Singh D, Dutta M and Singh S, "Neural network based handwritten Hindi character recognition system", Proceedings of the Bangalore Annual Compute Conference, article no. 15, 2009.
  29. Solanki P and Bhatt M, "Printed Gujarati Script OCR using Hopfield Neural Network", International Journal of Computer Applications, Volume 69 - No. 13, 2013.
  30. Thaker H and Kumbharana C, "Analysis of Structural Features and Classification of Gujarati Consonant for Offline Character Recognition", International Journal of Scientific and Research Publications, Volume 4, Issue 8, August 2014.
  31. Gujarati language origin: http://en. wikipedia. org/wiki/Gujarati_alphabet
Index Terms

Computer Science
Information Sciences

Keywords

Classification feature extraction Gujarati script Gujarati stemmer Indian script pre-processing and segmentation.