CFP last date
20 December 2024
Reseach Article

Text Classification for Marathi Documents using Supervised Learning Methods

by Pooja Bolaj, Sharvari Govilkar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 155 - Number 8
Year of Publication: 2016
Authors: Pooja Bolaj, Sharvari Govilkar
10.5120/ijca2016912374

Pooja Bolaj, Sharvari Govilkar . Text Classification for Marathi Documents using Supervised Learning Methods. International Journal of Computer Applications. 155, 8 ( Dec 2016), 6-10. DOI=10.5120/ijca2016912374

@article{ 10.5120/ijca2016912374,
author = { Pooja Bolaj, Sharvari Govilkar },
title = { Text Classification for Marathi Documents using Supervised Learning Methods },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2016 },
volume = { 155 },
number = { 8 },
month = { Dec },
year = { 2016 },
issn = { 0975-8887 },
pages = { 6-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume155/number8/26623-2016912374/ },
doi = { 10.5120/ijca2016912374 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:00:42.956568+05:30
%A Pooja Bolaj
%A Sharvari Govilkar
%T Text Classification for Marathi Documents using Supervised Learning Methods
%J International Journal of Computer Applications
%@ 0975-8887
%V 155
%N 8
%P 6-10
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The evolution of Information Technology has led to the collection of large number of text documents. Mostly, researchers worked on English text documents. Today, millions of documents are present in Indian regional languages. So, to classify these documents manually is expensive and time consuming task. Automatic classification can help in better management and retrieval of these documents. From the literature survey, it is found that not much work has been done for classification of Marathi text documents. This paper presents efficient Marathi text classification system using Supervised Learning Methods and Ontology based classification.

References
  1. Meera Patil, et. al., “Comparison of Marathi Text Classifiers”, ACEEE Int. J. on Information Technology, DOI: 01.IJIT.4.1.4, March 2014.
  2. Sushma R. Vispute, et. al., “Automatic Text Categorization of Marathi Documents Using Clustering Technique”, 978-1-4673-2818-0/13, 2013 IEEE.
  3. Ashis Kumar Mandal, et. al., “Supervised Learning Methods for Bangla Web Document Categorization”, International Journal of Artificial Intelligence and Application (IJAIA), DOI: 10.5121/ijaia.2014.5508 September 2014.
  4. K. Rajan, et. al., “Automatic classification of Tamil documents using vector space model and artificial neural networks”, Expert Systems with Applications 36 (2009) 1091-10918, ELSEVIER, 2009
  5. Abbas Raza Ali, et. al., “Urdu Text Classification”, FIT’09, December 16-18, 2009, CIIT, Abbottabad, Pakistan.
  6. Nidhi, et. al., “Domain Based Classification of Punjabi Text Documents using Ontology and Hybrid Based Approach”, Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing (SANLP), pages 109-122, COLING 2012, Mumbai, December 2012.
  7. Kavi Narayan Murthy, “Automatic Categorization of Telugu News Articles”.
  8. A. Kanaka Durga, et. al., “Ontology Based Text Categorization-Telugu Documents”, International Journal of Scientific & Engineering Research Volume 2, Issue 9, September-2011. ISSN 2229-5518.
  9. Nidhi, et. al., “Punjabi Text Classification using Naïve Bayes, Centroid and Hybrid Approach”, DOI: 10.5121/csit.2012.2421.
  10. Vishnu Murthy, et. al., “A Comparative Study on Term Weighting Methods For Automated Telugu Text Categorization With Effective Classifiers”, International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.6, November 2013.
  11. Bijal Dalwadi, et.al., “A Review: Text Categorization for Indian Language”, 2349-4476, International Journal of Engineering Technology Management and Applied Sciences, March 2015.
  12. B. Mahalakshmi, et. al., “An Overview of Categorization Techniques”, 2249-6645, International Journal of Modern Engineering Research (IJMER). October 2012.
  13. S. Niharika, et. al., “A Survey on Text Categorization”, 2231-2803, International Journal of Computer Trends and Technology, 2012.
  14. Monika Dogra, et. al., “A effective stemmer in Devanagari script”, Proc. of the Intl. Conf. on Recent Trends In Computing and Communication Engineering -- RTCCE 2013, ISBN: 978-981-07-6184-4 doi:10.3850/ 978-981-07-6184-4_05.
  15. Sharvari S. Govilkar, et. al., “Extraction of Root Words using Morphological Analyzer for Devanagari Script”, I.J. Information Technology and Computer Science, 2016, 01, 33-39, DOI: 10.5815/ijitcs.2016.01.04.
  16. Dalwadi Bijal, et. al., “Overview of Stemming Algorithms for Indian and Non-Indian Languages”, International Journal of Computer Science and Information Technologies, Vol. 5 (2), 2014, 1144-1146.
  17. http://blog.datumbox.com/machine-learning- tutorial-the- naive-bayes-text-classifier/
Index Terms

Computer Science
Information Sciences

Keywords

Text Mining Support Vector Machine Naïve Bayes Modified K Nearest Neighbor Ontology.