CFP last date
20 January 2025
Reseach Article

Text Categorization System for English Text Documents using Naïve Bayes Classifier

by Kiran Bolaj
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 177 - Number 48
Year of Publication: 2020
Authors: Kiran Bolaj
10.5120/ijca2020919950

Kiran Bolaj . Text Categorization System for English Text Documents using Naïve Bayes Classifier. International Journal of Computer Applications. 177, 48 ( Mar 2020), 7-10. DOI=10.5120/ijca2020919950

@article{ 10.5120/ijca2020919950,
author = { Kiran Bolaj },
title = { Text Categorization System for English Text Documents using Naïve Bayes Classifier },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2020 },
volume = { 177 },
number = { 48 },
month = { Mar },
year = { 2020 },
issn = { 0975-8887 },
pages = { 7-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume177/number48/31231-2020919950/ },
doi = { 10.5120/ijca2020919950 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:48:57.522236+05:30
%A Kiran Bolaj
%T Text Categorization System for English Text Documents using Naïve Bayes Classifier
%J International Journal of Computer Applications
%@ 0975-8887
%V 177
%N 48
%P 7-10
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Information technology generated huge data on the internet. Most of this data is mainly in English language. Automatic text categorization is useful in better management and retrieval of these text documents and also makes document retrieval as simple task. Various learning techniques exist for the classification of text documents like Naïve Bayes, Support Vector Machine and Decision Trees, etc. The proposed system uses a Naïve Bayesian method. Bayesian algorithms are often used to classify data in different categories in a way that the systems can be trained and learn from human corrections.

References
  1. El-Kourdi M., Bensaid A. and Rachidi T., “Automatic Arabic Document Categorization Based on the Naive Bayes Algorithm”, Proceedings of COLING 20th Workshop on Computational Approaches to Arabic Script-based Languages, pp. 51-58, August 2004.
  2. Kohilavani, S., Mala, T., Geetha, “Automatic Tamil Content Generation IAMA2009, IEEE International Conference, Sep 2009.
  3. Stanislaw Osinski, “An algorithm for clustering of web search results “, Master’s thesis, Poznan University of Technology, Poland, 2003.
  4. Meera Patil, Pravin Game, “Comparison of Marathi Text Classifiers”, ACEEE Int. J. on Information Technology, DOI: 01. IJIT.4.1.4, March 2014.
  5. ArunaDevi, K., Saveetha, R., “A Novel Approach on Tamil Text Classification Using C-Feature”, 2321-0613, 2014. IJSRD International Journal of Scientific Research & Development, 2014.
  6. Nidhi, Vishal Gupta, “Punjabi Text Classification using Naïve Bayes, Centroid and Hybrid Approach”, DOI: 10.5121/csit.2012.2421.
  7. Savita P. T., Santoshkumar B., “Effective Email Classification for Spam and Non-Spam”, International Journal of Advanced Research in Computer Science and Software Engineering, June 201.
Index Terms

Computer Science
Information Sciences

Keywords

Text categorization Naïve Bayes