Text Categorization System for English Text Documents using Naïve Bayes Classifier

Kiran Bolaj

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Text Categorization System for English Text Documents using Naïve Bayes Classifier

by Kiran Bolaj

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 177 - Number 48

Year of Publication: 2020

Authors: Kiran Bolaj

10.5120/ijca2020919950

Kiran Bolaj . Text Categorization System for English Text Documents using Naïve Bayes Classifier. International Journal of Computer Applications. 177, 48 ( Mar 2020), 7-10. DOI=10.5120/ijca2020919950

@article{ 10.5120/ijca2020919950,

author = { Kiran Bolaj },

title = { Text Categorization System for English Text Documents using Naïve Bayes Classifier },

journal = { International Journal of Computer Applications },

issue_date = { Mar 2020 },

volume = { 177 },

number = { 48 },

month = { Mar },

year = { 2020 },

issn = { 0975-8887 },

pages = { 7-10 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume177/number48/31231-2020919950/ },

doi = { 10.5120/ijca2020919950 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:48:57.522236+05:30

%A Kiran Bolaj

%T Text Categorization System for English Text Documents using Naïve Bayes Classifier

%J International Journal of Computer Applications

%@ 0975-8887

%V 177

%N 48

%P 7-10

%D 2020

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Information technology generated huge data on the internet. Most of this data is mainly in English language. Automatic text categorization is useful in better management and retrieval of these text documents and also makes document retrieval as simple task. Various learning techniques exist for the classification of text documents like Naïve Bayes, Support Vector Machine and Decision Trees, etc. The proposed system uses a Naïve Bayesian method. Bayesian algorithms are often used to classify data in different categories in a way that the systems can be trained and learn from human corrections.

References

El-Kourdi M., Bensaid A. and Rachidi T., “Automatic Arabic Document Categorization Based on the Naive Bayes Algorithm”, Proceedings of COLING 20th Workshop on Computational Approaches to Arabic Script-based Languages, pp. 51-58, August 2004.
Kohilavani, S., Mala, T., Geetha, “Automatic Tamil Content Generation IAMA2009, IEEE International Conference, Sep 2009.
Stanislaw Osinski, “An algorithm for clustering of web search results “, Master’s thesis, Poznan University of Technology, Poland, 2003.
Meera Patil, Pravin Game, “Comparison of Marathi Text Classifiers”, ACEEE Int. J. on Information Technology, DOI: 01. IJIT.4.1.4, March 2014.
ArunaDevi, K., Saveetha, R., “A Novel Approach on Tamil Text Classification Using C-Feature”, 2321-0613, 2014. IJSRD International Journal of Scientific Research & Development, 2014.
Nidhi, Vishal Gupta, “Punjabi Text Classification using Naïve Bayes, Centroid and Hybrid Approach”, DOI: 10.5121/csit.2012.2421.
Savita P. T., Santoshkumar B., “Effective Email Classification for Spam and Non-Spam”, International Journal of Advanced Research in Computer Science and Software Engineering, June 201.

Index Terms

Computer Science

Information Sciences

Keywords

Text categorization Naïve Bayes