Data Mining: Document Classification using Naive Bayes Classifier

Ekta Jadon; Roopesh Sharma

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

Energy Aware Routing Protocol in MANET using Power Efficient Topology Control Method

April

2012

Encoded Hybrid DWT based Watermarking Scheme based on Singular Matrix Decomposition

January

2015

Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages

January

2016

A Combination of Augmented Reality and Google Earth’s facilities for urban planning in idea stage

July

2010

Reseach Article

Data Mining: Document Classification using Naive Bayes Classifier

by Ekta Jadon, Roopesh Sharma

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 167 - Number 6

Year of Publication: 2017

Authors: Ekta Jadon, Roopesh Sharma

10.5120/ijca2017913925

Ekta Jadon, Roopesh Sharma . Data Mining: Document Classification using Naive Bayes Classifier. International Journal of Computer Applications. 167, 6 ( Jun 2017), 13-16. DOI=10.5120/ijca2017913925

@article{ 10.5120/ijca2017913925,

author = { Ekta Jadon, Roopesh Sharma },

title = { Data Mining: Document Classification using Naive Bayes Classifier },

journal = { International Journal of Computer Applications },

issue_date = { Jun 2017 },

volume = { 167 },

number = { 6 },

month = { Jun },

year = { 2017 },

issn = { 0975-8887 },

pages = { 13-16 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume167/number6/27774-2017913925/ },

doi = { 10.5120/ijca2017913925 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:14:05.979292+05:30

%A Ekta Jadon

%A Roopesh Sharma

%T Data Mining: Document Classification using Naive Bayes Classifier

%J International Journal of Computer Applications

%@ 0975-8887

%V 167

%N 6

%P 13-16

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In data mining, classification is the way to splits the data into several dependent and independent regions and each region refer as a class. There are different kinds of classifier uses to accomplish classification task. Moreover classification is bounded in case of classifying of text documents. The motives of the work which a present in the article is to evaluate multiclass document classification and to learn achieve accuracy of classification in the case of text documents. Naive Bayes approach is used to deal with the problem of document classification via a deceptively simplistic model. The Naive Bayes approach is applied in Flat (linear) and hierarchical manner for improving the efficiency of classification model. It has been found that Hierarchical Classification technique is more effective than Flat classification. It also performs better in case of multi-label document classification. In contrast to retrospect we observe significant increase in the generation of data each day. And hence with the advent of smarter technologies, data is required to be classified and sorted before framing out decisions from it. There are so many techniques available for classifying documents into various categories or labels. Data mining is the process of non-trivial extraction of novel, implicit, and actionable knowledge from large data sets.

References

Shweta Joshi. "Categorizing the Document Using Multi Class Classification in Data Mining", 2011 International Conference on Computational Intelligence and Communication Networks, 10/2011Ding, W. and Marchionini, G. 1997 A Study on Video Browsing Strategies. Technical Report. University of Maryland at College Park.
Nigam, Ayan, et al. "Classifying the bugs using multi-class semi supervised support vector machine." Pattern Recognition, Informatics and Medical Engineering (PRIME), 2012 International Conference on. IEEE, 2012.
Ponce, Julio, Alberto Hernndez, Alberto Ochoa, Felipe Padilla, Alejandro Padilla, Francisco lvarez, and Eunice Ponce de Le. "Data Mining in Web Applications", Data Mining and Knowledge Discovery in Real Life Applications, 2009.
Survey of Classification Techniques in Data Mining, Thair Nu Phyu, Proceedings of the International Multi Conference of Engineers and Computer Scientists, 2009, Vol. IIMECS 2009, March 18 - 20, 2009, Hong Kong.
Alexandrin Popescul, Lyle H. Ungar, Steve Lawrence, David M. Pennock, Statistical relational learning for document mining. In Proceedings of IEEE International Conference on Data Mining (ICDM-2003), 2003, pages 275–282.
S. B. Kim, H. C. Rim, D. S. Yook, H. S. Lim, Effective Methods for Improving Naïve Bayes Text Classifiers, In Proceeding of the 7th Pacific Rim International Conference on Artificial Intelligence, 2002, Volume, 2417.
Yang Y., Liu X., A re-examination of text categorization methods. Proceedings of the 22nd Annual International Conference on Research and Development in Information Retrieval (SIGIR’99), 1999, pp. 42-49, ACM Press.
Nigam, B., Ahirwal, P., Salve, S., & Vamney, S. (2011). Document classification using expectation maximization with semi supervised learning. arXiv preprint arXiv:1112.2028.
Senkamalavalli, R, and T Bhuvaneshwari. "Data mining techniques for CRM", International Conference on Information Communication and Embedded Systems (ICICES2014), 2014.
Jain, Rishabh, et al. "Performance evaluation of PSVM using various combination of kernel function for intrusion detection system." International Journal of Modeling and Optimization 2.5 (2012): 613.

Index Terms

Computer Science

Information Sciences

Keywords

Data Mining Mining Techniques Classification Document Classification Naïve Bayes Classifier.