We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

A Comparative Study of Techniques for Data Classification based on Naive Bayes

Published on December 2015 by Antriksh Pandita, Ajinkya Jadhav, Vijay Singh, Ashok Pawar, Nilav Mukhopadhyay
National Conference on Advances in Computing
Foundation of Computer Science USA
NCAC2015 - Number 5
December 2015
Authors: Antriksh Pandita, Ajinkya Jadhav, Vijay Singh, Ashok Pawar, Nilav Mukhopadhyay
b972987c-806d-42cb-94e5-0b477f2bbbf3

Antriksh Pandita, Ajinkya Jadhav, Vijay Singh, Ashok Pawar, Nilav Mukhopadhyay . A Comparative Study of Techniques for Data Classification based on Naive Bayes. National Conference on Advances in Computing. NCAC2015, 5 (December 2015), 1-4.

@article{
author = { Antriksh Pandita, Ajinkya Jadhav, Vijay Singh, Ashok Pawar, Nilav Mukhopadhyay },
title = { A Comparative Study of Techniques for Data Classification based on Naive Bayes },
journal = { National Conference on Advances in Computing },
issue_date = { December 2015 },
volume = { NCAC2015 },
number = { 5 },
month = { December },
year = { 2015 },
issn = 0975-8887,
pages = { 1-4 },
numpages = 4,
url = { /proceedings/ncac2015/number5/23384-5050/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Advances in Computing
%A Antriksh Pandita
%A Ajinkya Jadhav
%A Vijay Singh
%A Ashok Pawar
%A Nilav Mukhopadhyay
%T A Comparative Study of Techniques for Data Classification based on Naive Bayes
%J National Conference on Advances in Computing
%@ 0975-8887
%V NCAC2015
%N 5
%P 1-4
%D 2015
%I International Journal of Computer Applications
Abstract

The Naïve Bayes model is used for text classification and the data is considered by using the Naïve Bayes classifier and also the probabilistic based model. To define the discrete variable we use the multinomial distribution and for the numeric variable we use the Gaussian distribution. In this research, graphical structure has been considered due to properties of Naïve Bayes classifier such as flexibility, energy efficient and high performance. The main idea of classification has been introducedthat is the basic techniques for data classification which includes Naive Bayesian classifier.

References
  1. An Effective Algorithm for Improving the Performance of Naive Bayes for Text Classification, GuoQiang Higher Vocational College Shanghai University of Engineering Science Shanghai, China
  2. Naive Bayes Classi?cation of Uncertain Data,Jiangtao Ren?, Sau Dan Lee†, Xianlu Chen?, Ben Kao†, Reynold Cheng† and David Cheung† ?Department of Computer Science, Sun Yat-sen University, Guangzhou, 510275, China
  3. Clustering Unstructured Text Documents Using Naïve Bayesian Concept and Shape Pattern Matching. International Journals of Computer Application and Technology (IJCAT) 2012
  4. Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification. International Journal of Computer Science and Applications 2013.
  5. Naïve (Bayes) at Forty: The independence assumption in information retrieval David . D. Lewis AT&T Labs –Research 180 park avenue Florham Park , NJ 07932-0971 USA
  6. Unsupervised naive Bayes for data clustering with mixtures of truncated exponentials. International Journal of Approximate Reasoning, 2012.
  7. Performance Comparison of Naive Bayes and J48 Classification Algorithms, Published in IJAER, Vol. 7, No. 11, 2012.
  8. Data mining, Introductory and Advanced Topics, Person education, 1st edition, 2006.
  9. Unsupervised Discretization Using Kernel Density EstimationMarenglen Biba, Floriana Esposito, Stefano Ferilli, Nicola Di Mauro, Teresa M. A BasileDepartment of Computer Science, University of BariVia Orabona 4, 70125 Bari, Italy
  10. J. Han, M. Kamber, "Data Mining: Concepts andTechniques", Second Edition, Elsevier Inc. , Rajkamal Electric Press, 2006, pp. 1-628.
  11. L. Yanjun, L. Congnan, S. M. Chung, "Text Clustering with Feature Selection by Using Statistical Data", IEEE Transactions on Knowledge and Data Engineering, IEEEJournal, Volume 20, Issue 5, May 2008, pp. 641-652.
  12. L. Xinwu, "Research on Text Clustering Algorithm Based on k-means and SOM",International Symposium on Intelligent Information Technology Application Workshops 2008, IITAW 2008, 21-22 Dec. 2008, pp. 341-344.
  13. X. Liu, P. He, H. Wang, "The Research of Text Clustering Algorithms Based on Frequent Term Sets", Proc. 2005 International Conference on Machine Learning and Cybernetics 2005, Volume 4, 18-21 Aug. , 2005, pp. 2352-2356.
  14. Gerard Salton and Michael J. McGill. Introduction to modern information retrieval. McGraw –Hill book company, NewYork, 1983.
Index Terms

Computer Science
Information Sciences

Keywords

I. 5. 3 Clustering Similarity Measure H. 3. 1 Information Storage And Retrieval G. 1. 6 Global Optimization.