National Conference on Emerging Trends in Computer Science and Information Technology |
Foundation of Computer Science USA |
NCETCSIT - Number 1 |
February 2012 |
Authors: Hetal Doshi, Maruti Zalte |
5041d871-b5f8-4f4f-842f-7dbb5a3b2a3e |
Hetal Doshi, Maruti Zalte . Performance of Naive Bayes Classifier Multinomial Model on Different Categories of Documents. National Conference on Emerging Trends in Computer Science and Information Technology. NCETCSIT, 1 (February 2012), 10-13.
Automatic sorting of documents is progressively becoming vital because manual handling and organization of documents is not a feasible solution as it can be very time consuming given the number of documents. One of the machine learning applications â text classification which is employed for document classification is explored in this paper. Generative learning algorithm â Naïve Bayes classifier is discussed in this paper. Documents from the 20 Newsgroups dataset are distributed in two groups. Group 1 consists of relatively unrelated two categories of documents and group 2 consists of relatively similar two categories of documents. Naïve Bayes classifier - Multinomial model is implemented to perform classification on both groups and it is observed that Accuracy can be improved with increasing the training set size for both the groups and Classification accuracy is higher for category of documents with lower similarity.