CFP last date
20 December 2024
Reseach Article

Some Investigations on Machine Learning Techniques for Automated Text Categorization

by Bhagirath Prajapati, Sanjay Garg, N C Chauhan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 71 - Number 3
Year of Publication: 2013
Authors: Bhagirath Prajapati, Sanjay Garg, N C Chauhan
10.5120/12340-8617

Bhagirath Prajapati, Sanjay Garg, N C Chauhan . Some Investigations on Machine Learning Techniques for Automated Text Categorization. International Journal of Computer Applications. 71, 3 ( June 2013), 32-36. DOI=10.5120/12340-8617

@article{ 10.5120/12340-8617,
author = { Bhagirath Prajapati, Sanjay Garg, N C Chauhan },
title = { Some Investigations on Machine Learning Techniques for Automated Text Categorization },
journal = { International Journal of Computer Applications },
issue_date = { June 2013 },
volume = { 71 },
number = { 3 },
month = { June },
year = { 2013 },
issn = { 0975-8887 },
pages = { 32-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume71/number3/12340-8617/ },
doi = { 10.5120/12340-8617 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:34:33.175968+05:30
%A Bhagirath Prajapati
%A Sanjay Garg
%A N C Chauhan
%T Some Investigations on Machine Learning Techniques for Automated Text Categorization
%J International Journal of Computer Applications
%@ 0975-8887
%V 71
%N 3
%P 32-36
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The automated categorization (classification) of texts into predefined categories is one of the widely explored fields of research in text mining. Now-a-days, availability of digital data is very high, and to manage them in predefined categories has become a challenging task. Machine learning technique is an approach by which we can train automated classifier to classify the documents with minimum human assistance. This paper discusses the Naïve Bayes, Rocchio, k-Nearest Neighborhood and Support Vector Machine methods within machine learning paradigm for automated text categorization of given documents in predefined categories.

References
  1. Manning, C. D. , Raghavan, P. , Chütze, H. 2009. An Introduction to information retrieval, Chapter 1: Boolean retrieval, page 1, Cambridge University Press.
  2. Rijsbergen, C. J. V. 1979. Information retrieval: Chapter 2: Automatic Text Analysis, Butterworth-Heinemann, 2nd edition.
  3. Sebastian, F. , Ricerche, C. N. 2002. "Machine learning in automated text classification", ACM Computing Surveys, Vol. 34, No. 1, pp. 1-47.
  4. Nilsson, N. J. 1996. Introduction to machine learning, Chap 01: Preliminaries, Draft of Incomplete.
  5. Salton, G. , Buckley, C. 1988. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), pages. 513–523.
  6. Guo, G. , Wang, H. , Bell, D. , Bi, Y. , and Greer, K. 2006. "Using k-NN model-based approach for automatic text categorization", Soft Computing-A Fusion of Foundations, Methodologies and Applications.
  7. Manning, C. , Raghvan, P. , and Schutze, H. 2008. "Text classification and Naïve Bayes", Chapter in Introduction to Information Retrieval, Cambridge University Press.
  8. Yang, Y. 1994. "Expert network: effective and efficient learning from human decisions in text categorization and retrieval", In Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval, Dublin, Ireland, pages. 13–22.
  9. Joachims, T. 1999. "Transductive inference for text classification using support vector machines", ICML-99, Pages 200–209.
  10. Yang, Y. , Liu, X. 1999. "A re-examination of text categorization methods", SIGIR-99, Page 42–49.
  11. Vang, K. : 20 news group dataset, http://people. csail. mit. edu. /Jrennie/20newsgroup.
Index Terms

Computer Science
Information Sciences

Keywords

Machine learning Text categorization.