CFP last date
20 December 2024
Reseach Article

Machine Learning approach to Document Classification using Concept based Features

by C.saranya Jothi, D.thenmozhi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 118 - Number 20
Year of Publication: 2015
Authors: C.saranya Jothi, D.thenmozhi
10.5120/20864-3578

C.saranya Jothi, D.thenmozhi . Machine Learning approach to Document Classification using Concept based Features. International Journal of Computer Applications. 118, 20 ( May 2015), 33-36. DOI=10.5120/20864-3578

@article{ 10.5120/20864-3578,
author = { C.saranya Jothi, D.thenmozhi },
title = { Machine Learning approach to Document Classification using Concept based Features },
journal = { International Journal of Computer Applications },
issue_date = { May 2015 },
volume = { 118 },
number = { 20 },
month = { May },
year = { 2015 },
issn = { 0975-8887 },
pages = { 33-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume118/number20/20864-3578/ },
doi = { 10.5120/20864-3578 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:02:17.353893+05:30
%A C.saranya Jothi
%A D.thenmozhi
%T Machine Learning approach to Document Classification using Concept based Features
%J International Journal of Computer Applications
%@ 0975-8887
%V 118
%N 20
%P 33-36
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Text mining refers to the process of deriving high-quality information from text. Text processing involves in search and replace in electronic format of text. A number of approaches have been developed to represent and classify text documents. Most of the approach tries to attain good classification performance while taking a document only by words. We propose a concept based methodology instead of terms. It represents the meaning of text to reduce the features. Support Vector Machine (SVM) algorithm is applied for document classification. Then the performance measure is compared with document classification using original features and concept based features. This methodology enhances the document classification accuracy.

References
  1. Basu T. and Murthy C. (2012). Effective text classification by a supervised feature selection approach. In Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on, pages 918–925. IEEE.
  2. Gayathri K. and Marimuthu A. (2013). Text document pre-processing with the knn for classification using the svm. In Intelligent Systems and Control (ISCO), 2013 7th International Conference on, pages 453–457. IEEE.
  3. Lin Y. S. , Jiang J. Y. , and Lee S. J. (2013). A similarity measure for text classification and clustering. IEEE Transactions on Knowledge and Data Engineering, page 1.
  4. Peng J. , Yang D. q. , Tang S. W. , Gao J. , Zhang P. y. , and Fu Y. (2007). A concept similarity based text classification algorithm. In Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery-Volume 01, pages 535–539. IEEE Computer Society.
  5. Wang Z. Q. , Sun X. , Zhang D. X. , and Li X. (2006). An optimal svm based text classification algorithm. In 2005 International Conference on Machine Learning and Cybernetics, pages 1378–1381.
  6. Datasets for single-label text categorization: http://web. ist. utl. pt/~acardoso/datasets/
  7. WEKA, classpath: http://weka. wikispaces. com/classpath
  8. WordNet 2. 1. http://www. brothersoft. com/wordnet-236667. html.
Index Terms

Computer Science
Information Sciences

Keywords

Text classification Support Vector Machine Feature Selection.