International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 113 - Number 10 |
Year of Publication: 2015 |
Authors: Rajul Jain, Nitin Pise |
10.5120/19861-1818 |
Rajul Jain, Nitin Pise . Feature Selection for Effective Text Classification using Semantic Information. International Journal of Computer Applications. 113, 10 ( March 2015), 18-25. DOI=10.5120/19861-1818
Text categorization is the task of assigning text or documents into pre-specified classes or categories. For an improved classification of documents text-based learning needs to understand the context, like humans can decide the relevance of a text through the context associated with it, thus it is required to incorporate the context information with the text in machine learning for better classification accuracy. This can be achieved by using semantic information like part-of-speech tagging associated with the text. Thus the aim of this experimentation is to utilize this semantic information to select features which may provide better classification results. Different datasets are constructed with each different collection of features to gain an understanding about what is the best representation for text data depending on different types of classifiers.