International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 166 - Number 11 |
Year of Publication: 2017 |
Authors: Dixa Saxena, S. K. Saritha, K. N. S. S. V. Prasad |
10.5120/ijca2017914145 |
Dixa Saxena, S. K. Saritha, K. N. S. S. V. Prasad . Survey Paper on Feature Extraction Methods in Text Categorization. International Journal of Computer Applications. 166, 11 ( May 2017), 11-17. DOI=10.5120/ijca2017914145
As the world is moving towards globalization, digitization of text has been escalating a lot and the need to organize, categorize and classify text has become obligatory. Disorganization or little categorization and sorting of text may result in dawdling response time of information retrieval. There has been the ‘curse of dimensionality’ (as termed by Bellman)[1] problem, namely the inherent sparsity of high dimensional spaces. Thus, the search for a possible presence of some unspecified structure in such a high dimensional space can be difficult. This is the task of feature reduction methods. They obtain the most relevant information from the original data and represent the information in a lower dimensionality space. In this paper, all the applied methods on feature extraction on text categorization from the traditional bag-of-words model approach to the unconventional neural networks are discussed.