International Conference and Workshop on Emerging Trends in Technology |
Foundation of Computer Science USA |
ICWET - Number 9 |
None 2011 |
Authors: B S Harish, S Manjunath, D S Guru |
97025ea5-b5d0-47ec-aabf-acf3b23235cf |
B S Harish, S Manjunath, D S Guru . Cluster Based Text Classification: A Symbolic Approach. International Conference and Workshop on Emerging Trends in Technology. ICWET, 9 (None 2011), 38-44.
Classification of text documents based on clustering is presented in this paper. We propose a new method of representing documents using symbolic data analysis. For each class of documents we propose to create multiple clusters to preserve the intraclass variations. Term frequency vectors of each cluster are used to form a symbolic representation by the use of interval valued features. Further we propose a novel symbolic method for feature selection. The new feature selection method reduces the features in the representation phase for effective text classification. It keeps the best features for effective representation and simultaneously reduces the time taken to classify a given document. To corroborate the effectiveness of the proposed model, we conducted an experimentation on various datasets. Experimental results reveal that the proposed method gives better results when compared to the state of the art techniques.