International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 127 - Number 16 |
Year of Publication: 2015 |
Authors: Thomas Tesha |
10.5120/ijca2015906707 |
Thomas Tesha . The Impact of Transformed Features in Automating the Swahili Document Classification. International Journal of Computer Applications. 127, 16 ( October 2015), 37-42. DOI=10.5120/ijca2015906707
This paper describes experimental results in an attempt to identify the Transformation techniques which can be adopted to improve features for the automation of classification of Swahili documents. This means improving classification rate by enhancing separability and accuracy. The experiment involved Relative Frequency (RF), Power transformation (PT) and Relative Frequency with Power transformation (RFPT). The Term weighting with TFIDF and the absolute features (AF) were also studied. The features’ dimension reduction was done by using the statistical techniques of Principal Component Analysis. In learning algorithm, the Support vector machine for classification and the k-NN were used, and in evaluating the effect of features’ performance with the classifiers the micro averaged f-measure were adopted. The extensive experimental results demonstrated that the RFPT features worked better with the Support Vector Machine classifiers unlike k-NN in improving the classification rate by enhancing document separability and accuracy in Automation of Swahili document classification.