International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 31 - Number 3 |
Year of Publication: 2011 |
Authors: P.Perumal, R. Nedunchezhian, D.Brindha |
10.5120/3803-5249 |
P.Perumal, R. Nedunchezhian, D.Brindha . Article:An Empirical Selection Method for Document Clustering. International Journal of Computer Applications. 31, 3 ( October 2011), 15-19. DOI=10.5120/3803-5249
Model Selection is a task selecting set of potential models. This method is capable of establishing hidden semantic relations among the observed features, using a number of latent variables. In this paper, the selection of the correct number of latent variables is critical. In the most of the previous researches, the number of latent topics was selected based on the number of invoked classes. This paper presents a method, based on backward elimination approach, which is capable of unsupervised order selection in PLSA. During the elimination process, proper selection of some latent variables which must be deleted is the most essential problem, and its relation to the final performance of the pruned model is straightforward. To treat this problem, we introduce a new combined pruning method which selects the best options for removal, has been used. The obtained results show that this algorithm leads to an optimized number of latent variables. In this paper, we propose a novel approach, namely DPMFS, to address this issue.