International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 156 - Number 11 |
Year of Publication: 2016 |
Authors: Suraj Subramanian, Deepali Vora |
10.5120/ijca2016912570 |
Suraj Subramanian, Deepali Vora . Unsupervised Text Classification and Search using Word Embeddings on a Self-Organizing Map. International Journal of Computer Applications. 156, 11 ( Dec 2016), 35-37. DOI=10.5120/ijca2016912570
This paper presents the results of an experimental implementation of a document classifier leveraging contextual word embeddings clustered on a self-organizing map. The problem of document categorization is further compounded when there are no predefined categories, or conversely there are too many categories, that documents may be bucketed into. This paper proposes to address these problems by modelling the major themes contained in the document corpus into a cluster-map using a self-organizing neural network. The cluster-map provides a visual representation to explore the corpus, and a near-semantic search interface of the many concepts outlined across the corpus.