International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 150 - Number 2 |
Year of Publication: 2016 |
Authors: Jaideepsinh K. Raulji, Jatinderkumar R. Saini |
10.5120/ijca2016911462 |
Jaideepsinh K. Raulji, Jatinderkumar R. Saini . Stop-Word Removal Algorithm and its Implementation for Sanskrit Language. International Journal of Computer Applications. 150, 2 ( Sep 2016), 15-17. DOI=10.5120/ijca2016911462
In the Information era, optimization of processes for Information Retrieval, Text Summarization, Text and Data Analytic systems becomes utmost important. Therefore in order to achieve accuracy, extraction of redundant words with low or no semantic meaning must be filtered out. Such words are known as stopwords. Stopwords list has been developed for languages like English, Chinese, Arabic, Hindi, etc. Stopword list is also available for Sanskrit language. Stop-word removal is an important preprocessing techniques used in Natural Language processing applications so as to improve the performance of the Information Retrieval System, Text Analytics & Processing System, Text Summarization, Question-Answering system, stemming etc. In this paper, a simple approach is used to design stop-word removal algorithm and its implementation for Sanskrit language. The algorithm and its implementation uses dictionary based approach. In dictionary based approach predefined list of stopwords is compared to the target text on which removal is required.