Emerging Paradigms of Information and Communication Technologies and its Impact on Society |
Foundation of Computer Science USA |
EPICTIS2014 - Number 1 |
January 2015 |
Authors: Aastha Gupta, Rachna Rajput, Richa Gupta, Monika Arora |
4cb5bb3d-7283-436f-bf67-418fde4cee0c |
Aastha Gupta, Rachna Rajput, Richa Gupta, Monika Arora . Comparative Study of POS Taggers. Emerging Paradigms of Information and Communication Technologies and its Impact on Society. EPICTIS2014, 1 (January 2015), 19-25.
POS Tagging provides important grammatical as well as contextual information for each word in the corpus. POS Tagging enables various companies to be able to track user reviews and can even be used for Speech Synthesis. In this paper, different POS Tagging Algorithms, namely, Memory-Based Learning Algorithm, Multi-Domain Web Based Algorithm and the Hybrid Model, will be compared on the basis of their execution time as well as efficiency. In Memory-Based Learning algorithm, the word to be tagged is searched in the lexicon using weighted similarity matrix, if an exact match is found, its lexical representation is retrieved, but, if it is not found, the lexical representation of its nearest neighbor is retrieved. Thus, the algorithm will not work efficiently for sparse data. On the other hand, Multi-Domain Web Based Algorithm is used to tag unknown words. The word is searched over the web for its possible tags. Due to the web search, runtime overhead is induced for each word. The tag with highest occurring probability is assigned to the word. The Hybrid Model executes Memory-Based Learning algorithm for known words and Multi-Domain Web Based Algorithm for unknown words.