International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 86 - Number 2 |
Year of Publication: 2014 |
Authors: Marwan Akeel, R. B. Mishra |
10.5120/14957-3124 |
Marwan Akeel, R. B. Mishra . A Statistical Method for English to Arabic Machine Translation. International Journal of Computer Applications. 86, 2 ( January 2014), 13-19. DOI=10.5120/14957-3124
Translating from English into a morphologically richer language like Arabic is a challenge in statistical machine translation. Segmentation of Arabic text was introduced to bridge the inflection morphology gap. In this work, we investigate the impact of supporting Arabic morphologically segmented training corpus in a phrase-based statistical machine translation system with one to one dictionary and examine the effects on system performance. The results show that the dictionary improves the quality of the translation output especially when the corpus used is normalized and fully segmented excluding the determiner. The dictionary also decreases the out of vocabulary rate. The effect of the dictionary support with different baseline and factored models using data ranging from full word form to fully segmented forms are also demonstrated.