CFP last date
20 January 2025
Reseach Article

Urdu to English Machine Translation using Bilingual Evaluation Understudy

by Asad Abdul Malik, Asad Habib
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 82 - Number 7
Year of Publication: 2013
Authors: Asad Abdul Malik, Asad Habib
10.5120/14126-1040

Asad Abdul Malik, Asad Habib . Urdu to English Machine Translation using Bilingual Evaluation Understudy. International Journal of Computer Applications. 82, 7 ( November 2013), 5-12. DOI=10.5120/14126-1040

@article{ 10.5120/14126-1040,
author = { Asad Abdul Malik, Asad Habib },
title = { Urdu to English Machine Translation using Bilingual Evaluation Understudy },
journal = { International Journal of Computer Applications },
issue_date = { November 2013 },
volume = { 82 },
number = { 7 },
month = { November },
year = { 2013 },
issn = { 0975-8887 },
pages = { 5-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume82/number7/14126-1040/ },
doi = { 10.5120/14126-1040 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:57:08.020019+05:30
%A Asad Abdul Malik
%A Asad Habib
%T Urdu to English Machine Translation using Bilingual Evaluation Understudy
%J International Journal of Computer Applications
%@ 0975-8887
%V 82
%N 7
%P 5-12
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Machine Translation (MT) is exigent because it involves several thorny subtasks such as intrinsic language ambiguities, linguistic complexities and diversities between source and target language. Usually MT depends upon rules that provide linguistic information. At present, the corpus based MT approaches are used that include techniques like Example Based MT (EBMT) and Statistical MT (SMT). In addition to others, both of these corpus based techniques have different frameworks in the contemporary data-driven paradigm. SMT systems generate outputs using probabilities, whereas EBMT systems translate input text by matching examples from large amount of training data. Urdu MT is in its infancy with very limited availability of required data and computational resources. In this paper, we analyzed and evaluated the main MT techniques using qualitative as well as quantitative approaches. Strengths and weaknesses of each technique have been brought to light through special focus and discussion on examples from Urdu language MT literature. We evaluated the automated machine translated outputs using Bilingual Evaluation Understudy (BLEU). The EBMT approach produced the highest accuracy of 84. 21% whereas the accuracy of the online SMT system is 62. 68%. We found that BLUE scores of machine translated long Urdu sentences are low in comparison with long sentences. Similarly source text containing low frequency words affect the quality of Urdu machine translation negatively. Experiments and findings section of this paper explicate our reported results in detail. The paper concludes with proposal of future directions for research in Urdu machine translation.

References
  1. M. P. Lewis, G. F. Simons, C. D. Fennig, "Ethnologue: Languages of the World", Summer Institute of Linguistics International, 2013
  2. J. Hutchins, "Latest Developments in Machine Translation Technology: Beginning a New Era", MT Summit IV, 1993, 11-34
  3. A. H. Homiedan, "Machine Translation", Journal of King Saud University, Language & Translation, Vol. 10, 1998, 1-21
  4. D. Attash, "Urdu Informatics", National Language Authority Press, Vol. 1, 2008, 102-112
  5. N. Ata, B. Jawaid, A. Kamran, "Rule Based English to Urdu Machine Translation", Conference on Language and Technology , 2007
  6. B. Jawaid, "Statistical Machine Translation between Languages with Significant Word Order Difference", University of Malta & Charles University in Prague, 2010
  7. M. Zafar, A. Masood, "Interactive English to Urdu Machine Translation using Example-Based Approach", International Journal of Computational Science and Engineering, Vol. 1(3), 2009, 276-283
  8. A. Ali, S. Siddiq, M. K. Malik, "Development of Parallel Corpus and English to Urdu Statistical Machine Translation", International Journal of Engineering & Technology, International Journal of Engineering & Sciences, Vol. 10(5), 2010, 30-33
  9. B. Jawaid, D. Zeman, "Word-Order Issues in English-to-Urdu Statistical Machine Translation", The Prague Bulletin of Mathematical Linguistics, 2011, 87-106
  10. Online machine translation system, The Bing Translator by Microsoft Inc. http://www. bing. com/translator
  11. Online machine translation system, The Google Translate by Google Inc. http://translate. google. com
  12. A. Habib, M. Iwatate, M. Asahara, Y. Matsumoto, "Keypad for large letter-set languages and small touch-screen devices (case study: Urdu)", International Journal of Computer Science Issues, Vol. 9(3), 2012, 1694-0814
  13. S. D. Samantaray, "Example Based Machine Translation approach for Indian Language", International Center for Chemical and Biological Sciences, 2004, 1-10
  14. P. F. Brown, J. Cocke, S. A. D. Pietra, V. J. D. Pietra, F. Jelinek, J. D. Lafferty, R. L. Mercer, P. S. Roossin, "A Statistical Approach to Machine Translation", Computational Linguistics, Vol. 16(2), 1990, 79-85
  15. N. Karamat "Verb Transfer for English to Urdu Machine Translation (Using Lexical Functional Grammar (LFG))", National University of Computer & Emerging Sciences, Lahore, Pakistan, 2006
  16. H. Somers, "Machine translation and Welsh: The way forward". A Report for the Welsh Language Board, Centre for Computational Linguistics, University of Manchester Institute of Science and Technology, 2004
  17. K. Papineni, S. Roukos, T. Ward, W. J. Zhu, "BLEU: A Method for Automatic Evaluation of Machine Translation", Association for Computational Linguistics, 2002, 311-318
  18. A. Habib, M. Iwatate, M. Asahara, Y. Matsumoto, W. Khalil, "Optimized and hygienic touch screen keyboard for large letter set languages", International Conference on Ubiquitous Information Management and Communication, Association for Computing Machinery, 2013
  19. A. A. Malik, A. Habib, "Qualitative Analysis of Contemporary Urdu Machine Translation Systems", Logic Programming and Nonmonotonic Reasoning, Natural Language Processing and Automated Reasoning 2013, 27-36.
  20. M. Zhang and H. Li, "Tree kernel-based SVM with structured syntactic knowledge for BTG-based phrase reordering", Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2009, 698-707.
Index Terms

Computer Science
Information Sciences

Keywords

Machine Translation Comparison Rule Based Machine Translation Statistical Machine Translation Example Based Machine Translation Bilingual Evaluation Understudy Urdu to English Machine Translation.