International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 185 - Number 26 |
Year of Publication: 2023 |
Authors: Nasrin Ehassan, Cosimo Ieracitano, Mandar Gogate, Kia Dashtipour, Amir Hussain |
10.5120/ijca2023922989 |
Nasrin Ehassan, Cosimo Ieracitano, Mandar Gogate, Kia Dashtipour, Amir Hussain . An Overview of Speech-to-Speech Translation Framework and its Modules. International Journal of Computer Applications. 185, 26 ( Aug 2023), 16-26. DOI=10.5120/ijca2023922989
Speech is the most natural form of human communication and arguably the most efficient method of exchanging information. However, communication between people who only speak different languages is a very challenging task. Speech-to-Speech translation (S2ST) attempts to overcome this issue, making it one of the most promising research domains in speech and Natural Language Processing (NLP). This present article reviews the most recent S2ST systems employed for different languages in terms of their constituent modules, namely Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-To-Speech (TTS). Furthermore, the paper critically highlights the main advantages and disadvantages of state-of-the-art techniques in S2ST in order to provide researchers with an up-to-date picture of current systems and potential directions for future work.