International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 186 - Number 46 |
Year of Publication: 2024 |
Authors: K. Sriram, Hemanth S. |
10.5120/ijca2024924095 |
K. Sriram, Hemanth S. . Speech Dereverberation for Robust ASR using Deep Learning Techniques. International Journal of Computer Applications. 186, 46 ( Nov 2024), 19-23. DOI=10.5120/ijca2024924095
This paper aims to provide a comprehensive study on different speech dereverberation techniques using deep learning and compares them to find the best possible solution for the said problem. The persistence of sound after a sound is created is known as reverberation, or reverb in acoustics. A reflection is the result of a sound or signal hitting many surfaces in close proximity. These surfaces might be furniture, people, or even the surrounding air. The reflections build up and eventually disintegrate. The best example of this is when the sound source cuts out but the reflections keep going, amplitude lowering until it reaches zero. Deep learning is basically a three-layer neural network. By simulating human brain function, although not exactly mimicking it, these neural networks enable the human brain to "learn" from vast quantities of data. Additional hidden layers can aid in optimizing and refining for accuracy, even if a neural network with only one layer can still produce rough predictions. Deep learning techniques, including UNet, GANs, and LSTM, are implemented in this paper to study speech dereverberation. Speech reverberation refers to the degradation of the entire signal caused by reflections of the target signal, which diminishes the quality of speech. The objective is to enhance the voice signal by eliminating this reverberation.