CFP last date
20 January 2025
Reseach Article

Speech Dereverberation for Robust ASR using Deep Learning Techniques

by K. Sriram, Hemanth S.
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 46
Year of Publication: 2024
Authors: K. Sriram, Hemanth S.
10.5120/ijca2024924095

K. Sriram, Hemanth S. . Speech Dereverberation for Robust ASR using Deep Learning Techniques. International Journal of Computer Applications. 186, 46 ( Nov 2024), 19-23. DOI=10.5120/ijca2024924095

@article{ 10.5120/ijca2024924095,
author = { K. Sriram, Hemanth S. },
title = { Speech Dereverberation for Robust ASR using Deep Learning Techniques },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2024 },
volume = { 186 },
number = { 46 },
month = { Nov },
year = { 2024 },
issn = { 0975-8887 },
pages = { 19-23 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number46/speech-dereverberation-for-robust-asr-using-deep-learning-techniques/ },
doi = { 10.5120/ijca2024924095 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-11-08T23:09:21.250861+05:30
%A K. Sriram
%A Hemanth S.
%T Speech Dereverberation for Robust ASR using Deep Learning Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 46
%P 19-23
%D 2024
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper aims to provide a comprehensive study on different speech dereverberation techniques using deep learning and compares them to find the best possible solution for the said problem. The persistence of sound after a sound is created is known as reverberation, or reverb in acoustics. A reflection is the result of a sound or signal hitting many surfaces in close proximity. These surfaces might be furniture, people, or even the surrounding air. The reflections build up and eventually disintegrate. The best example of this is when the sound source cuts out but the reflections keep going, amplitude lowering until it reaches zero. Deep learning is basically a three-layer neural network. By simulating human brain function, although not exactly mimicking it, these neural networks enable the human brain to "learn" from vast quantities of data. Additional hidden layers can aid in optimizing and refining for accuracy, even if a neural network with only one layer can still produce rough predictions. Deep learning techniques, including UNet, GANs, and LSTM, are implemented in this paper to study speech dereverberation. Speech reverberation refers to the degradation of the entire signal caused by reflections of the target signal, which diminishes the quality of speech. The objective is to enhance the voice signal by eliminating this reverberation.

References
  1. K. Kinoshita κ.ά., ‘The REVERB Challenge: A Benchmark Task for Reverberation-Robust ASR Techniques’, στο New Era for Robust Speech Recognition, Springer, 2017.
  2. O. Ernst, S. E. Chazan, S. Gannot and J. Goldberger, "Speech Dereverberation Using Fully Convolutional Networks," 2018 26th European Signal Processing Conference (EUSIPCO), 2018, pp. 390-394, doi: 10.23919/EUSIPCO.2018.8553141.
  3. Y. Zhao, Z. Wang and D. Wang, "Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 1, pp. 53-62, 2019. Available: 10.1109/taslp.2018.2870725.
  4. IEEE Transactions on Audio Speech & Language Processing, 2010, 18(7):1717-1731.
  5. Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction[J].
  6. T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi and B. -H. Juang, "Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 7, pp. 1717-1731, Sept. 2010, doi: 10.1109/TASL.2010.2052251.
Index Terms

Computer Science
Information Sciences
Deep Learning techniques
GAN
acoustics
speech
dereverberation

Keywords

UNet GAN deverberation