CFP last date
21 April 2025
Call for Paper
May Edition
IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 21 April 2025

Submit your paper
Know more
Reseach Article

Application of Machine Learning Methods for Enhancing the Quality of Medical Audio Recordings: Comparative Analysis of Classical and Modern Approaches

by Nataliya Boyko, Petro Slobodian
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 69
Year of Publication: 2025
Authors: Nataliya Boyko, Petro Slobodian
10.5120/ijca2025924502

Nataliya Boyko, Petro Slobodian . Application of Machine Learning Methods for Enhancing the Quality of Medical Audio Recordings: Comparative Analysis of Classical and Modern Approaches. International Journal of Computer Applications. 186, 69 ( Mar 2025), 31-43. DOI=10.5120/ijca2025924502

@article{ 10.5120/ijca2025924502,
author = { Nataliya Boyko, Petro Slobodian },
title = { Application of Machine Learning Methods for Enhancing the Quality of Medical Audio Recordings: Comparative Analysis of Classical and Modern Approaches },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2025 },
volume = { 186 },
number = { 69 },
month = { Mar },
year = { 2025 },
issn = { 0975-8887 },
pages = { 31-43 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number69/application-of-machine-learning-methods-for-enhancing-the-quality-of-medical-audio-recordings-comparative-analysis-of-classical-and-modern-approaches/ },
doi = { 10.5120/ijca2025924502 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-03-01T12:38:53.203156+05:30
%A Nataliya Boyko
%A Petro Slobodian
%T Application of Machine Learning Methods for Enhancing the Quality of Medical Audio Recordings: Comparative Analysis of Classical and Modern Approaches
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 69
%P 31-43
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The aim of the study is to solve the problem of noise in audio recordings and improve sound quality using existing machine learning methods; compare different existing methods. In order to test, analyze and compare methods of machine learning based on sound processing problem, it is proposed to use several different approaches. The work will use both classical methods of audio signal processing, such as the wiener filter and spectral subtraction, and more modern ones, namely convolutional neural networks. Each of these methods has its own pros and cons that will be analyzed during experiments, in order to determine in which case which method will be useful. Using these methods will allow for in-depth analysis and comprehensive results for audio processing. Based on the research, it was determined that Spectral subtraction performs slightly better than the Wiener filter. This is evidenced by both the PESQ scores for the two methods and the audiovisual analysis. Among all the selected methods, convolutional neural networks perform the best, and based on the metrics, conclusion was made that the best results for CNN’s can be achieved using L1/L2 regularization and Dropout. Further research may include investigating new CNN architectures for audio de-noising, exploring the possibilities of using other types of neural networks such as Recurrent Neural Networks, Generative Adversarial Networks for audio de-noising.

References
  1. Le Roux J., Vincent E. 2013. Consistent Wiener Filtering for Audio Source Separation, IEEE Signal Processing Letters, Vol. 20, No. 3, pp. 217–220. https://doi.org/10.1109/lsp.2012.2225617
  2. Tran T., Bader S., Lundgren J. 2023. Denoising Induction Motor Sounds Using an Autoencoder, IEEE Sensors Applications Symposium (SAS), Ottawa, ON, Canada, pp. 18–20. https://doi.org/10.1109/sas58821.2023.10254150.
  3. Upadhyay N., Karmakar A. 2015. Speech Enhancement using Spectral Subtraction-type Algorithms: A Comparison and Simulation Study, Procedia Computer Science, Vol. 54, pp. 574–584. https://doi.org/10.1016/j.procs.2015.06.066.
  4. Ashwin J. S., Manoharan N. 2018. Audio Denoising Based on Short Time Fourier Transform, Indonesian Journal of Electrical Engineering and Computer Science, Vol. 9, No. 1, pp. 89. https://doi.org/10.11591/ijeecs.v9.i1.pp89-92.
  5. Junfeng L., Masato A., Yoiti S. 2010. A Two-Microphone Noise Reduction Method in Highly Nonstationary Multiple-Noise-Source Environments, IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences. https://doi.org/E91A. 10.1093/ietfec/e91-a.6.1337.
  6. Edmonson J W. 2002. Tucker, Digital Signal Processing System for Active Noise Reduction, Vol. 1, p. 49.
  7. Boll S. 2014. Suppression of acoustic noise in speech using spectral subtraction, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 27, No. 2, pp. 113-120. https://doi.org/10.1109/TASSP.1979.1163209.
  8. Xu Y., Du J., Dai L., Lee C. 2015. A Regression Approach to Speech Enhancement Based on Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 23, No. 1, pp. 7-19. https://doi.org/10.1109/TASLP.2014.2364452.
  9. Valin J. 2018. A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement, IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), pp. 1-5, https://doi.org/10.1109/MMSP.2018.8547084.
  10. Keshavarzi M. 2018. Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality, Trends in Hearing. https://doi.org/10.1177/2331216518770964.
  11. Omaima A. 2015. Removing Noise from Speech Signals Using Different Approaches of Artificial Neural Networks, International Journal of Information Technology and Computer Science, Vol. 7, pp. 8-18. https://doi.org/10.5815/ijitcs.2015.07.02.
  12. Taal C. H., Hendriks R. C., Heusdens R. and Jensen J. 2010. A short-time objective intelligibility measure for time-frequency weighted noisy speech, IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4214-4217. https://doi.org/10.1109/ICASSP.2010.5495701.
  13. Sun C., Zhang M., Wu R. 2021. A convolutional recurrent neural network with attention framework for speech separation in monaural recordings, Vol. 11, pp. 1434. https://doi.org/10.1038/s41598-020-80713-3.
  14. Boyko N. 2023. Models and Algorithms for Multimodal Data Processing, WSEAS Transactions on Information Science and Applications, ISSN / E-ISSN: 1790-0832 / 2224-3402, Vol. 20, pp. 87-97. https://doi.org/10.37394/23209.2023.20.11.
  15. Boyko N. 2023. Evaluating Binary Classification Algorithms on Data Lakes Using Machine Learning, Revue d'Intelligence Artificielle, Vol. 37(6), pp. 1423–1434. https://doi.org/10.18280/ria.370606
Index Terms

Computer Science
Information Sciences
Convolutional Neural Networks
Mean Square Error
Mean Absolute Error
Structural Similarity Index
Peak Signal-To-Noise Ratio
Perceptual Evaluation of Speech Quality

Keywords

Audio De-noising Wiener filter Spectral subtraction Audio processing Speech enhancement Noise estimation Machine learning model