CFP last date
20 January 2025
Reseach Article

Itakura-Saito Divergence Non Negative Matrix Factorization with Application to Monaural Speech Separation

by A. Adewusi, K. A. Amusa, A. R. Zubair
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 153 - Number 9
Year of Publication: 2016
Authors: A. Adewusi, K. A. Amusa, A. R. Zubair
10.5120/ijca2016912112

A. Adewusi, K. A. Amusa, A. R. Zubair . Itakura-Saito Divergence Non Negative Matrix Factorization with Application to Monaural Speech Separation. International Journal of Computer Applications. 153, 9 ( Nov 2016), 17-22. DOI=10.5120/ijca2016912112

@article{ 10.5120/ijca2016912112,
author = { A. Adewusi, K. A. Amusa, A. R. Zubair },
title = { Itakura-Saito Divergence Non Negative Matrix Factorization with Application to Monaural Speech Separation },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2016 },
volume = { 153 },
number = { 9 },
month = { Nov },
year = { 2016 },
issn = { 0975-8887 },
pages = { 17-22 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume153/number9/26431-2016912112/ },
doi = { 10.5120/ijca2016912112 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:58:41.047906+05:30
%A A. Adewusi
%A K. A. Amusa
%A A. R. Zubair
%T Itakura-Saito Divergence Non Negative Matrix Factorization with Application to Monaural Speech Separation
%J International Journal of Computer Applications
%@ 0975-8887
%V 153
%N 9
%P 17-22
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Monaural source separation is an interesting area that has received much attention in the signal processing community as it is a pre-processing step in many applications. However, many solutions have been developed to achieve clean separation based on Non-Negative Matrix Factorization (NMF). In this work, we proposed a variant of Itakura-Saito Divergence NMF based on source filter model that captures the temporal continuity of speech signal. The algorithm shows a very good separation results for mixture of two speech sources in terms of artifacts reduction. Besides that, Source to distortion ratio (SDR) and Source to Artifact Ratio (SAR) were found to be higher when compared with NMF algorithms with Kullback-Leibler and Euclidean divergences.

References
  1. Bertin N, F ´ Evotte C, Badeau R. 2012: A Tempering Approach For Itakura-Saito Non-Negative Matrix Factorization. With Application To Music Transcription. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE Signal Processing Society 2012 Kyoto Japan
  2. Bensaid, S. S. 2009 Monomicrophone Blind Audio Source Separation Using Thee-M Kalman Filter And Short+ Long Term AR Modelling. IEEE Conference On Signals and Systems (Pp. 343 - 345;). Califonia:.
  3. Cahill, N. M. 2012 : An Investigation Of The Utility Of Monaural Sound Source Separation Via Nonnegative Matrix Factorization Applied To Acoustic Echo And Reverberation Mitigation For Hands-Free Telephony. Doctoral Thesis. Callan Institute, Department Of Electronic Engineering.
  4. Fevotte, C. Bertin, N.and Dirrieu, J.L. 2009 Nonnegative Matrix Factorization With The Itakura-Saito Divergence. With Application To Music Analysis. Neural computations Vol 21, No 3 793-830
  5. Jaiswal, R. 2013 Non-Negative Matrix Factorization Based algorithms To Cluster Frequency Basis Functionsalgorithms. Doctoral Thesis. Dublin Institute Of Technology, School Of Electrical Engineering Systems: Dublin Institute Of Technology
  6. Jaureguiberry X, Leveau. P, Maller. S, Jos´E Burred . J. 2011. Adaptation Of Source-Specific Dictionaries In Non-Negative Matrix Factorization For Source Separation. IEEE International Conference On Acoustics, Speech and Signal Processing (ICASSP). IEEE Signal Processing Society. Prague, Czech Republic 1-4
  7. Lef`Evre, A. Bach, Y.F and F´Evotte, C. 2011. Itakura-Saito Nonnegative Matrix Factorization with Group Sparsity Prague, Czech Republic. ICASSP May 22-27 IEEE International Conference on Acoustics, Speech nd Signal Processing (ICASSP). IEEE Signal Processing Society. Prague, Czech Republic.
  8. Martin Spiertz, V. G. 2009. Source-Fiilter Based Clustering For Monaural Blind Source Separation. 12th Int. Conference On Digital Audio Effects Italy 1-7.
  9. Mikkel, S. 2008. Single-channel source separation using non-negative matrix. Technical University of Denmark, Informatics and Mathematical Modeling. Denmark: Technical University of Denmark
  10. Nobutaka Ono, K. M. 2008. Separation Of A Monaural Audio Signal Into Harmonic/Percussive Components By Components By Complementary Diffusion on Spectrogram. European Signal Processing Conference . Paris: European Signal Processing Society 1-4.
  11. Virtanen. T. O 2006. Monaural Sound Source Separation By Perceptually Weighted Non-Negative Matrix Factorization. IEEE Transactions On Signal Processing, 1-8
  12. Lee, D. D. and Seung, H. S. 2001 : Algorithms for non-negative matrix factorization, Advances in Neural Information Processing, , MIT Press vol. 13, 1-7
  13. Chien J.T and Yang P.K 2016; Bayesian Factorization and Learning for Monaural Source Separation: IEEE/ACM Transactions on Audio, Speech, And Language Processing, Vol. 24, No. 1, 185-195
Index Terms

Computer Science
Information Sciences

Keywords

Itakura-Saito divergence monaural source separation Non Negative Matrix Factorization