CFP last date
20 January 2025
Reseach Article

Convolutive Blind Speech Separation using Cross Spectral Density Matrix and Clustering for Resolving Permutation

by C.Prabhu, S.Pradeep, R.Baskaran, C.Chellappan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 7 - Number 2
Year of Publication: 2010
Authors: C.Prabhu, S.Pradeep, R.Baskaran, C.Chellappan
10.5120/1141-1494

C.Prabhu, S.Pradeep, R.Baskaran, C.Chellappan . Convolutive Blind Speech Separation using Cross Spectral Density Matrix and Clustering for Resolving Permutation. International Journal of Computer Applications. 7, 2 ( September 2010), 1-6. DOI=10.5120/1141-1494

@article{ 10.5120/1141-1494,
author = { C.Prabhu, S.Pradeep, R.Baskaran, C.Chellappan },
title = { Convolutive Blind Speech Separation using Cross Spectral Density Matrix and Clustering for Resolving Permutation },
journal = { International Journal of Computer Applications },
issue_date = { September 2010 },
volume = { 7 },
number = { 2 },
month = { September },
year = { 2010 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume7/number2/1141-1494/ },
doi = { 10.5120/1141-1494 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:55:22.007994+05:30
%A C.Prabhu
%A S.Pradeep
%A R.Baskaran
%A C.Chellappan
%T Convolutive Blind Speech Separation using Cross Spectral Density Matrix and Clustering for Resolving Permutation
%J International Journal of Computer Applications
%@ 0975-8887
%V 7
%N 2
%P 1-6
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The problem of separation of audio sources recorded in a real world situation is well established in modern literature. The method to solve this problem is Blind Speech Separation (BSS).The recording environment is usually modeled as convolutive (i.e. number of speech sources should be equal to or less than number of microphone arrays). In this paper, we propose a new frequency domain approach to convolutive blind speech separation. Matrix Diagonalization method is applied on cross power spectral density matrices of the microphone inputs to determine the mixing system at each frequency bin up to a permutation ambiguity. Then, we propose an efficient algorithm to resolve permutation ambiguity, where we group vectors of estimated frequency responses into clusters in such a way that each cluster contains frequency responses associated with the same source. The inverse of the mixing system is then used to find the separate sources. The performance of the proposed algorithm is demonstrated by experiments conducted in real reverberant rooms.

References
  1. Kamran Rahbar and James P.Reilly, “A Frequency Domain method for Blind Source Separation of convolutive audio sources”, IEEE Transaction on speech and audio processing, vol.13, no.5, September 2005.
  2. Bernhard Flury and Walter Gautchi, “An Algorithm for simultaneous orthogonal transformation of several positive definite matrices to nearly diagonal form” IEEE 1997.
  3. Kolossa and R. Orglmeister, “Nonlinear post-processing for blind speech separation”, in: Proc. 5th Intl. Symposium. on ICA and BSS (ICA 2004), pp. 832–839, 2004.
  4. Aapo Hypernan and Erki Ojha, Independent Component Analysis and its Application. Neural Networks 2000, pp.411-430.
  5. Wei Liu, Danilo, P.Mandic andAndrzej Cichocki., “A Class of novel Blind Source Extraction Algorithms based on a linear predictor” ,IEEE 1997.
  6. R.H. Lambert, “Blind Deconvolution: FIR Matrix Algebra and Separation of Multipath Mixtures”, Univ. Southern California, La Jolla, CA 1996.
  7. Bulek, S and Erdol, N, “Blind speech separation using fractional order moments”, Statistical Signal Processing, IEEE/SP 15th Workshop 2009, pp. 509 – 512 .
  8. Hua Cai, Junxi Sun and Shifeng Ou, “Blind Speech Separation Employing Laplacian Normal Mixture Distribution”, Model.Mechatronics and utomation.International Conference 2007, pp. 3185 – 3189.
  9. L.Parra and C.Spence, “Convolutive blind separation of non stationary Sources”, IEEE Trans. Speech Audio Process., vol.8, no.3, pp.320–327, May2000.
  10. K.Rahbar and J.Reilly, “Blind source separation algorithm for MIMO Convolutive mixtures.” in Int. Workshop on Independent Component Analysis and Signal Separation, San Diego, CA 2001, pp.242–247.
  11. Reju, V.G, Soo Ngee Koh and Ing Yann Soo, ”A robust Correlation Method for Solving Permutation problem in Frequency Domain Blind Source Separation of Speech Signal”, Circuits and Systems 2006, pp. 1891 – 1894.
  12. Solvang, H.K, Nagahara, Y, Araki, S, Sawada, H, Makino, S, “ Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation”, Audio, Speech, and Language Processing, IEEE Transactions on vol. 17, pp. 639 – 649, 2009
  13. Yu-Lin Liu, Shun Xu, Ming-Qi Li, “ A Second-Order Feature Window Method for Blind Separation of Speech Signals Corrupted by Color Noise.Machine Learning and Cybernetics”, International Conference on vol. 6 , pp. 3454 – 34, 2007.
  14. C.T.Ma,Z.Ding and S.F.Yau, “A two-stage algorithm for MIMO Blind deconvolution of non stationary colored signals,” IEEE Trans. Signal Process., vol.48,no.4,pp.1187–1192 , 2000.
  15. H.Sahlin and H.Broman, “MIMO signal separation for FIR channels: A criterion and performance analysis,” IEEE Trans. Signal Process, vol.48, no.3, pp.642–649, 2000.
  16. Fadaili, E.M. Moreau N.T. and Moreau E, “Non orthogonal Joint Diagonalization/Zero Diagonalization for Source Separation Based on Time-Frequency Distributions” Signal Processing, IEEE Transactions on vol. 55, pp. 1673 – 1687, 2007.
  17. Wenwu Wang, Sanei, S and Chambers, J.A, “ Penalty function-based joint diagonalization approach for convolutive blind separation of non stationary sources”, Signal Processing, IEEE Transactions vol. 53, pp. 1654 – 1669, 2005.
  18. Minje Kim and Seungjin Choi, “ICA-Based Clustering for Resolving Permutation Ambiguity in Frequency-Domain Convolutive Source Separation” IEEE 18th International conference on Pattern Recognition, 2006.
  19. J.Benesty, S.Makino and J.Chen, Speech Enhancement. Springer, 2005.
  20. Hiroshi Sawada, Ryo Mukai, Shoko Araki and Shoji Makino, “A Robust and Precise Method for Solving the Permutation Problem of Frequency- Domain Blind source Separation” IEEE Transactions on Speech and Audio Processing, Vol. 12, No. 5, September 2004.
Index Terms

Computer Science
Information Sciences

Keywords

Cross-Power Spectral Density Matrix Diagonalization Blind Speech Separation Permutation ambiguity Cluster