We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

STFT based Blind Separation of Underdetermined Speech Mixtures

Published on February 2013 by Prasanna Kumar M K, Padmavathi K
International Conference on Electronic Design and Signal Processing
Foundation of Computer Science USA
ICEDSP - Number 3
February 2013
Authors: Prasanna Kumar M K, Padmavathi K
b015d00d-9e1f-4544-ad1e-96cfbb483f34

Prasanna Kumar M K, Padmavathi K . STFT based Blind Separation of Underdetermined Speech Mixtures. International Conference on Electronic Design and Signal Processing. ICEDSP, 3 (February 2013), 10-13.

@article{
author = { Prasanna Kumar M K, Padmavathi K },
title = { STFT based Blind Separation of Underdetermined Speech Mixtures },
journal = { International Conference on Electronic Design and Signal Processing },
issue_date = { February 2013 },
volume = { ICEDSP },
number = { 3 },
month = { February },
year = { 2013 },
issn = 0975-8887,
pages = { 10-13 },
numpages = 4,
url = { /specialissues/icedsp/number3/10362-1021/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Special Issue Article
%1 International Conference on Electronic Design and Signal Processing
%A Prasanna Kumar M K
%A Padmavathi K
%T STFT based Blind Separation of Underdetermined Speech Mixtures
%J International Conference on Electronic Design and Signal Processing
%@ 0975-8887
%V ICEDSP
%N 3
%P 10-13
%D 2013
%I International Journal of Computer Applications
Abstract

Analysis of non stationary signals like audio, speech and biomedical signals require good resolution both in time and frequency as their spectral components are not fixed. There are many applications of time-frequency analysis in non stationary signals like source separation, signal denoising etc. This paper presents an application of time frequency analysis using STFT, Short Time Fourier Transform in speech separation. The method is blind since the information about the sources and mixing type is not available. The method uses relative amplitude information of speech mixtures in time frequency domain and ideal binary mask of source signals. The speech mixture used is underdetermined where number of sources are more than number of sensors. A mixture of male and female speech with a musical note is considered for the separation first with a strong mixing matrix and next with a weak mixing matrix. The performance parameter like SNR, signal to noise ratio obtained with this approach proves that time-frequency analysis using STFT can be useful to identify the tracks for separation out of determined speech mixtures. Short time spectrum representation of speech signal requires on the order of two to four times as many samples as required to represent the waveform. However in return a very flexible representation of the signal can be obtained from which extensive modifications in both time and frequency domains can be made.

References
  1. E. Vincent, R. Gribonval, and C. F'evotte, "Performance measurement in blind audio source separation," IEEE Transactions on Speech and Audio Processing, vol. 14, no. 4, pp. 1462– 1469, 2006.
  2. M. Babaie-Zadeh, C. Jutten, and A. Mansour, "Sparse ica via clusterwise pca," Neurocomputing, vol. 69, pp. 1458–1466, 2006.
  3. F. Abrard and Y. Deville, Atime frequency blind signal separation method applicable to underdetermined mixtures of dependent sources,Signal Processing,Vol 85,Issue 7,pp 1389- 1403,July 2005.
  4. O. Yilmaz and S. Rickard, "Blind separation of speech mixtures via time-frequency masking," IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830–1847, July 2004.
  5. S. Araki, S. Makino, H. Sawada, and R. Mukai, "Underdetermined blind separation of convolutive mixtures of speech with directivity pattern based mask and ica," Fifth International Conference on Independent Component Analysis and Blind Signal Separation, pp. 898–905, 2004.
  6. S. Araki, R. Mukai, S. Makino, T. Nishikawa, and H. Saruwatari, "The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech," IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 5, pp. 2737–2740, 2001.
  7. A. Hyv¨arinen, J. Karhunen, and E. Oja. Independent Component Analysis. John Wiley & Sons, 2001.
  8. A. Jourjine, S. Rickard, and O. Yilmaz, "Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures," In IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP2000),vol. 5, pp. 2985–2988, June 2000.
  9. K. Torkkola, "Blind separation of convolved sources based on information maximization," IEEE Worshop on Neural Networks for Signal Processing, Kyoto, pp. 423– 432, september 1996.
  10. A. J. Bell and T. J. Sejnowski, "An information-maximization approach to blind separation and blind deconvolution," Neural Computation, vol. 7, no. 6, pp. 1129–1159, 1995.
Index Terms

Computer Science
Information Sciences

Keywords

Stft Snr Istft Abs Tfm Asr