International Conference on Electronic Design and Signal Processing |
Foundation of Computer Science USA |
ICEDSP - Number 3 |
February 2013 |
Authors: Prasanna Kumar M K, Padmavathi K |
b015d00d-9e1f-4544-ad1e-96cfbb483f34 |
Prasanna Kumar M K, Padmavathi K . STFT based Blind Separation of Underdetermined Speech Mixtures. International Conference on Electronic Design and Signal Processing. ICEDSP, 3 (February 2013), 10-13.
Analysis of non stationary signals like audio, speech and biomedical signals require good resolution both in time and frequency as their spectral components are not fixed. There are many applications of time-frequency analysis in non stationary signals like source separation, signal denoising etc. This paper presents an application of time frequency analysis using STFT, Short Time Fourier Transform in speech separation. The method is blind since the information about the sources and mixing type is not available. The method uses relative amplitude information of speech mixtures in time frequency domain and ideal binary mask of source signals. The speech mixture used is underdetermined where number of sources are more than number of sensors. A mixture of male and female speech with a musical note is considered for the separation first with a strong mixing matrix and next with a weak mixing matrix. The performance parameter like SNR, signal to noise ratio obtained with this approach proves that time-frequency analysis using STFT can be useful to identify the tracks for separation out of determined speech mixtures. Short time spectrum representation of speech signal requires on the order of two to four times as many samples as required to represent the waveform. However in return a very flexible representation of the signal can be obtained from which extensive modifications in both time and frequency domains can be made.