CFP last date
20 December 2024
Reseach Article

Determining Number of Speakers in Multi-Speaker Condition with Additive Noise

Published on September 2015 by Namratha N, R Kumaraswamy
National Conference “Electronics, Signals, Communication and Optimization"
Foundation of Computer Science USA
NCESCO2015 - Number 3
September 2015
Authors: Namratha N, R Kumaraswamy
e600ef0f-d7fe-4618-a715-d6df914b2643

Namratha N, R Kumaraswamy . Determining Number of Speakers in Multi-Speaker Condition with Additive Noise. National Conference “Electronics, Signals, Communication and Optimization". NCESCO2015, 3 (September 2015), 15-18.

@article{
author = { Namratha N, R Kumaraswamy },
title = { Determining Number of Speakers in Multi-Speaker Condition with Additive Noise },
journal = { National Conference “Electronics, Signals, Communication and Optimization" },
issue_date = { September 2015 },
volume = { NCESCO2015 },
number = { 3 },
month = { September },
year = { 2015 },
issn = 0975-8887,
pages = { 15-18 },
numpages = 4,
url = { /proceedings/ncesco2015/number3/22309-5325/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference “Electronics, Signals, Communication and Optimization"
%A Namratha N
%A R Kumaraswamy
%T Determining Number of Speakers in Multi-Speaker Condition with Additive Noise
%J National Conference “Electronics, Signals, Communication and Optimization"
%@ 0975-8887
%V NCESCO2015
%N 3
%P 15-18
%D 2015
%I International Journal of Computer Applications
Abstract

The performance of speaker recognition system considerably degrades if the sample used for speaker recognition task has voices from different speakers in the close vicinity. Solutions to these problems are needed, especially for signals collected in a practical environment, such as in a room with background noise and reverberation. This paper presents a method of determining number of speakers in multi speaker condition using excitation source information. Speech in a multi speaker environment are collected using two spatially separated microphones which results in time delay of arrival of speech signals with respect to a given speaker. This time delay is estimated from the cross correlation function of Hilbert envelopes of LP Residual signals. Thus by estimating the difference in time delay for different speakers the number of speakers can be determined. The performance of the proposed method is evaluated by adding different types of noise to the clean speech signal which illustrates the robustness of the proposed method.

References
  1. L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition.
  2. Kumara Swamy. R. , Sri Rama Murty. K. , & Yegnanarayana. B, "Determining number of speakers from multispeaker speech
  3. B. Yegnanarayana, S. R. M. Prasanna, R. Duraiswami, and D. Zotkin,"Processing of reverberent speech for time-delay estimation," IEEE Trans. Speech Audio Process. , vol. 13, no. 6, pp. 1110–1118, Nov. 2005.
  4. T. V. Ananthapadmanabha and B. Yegnanarayana, "Epoch extraction from linear prediction residual for identification of closed glottis interval,"IEEE Trans. Acoust. , Speech, Signal Process. , vol. ASSP-27, no. 4, pp. 309–319, Aug. 1979.
  5. J. Makhoul, "Linear prediction: A tutorial review," Proc. IEEE, vol. 63, no. 4, pp. 561–580, Apr. 1975.
  6. K. Sri Rama Murty, Vivek Boominathan, and Karthika Vijayan, "Allpass modeling of lp residual for speaker recogni-tion," in International Conference on Signal Processing and Communications, SPCOM, July 2012, pp. 1–5.
  7. Ananthapadmanabha, T. V. , & Yegnanarayana, B. (1979). "Epoch extraction from linear prediction residual for identification of closed glottis interval". IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 309–319.
  8. Krishnamoorthy, P. , & Prasanna, S. R. M. (2007). Processing noisy speech by noise components subtraction and speech components enhancement. In Proc. int. conf. systemics and cybernetics informatics, Hyberabad, India.
  9. Berouti, M. , Schwartz, R. , & Makhoul, J. (1979) " Enhancement.
  10. Of speech corrupted by acoustic noise". In Proc. IEEE int. conf. acoust. , speech, signal process (pp. 208–211). Smits, R. , & Yegnanarayana, B. (1995) ," Determination of instants of significant excitation in speech using group delay function". IEEE Transactions on Speech and Audio Processing, 3, 325–333.
Index Terms

Computer Science
Information Sciences

Keywords

Excitation Source Information Instants Of Glottal Closure (gcis) Linear Prediction(lp) Residual Hilbert Envelop (he) Time Delay Estimation Different Types Of Noises.