CFP last date
20 January 2025
Reseach Article

Hierarchical Speaker Identification based on Latent Variable Decomposition

by Sabyasachi Patra, Subhendu Kumar Acharya
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 19 - Number 7
Year of Publication: 2011
Authors: Sabyasachi Patra, Subhendu Kumar Acharya
10.5120/2376-3131

Sabyasachi Patra, Subhendu Kumar Acharya . Hierarchical Speaker Identification based on Latent Variable Decomposition. International Journal of Computer Applications. 19, 7 ( April 2011), 6-11. DOI=10.5120/2376-3131

@article{ 10.5120/2376-3131,
author = { Sabyasachi Patra, Subhendu Kumar Acharya },
title = { Hierarchical Speaker Identification based on Latent Variable Decomposition },
journal = { International Journal of Computer Applications },
issue_date = { April 2011 },
volume = { 19 },
number = { 7 },
month = { April },
year = { 2011 },
issn = { 0975-8887 },
pages = { 6-11 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume19/number7/2376-3131/ },
doi = { 10.5120/2376-3131 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:06:20.499954+05:30
%A Sabyasachi Patra
%A Subhendu Kumar Acharya
%T Hierarchical Speaker Identification based on Latent Variable Decomposition
%J International Journal of Computer Applications
%@ 0975-8887
%V 19
%N 7
%P 6-11
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper, a novel hierarchical speaker identification method based on Latent Variable Decomposition (LVD) has been proposed. Firstly, we got a coarse decision by a fast scan all registered speakers using LVD based features and GMM classifier to find R possible target speakers, and then MFCC or PCA based features were used to make final decision. LVD has another advantage: reduction of the feature vectors dimensions, and the noise is removed from speech simultaneity. So, it can reduce the computational complexity and improve the performance of speaker identification. The experimental results showed that the proposed method could improve recognition accuracy of system remarkably and the system has better robustness by comparing with the traditional speaker identification method.

References
  1. Atal, B. S., Automatic recognition of speakers from their voices, Proc. IEEE, Vol. 64, pp. 460-475, 1976.
  2. Reynolds, D. A., An overview of automatic speaker recognition technology, ICASSP, pp. 4072-4075, 2002.
  3. Garofolo, J. S. et al, DARPA TIMIT: Acoustic-Phonetic Continuous Speech Corpus, New Jersey: NIST Publications, 1993.
  4. Wei Han, Cheong–Fat Chan, Chiu-Sing Choy, Kong-Pang Pun, An Efficient MFCC Extraction Method in Speech Recognition, IEEE ISCAS, pp-4, September 2006.
  5. Reynolds, D. A., and Rose, R.C., Robust Text-Independent Speaker Identification using Gaussian Mixture Speaker Models, IEEE Trans. Speech and Audio Processing, 3(1):72-83, 1995.
  6. Reynolds, D. A., Experimental evaluation of features for robust speaker identification, IEEE Trans. Speech Audio Processing, Vol. SAP-2, No. 4, pp. 639-643, 1992.
  7. Hoffman, T., Unsupervised learning by probabilistic latent semantic analysis, Machine Learning, vol. 42, pp. 177-196, 2001.
  8. Bhiksha Raj, Paris Smaragdis, Latent Variable Decomposition of Spectrograms for Single Channel Speaker Separation, 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp.17 - 20, 16-19 Oct 2005.
  9. Kittler, J., Hatef, M., Duin, P.W., and Matas, J., On Combining Classifiers, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 20(3), 226-239, 1998.
  10. Switzerl, M.V., Conrad, S., and Paliwal, K.K., Information Fusion and Person Verification Using Speech and Face Information, IDIAP Research Report, pp. 1-37, 2002.
Index Terms

Computer Science
Information Sciences

Keywords

MFCC PCA GMM LVD SNR Features Classifiers