CFP last date
20 January 2025
Reseach Article

Performance of Complementary Features for Robust Speaker Identification

by Sharada V. Chougule, Mahesh S. Chavan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 123 - Number 9
Year of Publication: 2015
Authors: Sharada V. Chougule, Mahesh S. Chavan
10.5120/ijca2015905617

Sharada V. Chougule, Mahesh S. Chavan . Performance of Complementary Features for Robust Speaker Identification. International Journal of Computer Applications. 123, 9 ( August 2015), 21-27. DOI=10.5120/ijca2015905617

@article{ 10.5120/ijca2015905617,
author = { Sharada V. Chougule, Mahesh S. Chavan },
title = { Performance of Complementary Features for Robust Speaker Identification },
journal = { International Journal of Computer Applications },
issue_date = { August 2015 },
volume = { 123 },
number = { 9 },
month = { August },
year = { 2015 },
issn = { 0975-8887 },
pages = { 21-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume123/number9/21987-2015905617/ },
doi = { 10.5120/ijca2015905617 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:12:14.892688+05:30
%A Sharada V. Chougule
%A Mahesh S. Chavan
%T Performance of Complementary Features for Robust Speaker Identification
%J International Journal of Computer Applications
%@ 0975-8887
%V 123
%N 9
%P 21-27
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper considers the problem of acoustic mismatch caused by use of different sensors, in digital gazettes and hand-held devices. In this paper, two complementary features derived from conventional cepstral features are proposed, namely linear/mel spectral subband features (L/M-SSC) and log filter bank energy features (LFBE). The performance of these complementary features is compared with conventional features in acoustic mismatch conditions. To investigate the performance of features alone, all processing and classification steps are kept constant to allow a controlled comparison. A multi-variability speech database (IITG-MV) with acoustic mismatch (different microphones) is used for experimental evaluation. It is observed that all these features shows almost equal performance for text independent speaker identification in same acoustic condition. Whereas in mismatch condition, spectral subband centroids (L/M-SSC) features proved to be robust than other features when used alone. Further, use of dynamic features along with channel and noise compensation enhances the percentage identification rate of the system for all cases of acoustic mismatch, with spectral subband centroid features showing comparable performance to that of conventional features.

References
  1. Joseph P Campbell, Wade Shen, Willam M Campbell,Reva Schwartz, Jean-Francois Bonastre and Driss Matrouf, “Forensic speaker recognition” , IEEE Signal Processing Magazine, March 2009 , pp. 95-103.
  2. Tomi Kinnunen, Haizhou Li, “An overview of text independent speaker recognition, from features to supervectors” , Speech Communication, July 2009.
  3. Douglas A Raynolds, “Automatic speaker recognition using Gaussian Mixture Model” , The LINCON Laboratory Journal, vol.8, No.2, 1995, pp.173-192.
  4. D.A. Reynolds, T.F. Quateri, and R.B. Dunn, “Speaker verification using adapted Gaussian mixture models”, Digital Signal Processing , vol. 10,2000, p. 19-41.
  5. Taufiq Hasan and John H.L. Hansen, ”A study of universal background model training in speaker verification”, IEEE Trans. Audio Speech Lang. Process. vol. 19, No. 7, Sep 2011.
  6. B. Yegnanarayana, and S.P. Kishore, “AANN An alternative to GMM for pattern recognition”, Neural Networks , vol. 15, 2002, p. 459-69.
  7. V. Wan, and S. Renals, ”Evaluation of kernel methods for speaker verification and identification”, Proceeding IEEE International Conference on Acoustic, Speech, Signal Processing. , vol. 1, 2002, pp.669 –672.
  8. Chang Huai You , Kong Aik Lee and Haizhou Li, “GMM-SVM Kernel with a Bhattacharyya based distance for speaker recognition” , IEEE Transaction on Audio, Speech and Language Processing,vol.18,no.6, August 2010, pp.1300-1312.
  9. Marc Ferras, Cheung-Chi Leung, Claude Barras and Jean-Luc Gauvain, ”Comparison of speaker adaption methods as feature extraction for SVM-based speaker recognition”, IEEE Transaction on Audio, Speech and Language Processing,vol.19,no.7, September 2011,pp.1890-1899.
  10. Seyed Omid Sadjadi and John H.L. Hansen, ”Robust front end processing “, IEEE ICASSP 2013, pp.7214-7218.
  11. James G Lyons, James G. O’Connel and Kuldip K Paliwal, “Using long-term information to improve robustness in Speaker Identification”, IEEE 2010.
  12. Xiaojia Zhao and DeLiang Wang, “ Analyzing noise robustness of MFCC and GFCC features in speaker identification”, IEEE, ICASSP 2013, pp.7204-7208.
  13. Vikramjit Mitra , Mitchel McLaren,Horacio Franco, Martin Graciarena, Nicolas Scheffer, “Modulation features for noise
  14. robust speaker identification“, INTERSPEECH 2013, pp. 3707-3713.
  15. Steven V Devis and Paul Mermelstein, ”Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences”, IEEE Transaction on Audio, Speech and Language Processing,vol.4, ISSP-28,no.4, August 1980 , pp.357-366.
  16. K. K. Paliwal, ” Spectral Centroid Features for speech recognition” , Proc. ICASSP, vol. 2, Seattle, 1998, pp.617–620.
  17. Jinggong Chen, Yiteng Huang, Qi Li and Kuldip Paliwal, “ Recognition of noisy speech using dynamic spectral subband centroids”, IEEE Signal Processing Letters, vol.11, no.2. February 2004,pp. 258-261.
  18. Tomi Kinnunen, Evgeny Karpov and Pasi Franti, “ Real time speaker identification and verification”, IEEE Transaction on speech and audio processing, vol. 14, no.1, January 2006, pp.277-288.
  19. Electro Medical and Speech Technology Laboratory, Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati. http://www.iitg.ernet.in/ece/emstlab/
  20. Pujol P, Macho D., Nadeu C:On real time mean and variance normalization of speech recognition features,IEEE,ICASSP, 2006
  21. Saeed V. Vaseghi : Advanced Digital Signal Processing and Noise Reduction, Second Edition, John Wiley & Sons Ltd,2000.
Index Terms

Computer Science
Information Sciences

Keywords

MFCC LFCC Linear/Mel scale spectral subband centroids (L/M-SSC) Log filter bank energy (LFBE)