CFP last date
20 February 2025
Reseach Article

Noise Robust Speaker Identification using PCA based Genetic Algorithm

by Md. Fayzur Rahman, Md. Rabiul Islam
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 4 - Number 12
Year of Publication: 2010
Authors: Md. Fayzur Rahman, Md. Rabiul Islam
10.5120/875-1238

Md. Fayzur Rahman, Md. Rabiul Islam . Noise Robust Speaker Identification using PCA based Genetic Algorithm. International Journal of Computer Applications. 4, 12 ( August 2010), 27-31. DOI=10.5120/875-1238

@article{ 10.5120/875-1238,
author = { Md. Fayzur Rahman, Md. Rabiul Islam },
title = { Noise Robust Speaker Identification using PCA based Genetic Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { August 2010 },
volume = { 4 },
number = { 12 },
month = { August },
year = { 2010 },
issn = { 0975-8887 },
pages = { 27-31 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume4/number12/875-1238/ },
doi = { 10.5120/875-1238 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:52:56.051870+05:30
%A Md. Fayzur Rahman
%A Md. Rabiul Islam
%T Noise Robust Speaker Identification using PCA based Genetic Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 4
%N 12
%P 27-31
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper emphasizes text dependent speaker identification system on Principal Component Analysis based Genetic Algorithm which deals with detecting a particular speaker from a known population under noisy environment. At first, the system prompts the user to get speech utterance. Noises are eliminated from the speech utterances by using wiener filtering technique. To extract the features from the speech, various types of feature extraction techniques such as RCC, LPCC, MFCC, MFCC and MFCC have been used. Principal Component Analysis has been used to reduce the dimensionality of the speech feature vector. To classify the speech utterances, Genetic Algorithm has been used. NOIZEOUS speech database has been used to measure the performance of this system under the condition of various SNRs. Experimental results show the superiority of the proposed close-set text dependent speaker identification system which can be used for security and access control purposes.

References
  1. Jain, R. Bole, S. Pankanti, BIOMETRICS Personal Identification in Networked Society, Kluwer Academic Press, Boston, 1999.
  2. Rabiner, L., and Juang, B.-H., Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, New Jersey, 1993.
  3. Jacobsen, J. D., “Probabilistic Speech Detection”, Informatics and Mathematical Modeling, DTU, 2003.
  4. Jain, A., R.P.W.Duin, and J.Mao., “Statistical pattern recognition: a review”, IEEE Trans. on Pattern Analysis and Machine Intelligence 22 (2000), pp. 4–37, 2002.
  5. Davis, S., and Mermelstein, P., “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE 74 Transactions on Acoustics, Speech, and Signal Processing (ICASSP), vol. 28, no. 4, pp. 357-366, Aug. 1980.
  6. Sadaoki Furui, “50 Years of Progress in Speech and Speaker Recognition Research”, ECTI TRANSACTIONS ON COMPUTER AND INFORMATION TECHNOLOGY, Vol.1, No.2, November 2005.
  7. Lockwood, P., Boudy, J., and Blanchet, M., “Non-linear spectral subtraction (NSS) and hidden Markov models for robust speech recognition in car noise environments”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. 265-268, Mar. 1992.
  8. Matsui, T., and Furui, S., “Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/ continuous HMMs”, IEEE Transactions on Speech Audio Process, no. 2, pp. 456-459, 1994.
  9. Reynolds, D.A., “Experimental evaluation of features for robust speaker identification”, IEEE Transactions on SAP, Vol. 2, pp. 639-643, 1994.
  10. Sharma, S., Ellis, D., Kajarekar, S., Jain, P. & Hermansky, H., “Feature extraction using non-linear transformation for robust speech recognition on the Aurora database”, Proc. ICASSP2000, 2000.
  11. Wu, D., Morris, A.C. & Koreman, J., “MLP Internal Representation as Disciminant Features for Improved Speaker Recognition”, Proc. NOLISP2005, Barcelona, Spain, pp. 25-33, 2005.
  12. Konig, Y., Heck, L., Weintraub, M. & Sonmez, K., “Nonlinear discriminant feature extraction for robust text-independent speaker recognition”, Proc. RLA2C, ESCA workshop on Speaker Recognition and its Commercial and Forensic Applications, pp. 72-75, 1998.
  13. Ismail Shahin, “Improving Speaker Identification Performance Under the Shouted Talking Condition Using the Second-Order Hidden Markov Models”, EURASIP Journal on Applied Signal Processing 2005:4, pp. 482–486, ,2005, Hindawi Publishing Corporation.
  14. S. E. Bou-Ghazale and J. H. L. Hansen, “A comparative study of traditional and newly proposed features for recognition of speech under stress”, IEEE Trans. Speech, and Audio Processing, vol. 8, no. 4, pp. 429–442, 2000.
  15. G. Zhou, J. H. L. Hansen, and J. F. Kaiser, “Nonlinear feature based classification of speech under stress”, IEEE Trans. Speech, and Audio Processing, vol. 9, no. 3, pp. 201–216, 2001.
  16. Simon Doclo and Marc Moonen, “On the Output SNR of the Speech-Distortion Weighted Multichannel Wiener Filter”, IEEE SIGNAL PROCESSING LETTERS, vol. 12, no. 12, 2005.
  17. Wiener, N., Extrapolation, Interpolation and Smoothing of Stationary Time Series with Engineering Applications, Wiely, Newyork, 1949.
  18. Wiener, N., Paley, R. E. A. C., “Fourier Transforms in the Complex Domains”, American Mathematical Society, Providence, RI, 1934.
  19. Koji Kitayama, Masataka Goto, Katunobu Itou and Tetsunori Kobayashi, “Speech Starter: Noise-Robust Endpoint Detection by Using Filled Pauses”, Eurospeech 2003, Geneva, pp. 1237-1240, 2003.
  20. S. E. Bou-Ghazale and K. Assaleh, “A robust endpoint detection of speech for noisy environments with application to automatic speech recognition”, in Proc. ICASSP2002, vol. 4, pp. 3808–3811, 2002.
  21. Martin, D. Charlet, and L. Mauuary, “Robust speech / non-speech detection using LDA applied to MFCC”, in Proc. ICASSP2001, vol. 1, pp. 237–240, 2001.
  22. Richard. O. Duda, Peter E. Hart, David G. Strok, Pattern Classification, A Wiley-interscience publication, John Wiley & Sons, Inc, Second Edition, 2001.
  23. Sarma, V., Venugopal, D., “Studies on pattern recognition approach to voiced-unvoiced-silence classification”, Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '78. , Volume: 3, pp. 1-4, Apr 1978.
  24. Qi Li. Jinsong Zheng, Augustine Tsai, Qiru Zhou, “Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition”, IEEE Transaction on speech and Audion Processing, Vol.10, No.3, March, 2002.
  25. Harrington, J., and Cassidy, S., Techniques in Speech Acoustics. Kluwer Academic Publishers, Dordrecht, 1999.
  26. Makhoul, J., “Linear prediction: a tutorial review”, Proceedings of the IEEE 64, 4 (1975), pp. 561–580, 1975.
  27. Picone, J., “Signal modeling techniques in speech recognition”, Proceedings of the IEEE 81, 9 (1993), pp. 1215–1247, 1993.
  28. Clsudio Beccchetti and Lucio Prina Ricotti, Speech Recognition Theory and C++ Implementation, John Wiley & Sons. Ltd., pp.124-136, 1999.
  29. L.P. Cordella, P. Foggia, C. Sansone, M. Vento., "A Real-Time Text-Independent Speaker Identification System", Proceedings of 12th International Conference on Image Analysis and Processing, IEEE Computer Society Press, Mantova, Italy, pp. 632 - 637 , September , 2003.
  30. J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-Time Processing of Speech Signals. Macmillan, 1993.
  31. F. Owens., Signal Processing Of Speech, Macmillan New electronics. Macmillan, 1993.
  32. F. Harris, “On the use of windows for harmonic analysis with the discrete fourier transform”, Proceedings of the IEEE 66, vol.1 (1978), pp.51-84, 1978.
  33. J. Proakis and D. Manolakis, Digital Signal Processing, Principles, Algorithms and Aplications, Second edition, Macmillan Publishing Company, New York, 1992.
  34. A.V. Oppenheim, and R.W. Schafer, Digital Signal Processing, Prentice Hall, Englewood Cliffs, 1975.
  35. Svetoslav Marinov., “Text Dependent and Text Independent Speaker Verification Systems. Technology and Applications”, Overview article, 2003. http://www.speech.kth.se/~rolf/gslt_papers/SvetoslavMarinov.pdf
  36. Brett Richard Wildermoth. “Text-Independent Speaker Recognition Using Source Based Features”, Master of Philosophy Thesis, 2001. http://www4.gu.edu.au:8080/adt-root/uploads/approved/adt-QGU20040831.115646/public/01Front.pdf
  37. Tomi Kinnunen. “Spectral Features for Automatic Text-Independent Speaker Recognition.”, Licentiate’s Thesis, 2003. http://www.cs.joensuu.fi/pages/pums/public_results/2004_PhLic_Kinnunen_Tomi.pdf
  38. K. I. Diamantaras and S. Y. Kung, Principal Component Neural Networks: Theory and Applications, John Wiley & Sons,Inc., 1996.
  39. M.A. Turk and A.P. Pentland, “Face Recognition Using Eigenfaces”, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 586-591, 1991.
  40. Omar Daoud, Abdel-Rahman Al-Qawasmi and Khaled daqrouq, “Modified PCA Speaker Identification Based System Using Wavelet Transform and Neural Networks”, International Journal of Recent Trends in Engineering, Vol 2, No. 5, November 2009.
  41. Hu, Y., and Loizou, P., “Subjective comparison of speech enhancement algorithms”, Proceedings of ICASSP-2006, I, pp. 153-156, Toulouse, France, 2006.
  42. Hu, Y., and Loizou, P., “Evaluation of objective measures for speech enhancement”, Proceedings of INTERSPEECH-2006, Philadelphia, PA, 2006.
Index Terms

Computer Science
Information Sciences

Keywords

Biometric Technology Noise Robust Speaker Identification Speech Feature Extraction Principal Component Analysis Genetic Algorithm