Noise Robust Speaker Identification using PCA based Genetic Algorithm

Md. Fayzur Rahman; Md. Rabiul Islam

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

A New Image Encryption Method using Chirikov and Logistic Map

December

2012

VLSI Implementation of Segmentation of Single Channel ECG

March

2014

Content based Image Retrieval in the Compressed Domain

August

2014

A Comparative Analysis for Determining the Optimal Path using PSO and GA

October

2011

Reseach Article

Noise Robust Speaker Identification using PCA based Genetic Algorithm

by Md. Fayzur Rahman, Md. Rabiul Islam

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 4 - Number 12

Year of Publication: 2010

Authors: Md. Fayzur Rahman, Md. Rabiul Islam

10.5120/875-1238

Md. Fayzur Rahman, Md. Rabiul Islam . Noise Robust Speaker Identification using PCA based Genetic Algorithm. International Journal of Computer Applications. 4, 12 ( August 2010), 27-31. DOI=10.5120/875-1238

@article{ 10.5120/875-1238,

author = { Md. Fayzur Rahman, Md. Rabiul Islam },

title = { Noise Robust Speaker Identification using PCA based Genetic Algorithm },

journal = { International Journal of Computer Applications },

issue_date = { August 2010 },

volume = { 4 },

number = { 12 },

month = { August },

year = { 2010 },

issn = { 0975-8887 },

pages = { 27-31 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume4/number12/875-1238/ },

doi = { 10.5120/875-1238 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T19:52:56.051870+05:30

%A Md. Fayzur Rahman

%A Md. Rabiul Islam

%T Noise Robust Speaker Identification using PCA based Genetic Algorithm

%J International Journal of Computer Applications

%@ 0975-8887

%V 4

%N 12

%P 27-31

%D 2010

%I Foundation of Computer Science (FCS), NY, USA

Abstract

This paper emphasizes text dependent speaker identification system on Principal Component Analysis based Genetic Algorithm which deals with detecting a particular speaker from a known population under noisy environment. At first, the system prompts the user to get speech utterance. Noises are eliminated from the speech utterances by using wiener filtering technique. To extract the features from the speech, various types of feature extraction techniques such as RCC, LPCC, MFCC, MFCC and MFCC have been used. Principal Component Analysis has been used to reduce the dimensionality of the speech feature vector. To classify the speech utterances, Genetic Algorithm has been used. NOIZEOUS speech database has been used to measure the performance of this system under the condition of various SNRs. Experimental results show the superiority of the proposed close-set text dependent speaker identification system which can be used for security and access control purposes.

References

Jain, R. Bole, S. Pankanti, BIOMETRICS Personal Identification in Networked Society, Kluwer Academic Press, Boston, 1999.
Rabiner, L., and Juang, B.-H., Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, New Jersey, 1993.
Jacobsen, J. D., “Probabilistic Speech Detection”, Informatics and Mathematical Modeling, DTU, 2003.
Jain, A., R.P.W.Duin, and J.Mao., “Statistical pattern recognition: a review”, IEEE Trans. on Pattern Analysis and Machine Intelligence 22 (2000), pp. 4–37, 2002.
Davis, S., and Mermelstein, P., “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE 74 Transactions on Acoustics, Speech, and Signal Processing (ICASSP), vol. 28, no. 4, pp. 357-366, Aug. 1980.
Sadaoki Furui, “50 Years of Progress in Speech and Speaker Recognition Research”, ECTI TRANSACTIONS ON COMPUTER AND INFORMATION TECHNOLOGY, Vol.1, No.2, November 2005.
Lockwood, P., Boudy, J., and Blanchet, M., “Non-linear spectral subtraction (NSS) and hidden Markov models for robust speech recognition in car noise environments”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. 265-268, Mar. 1992.
Matsui, T., and Furui, S., “Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/ continuous HMMs”, IEEE Transactions on Speech Audio Process, no. 2, pp. 456-459, 1994.
Reynolds, D.A., “Experimental evaluation of features for robust speaker identification”, IEEE Transactions on SAP, Vol. 2, pp. 639-643, 1994.
Sharma, S., Ellis, D., Kajarekar, S., Jain, P. & Hermansky, H., “Feature extraction using non-linear transformation for robust speech recognition on the Aurora database”, Proc. ICASSP2000, 2000.
Wu, D., Morris, A.C. & Koreman, J., “MLP Internal Representation as Disciminant Features for Improved Speaker Recognition”, Proc. NOLISP2005, Barcelona, Spain, pp. 25-33, 2005.
Konig, Y., Heck, L., Weintraub, M. & Sonmez, K., “Nonlinear discriminant feature extraction for robust text-independent speaker recognition”, Proc. RLA2C, ESCA workshop on Speaker Recognition and its Commercial and Forensic Applications, pp. 72-75, 1998.
Ismail Shahin, “Improving Speaker Identification Performance Under the Shouted Talking Condition Using the Second-Order Hidden Markov Models”, EURASIP Journal on Applied Signal Processing 2005:4, pp. 482–486, ,2005, Hindawi Publishing Corporation.
S. E. Bou-Ghazale and J. H. L. Hansen, “A comparative study of traditional and newly proposed features for recognition of speech under stress”, IEEE Trans. Speech, and Audio Processing, vol. 8, no. 4, pp. 429–442, 2000.
G. Zhou, J. H. L. Hansen, and J. F. Kaiser, “Nonlinear feature based classification of speech under stress”, IEEE Trans. Speech, and Audio Processing, vol. 9, no. 3, pp. 201–216, 2001.
Simon Doclo and Marc Moonen, “On the Output SNR of the Speech-Distortion Weighted Multichannel Wiener Filter”, IEEE SIGNAL PROCESSING LETTERS, vol. 12, no. 12, 2005.
Wiener, N., Extrapolation, Interpolation and Smoothing of Stationary Time Series with Engineering Applications, Wiely, Newyork, 1949.
Wiener, N., Paley, R. E. A. C., “Fourier Transforms in the Complex Domains”, American Mathematical Society, Providence, RI, 1934.
Koji Kitayama, Masataka Goto, Katunobu Itou and Tetsunori Kobayashi, “Speech Starter: Noise-Robust Endpoint Detection by Using Filled Pauses”, Eurospeech 2003, Geneva, pp. 1237-1240, 2003.
S. E. Bou-Ghazale and K. Assaleh, “A robust endpoint detection of speech for noisy environments with application to automatic speech recognition”, in Proc. ICASSP2002, vol. 4, pp. 3808–3811, 2002.
Martin, D. Charlet, and L. Mauuary, “Robust speech / non-speech detection using LDA applied to MFCC”, in Proc. ICASSP2001, vol. 1, pp. 237–240, 2001.
Richard. O. Duda, Peter E. Hart, David G. Strok, Pattern Classification, A Wiley-interscience publication, John Wiley & Sons, Inc, Second Edition, 2001.
Sarma, V., Venugopal, D., “Studies on pattern recognition approach to voiced-unvoiced-silence classification”, Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '78. , Volume: 3, pp. 1-4, Apr 1978.
Qi Li. Jinsong Zheng, Augustine Tsai, Qiru Zhou, “Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition”, IEEE Transaction on speech and Audion Processing, Vol.10, No.3, March, 2002.
Harrington, J., and Cassidy, S., Techniques in Speech Acoustics. Kluwer Academic Publishers, Dordrecht, 1999.
Makhoul, J., “Linear prediction: a tutorial review”, Proceedings of the IEEE 64, 4 (1975), pp. 561–580, 1975.
Picone, J., “Signal modeling techniques in speech recognition”, Proceedings of the IEEE 81, 9 (1993), pp. 1215–1247, 1993.
Clsudio Beccchetti and Lucio Prina Ricotti, Speech Recognition Theory and C++ Implementation, John Wiley & Sons. Ltd., pp.124-136, 1999.
L.P. Cordella, P. Foggia, C. Sansone, M. Vento., "A Real-Time Text-Independent Speaker Identification System", Proceedings of 12th International Conference on Image Analysis and Processing, IEEE Computer Society Press, Mantova, Italy, pp. 632 - 637 , September , 2003.
J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-Time Processing of Speech Signals. Macmillan, 1993.
F. Owens., Signal Processing Of Speech, Macmillan New electronics. Macmillan, 1993.
F. Harris, “On the use of windows for harmonic analysis with the discrete fourier transform”, Proceedings of the IEEE 66, vol.1 (1978), pp.51-84, 1978.
J. Proakis and D. Manolakis, Digital Signal Processing, Principles, Algorithms and Aplications, Second edition, Macmillan Publishing Company, New York, 1992.
A.V. Oppenheim, and R.W. Schafer, Digital Signal Processing, Prentice Hall, Englewood Cliffs, 1975.
Svetoslav Marinov., “Text Dependent and Text Independent Speaker Verification Systems. Technology and Applications”, Overview article, 2003. http://www.speech.kth.se/~rolf/gslt_papers/SvetoslavMarinov.pdf
Brett Richard Wildermoth. “Text-Independent Speaker Recognition Using Source Based Features”, Master of Philosophy Thesis, 2001. http://www4.gu.edu.au:8080/adt-root/uploads/approved/adt-QGU20040831.115646/public/01Front.pdf
Tomi Kinnunen. “Spectral Features for Automatic Text-Independent Speaker Recognition.”, Licentiate’s Thesis, 2003. http://www.cs.joensuu.fi/pages/pums/public_results/2004_PhLic_Kinnunen_Tomi.pdf
K. I. Diamantaras and S. Y. Kung, Principal Component Neural Networks: Theory and Applications, John Wiley & Sons,Inc., 1996.
M.A. Turk and A.P. Pentland, “Face Recognition Using Eigenfaces”, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 586-591, 1991.
Omar Daoud, Abdel-Rahman Al-Qawasmi and Khaled daqrouq, “Modified PCA Speaker Identification Based System Using Wavelet Transform and Neural Networks”, International Journal of Recent Trends in Engineering, Vol 2, No. 5, November 2009.
Hu, Y., and Loizou, P., “Subjective comparison of speech enhancement algorithms”, Proceedings of ICASSP-2006, I, pp. 153-156, Toulouse, France, 2006.
Hu, Y., and Loizou, P., “Evaluation of objective measures for speech enhancement”, Proceedings of INTERSPEECH-2006, Philadelphia, PA, 2006.

Index Terms

Computer Science

Information Sciences

Keywords

Biometric Technology Noise Robust Speaker Identification Speech Feature Extraction Principal Component Analysis Genetic Algorithm