CFP last date
20 December 2024
Reseach Article

Composite Feature Selection Method based on Spoken Word and Speaker Recognition

by Dipen Nath, Sanjib Kr. Kalita
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 121 - Number 8
Year of Publication: 2015
Authors: Dipen Nath, Sanjib Kr. Kalita
10.5120/21560-4593

Dipen Nath, Sanjib Kr. Kalita . Composite Feature Selection Method based on Spoken Word and Speaker Recognition. International Journal of Computer Applications. 121, 8 ( July 2015), 18-23. DOI=10.5120/21560-4593

@article{ 10.5120/21560-4593,
author = { Dipen Nath, Sanjib Kr. Kalita },
title = { Composite Feature Selection Method based on Spoken Word and Speaker Recognition },
journal = { International Journal of Computer Applications },
issue_date = { July 2015 },
volume = { 121 },
number = { 8 },
month = { July },
year = { 2015 },
issn = { 0975-8887 },
pages = { 18-23 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume121/number8/21560-4593/ },
doi = { 10.5120/21560-4593 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:07:54.897570+05:30
%A Dipen Nath
%A Sanjib Kr. Kalita
%T Composite Feature Selection Method based on Spoken Word and Speaker Recognition
%J International Journal of Computer Applications
%@ 0975-8887
%V 121
%N 8
%P 18-23
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The aim of this paper is to measure the recognition capability of composite features extracted from speech signal and compare the result with other individually considered features for both spoken word and speaker based recognitions. Standard features like formants (F1, F2, F3), Linear Predictive Coefficients (LPC) and Mel Frequency Cepstral Coefficients (MFCC) along with various combinations among them are considered for the task to arrive at the conclusion. Six different speakers and six different strings (words) are considered in the present study. The threshold is set through an iterative approach for both spoken word and speaker recognition experiments. The mixing of LPC and MFCC is found to be the most promising combination among all others. Another interesting conclusion that we can draw from the study that the composite feature approach gives accuracy very near to 100% in case of speaker recognition task as compared to spoken word recognition task. .

References
  1. Adjoudj Reda, Boukelif Aoued, "Artificial Neural Network & Mel-Frequency Cepstrum Coefficients-Based Speaker Recognition", 3rd International Conference: Sciences of Electronic, Technologies of Information and Telecommunications--TUNISIA, March 27-31, 2005
  2. Das, B. P. , Parekh, Ranjan. "Recognition of Isolated Words using Features based on LPC, MFCC, ZCR and STE with Neural Network Classifiers". International Journal of Modern Engineering Research (IJMER) ,Vol. 2, Issue. 3, May-June 2012 pp-854-858 [ISSN: 2249-6645]
  3. Praveen N, Tessamma Thomas, "Text dependent speaker recognition using MFCC features and BPANN", International Journal of Computer Applications (0975 – 8887), Volume 74– No. 5, July 2013.
  4. Kshamamayee Dash, Debananda Padhi, Bhoomika Panda, Prof. Sanghamitra Mohanty, "Speaker Identification using Mel Frequency Cepstral Coefficient and BPNN", International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 4, April 2012 ISSN: 2277 128X
  5. Lajish V. L , Sunil Kumar R. K and Vivek P, "Speaker identification using a nonlinear speech model and ANN", International Journal of Advanced Information Technology (IJAIT) Vol. 2, No. 5, October 2012
  6. Thiang, Suryo Wijoyo. "Speech Recognition Using Linear Predictive Coding and Artificial Neural Network for Controlling Movement of Mobile Robot". 2011 International Conference on Information and Electronics Engineering IPCSIT vol. 6 (2011) © (2011) IACSIT Press, Singapore
  7. Talukdar, P. H. , Bhattacharjee, U. , Goswami, C. K. , Barman, J. , "Cepstral Measure of Boro Vowels through LPC-Analysis", Journal of the CSI, Vol. 34 No 1, Jan – Mar, 2004.
  8. Kalita S. K. , Dutta R. , and Talukdar P. H. , "A spectral analysis of Bodo and Assamese vowels", Abstracts 3rd International Conference on "Computers and Devices for Communication". CODEC – 06, Kolkata, India, pp. 41, 2006.
  9. Braman, J. , Kalita, S. , Talukdar, P. H. , "Features extraction of bodo vowels through lpc-analysis", Proceedings of Frontiers of Research on Speech and Music (FRMS-2004), 2004
  10. Hasan Rashidul, Jamil Mustafa, Rabbani Golam, Rahman Saifur, "Speaker identification using mel frequency cepstral coefficients", 3rd International Conference on Electrical & Computer Engineering, Dhaka, Bangladesh, ICECE 2004, 28-30 December 2004
  11. Rabiner L. , Juang B. H. and Yegnanarayana B. – "Fundamentals of Speech Processing", Pearson Education, ISBN 978-81-775-8560-5 (2011)
  12. D. Ripley, "Neural Networks and Related Methods for Classification", Journal of the Royal Statistical Society. Series B (Methodological), Vol. 56, No. 3(1994), pp. 409-456, 1994.
  13. Rabiner L. and Juang B. H. – "Fundamental of Speech Processing", Prentice-Hall, 1993
  14. K. Levenberg. "A Method for the Solution of Certain Non-Linear Problems in Least Squares". The Quarterly of Applied Mathematics, 2: 164-168 (1944).
  15. M. I. A. Lourakis. , "A brief description of the Levenberg-Marquardt algorithm" implemented by levmar, Technical Report, Institute of Computer Science, Foundation for Research and Technology, - Hellas, 2005.
  16. Mark K. Transtrum and James P. Sethna "Improvements to the Levenberg-Marquardt algorithm for nonlinear least-squares minimization," Preprint submitted to Journal of Computational Physics, January 30, 2012.
  17. Vibha Tiwari, "MFCC and its applications in speaker recognition", International Journal on Emerging Technologies 1(1): 19-22(2010) ISSN: 0975-8364, 2010.
  18. S. Khan, Mohd Rafibul lslam, M. Faizul, D. Doll, "Speaker recognition using MFCC", presented in IJCSES ,International Journal of Computer Science and Engineering System, 2(1): 2008.
  19. Bishop, C. , "Neural Networks for Pattern Recognition", Oxford University Press, Oxford, 1995.
  20. Haykin, S. , "Neural Networks - A Comprehensive Foundation", 2nd ed. Prentice-Hall, Englewood Cliffs, 1998.
  21. Dipen Nath, Sanjib Kr Kalita, "Feature Selection Method for Speaker Recognition using Neural Network", International Journal of Computer Applications(0975-8887), Volume 101- No. 3, September, 2014[ISBN: 973-93-80883-64-3, ISSN: 0975-8887]
  22. Sanjib Kr. Kalita, Dipen Nath, "SPEECH UTTERANCE CLASSIFICATION USING MEL-FREQUENCY CEPSTRAL COEFFICIENT (MFCC) AND ROC ANALYSIS", a chapter in the book titled "Advanced Computer Science & Applications" pp. (54-64), 2014 [ISBN: 978-93-83252-08-4]
  23. E. M. Mohammed, M. S. Sayed, A. M. Moselhy, A. A. Abdelnaiem, "LPC and MFCC Performance Evaluation with Artificial Neural Network for Spoken Language Identification", International Journal of Signal Processing, Image Processing and Pattern Recognition, Vol. 6, No. 3, June, 2013
  24. Soyuj Kumar Sahoo, Tarun Choubisa & S. R. Mahadeva Prasanna (2012) Multimodal Biometric Person Authentication : A Review, IETE Technical Review, 29:1, 54-75
  25. Rajkumar Palaniappan, Kenneth Sundaraj, Nizam Uddin Ahamed, Agilan Arjunan & Sebastian Sundaraj (2013) Computer-based Respiratory Sound Analysis: A Systematic Review, IETE Technical Review, 30:3, 248-256
Index Terms

Computer Science
Information Sciences

Keywords

Feature Extraction and Selection Feed Forward Neural Network Speech and Speaker Recognition