CFP last date
20 December 2024
Reseach Article

Speech Recognition System: A Review

by Nitin Washani, Sandeep Sharma
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 115 - Number 18
Year of Publication: 2015
Authors: Nitin Washani, Sandeep Sharma
10.5120/20249-2617

Nitin Washani, Sandeep Sharma . Speech Recognition System: A Review. International Journal of Computer Applications. 115, 18 ( April 2015), 7-10. DOI=10.5120/20249-2617

@article{ 10.5120/20249-2617,
author = { Nitin Washani, Sandeep Sharma },
title = { Speech Recognition System: A Review },
journal = { International Journal of Computer Applications },
issue_date = { April 2015 },
volume = { 115 },
number = { 18 },
month = { April },
year = { 2015 },
issn = { 0975-8887 },
pages = { 7-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume115/number18/20249-2617/ },
doi = { 10.5120/20249-2617 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:55:11.848271+05:30
%A Nitin Washani
%A Sandeep Sharma
%T Speech Recognition System: A Review
%J International Journal of Computer Applications
%@ 0975-8887
%V 115
%N 18
%P 7-10
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

To be able to control devices by voice has always intrigued mankind. Today after intense research, Speech Recognition System, have made a niche for themselves and can be seen in many walks of life. The accuracy of Speech Recognition Systems remains one of the most important research challenges e. g. noise, speaker variability, language variability, vocabulary size and domain. The design of speech recognition system requires careful attentions to the challenges such as various types of Speech Classes and Speech Representation, Speech Preprocessing stages, Feature Extraction techniques, Database and Performance evaluation. This paper presents the advances made as well as highlights the pressing problems for a speech recognition system. The paper also classifies the system into Front End and Back End for better understanding and representation of speech recognition system in each part.

References
  1. Dr. Shaila D. Apte, "Speech and Audio Processing",Wiley India Edition.
  2. Jacob Benesty, M. Mohan Sondhi, Yiteng Huang, "Springer Handbook of Speech Processing", Springer.
  3. L. R. Rabiner and R. W. Schafer, "Digital Processing of Speech Signals", Prentice Hall Signal Processing Series.
  4. N. Srivastava, "Speech Recognition using Artificial Neural Network", IJESIT, Volume 3, Issue 3, May 2014.
  5. L. R. Rabiner, M. J. Cheng, A. E. Rosenberg and C. A. McGonegal, "A Comparative Performance Study of Several Pitch Detection Algorithms", IEEE Transactions On Acoustics, Speech, And Signal Processing, Vol. Assp-24,No. 5, October 1976.
  6. S. Ahmadi and A. S. Spanias, "Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm", IEEE Transactions on Speech And Audio Processing, Vol. 7, No. 3, May 1999.
  7. K. K. Paliwal, "Effect of Preemphasis on Vowel Recognition Performance", Elsevier Science Publishers B. V. (North-Holland), Vol. 3. No. 1. April 1984.
  8. R. Vergin, Douglas O'Shaughnessy and A. Farhat, "Generalized Mel Frequency Cepstral Coefficients for Large-Vocabulary Speaker-Independent Continuous-Speech Recognition", IEEE Transactions On Speech And Audio Processing, Vol. 7, No. 5, September 1999.
  9. I. Patel, Dr. Y. Srinivas Rao, "Speech Recognition Using HMM with MFCC-AN Analysis Using Frequency Spectral Decomposition Technique", SIPIJ,Vol. 1,No. 2,December 2010.
  10. A. N. Mishra, M. Chandra, A. Biswas, S. N. Sharana, "Robust Features for Connected Hindi Digits Recognition", International Journal of Signal Processing, Image Processing and Pattern Recognition Vol. 4, No. 2, June, 2011.
  11. Sadaoki Furui, "Speaker-Independent Isolated Word Recognition Using Dynamic Features of Speech Spectrum", IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSp-34, No. 1, February 1986.
  12. Bachu R. G. , Kopparthi S. , Adapa B. , Barkana B. D. , "Separation of Voiced and Unvoiced using Zero crossing rate and Energy of the Speech Signal", Springer Science & Business Media.
  13. A. Singh, Dr. D. K. Rajoria, V. Singh, "Database Development and Analysis of Spoken Hybrid Words Using Endpoint Detection", IJECSE, Volume 1, Number 3.
  14. K. Waheed, Kim Weaver and F. M. Salam, "A Robust Algorithm for Detecting Speech Segments Using an Entropic Contrast".
  15. Lingyun Gu and S. A. Zahorian, "A New Robust Algorithm for Isolated Word Endpoint Detection".
  16. Qi Li, J. Zheng, A. Tsai and Q. Zhou, Member, "Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition", IEEE Transactions on Speech And Audio Processing, Vol. 10, No. 3, March 2002.
  17. N. N. Lokhande, N. S. Nehe, P. S. Vikhe, "Voice Activity Detection Algorithm for Speech Recognition Applications", ICCIA, 2011.
  18. Hui Jiang, K. Hirose and Qiang Huo, "A Minimax Search Algorithm for Robust Continuous Speech Recognition", IEEE Transactions On Speech And Audio Processing, Vol. 8, No. 6, November 2000.
  19. J. K. Lee and C. D. Yoo, "Wavelet Speech Enhancement Based On Voiced/Unvoiced Decision", the 32nd International Congress and Exposition on Noise Control Engineering Jeju International Convention Center, Seogwipo, Korea, August 25-28, 2003.
  20. W. Gevaert, G. Tsenov and V. Mladenov, "Neural Networks used for Speech Recognition", Journal Of Automatic Control, University Of Belgrade, Vol. 20:1-7, 2010.
  21. Amr Rashed, "Fast Algorithm for Noisy Speaker Recognition Using ANN", IJCET, Volume 5, Issue 2, February (2014), pp. 56-65.
  22. T. Lee, C. Ching and Lai-Wan Chan, "Isolated Word Recognition Using Modular Recurrent Neural Networks", Pattern Recognition, Vol. 31, No. 6, pp. 751—760, 1998.
  23. K. Dutta and K. K. Sarma, "Multiple Feature Extraction for RNN-based Assamese Speech Recognition for Speech to Text Conversion Application", International Conference on Communications, Devices and Intelligent Systems (CODIS), IEEE, 2012.
  24. K. Dutta and K. K. Sarma, "Dynamic Segmentation of Vocal Extract for Assamese Speech to Text Conversion using RNN", CISP, IEEE, 2012.
  25. A. Singh, Dr. D. K. Rajoria, V. Singh, "Broad Acoustic Classification of Spoken Hindi Hybrid Paired Words using Artificial Neural Networks", International Journal of Computer Applications, Volume 52, No. 12, August 2012.
  26. M. Vyas, "A Gaussian Mixture Model Based Speech Recognition System Using Matlab", SIPIJ, Vol. 4, No. 4, August 2013.
  27. Hiroaki Sakoe, "Two-Level DP-Matching, A Dynamic Programming Based Pattern Matching Algorithm For Connected Word Recognition", IEEE Transactions On Acoustics, Speech, And Signal Processing, Vol. Assp-27, No. 6, December 1979.
Index Terms

Computer Science
Information Sciences

Keywords

VAD Feature Extraction Hidden Markov Model Neural Networks.