CFP last date
20 January 2025
Reseach Article

Article:A Review on Speech Recognition Technique

by Santosh K.Gaikwad, Bharti W.Gawali, Pravin Yannawar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 10 - Number 3
Year of Publication: 2010
Authors: Santosh K.Gaikwad, Bharti W.Gawali, Pravin Yannawar
10.5120/1462-1976

Santosh K.Gaikwad, Bharti W.Gawali, Pravin Yannawar . Article:A Review on Speech Recognition Technique. International Journal of Computer Applications. 10, 3 ( November 2010), 16-24. DOI=10.5120/1462-1976

@article{ 10.5120/1462-1976,
author = { Santosh K.Gaikwad, Bharti W.Gawali, Pravin Yannawar },
title = { Article:A Review on Speech Recognition Technique },
journal = { International Journal of Computer Applications },
issue_date = { November 2010 },
volume = { 10 },
number = { 3 },
month = { November },
year = { 2010 },
issn = { 0975-8887 },
pages = { 16-24 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume10/number3/1462-1976/ },
doi = { 10.5120/1462-1976 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:59:16.659846+05:30
%A Santosh K.Gaikwad
%A Bharti W.Gawali
%A Pravin Yannawar
%T Article:A Review on Speech Recognition Technique
%J International Journal of Computer Applications
%@ 0975-8887
%V 10
%N 3
%P 16-24
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The Speech is most prominent & primary mode of Communication among of human being. The communication among human computer interaction is called human computer interface. Speech has potential of being important mode of interaction with computer .This paper gives an overview of major technological perspective and appreciation of the fundamental progress of speech recognition and also gives overview technique developed in each stage of speech recognition. This paper helps in choosing the technique along with their relative merits & demerits. A comparative study of different technique is done as per stages. This paper is concludes with the decision on feature direction for developing technique in human computer interface system using Marathi Language.

References
  1. R.Klevansand R.Rodman, “Voice Recognition, Artech House, Boston, London 1997.
  2. Samudravijaya K. Speech and Speaker recognition tutorial TIFR Mumbai 400005.
  3. Kevin Brady, Michael Brandstein, Thomas Quatieri, Bob Dunn “An Evaluation Of Audio-Visual person Recognition on the XM2VTS corpus using the Lausanne protocol” MIT Lincoln Laboratory, 244 Wood St., Lexington MA
  4. W. M. Campbell_, D. E. Sturim W. Shen D. A. Reynolds_, J. Navr´atily “The MIT- LL/IBM Speaker recognition System using High performance reduced Complexity recognition” MIT Lincoln Laboratory IBM 2006.
  5. Zahi N.Karam,William M.Campbell “A new Kernel for SVM MIIR based Speaker recognition “MIT Lincoln Laboratory, Lexington, MA, USA.
  6. Asghar .Taheri ,Mohammad Reza Trihi et.al,Fuzzy Hidden Markov Models for speech recognition on based FEM Algorithm, Transaction on engineering Computing and Technology V4 February 2005,IISN,1305-5313
  7. GIN-DER WU AND YING LEI “ A Register Array based Low power FFT Processor for speech recognition” Department of Electrical engineering national Chi Nan university Puli ,545 Taiwan
  8. Nicolás Morales1, John H. L. Hansen2 and Doorstep T. Toledano1 “MFCC Compensation for improved recognition filtered and band limited speech” Center for Spoken Language Research, University of Colorado at Boulder, Boulder (CO), USA
  9. M.A.Anusuya , S.K.Katti “Speech Recognition by Machine: A Review” International journal of computer science and Information Security 2009.
  10. Goutam Saha, Ulla S. Yadhunandan “ Modifield Mel-Frequency Cepstral coefficient Department of Electronics and Electrical communication Engineering India Institute of Technology ,Kharagpur Kharagpur-721302 West Bengal,India. .
  11. Kenneth Thomas Schutte “Parts-based Models and Local Features for Automatic Speech Recognition” B.S., University of Illinois at Urbana-Champaign (2001) S.M., Massachusetts Institute of Technology (2003).
  12. Zaidi Razak, Noor Jamaliah Ibrahim, Emran Mohd Tamil, Mohd Yamani Idna Idris “Quarnic Verse recitation feature extraction using Mel-Frequency Cepstral Coefficient(MFCC)” Department of Al-Quran & Al-Hadith, AcademyOf Islamic Studies, University of Malaya .
  13. Samudravijay K “Speech and Speaker recognition report” source: http://cs.jounsuu.fi/pages/tkinnu/reaserch/index.html Viewed on 23 Feb. 2010.
  14. Sannella, M Speaker recognition Project Report report” From http://cs.joensuu.fi/pages/tkinnu/research/index.html Viewed 23 Feb. 2010.
  15. IBM (2010) online IBM Research Source:-http://www.research.ibm.com/Viewed 12 Jan 2010.
  16. P.satyanarayana “short segment analysis of speech for enhancement” institute of IIT Madras feb.2009
  17. David, E., and Selfridge, O., Eyes and ears for computers, Proc.IRE 50:1093.
  18. SadokiFuruki,Tomohisa Ichiba et.al,Cluster-based Modeling for Ubiquitous Speech Recognition, Department of Computer Science Tokyo Institute of Technology Interspeech 2005.
  19. Spector, Simon Kinga and Joe Frankel, Recognition ,Speech production knowledge in automatic speech recognition , Journal of Acoustic Society of America,2006
  20. M.A Zissman,”Predicting,diagonosing and improving automatic Language identification performance” ,Proc.Eurospeech97,Sept.1997 vol.1,pp.51-54 1989.
  21. Y.Yan and E.Bernard ,”An apporch to automatic language identification basedon language depandant phone recognition “,ICASSP’95,vol.5,May.1995 p.3511
  22. E. Singer, P.A. Torres-Carrasquillo, T.P. Gleason, W.M. Campbell, and D.A. Reynolds ,“Accoustic ,phonetic and discriminative approach to automic Language Idantification”.
  23. Viet Bac Le, Laurent Besacier, and Tanja Schultz, Acoustic-phonetic unit similarities for context dependant acoustic model portability Carnegie Mellon University, Pittsburgh, PA, USA
  24. C.S.Myers and L.R.Rabiner, A Level Building Dynamic Time Warping Algorithm for Connected Word Recognition , IEEE Trans. Acoustics, Speech Signal Proc.,ASSP-29:284-297,April 1981.
  25. D.R.reddy,An Approach to Computer speech Recognition by direct analysis of the speech wave,Tech.Report No.C549,Computer Science Department ,Stanford University,sept.1996
  26. Tavel R.K.Moore,Twenty things we still don’t know about speech proc.CRIM/FORWISS Workshop on Progress and Prospects of speech Research and Technology 1994.
  27. H.Sakoe and S.Chiba, Dynamic programming algorithm optimization for spoken word recognition ,IEEE Trans. Acoustics, Speech, Signal Proc.,ASSP-26(1).1978
  28. Keh-Yih Su et.al., Speech Recognition using weighted HMM and subspace IEEE Transactions on Audio, Speech and Language.
  29. L.R.Bahl et.al, A method of Construction of acoustic Markov Model for words, IEEE Transaction on Audio ,speech and Language Processing ,Vol.1,1993
  30. Shigeru Katagiri et.al., A New hybrid algorithm for speech recognition based on HMM segmentation and learning Vector quantization , IEEE Transactions on Audio Speech and Language processing Vol.1,No.4
  31. G. 2003 Lalit R .Bahl et.al.,Estimating Hidden Markov Model Parameters so as to maximize speech recognition Accuracy,IEEE Transaction on Audio, Speech and Language Processing Vol.1 No.1 , Jan.1993.
  32. Gerhard Rogoll,Maximum Mutual Information Neural Networks for hybrid connectionist-HMM speech Recognition systems ,IEEE Transaction on Audio, speech and Language Processing Vol.2 ,No.1,Part II,Jan.1994.
  33. Antonio M. Peinado et.al, discriminative codebook design using Multiple Vector quatization in HMM based speech recognizers,IEEE Transaction on Audio,Speech and language Processing Vol.4 No.2 March.1996
  34. Nam Soo kim et.al,On estimating robust Probability Distribution in HMM in HMM based Speech Recognition ,IEEE Transaction on Audio, Speech and Language Processing Vol.3,No.4 ,July 1995.
  35. Jean Francois, Automatic word Recognition Based on Second Order hidden Markov Models.IEEE Transaction on Audio, Speech and Language ProcessingVol.5, No.1, Jan.1997.
  36. Mari ostendorf et.al. from HMM to segment Models: a Unified View stochastic Modeling for speech Recognition ,IEEE Transaction on audio, speech and Language Processing Vol.4,No.5,September 1996.
  37. John butzberger ,Spontaneous speech effects In Large Vocabulary Speech Recognition application,SRI International Speech Research and Technology Program Menlo Park,CA 94025
  38. Dannis Norris, “Merging Information in Speech Recognition” feedback is never Necessary workshop.1995
  39. Yifan gong, stochastic trajectory Modeling and Sentence searching for continuous Speech Recognition,IEEE Transaction on Speech and Audio Processing,1997.
  40. Alex weibel and Kai-Fu Lee, reading in Speech recognition ,Morgan Kaufman Publisher,Inc.San Mateo,California,1990.
  41. John Butzberger, Spontanious Speech Effect in Large Vocublary speech recognition application, SRI International Speech Research and Technology program Menlo Park, CA94025.
  42. M.J.F.Gales and S.J young, Parallel Model combination for Speech Recognition in Noise technical Report, CUED/FINEFENG/TRI135, 1993.
  43. A.P.Varga and R.K.Moore, “Hidden Markov Model Decomposition of Speech and Noise, Proc.ICASSp, pp.845-848, 1990.
  44. M.Weintraub et.al, linguistic constraints in hidden markov Model based speech recognition, Proc.ICASSP, pp.699-702, 1989.
  45. S.katagiri, Speech Pattern recognition using Neural Networks.
  46. L.R.Rabiner and B.H.jaung ,” Fundamentles of Speech Recognition Prentice-Hall, Englewood Cliff, New Jersy, 1993
  47. D.R.Reddy, An Approach to Computer Speech Recognition by Direct Analysis of the Speech Wave , Tech.Report No.C549, Computer Science Dept., Stanford Univ., September 1966
  48. K.Nagata, Y.Kato, and S.Chiba, Spoken Digit Recognizer for Japanese Language , NEC Res.Develop., No.6,1963
  49. D.B.Fry, Theoretical Aspects of Mechanical speech Recognition , and P.Denes, The design and Operation of the Mechanical Speech Recognizer at University College London, J.British Inst. Radio Engr., 19:4,211-299,1959.
  50. Dat Tat Tran, Fuzzy Approaches to Speech and Speaker Recognition , A thesis submitted for the degree of Doctor of Philosophy of the university of Canberra
  51. Lawrence Rabiner, Biing Hwang Juang, Fundamental of Speech Recognition, Copyright 1999by AT&T.
Index Terms

Computer Science
Information Sciences

Keywords

Analysis feature extraction Modeling Testing speech processing HCI