CFP last date
20 January 2025
Reseach Article

A Review of Challenges in Automatic Speech Recognition

by Harshalata Petkar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 151 - Number 3
Year of Publication: 2016
Authors: Harshalata Petkar
10.5120/ijca2016911706

Harshalata Petkar . A Review of Challenges in Automatic Speech Recognition. International Journal of Computer Applications. 151, 3 ( Oct 2016), 23-26. DOI=10.5120/ijca2016911706

@article{ 10.5120/ijca2016911706,
author = { Harshalata Petkar },
title = { A Review of Challenges in Automatic Speech Recognition },
journal = { International Journal of Computer Applications },
issue_date = { Oct 2016 },
volume = { 151 },
number = { 3 },
month = { Oct },
year = { 2016 },
issn = { 0975-8887 },
pages = { 23-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume151/number3/26214-2016911706/ },
doi = { 10.5120/ijca2016911706 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:56:07.605662+05:30
%A Harshalata Petkar
%T A Review of Challenges in Automatic Speech Recognition
%J International Journal of Computer Applications
%@ 0975-8887
%V 151
%N 3
%P 23-26
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Speech is the nature’s gift to the human being which contributes towards the intelligence and discrimination from rest of the animal kingdom. Taking into consideration technological aspects, speech recognition is the buzzword today, as communication and hands free computing evolving day by day. Speech is a very important mode of the communication and interaction with the digital computer. Speech recognition along with the wide range of applicability in domain of computer science, medical science, psychology, sports, neurology has many challenges while developing. Developing real time speech recognizer may hurdle from adverse environment to anatomy of the human body. It also involves linguistic aspects too. This paper explores various challenges in developing a robust ASR system.

References
  1. John Makhoul and Richard Schwartz, “State of art in continuous speech recognition” proceeding National Academy of Science,USA Vol 92 pp9956-9963 october1995
  2. H.  Dudley,  The  Vocoder,  Bell  Labs  Record,  Vol.  17, 122-126, 1939.
  3. Lawson, A.D., Harris, D.M., Grieco, J.J., 2003. Effect of foreign accent on speech recognition in the NATO N-4 corpus. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 1505–1508.;
  4. Vibha Tiwari, International Journal on Emerging Technologies 1(1): 19-22(2010) ISSN : 0975-8364 MFCC and its application in speaker recognition
  5. Scan soft (2004). Embeded speech soloutions retrieved January 25, 2005 from http://www.speechworks.com/
  6. Robertson, J., Wong, Y.T., Chung, C., and Kim, D.K., (1998) Automatic Speech Recognition for Generalised Time Based Media Retrieval and Indexing, Proceedings of the sixth ACM International Conference on Multimedia(pp 241-246) Bristol, England.
  7. Huang, X., Acero, A., Hon, H., 2001. Spoken Language Processing. Prentice-Hall, PTR, Upper Saddle River, NJ.
  8. Multimodality in Language and Speech Systems Björn Granström, David House, and Inger Karlsson (Eds.). Text, speech and Language Technology, Dordrecht,(2002)
  9. Article from url https://www.hamilton.edu/oralcommunication/spoken-language-vs-written-language
  10. Garvin, P.L., Ladefoged, P., 1963. Speaker identification and message identification in speech recognition. Phonetica 9, 193–199. (Garvin and Ladefoged, 1963; Nolan, 1983)
  11. Nolan, F., 1983. The Phonetic Bases of Speaker Recognition. Cambridge University Press, Cambridge
  12. Kubala, F., Anastasakos, A., Makhoul, J., Nguyen, L., Schwartz, R., Zavaliagkos, E., 1994. Comparative experiments on large vocabulary 782 M. Benzeghiba et al. / Speech Communication 49 (2007) 763–786speech recognition. In: Proceedings of ICASSP, Adelaide, Australia,pp. 561–564
  13. Van Compernolle, D., 2001. Recognizing speech of goats, wolves, sheep and ... non-natives. Speech Communication 35 (1–2), 71–79.
  14. Lee, S., Potamianos, A., Narayanan, S., 1999. Acoustics of children speech: developmental changes of temporal and spectral parameters. The Journal of the Acoustical Society of America 105, 1455–1468.
  15. Das, S., Nix, D., Picheny, M., 1998. Improvements in children speech recognition performance. In: Proceedings of ICASSP, vol. 1. Seattle, USA, pp. 433–436.
  16. Lee, L., Rose, R.C., 1996. Speaker normalization using effcient frequency warping procedures. In: Proceedings of ICASSP, vol. 1. Atlanta, Georgia, pp. 353–356.
  17. Martinez et al., 1997; Mirghafori et al., 1995; Siegler and Stern, 1995
  18. RABINER, L.R., JUANG, B., Fundamentals on Speech Recognition, New Jersey, Prentice Hall, 1996.
  19. HUANG, X., ACERO, A., HON, H.W., Spoken Language Processing: A Guide to Theory, Algorithm and System Development, New Jersey, Prentice Hall, chapter 11, 2001.
  20. Linguistics: An introduction to language and communication
  21. Louis Boves and Johan de Vethd. Comparison of channel normalization techniques for automatic speech recognition over the phone. In Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, volume 4, pages 2332 {2335 vol.4, oct 1996.
  22. Gang Liu and John L. Hansen. A systematic strategy for robust automatic dialect identi_ cation. In EUSIPCO2011, pages 2138{2141, 2011.Gang Liu, Yun Lei, and John H.L. Hansen. Dialect identi_ cation: Impact of di_erences between read versus spontenous speech. In EUSIPCO2010,pages 49{53, 2010.
  23. J ohn Nerbonne. Linguistic variation and computation. In Proceedingsof the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1, EACL '03, pages 3{10, Stroudsburg, PA, USA, 2003. Association for Computational Linguistics.
  24. Pedro A. Torres-Carrasquillo, Douglas A. Reynolds, and P. Gleason.Dialect identi_cation using gaussian mixture models. In ISCA, pages757{760, 2004.
  25. Mingkuan Liu, Bo Xu, Taiyi Hunng, Yonggang Deng, and Chengrong Li. Mandarin accent adaptation based on contextindependent/context-dependent pronunciation modeling. In Proceedings of the Acoustics, Speech, and Signal Processing, ICASSP '00, pages II1025{II1028, Washington, DC, USA, 2000. IEEE Computer Society.
Index Terms

Computer Science
Information Sciences

Keywords

Speech Speech recognition communication linguistics