CFP last date
20 January 2025
Reseach Article

Performance Improvement in Keyword Spotting for Telephony Services

by M. Assadi, M. M. Homayounpour
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 77 - Number 8
Year of Publication: 2013
Authors: M. Assadi, M. M. Homayounpour
10.5120/13414-1079

M. Assadi, M. M. Homayounpour . Performance Improvement in Keyword Spotting for Telephony Services. International Journal of Computer Applications. 77, 8 ( September 2013), 18-22. DOI=10.5120/13414-1079

@article{ 10.5120/13414-1079,
author = { M. Assadi, M. M. Homayounpour },
title = { Performance Improvement in Keyword Spotting for Telephony Services },
journal = { International Journal of Computer Applications },
issue_date = { September 2013 },
volume = { 77 },
number = { 8 },
month = { September },
year = { 2013 },
issn = { 0975-8887 },
pages = { 18-22 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume77/number8/13414-1079/ },
doi = { 10.5120/13414-1079 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:49:43.938688+05:30
%A M. Assadi
%A M. M. Homayounpour
%T Performance Improvement in Keyword Spotting for Telephony Services
%J International Journal of Computer Applications
%@ 0975-8887
%V 77
%N 8
%P 18-22
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper, a new hybrid approach is presented for keyword spotting. The proposed Method is based on Hidden Markov Mode (HMM) and is performed in two stages. In the first stage by using phoneme models, a series of candidate keyword(s) is recognized. In the second stage, word models are used to decide on acceptance or rejection of each candidate keyword. Two different methods are presented in the second stage to improve the spotting performance of the first stage. In the first method, we make a decision to accept or reject each candidate keyword using the similarity between candidate word and the corresponding word model. In the second method, the similarity values between candidate keyword with HMM models of keywords and some HMM models of out of vocabulary words are calculated. These similarity values form a feature vector and are given to a SVM classifier to make the final decision on the correctness of the decision made in the first step. The proposed method was evaluated on two evaluation datasets. Comparing the result obtained from the proposed method and the results obtained by the one stage keyword spotting using the filler models (i. e. the first method on the second step), 5. 6% of improvement on the first test set and 4. 5% of improvement on the second test set were obtained. By implementation and evaluation of the second method in the second stage, an improvement of 10. 3% was achieved using the second dataset.

References
  1. Jeong-Sik Park, Daejeon, South Korea, International Journal of Multimedia and Ubiquitous Engineering Vol. 7, No. 2, April, 2012. Confidence Measure for Utterance Verification in Keyword Spotting System
  2. H. Ketabdar, J. Vepa, S. Bengio, and H. Boulard, Proceedings of Interspeech, Pittsburgh, Pennsylvania, 2006. Posterior based keyword spotting with a priori thresholds,
  3. Y. B. Ayed, D. Fohr, J. P. Haton, and G. Chollet, Proceedings of International Conference on Audio, Speech and Signal Processing, Montreal, Canada, 2004. Confidence measure for keyword spotting using support vector machines
  4. J. Keshet, D. Grangier, and S. Bengio, Workshop on Non-Linear Speech Processing NOLISP, 2007. Discriminative keyword spotting
  5. E. Gouws, K. Wolvaardt, N. Kleynhans, and E. Barnard, Proceedings of the 15th Annual Symposium of the Pattern Recognition Association of South Africa, p. 169, 2004. Appropriate baseline values for HMM-based speech recognition
  6. Jeong-Sik Park, Daejeon, South Korea, International Journal of Multimedia and Ubiquitous Engineering, April 2012. Confidence Measure for Utterance Verification in Keyword Spotting System
  7. S. Veisi, MSc thesis, Sharif University, 2006. Recognition of Out of Vocabulary Words in order to improve the Performance of Speech Recognition Systems (in Farsi)
  8. Matsushita, M. , Nishizaki, H. , Utsuro, T. Kodama, Y. , Nakagawa, S. , Toyohashi University of Technology, Japan; Kyoto University, Japan, Eurospeech, 2003. Evaluating Multiple LVCSR Model Combination in NTCIR-3 Speech-Driven Web Retrieval Task.
  9. Davis, S. B. , Mermelstein, P. , IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-28 pp. 357-366, August 1980. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences.
  10. Young, S. , IEEE Signal Processing Magazine, September 1996. A Review of Large-Vocabulary Continuous-Speech Recognition
  11. Digalakis V. , Murveit, H. , International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, pp. 537-540, 1994. Genones: Optimizing the Degree of Mixture Tying in a Large Vocabulary Hidden Markov Model Based Speech Recognizer.
  12. Yu, P. , Chen, K. , Ma, C. , Seide, F. , IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 5, no. 13, pp. 635–643, 2005. Vocabulary independent indexing of spontaneous speech.
  13. Heracleous, P. , Shimizu, T. , Speech Communication, pp. 373–386, 2005. A novel approach for modeling non-keyword intervals in a keyword spotter exploiting acoustic similarities of languages.
  14. Yapanel, U. , Ph. D. thesis, Istambul Teknik University, 1997. Garbage modeling techniques for a Turkish keyword spotting system.
  15. Clarkson, P. , Moreno, P. , Acoustics, Speech and Signal Processing, pp. 2:585–588, 2000. The use of support vector machines for phonetic classification.
  16. Mori, S. , Nishimura, M. , Itoh, N. IBM Japan Ltd. , Japan, Eurospeech, Geneva, Switzerland, 2003. Language Model Adaptation Using Word Clustering.
  17. Shilei, H. , Xiang, X. , Jingming, K. , Department of Electronic Engineering, Beijing Institute of Technology, Beijing, P. R. China, INTERSPEECH, 2006. Improving the Performance of Out-of-vocabulary Word Rejection by Using Support Vector Machines.
  18. Bijankhan, M. , et al (1994), Proc. Australian Conference On Speech Science and Technology. Vol 2, pp. 826-830, 1994. FARSDAT – The Speech database Of Farsi Spoken Language.
  19. K. Thambiratnam, S. Sridharan, EuroSpeech, 2003. Isolated word verification using Cohort Word-level Verification.
  20. Lobus smidle, Josef V. Psutka, INTERSPEECH, ICSLP, 2006. Comparison of Keyword Spotting Methods for Searching in Speech.
  21. Z. Chenyan, L. Shuqin and S. Chengli, Proceedings of the 8th International Conference on Signal Processing, 2006.
  22. Study of Design and Implementation of Speech Keyword Recognition System based on Streaming Media.
Index Terms

Computer Science
Information Sciences

Keywords

Keyword spotting Speech recognition Confidence measure Hidden Markov Model (HMM) Support Vector Machine (SVM)