Analysis of a Small Vocabulary Bangla Speech Database for Recognition

Sumana Huque; Ahsan Habib Rasel; M. Babul Islam

Call for Paper

April Edition

IJCA solicits high quality original research papers for the upcoming April edition of the journal. The last date of research paper submission is 20 March 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Analysis of a Small Vocabulary Bangla Speech Database for Recognition

by Sumana Huque, Ahsan Habib Rasel, M. Babul Islam

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 133 - Number 6

Year of Publication: 2016

Authors: Sumana Huque, Ahsan Habib Rasel, M. Babul Islam

10.5120/ijca2016907827

Sumana Huque, Ahsan Habib Rasel, M. Babul Islam . Analysis of a Small Vocabulary Bangla Speech Database for Recognition. International Journal of Computer Applications. 133, 6 ( January 2016), 22-28. DOI=10.5120/ijca2016907827

@article{ 10.5120/ijca2016907827,

author = { Sumana Huque, Ahsan Habib Rasel, M. Babul Islam },

title = { Analysis of a Small Vocabulary Bangla Speech Database for Recognition },

journal = { International Journal of Computer Applications },

issue_date = { January 2016 },

volume = { 133 },

number = { 6 },

month = { January },

year = { 2016 },

issn = { 0975-8887 },

pages = { 22-28 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume133/number6/23791-2016907827/ },

doi = { 10.5120/ijca2016907827 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:30:25.607196+05:30

%A Sumana Huque

%A Ahsan Habib Rasel

%A M. Babul Islam

%T Analysis of a Small Vocabulary Bangla Speech Database for Recognition

%J International Journal of Computer Applications

%@ 0975-8887

%V 133

%N 6

%P 22-28

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

To carry out any kind of research in the field of speech signal processing, a standard database is essential. There are many databases in different languages but not in Bangla language. Therefore, in this article, it has been attempted to develop and analysis a small vocabulary Bangla database for recognition. In this database 11 Bangla digits (/ak/, /dui/, /tin/, /chaar/, /panch/, /chhoy/, /shaat/, /aat/, /noy/, /zero/, /shunno/) have been used. The developed database consisted of two sets of data such as training and testing datasets. The training dataset contains 3824 utterances of 50 speakers, and testing dataset is subdivided into four groups (clean1, clean2, clean3 and clean4) and contains 1985 utterances of 52 speakers. All recordings have been done in a quiet room but not sound proof with the A4Tech HS-60 headset microphone interfaced to an Intel Dual Core 2.0 GHz CPU. The software used to record and edit the speech file is wavepad. Finally, an HMM based recognizer is developed to evaluate the database. The word accuracy for test sets is found to be 98.05% on the average. In this recognition process Mel-LPC based front-end and as a reference recognizer HTK (Hidden Markov Model Toolkit) have been used.

References

Muhammad, G. et al. 2009. Automatic speech recognition for Bangla Digits. IEEE, 12th International Conference on Computers and Information Technology (ICCIT '09), Dhaka.
Hirsch, H. G. and D. Pearce, 2000. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Proc. ISCA ITRW ASR 2000: 181:188.
E. T. S. Institute. 2000. Speech Processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms. ETSI Standard, vol. 1, 12, 2000-2004.
Pearce, D. et al. 2000. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Motorola Labs, UK.
Nakamura, S. 2005. AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition. IEICE Transactions on Information and Systems. E88-D, 3, 535-544.
Moreno, A. et al. 1998. SPEECH DAT CAR. A Large Speech Database For Automotive Environments. Universidad Politécnica de Cataluña, Barcelona, Spain.
Young, S. et al. 1999. The HTK Book, USA: Microsoft Corporation.
Weisstein, A. E. 2013. Hidden Markov Model Manual v1.0. Washington University and Truman State University.
Weisstein’s,E. W. E. 2010. Wolfram math world. MathWorld Book.
Mooney, R. J. 1997. Natural Language Processing: N-Gram Language Models. University of Texas at Austin, Texas, USA.
Entropic, 2011. General Principles of Recognition. [Online].
Islam, M. B. 2007. Mel-Wiener Filter for Mel-LPC Based Speech Recognition. IEICE Transactions on Information and System. 90, 6, 30-35.
Rahman, M. and Islam, M. B. 2010. Performance evaluation of MLPC and MFCC for HMM based noisy speech recognition. International Conference on Computer and Information Technology (ICCIT), Dhaka.
Matsumoto, H. et al. 1998. An efficient Mel-LPC analysis method for speech recognition. Proc. ICSLP, 98, 1051-1054.
Furui, S. 1981. Cepstral analysis technique for automatic speaker verification. IEEE Trans. Acoust., Speech and Signal Processing, ASSP-29, 254-272.

Index Terms

Computer Science

Information Sciences

Keywords

Bangla Speech Database Bangla Speech Recognition HMM Mel-LPC