CFP last date
20 January 2025
Reseach Article

AAYUDHA: A Tool for Automatic Segmentation and Labelling of Continuous Tamil Speech

by Laxmi Sree B. R., Suguna M.
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 143 - Number 1
Year of Publication: 2016
Authors: Laxmi Sree B. R., Suguna M.
10.5120/ijca2016910002

Laxmi Sree B. R., Suguna M. . AAYUDHA: A Tool for Automatic Segmentation and Labelling of Continuous Tamil Speech. International Journal of Computer Applications. 143, 1 ( Jun 2016), 31-35. DOI=10.5120/ijca2016910002

@article{ 10.5120/ijca2016910002,
author = { Laxmi Sree B. R., Suguna M. },
title = { AAYUDHA: A Tool for Automatic Segmentation and Labelling of Continuous Tamil Speech },
journal = { International Journal of Computer Applications },
issue_date = { Jun 2016 },
volume = { 143 },
number = { 1 },
month = { Jun },
year = { 2016 },
issn = { 0975-8887 },
pages = { 31-35 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume143/number1/25044-2016910002/ },
doi = { 10.5120/ijca2016910002 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:45:12.789271+05:30
%A Laxmi Sree B. R.
%A Suguna M.
%T AAYUDHA: A Tool for Automatic Segmentation and Labelling of Continuous Tamil Speech
%J International Journal of Computer Applications
%@ 0975-8887
%V 143
%N 1
%P 31-35
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Speech! An effective way of communication between human is now becoming an alternative way to communicate between human and machine. This alternative way is now-a-days used in many real time systems for faster, easier and comfortable response and communication. Speech segmentation and labelling are the process that lay as a key to decide the accuracy of several speech related research. A tool ‘AAYUDHA’ is proposed that enables automatic segmentation and labelling of continuous speech in Tamil. Two different segmentation algorithms, one based on Fast Fourier Transform (FFT) feature set and 2D filtering and other based on Discrete Wavelet Transform (DWT) feature set and its energy variation in different sub-bands are implemented. The segmentation accuracy of those algorithms is analyzed. Further the segmented speech is labelled using a baseline Hidden Markov Model (HMM) based acoustic model. A speech corpus named ‘KAZHANGIYAM’ is created which includes the recorded Tamil speech of various speakers. The database also includes the information of manually segmented data of those speech data. This speech corpus is used to analyze the accuracy of the algorithms used in the proposed tool. This tool concentrates on the phonetic level segmentation of Tamil speech. The tool shows an acceptable segmentation and labelling accuracy.

References
  1. Ranjani, H. G. (2008). Explicit Segmentation Of Speech For Indian Languages (Doctoral dissertation, Indian Institute of Science Bangalore-560 012 India).
  2. Elminir, H. K., ElSoud, M. A., & El-Maged, L. A. (2012). Evaluation of different feature extraction techniques for continuous speech recognition. International Journal of Science and Technology, 2(10).
  3. Boersma, P., Weenink, D., "Praat: doing phonetics by computer", http://www.praat.org, accessed in Mar 2010.
  4. Sjölander, Kåre, and Jonas Beskow, "Wavesurfer-an open source speech tool." Interspeech. 2000.
  5. Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., ... & Valtchev, V. (1997). The HTK book (Vol. 2). Cambridge: Entropic Cambridge Research Laboratory.
  6. Sarada, G. L., Lakshmi, A., Murthy, H. A., & Nagarajan, T. (2009). Automatic transcription of continuous speech into syllable-like units for Indian languages. Sadhana, 34(2), 221-233.
  7. Elminir, H. K., ElSoud, M. A., & El-Maged, L. A. (2012). Evaluation of different feature extraction techniques for continuous speech recognition.International Journal of Science and Technology, 2(10).
  8. Murthy, H. A., & Yegnanarayana, B. (2011). Group delay functions and its applications in speech technology. Sadhana, 36(5), 745-782.
  9. Schwarz, P. (2009). Phoneme recognition based on long temporal context.
  10. Ziolko, B., Manandhar, S., Wilson, R. C., & Ziolko, M. (2006, September). Wavelet method of speech segmentation. In Signal Processing Conference, 2006 14th European (pp. 1-5). IEEE.
  11. Cosi, P. (1993). SLAM: Segmentation and labelling automatic module. InThird European Conference on Speech Communication and Technology.
  12. Nagarajan, T., Murthy, H. A., & Hegde, R. M. (2003). Segmentation of speech into syllable-like units. Energy, 1(1.5), 2.
  13. Okko Rasanen, Unto Laine and Toomas Altosaar (2011). Blind Segmentation of Speech Using Non-Linear Filtering Methods, Speech Technologies, Prof. Ivo Ipsic (Ed.), ISBN: 978-953-307-996-7, InTech, DOI: 10.5772/16433.
  14. Ziolko, B., Manandhar, S., Wilson, R. C., & Ziolko, M. (2006, September). Wavelet method of speech segmentation. In Signal Processing Conference, 2006 14th European (pp. 1-5). IEEE.
Index Terms

Computer Science
Information Sciences

Keywords

FFT DWT automatic segmentation labelling Tamil speech.