CFP last date
20 December 2024
Reseach Article

Optimization of Arabic Database and an Implementation for Arabic Speech Synthesis System using HMM: HTS_ARAB_TALK

by Krichi Mohamed Khalil, Cherif Adnan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 73 - Number 17
Year of Publication: 2013
Authors: Krichi Mohamed Khalil, Cherif Adnan
10.5120/12832-9941

Krichi Mohamed Khalil, Cherif Adnan . Optimization of Arabic Database and an Implementation for Arabic Speech Synthesis System using HMM: HTS_ARAB_TALK. International Journal of Computer Applications. 73, 17 ( July 2013), 11-17. DOI=10.5120/12832-9941

@article{ 10.5120/12832-9941,
author = { Krichi Mohamed Khalil, Cherif Adnan },
title = { Optimization of Arabic Database and an Implementation for Arabic Speech Synthesis System using HMM: HTS_ARAB_TALK },
journal = { International Journal of Computer Applications },
issue_date = { July 2013 },
volume = { 73 },
number = { 17 },
month = { July },
year = { 2013 },
issn = { 0975-8887 },
pages = { 11-17 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume73/number17/12832-9941/ },
doi = { 10.5120/12832-9941 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:40:21.125352+05:30
%A Krichi Mohamed Khalil
%A Cherif Adnan
%T Optimization of Arabic Database and an Implementation for Arabic Speech Synthesis System using HMM: HTS_ARAB_TALK
%J International Journal of Computer Applications
%@ 0975-8887
%V 73
%N 17
%P 11-17
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper presents an optimization of the Arabic database and a prototype for real-time speech synthesis. Statistical parametric speech synthesis is a relatively new approach to speech synthesis. Hidden Markov model based speech synthesis, the techniques in this approach, has been demonstrated to be very effective in synthesizing high quality, natural and expressive speech. This work modified the publicly available HTS to establish a complete architecture system, called HTS_ARAB_TALK, which provides us with a basis for further research for a future fully real-time speech synthesis system and we give an overview of the Arabic speech synthesis system using HMM. A brief description of the HTS_ARAB_TALK is presented with some emphasis on the feature that is relevant to the Arabic language. Finally, a mean opinion score for the synthesized speech is presented. These results were supported by subjective evaluation.

References
  1. Cheng-Yuan,L. and Jang,J. "A two-phase pitch marking method for TD-PSOLA synthesis" ICSLP,2004.
  2. Yorozu, Y. Hirano, M. Oka, K. Tagawa, and Y. "Electron spectroscopy studies on magneto-optical media and plastic substrate interface," IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
  3. Fukada, T. Tokuda, K. Kobayashi T. and Imai, S. "An adaptive algorithm for mel-cepstral analysis of speech," Proc. of ICASSP'92, vol. 1, pp. 137–140, 1992.
  4. http://kt-lab. ics. nitech. ac. jp/˜tokuda/SPTK/.
  5. Tokuda, K. Masuko, T. Miyazaki N. and Kobayashi, T. "Hidden Markov Models Based on Multi-Space Probability Distribution for Pitch Pattern Modeling," Proc. of ICASSP, 1999.
  6. Toth, B. andNemethG. "Optimizing HMM Speech Synthesis for Low-Resource Devices", November 15, 2011.
  7. Tokuda, K. Zen, H. andBlack, A. W. "An HMM-based speech synthesis system applied to English", in IEEE Speech Synthesis Workshop, 2002.
  8. Assaf, M. "A Prototype of an Arabic Diphone Speech Synthesizer in Festival," Master Thesis,Department of Linguistics and Philology, Uppsala University, 2005.
  9. Zen, H. Tokuda, K. and Kitamura, T. "Decision tree based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling", in Proc. Eurospeech,2003b, pp. 3189-3192.
  10. Tokuda, K. Yoshimura, T. Masuko, T. Kobayashi, T. Kitamura, T. "Speech parameter generation algorithms for HMM-based speech synthesis", in Proceedings of International Conference on Acoustics, Speech, and Signal Processing,2000, Vol. 3, pp. 1315-1318.
  11. Fukada, T. , Tokuda, K. , Kobayashi, T. , Imai, S. , "An adaptive algorithm for mel-cepstral analysis of speech", in Proc. Of ICASSP92, 1992, vol. 1, pp. 137-140.
  12. Shichiri, K. , Sawabe, A. , Tokuda, K. , Masuko, T. ,Kobayashi, T. , Kitamura, T. , "Eigenvoices for HMM-based speech synthesis", in Proceedings of International Conference on Spoken Language Processing,2002, pp. 1269–1272.
  13. Tamura, M. , Masuko, T. , Tokuda, K. , Kobayashi, T. , "Adaptation of pitch and spectrum for HMM-based speech synthesis using mllr", in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, 2001, Vol. 2, pp. 805808.
  14. Yoshimura, T. , Tokuda, K. , Masuko, T. , Kobayashi, T. , Kitamura, T. , "Speaker interpolation in HMM-based speech synthesis system", in Proceedings of European Conference on Speech Communication and Technology97, 1997, Vol. 5, pp. 2523-2526.
  15. M. Boudraa, B. Boudraa, B. Guerin, ''Elaboration d'une base de données arabe phonétiquement équilibrée'', Actes du colloque Langue Arabe et Technologies Informatiques Avancées, pp 171-187, Casablanca, Décembre 1993.
  16. K. Mohamed Khalil, C. Adnan," Arabic HMM-based Speech Synthesis", in International Conference on Electrical Engineering and Software Applications ICEESA 2013.
  17. M. Assaf, "A Prototype of an Arabic Diphone Speech Synthesizer in Festival," Master Thesis,Department of Linguistics and Philology, Uppsala University, 2005.
  18. Eriwn, W. M. (1963) A Short Reference Grammar of Iraqi Arabic. Washington: Georgetown University Press.
  19. Mitchell, T F (1975) Principles of Firthian Linguistics. London: Longman.
  20. Kawai, H. Toda, T. Yamagishi, J. Hirai, T. J. Ni, Nishizawa,T. , Tsuzaki, M. and Tokuda. K. XIMERA: A concatenative speech synthesis system with large scale corpora. IEICE Trans. Inf. Syst. (Japanese Edition), J89-D(12):2688–2698, Dec. 2006.
  21. Black, A. W. Taylor, P. and Caley, R. The festival speech synthesis system. http://www. festvox. org/festival/. Young M. , The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989.
  22. Schroder M. and Trouvain J. The German text-to-speech synthesis ¨ system MARY: A tool for research, development and teaching. International Journal of Speech Technology, 6:365–377, 2003.
  23. Omar, A. (1985) Dirasat Al–Swat Al–Lugawi. Cairo: Alam Al– Kutub.
  24. Eriwn, W. M. (1963) A Short Reference Grammar of Iraqi Arabic. Washington: Georgetown University Press.
  25. Mitchell, T F (1975) Principles of Firthian Linguistics. London: Longman. Tokuda, K. Zen, H. and Black A. , An HMM-Based Speech Synthesis System Applied to English. IEEE TTS Workshop 2002. Santa Monica. California, USA. 2002.
Index Terms

Computer Science
Information Sciences

Keywords

HMM Speech Synthesis Text to Speech Arabic Language Statistical Parametric Speech Synthesis Hidden Markov Model