We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Call for Paper
December Edition
IJCA solicits high quality original research papers for the upcoming December edition of the journal. The last date of research paper submission is 20 November 2024

Submit your paper
Know more
Reseach Article

Estimation of Spectral Mismatch for Joint Cost Evaluation in Marathi TTS

by Smita P. Kawachale, Janardan S. Chitode
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 65 - Number 17
Year of Publication: 2013
Authors: Smita P. Kawachale, Janardan S. Chitode
10.5120/11019-6387

Smita P. Kawachale, Janardan S. Chitode . Estimation of Spectral Mismatch for Joint Cost Evaluation in Marathi TTS. International Journal of Computer Applications. 65, 17 ( March 2013), 43-50. DOI=10.5120/11019-6387

@article{ 10.5120/11019-6387,
author = { Smita P. Kawachale, Janardan S. Chitode },
title = { Estimation of Spectral Mismatch for Joint Cost Evaluation in Marathi TTS },
journal = { International Journal of Computer Applications },
issue_date = { March 2013 },
volume = { 65 },
number = { 17 },
month = { March },
year = { 2013 },
issn = { 0975-8887 },
pages = { 43-50 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume65/number17/11019-6387/ },
doi = { 10.5120/11019-6387 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:19:08.268000+05:30
%A Smita P. Kawachale
%A Janardan S. Chitode
%T Estimation of Spectral Mismatch for Joint Cost Evaluation in Marathi TTS
%J International Journal of Computer Applications
%@ 0975-8887
%V 65
%N 17
%P 43-50
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Among different methods of speech synthesis, Concatenative Speech Synthesis is widely used due to its naturalness and less signal processing requirement. But concatenative TTS has problems like requirement of large database and resulting spectral mismatch in output speech. In concatenative TTS position of syllable plays very important role while carrying out segmentation. If proper position syllable is used while forming new words from existing syllables, resulting spectral mismatch is less. If position of syllable is not considered during concatenation of speech units, resulting synthesis end up in more concatenation cost. This paper presents different techniques like PSD, Wavelet and DTW to find spectral mismatch in concatenated segments. In all these three techniques PSD results are more superior who shows spectral mismatch in graphical form. With direct formant modification we can overcome spectral mismatch and smooth some of the frames which helps to reduce glitch type of sound at concatenation point. Wavelet based audio results shows more naturalness compare to other two methods. In proposed work the discontinuities at the cutting point are smoothed by changing the spectral characteristics before and after the cutting point so that the spectral mismatch is equally distributed over the number of adjacent frames. This work throws light on how spectral mismatch calculation and reduction increases naturalness of concatenative Marathi TTS.

References
  1. "Objective distance measure for spectral discontinuities in concatenative speech synthesis. "—J. Vepa, S. King and P. Taylor, in proc. ICSLP, Denver, co, 2002.
  2. "The minimum phase signal derived from the magnitude spectrum and its applications to speech segmentation" – T. Nagarajan, V. Kamakshi Prasad and Hema A. Murthy, Sixth Biennial conference of signal processing and communications, July 2001.
  3. "A comparision of spectral smoothing methods for segment concatenation based speech synthesis", -David T. Chappell, John H. L. Hansen.
  4. "Context-Adaptive Smoothing for concatenative speech synthesis", - Ki-Seung Lee and Sang-Ryong Kim, IEEE signal processing letters, vol. 9, No. 12, December 2002.
  5. "Refining segmental boundaries for TTS Database using fine contextual dependent boundary models", - Lijuan Wang, Yong Zhao, Min Chu, Jianlai Zhou and Zhigang Cao.
  6. "Subjective evaluation of joint cost and smoothing methods for unit selection speech synthesis", - Jithendra Vepa and Simon King, IEEE transactions on Audio, Speech, and Language Processing, Vol. 14, No. 5, September 2006.
  7. "New Objective Distance measures for Spectral Discontinuities in Concatenative speech synthesis. ", - Jithendra Vepa, Simon King and Paul Taylor, IEEE 0- 7803-7395-2/2002.
  8. "Concatenative Speech Synthesis for European Portuguese", -Pedro M. Carvalho, Luis C. Oliveira, Isabel M. Trancoso, M. Ceu Viana, INESC/IST.
  9. "Sub-band based group delay segmentation ofspontaneous speech into syllable like units", -T. Nagarajan, H. A. Murthy, I. I. T. Madras.
  10. "A Study on the Performance of Wavelet Packets for Spectral Analysis" M. K. Lakshmanan et. al, IRCTR, Dept of Electrical Engg, Delft University, Netherlands.
Index Terms

Computer Science
Information Sciences

Keywords

TTS-Text to Speech System Spectral Smoothing Concatenative TTS Speech Synthesizer