CFP last date
22 April 2024
Reseach Article

Cepstrum Based Voice Transformation using ANN

Published on March 2012 by J.H.Nirmal, Suparva Patnaik, Mukesh Zaveri
International Conference in Computational Intelligence
Foundation of Computer Science USA
ICCIA - Number 2
March 2012
Authors: J.H.Nirmal, Suparva Patnaik, Mukesh Zaveri

J.H.Nirmal, Suparva Patnaik, Mukesh Zaveri . Cepstrum Based Voice Transformation using ANN. International Conference in Computational Intelligence. ICCIA, 2 (March 2012), 13-16.

author = { J.H.Nirmal, Suparva Patnaik, Mukesh Zaveri },
title = { Cepstrum Based Voice Transformation using ANN },
journal = { International Conference in Computational Intelligence },
issue_date = { March 2012 },
volume = { ICCIA },
number = { 2 },
month = { March },
year = { 2012 },
issn = 0975-8887,
pages = { 13-16 },
numpages = 4,
url = { /proceedings/iccia/number2/5099-1011/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Proceeding Article
%1 International Conference in Computational Intelligence
%A J.H.Nirmal
%A Suparva Patnaik
%A Mukesh Zaveri
%T Cepstrum Based Voice Transformation using ANN
%J International Conference in Computational Intelligence
%@ 0975-8887
%N 2
%P 13-16
%D 2012
%I International Journal of Computer Applications

The basic goal of the voice conversion system to mimics the characteristics of the target speaker voice by keeping the linguistic and paralinguistic information intact. The characteristics of a speaker in speech reflect at different level such as vocal tract, excitation and prosodic parameters. This propose work based on cepstrum which represents the vocal tract and excitation parameters of the speech. This paper proposes the decomposition of the cepstrum by wavelet and mapped the source cepstrum features in to target cepstrum features using Radial basis function neural network. The results are evaluated using subjective and objective measures based on voice quality method and the listening tests prove that the proposed algorithm converts speaker individuality while maintaining high speech quality

  1. Stylianou Y 2009. "Voice Transformation: A survey." Acoustics, Speech and Signal Processing, IEEE International Conference on 2009. ICASSP 2009
  2. A. Kain, " High resolution voice transformation," PhD Thesis, OGI School of Science and Engineering,2001
  3. Lehana P.K, Pande P.C (2011).,”Transformation of short term spectral envelope of speech signal using multivariate polynomial modelling”, National conference on communication pp :1-5.
  4. H. Kuwabara and Y. Sagisak,1995 "Acoustic characteristics of speaker individuality: Control and conversion, “Speech Communication, vol.16, pp. 165-173, .
  5. M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, 1988 "Voice conversion through vector quantization," in Acoustics, Speech, and Signal Processing 88. ,International Conference on, 1988, pp. 655-658
  6. H. Valbret, E. Moulines and J. P. Tubach,1992 "Voice transformation using PSOLA technique," Speech Communication, vol. II, pp. 175-187,
  7. Shikano, K,Nakamura S,Abe M,” Speaker adaptation and voice conversion by codebook mapping” Circuits and Systems, 1991., IEEE International Sympoisum on,vol 1,pp.594-597.
  8. Y. Stylianou, O. Cappe and E. Moulines (1998), Continuous probabilistic transform for voice Conversion," Speech and Audio Processing, IEEE Transactions on, vol. 6, pp. 131-142.
  9. Y. Kang, Z. Shuang, J. Tao, W. Zhang, and B. Xu I(2005), " A Hybrid GMM and Codebook Mapping Method for Spectral Conversion, " Affective Computing and Intelligent Interaction, pp. 303-310,
  10. Desai, S; Black, A W; Yegnanarayana, B; Prahallad, K.T. 2010 "Spectral mapping using artificial neural networks For voice conversion," IEEE Transactions on Audio, Speech,and Language Processing,vol 18,no.5,pp. 954 -64,
  11. K.S.Rao 2010,,”Voice conversion by a mapping the speaker specific features using pitch synchronous approach” Computer speech and language ,vol 24 issue 3 pp 474-494.
  12. Alan V Opphenheim-1969,”Speech Analysis and Synthesis System based on Homomorphic filtering”, The Journal of the Acoustical society of America vol 45 No 2.pp 458-465
Index Terms

Computer Science
Information Sciences


Wavelet transforms Voice conversion Speech cepstrum and Radial basis artificial neural network