Speech Modeling Using the Complex Cepstrum
Conventional cepstral speech modeling is based on the minimum phase parametric speech production model with infinite impulse response. In that approach only the logarithmic magnitude frequency response of the corresponding speech frame is approximated. In this contribution the principle of the cepstral speech modeling using the complex cepstrum is described. The obtained mixed-phase vocal tract model with finite impulse response contains also the information about the phase properties of the modeled speech frame. This model approximates the speech signal with higher accuracy than the model based on the real cepstrum, the numerical complexity and the memory requirements are at least twice greater.
KeywordsImpulse Response Speech Signal Finite Impulse Response Vocal Tract Infinite Impulse Response
Unable to display preview. Download preview PDF.
- 2.Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, J. (ed.) BIOSIGNAL 2000, pp. 77–82. VUTIUM, Brno (2000)Google Scholar
- 3.Drugman, T., Moinet, A., Dutoit, T., Wilfart, G.: Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis. In: IEEE ICASSP, Taipei, Taiwan, pp. 3793–3796 (2009)Google Scholar
- 4.Quatieri, T.F.: Discrete-Time Speech Signal Processing, pp. 253–308. Prentice-Hall, Englewood Cliffs (2002)Google Scholar
- 5.Drugman, T., Bozkurt, B.T., Dutoit, T.: Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation. In: Interspeech 2009, Brighton, U.K, pp. 116–119 (2009)Google Scholar