Abstract
To obtain high-quality results in speech synthesis, we should record utterance elements, which is to be concatenated, as temporally long as possible to avoid a sense of discomfort in listeners. In the case of announcements in trains, for example, we prepare word- or segment-long utterances, and concatenate them to generate simple sentences. However, when we try to synthesize free sentences required in daily life with this method, we need such a huge database that we cannot construct it by recording real utterances. Instead, we may be able to synthesize speech by sampling short elements of utterance, and memorize their short-time spectra to be concatenated. In practice, however, it is very difficult to tune the way of concatenation since, to yield reasonable speech, it is crucial to reproduce the features in waveforms such as pulse sharpness. In this chapter, we present a complex-valued neural network that adjusts phase values in frequency spectra adaptively to realize an ideal concatenation. The network functions in the frequency domain to obtain desired waveforms in the time domain. Phase shift in the frequency domain corresponds to temporal shift in the time domain. Such frequency-domain processing using complex-valued neural networks is useful in various fields such as image processing where we deal with spatial frequency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hirose, A. (2012). Pitch-Asynchronous Overlap-Add Waveform-Concatenation Speech Synthesis by Optimizing Phase Spectrum in Frequency Domain. In: Complex-Valued Neural Networks. Studies in Computational Intelligence, vol 400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27632-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-27632-3_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27631-6
Online ISBN: 978-3-642-27632-3
eBook Packages: EngineeringEngineering (R0)