Abstract
Spectral analysis of time-series has been an important tool of science for a very long time. Indeed, periodicities of celestial events and of weather phenomena sparked the curiosity and imagination of early thinkers in the history of science. More recently, the refined mathematical techniques for spectral analysis of the past fifty years form the basis for a wide range of technological developments from medical imaging to communications. However, in spite of the centrality of a spectral representation of time-series, no universal agreement exists on what is a suitable metric between such representations. In this paper we discuss three alternative metrics along with their application in morphing speech signals. Morphing can be naturally effected via a deformation of power spectra along geodesics of the corresponding geometry. The acoustic effect of morphing between two speakers is documented at a website.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abe, M.: Speech morphing by gradually changing spectrum parameter and fundamental frequency. In: Proc. ICSLP 1996, vol. 4, pp. 2235–2238 (1996)
Ambrosio, L.: Lectures Notes on Optimal Transport Problems, CVGMT (July 2000) (preprint)
Childers, D.G.: Speech Processing and Synthesis Toolboxes. Wiley, Chichester (2000)
Georgiou, T.T.: Distances and Riemannian metrics for spectral density functions. IEEE Trans. on Signal Processing 55(8), 3995–4003 (2007)
Goncharoff, V., Kaine-Krolak, M.: Interpolation of LPC spectra via pole shifting. In: Proc. ICASSP 1995, vol. 1, pp. 780–783 (1995)
Haker, S., Zhu, L., Tannenbaum, A., Angenent, S.: Optimal mass transport for registration and warping. International Journal on Computer Vision 60(3), 225–240 (2004)
Kawahara, H., Matsui, H.: Auditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation. In: Proc. ICASSP 2003, vol. 1, pp. 256–259 (2003)
Pfitzinger, H.R.: Unsupervised Speech Morphing between Utterances of any Speakers. In: Proceedings of the 10th Australian Intern. Conf. on Speech Science and Technology, Sydney, pp. 545–550 (2004)
Sambur, M.R., Rosenberg, A.E., Rabiner, L.R., McGonegal, C.A.: On reducing the buzz in LPC synthesis. J. Acoust. Soc. Am. 63(3), 918–924 (1978)
Stoica, P., Moses, R.: Introduction to Spectral Analysis. Prentice Hall, Englewood Cliffs (2005)
Villani, C.: Topics in Optimal Transportation. In: GSM, vol. 58, AMS (2003)
Wong, D.Y., Markel, J.D., Gray, A.H.: Least Squares Glottal Inverse Filtering from the Acoustic Speech Waveform. IEEE Trans. on Acoustics, Speech, and Signal Processing ASSP-27(4), 350–355 (1979)
Ye, H., Young, S.: Perceptually Weighted Linear Transformations for Voice Conversion. Eurospeech (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, X., Takyar, S., Georgiou, T.T. (2008). Metrics and Morphing of Power Spectra. In: Blondel, V.D., Boyd, S.P., Kimura, H. (eds) Recent Advances in Learning and Control. Lecture Notes in Control and Information Sciences, vol 371. Springer, London. https://doi.org/10.1007/978-1-84800-155-8_9
Download citation
DOI: https://doi.org/10.1007/978-1-84800-155-8_9
Publisher Name: Springer, London
Print ISBN: 978-1-84800-154-1
Online ISBN: 978-1-84800-155-8
eBook Packages: EngineeringEngineering (R0)