Reviewing the Definition of Timbre as it Pertains to the Perception of Speech and Musical Sounds

Conference paper


The purpose of this paper is to draw attention to the definition of timbre as it pertains to the vowels of speech. There are two forms of size information in these “source-filter” sounds, information about the size of the excitation mechanism (the vocal folds), and information about the size of the resonators in the vocal tract that filter the excitation before it is projected into the air. The current definitions of pitch and timbre treat the two forms of size information differently. In this paper, we argue that the perception of speech sounds by humans suggests that the definition of timbre would be more useful if it grouped the size variables together and separated the pair of them from the remaining properties of these sounds.


Musical pitch Voice pitch Vocal timbre 



Research supported by the UK Medical Research Council [G0500221, G9900369].


  1. Cohen L (1993) The scale transform. IEEE Trans Acoust 41:3275–3292Google Scholar
  2. Fitch WT, Giedd J (1999) Morphology and development of the human vocal tract: a study using magnetic resonance imaging. J Acoust Soc Am 106:1511–1522PubMedCrossRefGoogle Scholar
  3. Irino T, Patterson RD (2002) Segregating information about the size and shape of the vocal tract using a time-domain auditory model: the stabilized wavelet-Mellin transform. Speech Commun 36:181–203CrossRefGoogle Scholar
  4. Ives DT, Smith DRR, Patterson RD (2005) Discrimination of speaker size from syllable phrases. J Acoust Soc Am 118:3816–3822PubMedCrossRefGoogle Scholar
  5. Kawahara H, Irino T (2004) Underlying principles of a high-quality speech manipulation system STRAIGHT and its application to speech segregation. In: Divenyi PL (ed) Speech separation by humans and machines. Kluwer Academic, MAGoogle Scholar
  6. Krumbholz K, Patterson RD, Pressnitzer D (2000) The lower limit of pitch as determined by rate discrimination. J Acoust Soc Am 108:1170–1180PubMedCrossRefGoogle Scholar
  7. Lee S, Potamianos A, Narayanan S (1999) Acoustics of children’s speech: developmental changes and spectral parameters. J Acoust Soc Am 105:1455–1468PubMedCrossRefGoogle Scholar
  8. Patterson RD, van Dinther R, Irino T (2007) The robustness of bio-acoustic communication and the role of normalization. In: Proceedings of 19th international congress on acoustics, Madrid, September 2007, pp 7–11Google Scholar
  9. Patterson RD, Smith DRR, van Dinther R, Walters TC (2008) Size information in the production and perception of communication sounds. In: Yost WA, Popper AN, Fay RR (eds) Auditory perception of sound sources. Springer Science/Business Media, LLC, New YorkGoogle Scholar
  10. Peterson GE, Barney HI (1952) Control methods used in the study of vowels. J Acoust Soc Am 24:75–184Google Scholar
  11. Pressnitzer D, Patterson RD, Krumbholtz K (2001) The lower limit of melodic pitch. J Acoust Soc Am 109:2074–2084PubMedCrossRefGoogle Scholar
  12. Smith DRR, Patterson RD (2005) The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex, and age. J Acoust Soc Am 118:3177–3186PubMedCrossRefGoogle Scholar
  13. Smith DRR, Patterson RD, Turner RE, Kawahara H, Irino T (2005) The processing and perception of size information in speech sounds. J Acoust Soc Am 117:305–318PubMedCrossRefGoogle Scholar
  14. Turner RE, Walters TC, Monaghan JJM, Patterson RD (2009) A statistical, formant-pattern model for segregating vowel type and vocal-tract length in developmental formant data. J Acoust Soc Am 125:2374–2386PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Centre for the Neural Basis of HearingUniversity of CambridgeCambridgeUK

Personalised recommendations