Perception of Concurrent Sentences with Harmonic or Frequency-Shifted Voiced Excitation: Performance of Human Listeners and of Computational Models Based on Autocorrelation

Roberts, Brian; Holmes, Stephen D.; Darwin, Christopher J.; Brown, Guy J.

doi:10.1007/978-1-4419-5686-6_48

Brian Roberts⁴,
Stephen D. Holmes,
Christopher J. Darwin &
…
Guy J. Brown

1289 Accesses
2 Citations

Abstract

Keyword identification in one of two simultaneous sentences is improved when the sentences differ in F0, particularly when they are almost continuously voiced. Sentences of this kind were recorded, monotonised using PSOLA, and re-synthesised to give a range of harmonic ∆F0s (0, 1, 3, and 10 semitones). They were additionally re-synthesised by LPC with the LPC residual frequency shifted by 25% of F0, to give excitation with inharmonic but regularly spaced components. Perceptual identification of frequency-shifted sentences showed a similar large improvement with nominal ∆F0 as seen for harmonic sentences, although overall performance was about 10% poorer. We compared performance with that of two autocorrelation-based computational models comprising four stages: (i) peripheral frequency selectivity and half-wave rectification; (ii) within-channel periodicity extraction; (iii) identification of the two major peaks in the summary autocorrelation function (SACF); (iv) a template-based approach to speech recognition using dynamic time warping. One model sampled the correlogram at the target-F0 period and performed spectral matching; the other deselected channels dominated by the interferer and performed matching on the short-lag portion of the residual SACF. Both models reproduced the monotonic increase observed in human performance with increasing ∆F0 for the harmonic stimuli, but not for the frequency-shifted stimuli. A revised version of the spectral-matching model, which groups patterns of periodicity that lie on a curve in the frequency-delay plane, showed a closer match to the perceptual data for frequency-shifted sentences. The results extend the range of phenomena originally attributed to harmonic processing to grouping by common spectral pattern.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Assmann PF, Summerfield Q (1990) J Acoust Soc Am 88:680–697
Article PubMed CAS Google Scholar
Bird J, Darwin CJ (1998) In: Palmer AR et al (eds) Psychophysical and physiological advances in hearing. Whurr, London, pp 263–269
Google Scholar
Boersma P, Weenink D (1996) Praat, a system for doing phonetics by computer. Institute of Phonetic Sciences, University of Amsterdam
Google Scholar
Brokx JPL, Nooteboom SG (1982) J Phonet 10:23–36
Google Scholar
Brown GJ, Wang DL (1997) Neural Netw 10:1547–1558
Article Google Scholar
Carlyon RP, Gockel HE (2008) In: Yost WA et al (eds) Auditory perception of sound sources. Springer, New York, pp 191–213
Google Scholar
Culling JF, Darwin CJ (1994) J Acoust Soc Am 95:1559–1569
Article PubMed CAS Google Scholar
de Cheveigné A (1993) J Acoust Soc Am 93:3271–3290
Article Google Scholar
Duifhuis H, Willems LF, Sluyter RJ (1982) J Acoust Soc Am 71:1568–1580
Article PubMed CAS Google Scholar
Ellis D (2003) http://www.ee.columbia.edu/∼dpwe/resources/matlab/dtw/
Google Scholar
Lopez-Poveda EA, Meddis RM (2001) J Acoust Soc Am 110:3107–3118
Article PubMed CAS Google Scholar
Meddis R, Hewitt MJ (1992) J Acoust Soc Am 91:233–245
Article PubMed CAS Google Scholar
Moulines E, Charpentier F (1990) Speech Commun 9:453–467
Article Google Scholar
Parsons TW (1976) J Acoust Soc Am 60:911–918
Article Google Scholar
Roberts B (2005) Acta Acust Acust 91:945–957
Google Scholar
Roberts B, Bregman AS (1991) J Acoust Soc Am 90:3050–3060
Article Google Scholar
Roberts B, Brunstrom JM (1998) J Acoust Soc Am 104:2326–2338
Article PubMed CAS Google Scholar
Roberts B, Brunstrom JM (2001) J Acoust Soc Am 110:2479–2490
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Psychology, School of Life and Health Sciences, Aston University, Birmingham, B4 7ET, UK
Brian Roberts

Authors

Brian Roberts
View author publications
You can also search for this author in PubMed Google Scholar
Stephen D. Holmes
View author publications
You can also search for this author in PubMed Google Scholar
Christopher J. Darwin
View author publications
You can also search for this author in PubMed Google Scholar
Guy J. Brown
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brian Roberts .

Editor information

Editors and Affiliations

Inst. Neurociencias de Castilla y León, Universidad de Salamanca, Av. Alfonso X El Sabio s/n, Salamanca, 37007, Spain
Enrique A. Lopez-Poveda
MRC Inst.of Hearing Research, University Park, Nottingham, NG7 2RD, United Kingdom
Alan R. Palmer
University of Essex, Wivenhoe Park, Colchester, Essex, CO4 3SQ, United Kingdom
Ray Meddis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Roberts, B., Holmes, S.D., Darwin, C.J., Brown, G.J. (2010). Perception of Concurrent Sentences with Harmonic or Frequency-Shifted Voiced Excitation: Performance of Human Listeners and of Computational Models Based on Autocorrelation. In: Lopez-Poveda, E., Palmer, A., Meddis, R. (eds) The Neurophysiological Bases of Auditory Perception. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-5686-6_48

Download citation

DOI: https://doi.org/10.1007/978-1-4419-5686-6_48
Published: 16 February 2010
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-5685-9
Online ISBN: 978-1-4419-5686-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics