Skip to main content

Energetic Masking and Masking Release

  • Chapter
  • First Online:
The Auditory System at the Cocktail Party

Part of the book series: Springer Handbook of Auditory Research ((SHAR,volume 60))

Abstract

Masking is of central interest in the cocktail party problem, because interfering voices may be sufficiently intense or numerous to mask the voice to which the listener is attending, rendering its discourse unintelligible. The definition of energetic masking is problematic, but it may be considered to consist of effects by which an interfering sound disrupts the processing of the speech signal in the lower levels of the auditory system. Maskers can affect speech intelligibility by overwhelming its representation on the auditory nerve and by obscuring its amplitude modulations. A release from energetic masking is obtained by using mechanisms at these lower levels that can recover a useful representation of the speech. These mechanisms can exploit differences between the target and masking speech such as in harmonic structure or in interaural time delay. They can also exploit short-term dips in masker strength or improvements in speech-to-masker ratio at one or other ear.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • ANSI. (1997). ANSI S3.5-1997. Methods for the calculation of the speech intelligibility index. Washington, DC: American National Standards Institute.

    Google Scholar 

  • ANSI. (2013). ANSI S1.1-2013. Acoustical terminology. Washington, DC: American National Standard Institute.

    Google Scholar 

  • Assmann, P. F., & Paschall, D. D. (1998). Pitches of concurrent vowels. The Journal of the Acoustical Society of America, 103, 1150–1160.

    Article  CAS  PubMed  Google Scholar 

  • Assmann, P. F., & Summerfield, Q. (1990). Modeling the perception of concurrent vowels: Vowels with different fundamental frequencies. The Journal of the Acoustical Society of America, 88, 680–697.

    Article  CAS  PubMed  Google Scholar 

  • Assmann, P. F., & Summerfield, Q. (1994). The contribution of waveform interactions to the perception of concurrent vowels. The Journal of the Acoustical Society of America, 95, 471–484.

    Article  CAS  PubMed  Google Scholar 

  • Bee, M. A., & Micheyl, C. (2008). The cocktail party problem: What is it? How can it be solved? And why should animal behaviorists study it? Journal of Comparative Psychology, 122, 235–251.

    Article  PubMed  PubMed Central  Google Scholar 

  • Bernstein, J. G. W., & Grant, K. W. (2009). Auditory and auditory-visual speech intelligibility in fluctuating maskers for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 125, 3358–3372.

    Article  PubMed  Google Scholar 

  • Beutelmann, R., Brand, T., & Kollmeier, B. (2010). Revision, extension, and evaluation of a binaural speech intelligibility model. The Journal of the Acoustical Society of America, 127, 2479–2497.

    Article  PubMed  Google Scholar 

  • Bird, J., & Darwin, C. J. (1998). Effects of a difference in fundamental frequency in separating two sources. In A. R. Palmer, A. Rees, A. Q. Summerfield, & R. Meddis (Eds.), Psychophysical and physiological advances in hearing. London: Whurr.

    Google Scholar 

  • Bregman, A. S. (1990). Auditory scene analysis. Cambridge, MA: MIT Press.

    Google Scholar 

  • Brokx, J. P., & Nooteboom, S. G. (1982). Intonation and the perceptual separation of simultaneous voices. Journal of Phonetics, 10, 23–36.

    Google Scholar 

  • Bronkhorst, A. W., & Plomp, R. (1988). The effect of head-induced interaural time and level differences on speech intelligibility in noise. The Journal of the Acoustical Society of America, 83, 1508–1516.

    Article  CAS  PubMed  Google Scholar 

  • Brungart, D. S. (2001). Informational and energetic masking effects in the perception of two simultaneous talkers. The Journal of the Acoustical Society of America, 109, 1101–1109.

    Article  CAS  PubMed  Google Scholar 

  • Buus, S. (1985). Release from masking caused by envelope fluctuations. The Journal of the Acoustical Society of America, 78, 1958–1965.

    Article  CAS  PubMed  Google Scholar 

  • Christiansen, C., & Dau, T. (2012). Relationship between masking release in fluctuating maskers and speech reception thresholds in stationary noise. The Journal of the Acoustical Society of America, 132, 1655–1666.

    Article  PubMed  Google Scholar 

  • Colburn, H. S. (1996). Computational models of binaural processing. In H. L. Hawkins, T. A. McMullen, A. N. Popper, & R. R. Fay (Eds.), Auditory computation (pp. 332–400). New York: Springer.

    Chapter  Google Scholar 

  • Colburn, H. S., & Durlach, N. I. (1978). Models of binaural interaction. In E. C. Carterette (Ed.), Handbook of perception (Vol. IV, pp. 467–518). New York: Academic Press.

    Google Scholar 

  • Collin, B., & Lavandier, M. (2013). Binaural speech intelligibility in rooms with variations in spatial location of sources and modulation depth of noise interferers. The Journal of the Acoustical Society of America, 134, 1146–1159.

    Article  PubMed  Google Scholar 

  • Culling, J. F. (2007). Evidence specifically favoring the equalization-cancellation theory of binaural unmasking. The Journal of the Acoustical Society of America, 122(5), 2803–2813.

    Article  PubMed  Google Scholar 

  • Culling, J. F., & Colburn, H. S. (2000). Binaural sluggishness in the perception of tone sequences. The Journal of the Acoustical Society of America, 107, 517–527.

    Article  CAS  PubMed  Google Scholar 

  • Culling, J. F., & Darwin, C. J. (1993). Perceptual separation of simultaneous vowels: Within and across-formant grouping by F0. The Journal of the Acoustical Society of America, 93, 3454–3467.

    Article  CAS  PubMed  Google Scholar 

  • Culling, J. F., & Darwin, C. J. (1994). Perceptual and computational separation of simultaneous vowels: Cues arising from low-frequency beating. The Journal of the Acoustical Society of America, 95, 1559–1569.

    Article  CAS  PubMed  Google Scholar 

  • Culling, J. F., & Mansell, E. R. (2013). Speech intelligibility among modulated and spatially distributed noise sources. The Journal of the Acoustical Society of America, 133, 2254–2261.

    Article  PubMed  Google Scholar 

  • Culling, J. F., & Summerfield, Q. (1995). The role of frequency modulation in the perceptual segregation of concurrent vowels. The Journal of the Acoustical Society of America, 98, 837–846.

    Article  CAS  PubMed  Google Scholar 

  • Culling, J. F., & Summerfield, Q. (1998). Measurements of the binaural temporal window. The Journal of the Acoustical Society of America, 103, 3540–3553.

    Article  Google Scholar 

  • Darwin, C. J. (1984). Perceiving vowels in the presence of another sound: Constraints on formant perception. The Journal of the Acoustical Society of America, 76, 1636–1647.

    Article  CAS  PubMed  Google Scholar 

  • Darwin, C. J., & Sutherland, N. S. (1984). Grouping frequency components of vowels: When is a harmonic not a harmonic? Quarterly Journal of Experimental Psychology, 36A, 193–208.

    Article  Google Scholar 

  • de Cheveigné, A. (1998). Cancellation model of pitch perception. The Journal of the Acoustical Society of America, 103, 1261–1271.

    Article  PubMed  Google Scholar 

  • de Cheveigné, A., McAdams, S., Laroche, J., & Rosenberg, M. (1995). Identification of concurrent harmonic and inharmonic vowels: A test of Theory of harmonic cancellation and enhancement. The Journal of the Acoustical Society of America, 97, 3736–3748.

    Article  PubMed  Google Scholar 

  • de Laat, J. A. P. M., & Plomp, R. (1983). The reception threshold of interrupted speech for hearing-impaired listeners. In R. Klinke & R. Hartmann (Eds.), Hearing—Physiological bases and psychophysics (pp. 359–363). Berlin, Heidelberg: Springer.

    Chapter  Google Scholar 

  • Deroche, M. L. D., & Culling, J. F. (2011a). Voice segregation by difference in fundamental frequency: Evidence for harmonic cancellation. The Journal of the Acoustical Society of America, 130, 2855–2865.

    Article  PubMed  Google Scholar 

  • Deroche, M. L. D., & Culling, J. F. (2011b). Narrow noise band detection in a complex masker: Masking level difference due to harmonicity. Hearing Research, 282, 225–235.

    Article  PubMed  Google Scholar 

  • Deroche, M. L. D., Culling, J. F., & Chatterjee, M. (2013). Phase effects in masking by harmonic complexes: Speech recognition. Hearing Research, 306, 54–62.

    Article  PubMed  Google Scholar 

  • Deroche, M. L. D., Culling, J. F., Chatterjee, M., & Limb, C. J. (2014). Speech recognition against harmonic and inharmonic complexes: Spectral dips and periodicity. The Journal of the Acoustical Society of America, 135, 2873–2884.

    Article  PubMed  PubMed Central  Google Scholar 

  • Durlach, N. I. (1963). Equalization and cancellation theory of binaural masking-level differences. The Journal of the Acoustical Society of America, 35, 416–426.

    Article  Google Scholar 

  • Durlach, N. I. (1972). Binaural signal detection: Equalization and cancellation theory. In J. V. Tobias (Ed.), Foundations of modern auditory theory (Vol. II, p. 365462). New York: Academic Press.

    Google Scholar 

  • Durlach, N. (2006). Auditory masking: Need for improved conceptual structure. The Journal of the Acoustical Society of America, 120, 1787–1790.

    Article  PubMed  Google Scholar 

  • Edmonds, B. A., & Culling, J. F. (2005). The spatial unmasking of speech: Evidence for within-channel processing of interaural time delay. The Journal of the Acoustical Society of America, 117, 3069–3078.

    Article  PubMed  Google Scholar 

  • Edmonds, B. A., & Culling, J. F. (2006). The spatial unmasking of speech: Evidence for better-ear listening. The Journal of the Acoustical Society of America, 120, 1539–1545.

    Article  PubMed  Google Scholar 

  • Egan, J., Carterette, E., & Thwing, E. (1954). Factors affecting multichannel listening. The Journal of the Acoustical Society of America, 26, 774–782.

    Article  Google Scholar 

  • Festen, J., & Plomp, R. (1990). Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. The Journal of the Acoustical Society of America, 88, 1725–1736.

    Article  CAS  PubMed  Google Scholar 

  • Fletcher, H. (1930). A space-time pattern theory of hearing. The Journal of the Acoustical Society of America, 1, 311–343.

    Article  Google Scholar 

  • French, N. R., & Steinberg, J. C. (1947). Factors governing the intelligibility of speech sounds. The Journal of the Acoustical Society of America, 19, 90–119.

    Article  Google Scholar 

  • Glasberg, B. R., & Moore, B. C. (1990). Derivation of auditory filter shapes from notched-noise data. Hearing Research, 47, 103–138.

    Article  CAS  PubMed  Google Scholar 

  • Grantham, D. W., & Wightman, F. L. (1979). Detectability of a pulsed tone in the presence of a masker with time-varying interaural correlation. The Journal of the Acoustical Society of America, 65, 1509–1517.

    Article  CAS  PubMed  Google Scholar 

  • Hartmann, W. M., & Pumplin, J. (1988). Noise power fluctuations and the masking of sine signals. The Journal of the Acoustical Society of America, 83, 2277–2289.

    Article  CAS  PubMed  Google Scholar 

  • Hawkins, J. E., & Stevens, S. S. (1950). The masking of pure tones and of speech by white noise. The Journal of the Acoustical Society of America, 22, 6–13.

    Article  Google Scholar 

  • Hawley, M. L., Litovsky, R. Y., & Culling, J. F. (2004). The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer. The Journal of the Acoustical Society of America, 115, 833–843.

    Article  PubMed  Google Scholar 

  • Hilkhuysen, G., & Machery, O. (2014). Optimizing pulse-spreading harmonic complexes to minimize intrinsic modulations after cochlear filtering. The Journal of the Acoustical Society of America, 136, 1281–1294.

    Article  PubMed  Google Scholar 

  • Hirsh, I. J. (1948). The influence of interaural phase on interaural summation and inhibition. The Journal of the Acoustical Society of America, 20, 536–544.

    Article  Google Scholar 

  • Holmes, S. D., & Roberts, B. (2011). The influence of adaptation and inhibition on the effects of onset asynchrony on auditory grouping. Journal of Experimental Psychology. Human Perception and Performance, 37, 1988–2000.

    Article  PubMed  Google Scholar 

  • Houtgast, T., & Steeneken, H. J. M. (1985). A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. The Journal of the Acoustical Society of America, 77, 1069–1077.

    Article  Google Scholar 

  • Howard-Jones, P. A., & Rosen, S. (1993). Uncomodulated glimpsing in ‘checkerboard’ noise. The Journal of the Acoustical Society of America, 93, 2915–2922.

    Article  CAS  PubMed  Google Scholar 

  • Jelfs, S., Culling, J. F., & Lavandier, M. (2011). Revision and validation of a binaural model for speech intelligibility in noise. Hearing Research, 275, 96–104.

    Article  PubMed  Google Scholar 

  • Jørgensen, S., & Dau, T. (2011). Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing. The Journal of the Acoustical Society of America, 130, 1475–1487.

    Article  PubMed  Google Scholar 

  • Jørgensen, S., Ewert, S. D., & Dau, T. (2013). A multi-resolution envelope-power based model for speech intelligibility. The Journal of the Acoustical Society of America, 134, 436–446.

    Article  PubMed  Google Scholar 

  • Klatt, H. (1980). Software for a cascade/parallel formant synthesizer. The Journal of the Acoustical Society of America, 67, 971–995.

    Article  Google Scholar 

  • Klumpp, R. G., & Eady, H. R. (1956). Some measurements of interaural time difference thresholds. The Journal of the Acoustical Society of America, 28, 859–860.

    Article  Google Scholar 

  • Kohlrausch, A., Fassel, R., van der Heijden, M., Kortekaas, R., et al. (1997). Detection of tones in low-noise noise: Further evidence for the role of envelope fluctuations. Acta Acustica united with Acustica, 83, 659–669.

    Google Scholar 

  • Kohlrausch, A., & Sander, A. (1995). Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets. The Journal of the Acoustical Society of America, 97, 1817–1829.

    Article  CAS  PubMed  Google Scholar 

  • Kwon, B. J., & Turner, C. W. (2001). Consonant identification under maskers with sinusoidal modulation: Masking release or modulation interference? The Journal of the Acoustical Society of America, 110, 1130–1140.

    Article  CAS  PubMed  Google Scholar 

  • Licklider, J. C. R. (1948). The influence of interaural phase relations upon the masking of speech by white noise. The Journal of the Acoustical Society of America, 20, 150–159.

    Article  Google Scholar 

  • McAdams, S. (1989). Segregation of concurrent sounds. I: Effects of frequency modulation coherence. The Journal of the Acoustical Society of America, 86, 2148–2159.

    Article  CAS  PubMed  Google Scholar 

  • Meddis, R., & Hewitt, M. J. (1991). Virtual pitch and phase sensitivity of a computer model of the auditory periphery. I: Pitch identification. The Journal of the Acoustical Society of America, 89, 2866–2882.

    Article  Google Scholar 

  • Meddis, R., & Hewitt, M. J. (1992). Modeling the identification of concurrent vowels with different fundamental frequencies. The Journal of the Acoustical Society of America, 91, 233–245.

    Article  CAS  PubMed  Google Scholar 

  • Miller, G. A. (1947). The masking of speech. Psychological Bulletin, 44, 105–129.

    Article  CAS  PubMed  Google Scholar 

  • Miller, G. A., & Licklider, J. C. R. (1950). The intelligibility of interrupted speech. The Journal of the Acoustical Society of America, 22, 167–173.

    Article  Google Scholar 

  • Nelson, P., Jin, S.-H., Carney, A. E., & Nelson, D. A. (2003). Understanding speech in modulated interference: Cochlear implant users and normal-hearing listeners. The Journal of the Acoustical Society of America, 113, 961–968.

    Article  PubMed  Google Scholar 

  • Oxenham, A., & Simonson, A. M. (2009). Masking release for low- and high-pass-filtered speech in the presence of noise and single-talker interference. The Journal of the Acoustical Society of America, 125, 457–468.

    Article  PubMed  PubMed Central  Google Scholar 

  • Plomp, R. (1983). The role of modulation in hearing. In R. Klinke & R. Hartmann (Eds.), Hearing—Physiological bases and psychophysics (pp. 270–276). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Pumplin, J. (1985). Low-noise noise. The Journal of the Acoustical Society of America, 78, 100–104.

    Article  Google Scholar 

  • Rhebergen, K. S., & Versfeld, N. J. (2005). A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners. The Journal of the Acoustical Society of America, 117, 2181–2192.

    Article  PubMed  Google Scholar 

  • Roberts, B., & Holmes, S. D. (2006). Asynchrony and the grouping of vowel components: Captor tones revisited. The Journal of the Acoustical Society of America, 119, 2905–2918.

    Article  PubMed  Google Scholar 

  • Scheffers, T. M. (1983). Sifting vowels: Auditory pitch analysis and sound segregation. Doctoral thesis, University of Groningen.

    Google Scholar 

  • Schroeder, M. R. (1970). Synthesis of low-peak-factor signals and binary sequences with low autocorrelation. IEEE Transactions on Information Theory, 16, 85–89.

    Article  Google Scholar 

  • Schubert, E. D. (1956). Some preliminary experiments on binaural time delay and intelligibility. The Journal of the Acoustical Society of America, 28, 895–901.

    Article  Google Scholar 

  • Stone, M. A., Anton, K., & Moore, B. C. J. (2012). Use of high-rate envelope speech cues and their perceptually relevant dynamic range for the hearing impaired. The Journal of the Acoustical Society of America, 132, 1141–1151.

    Article  PubMed  Google Scholar 

  • Stone, M. A., Füllgrabe, C., & Moore, B. C. J. (2010). Relative contribution to speech intelligibility of different envelope modulation rates within the speech dynamic range. The Journal of the Acoustical Society of America, 128, 2127–2137.

    Article  PubMed  Google Scholar 

  • Stone, M. A., & Moore, B. C. J. (2014). On the near non-existence of “pure” energetic masking release for speech. The Journal of the Acoustical Society of America, 135, 1967–1977.

    Article  PubMed  Google Scholar 

  • Studebaker, G. A., & Sherbecoe, R. L. (2002). Intensity-importance functions for bandlimited monosyllabic words. The Journal of the Acoustical Society of America, 111, 1422–1436.

    Article  PubMed  Google Scholar 

  • Summerfield, Q., & Assmann, P. F. (1990). Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. The Journal of the Acoustical Society of America, 89, 1364–1377.

    Article  Google Scholar 

  • Summerfield, Q., & Assmann, P. F. (1991). Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. The Journal of the Acoustical Society of America, 89, 1364–1377.

    Google Scholar 

  • Summers, V., & Leek, M. R. (1998). Masking of tones and speech by Schroeder-phase harmonic complexes in normally hearing and hearing-impaired listeners. Hearing Research, 118, 139–150.

    Article  CAS  PubMed  Google Scholar 

  • von Helmholz, H. (1895). On the sensations of tone as a physiological basis for Theory of music. London: Longmans.

    Google Scholar 

  • Wan, R., Durlach, N. I., & Colburn, H. S. (2014). Application of a short-time version of the equalization–cancellation model to speech intelligibility experiments with speech maskers. The Journal of the Acoustical Society of America, 136, 768–776.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Compliance with Ethics Requirements

John Culling has no conflicts of interest.

Michael Stone has no conflicts of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John F. Culling .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Culling, J.F., Stone, M.A. (2017). Energetic Masking and Masking Release. In: Middlebrooks, J., Simon, J., Popper, A., Fay, R. (eds) The Auditory System at the Cocktail Party. Springer Handbook of Auditory Research, vol 60. Springer, Cham. https://doi.org/10.1007/978-3-319-51662-2_3

Download citation

Publish with us

Policies and ethics