Abstract
Masking is of central interest in the cocktail party problem, because interfering voices may be sufficiently intense or numerous to mask the voice to which the listener is attending, rendering its discourse unintelligible. The definition of energetic masking is problematic, but it may be considered to consist of effects by which an interfering sound disrupts the processing of the speech signal in the lower levels of the auditory system. Maskers can affect speech intelligibility by overwhelming its representation on the auditory nerve and by obscuring its amplitude modulations. A release from energetic masking is obtained by using mechanisms at these lower levels that can recover a useful representation of the speech. These mechanisms can exploit differences between the target and masking speech such as in harmonic structure or in interaural time delay. They can also exploit short-term dips in masker strength or improvements in speech-to-masker ratio at one or other ear.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
ANSI. (1997). ANSI S3.5-1997. Methods for the calculation of the speech intelligibility index. Washington, DC: American National Standards Institute.
ANSI. (2013). ANSI S1.1-2013. Acoustical terminology. Washington, DC: American National Standard Institute.
Assmann, P. F., & Paschall, D. D. (1998). Pitches of concurrent vowels. The Journal of the Acoustical Society of America, 103, 1150–1160.
Assmann, P. F., & Summerfield, Q. (1990). Modeling the perception of concurrent vowels: Vowels with different fundamental frequencies. The Journal of the Acoustical Society of America, 88, 680–697.
Assmann, P. F., & Summerfield, Q. (1994). The contribution of waveform interactions to the perception of concurrent vowels. The Journal of the Acoustical Society of America, 95, 471–484.
Bee, M. A., & Micheyl, C. (2008). The cocktail party problem: What is it? How can it be solved? And why should animal behaviorists study it? Journal of Comparative Psychology, 122, 235–251.
Bernstein, J. G. W., & Grant, K. W. (2009). Auditory and auditory-visual speech intelligibility in fluctuating maskers for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 125, 3358–3372.
Beutelmann, R., Brand, T., & Kollmeier, B. (2010). Revision, extension, and evaluation of a binaural speech intelligibility model. The Journal of the Acoustical Society of America, 127, 2479–2497.
Bird, J., & Darwin, C. J. (1998). Effects of a difference in fundamental frequency in separating two sources. In A. R. Palmer, A. Rees, A. Q. Summerfield, & R. Meddis (Eds.), Psychophysical and physiological advances in hearing. London: Whurr.
Bregman, A. S. (1990). Auditory scene analysis. Cambridge, MA: MIT Press.
Brokx, J. P., & Nooteboom, S. G. (1982). Intonation and the perceptual separation of simultaneous voices. Journal of Phonetics, 10, 23–36.
Bronkhorst, A. W., & Plomp, R. (1988). The effect of head-induced interaural time and level differences on speech intelligibility in noise. The Journal of the Acoustical Society of America, 83, 1508–1516.
Brungart, D. S. (2001). Informational and energetic masking effects in the perception of two simultaneous talkers. The Journal of the Acoustical Society of America, 109, 1101–1109.
Buus, S. (1985). Release from masking caused by envelope fluctuations. The Journal of the Acoustical Society of America, 78, 1958–1965.
Christiansen, C., & Dau, T. (2012). Relationship between masking release in fluctuating maskers and speech reception thresholds in stationary noise. The Journal of the Acoustical Society of America, 132, 1655–1666.
Colburn, H. S. (1996). Computational models of binaural processing. In H. L. Hawkins, T. A. McMullen, A. N. Popper, & R. R. Fay (Eds.), Auditory computation (pp. 332–400). New York: Springer.
Colburn, H. S., & Durlach, N. I. (1978). Models of binaural interaction. In E. C. Carterette (Ed.), Handbook of perception (Vol. IV, pp. 467–518). New York: Academic Press.
Collin, B., & Lavandier, M. (2013). Binaural speech intelligibility in rooms with variations in spatial location of sources and modulation depth of noise interferers. The Journal of the Acoustical Society of America, 134, 1146–1159.
Culling, J. F. (2007). Evidence specifically favoring the equalization-cancellation theory of binaural unmasking. The Journal of the Acoustical Society of America, 122(5), 2803–2813.
Culling, J. F., & Colburn, H. S. (2000). Binaural sluggishness in the perception of tone sequences. The Journal of the Acoustical Society of America, 107, 517–527.
Culling, J. F., & Darwin, C. J. (1993). Perceptual separation of simultaneous vowels: Within and across-formant grouping by F0. The Journal of the Acoustical Society of America, 93, 3454–3467.
Culling, J. F., & Darwin, C. J. (1994). Perceptual and computational separation of simultaneous vowels: Cues arising from low-frequency beating. The Journal of the Acoustical Society of America, 95, 1559–1569.
Culling, J. F., & Mansell, E. R. (2013). Speech intelligibility among modulated and spatially distributed noise sources. The Journal of the Acoustical Society of America, 133, 2254–2261.
Culling, J. F., & Summerfield, Q. (1995). The role of frequency modulation in the perceptual segregation of concurrent vowels. The Journal of the Acoustical Society of America, 98, 837–846.
Culling, J. F., & Summerfield, Q. (1998). Measurements of the binaural temporal window. The Journal of the Acoustical Society of America, 103, 3540–3553.
Darwin, C. J. (1984). Perceiving vowels in the presence of another sound: Constraints on formant perception. The Journal of the Acoustical Society of America, 76, 1636–1647.
Darwin, C. J., & Sutherland, N. S. (1984). Grouping frequency components of vowels: When is a harmonic not a harmonic? Quarterly Journal of Experimental Psychology, 36A, 193–208.
de Cheveigné, A. (1998). Cancellation model of pitch perception. The Journal of the Acoustical Society of America, 103, 1261–1271.
de Cheveigné, A., McAdams, S., Laroche, J., & Rosenberg, M. (1995). Identification of concurrent harmonic and inharmonic vowels: A test of Theory of harmonic cancellation and enhancement. The Journal of the Acoustical Society of America, 97, 3736–3748.
de Laat, J. A. P. M., & Plomp, R. (1983). The reception threshold of interrupted speech for hearing-impaired listeners. In R. Klinke & R. Hartmann (Eds.), Hearing—Physiological bases and psychophysics (pp. 359–363). Berlin, Heidelberg: Springer.
Deroche, M. L. D., & Culling, J. F. (2011a). Voice segregation by difference in fundamental frequency: Evidence for harmonic cancellation. The Journal of the Acoustical Society of America, 130, 2855–2865.
Deroche, M. L. D., & Culling, J. F. (2011b). Narrow noise band detection in a complex masker: Masking level difference due to harmonicity. Hearing Research, 282, 225–235.
Deroche, M. L. D., Culling, J. F., & Chatterjee, M. (2013). Phase effects in masking by harmonic complexes: Speech recognition. Hearing Research, 306, 54–62.
Deroche, M. L. D., Culling, J. F., Chatterjee, M., & Limb, C. J. (2014). Speech recognition against harmonic and inharmonic complexes: Spectral dips and periodicity. The Journal of the Acoustical Society of America, 135, 2873–2884.
Durlach, N. I. (1963). Equalization and cancellation theory of binaural masking-level differences. The Journal of the Acoustical Society of America, 35, 416–426.
Durlach, N. I. (1972). Binaural signal detection: Equalization and cancellation theory. In J. V. Tobias (Ed.), Foundations of modern auditory theory (Vol. II, p. 365462). New York: Academic Press.
Durlach, N. (2006). Auditory masking: Need for improved conceptual structure. The Journal of the Acoustical Society of America, 120, 1787–1790.
Edmonds, B. A., & Culling, J. F. (2005). The spatial unmasking of speech: Evidence for within-channel processing of interaural time delay. The Journal of the Acoustical Society of America, 117, 3069–3078.
Edmonds, B. A., & Culling, J. F. (2006). The spatial unmasking of speech: Evidence for better-ear listening. The Journal of the Acoustical Society of America, 120, 1539–1545.
Egan, J., Carterette, E., & Thwing, E. (1954). Factors affecting multichannel listening. The Journal of the Acoustical Society of America, 26, 774–782.
Festen, J., & Plomp, R. (1990). Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. The Journal of the Acoustical Society of America, 88, 1725–1736.
Fletcher, H. (1930). A space-time pattern theory of hearing. The Journal of the Acoustical Society of America, 1, 311–343.
French, N. R., & Steinberg, J. C. (1947). Factors governing the intelligibility of speech sounds. The Journal of the Acoustical Society of America, 19, 90–119.
Glasberg, B. R., & Moore, B. C. (1990). Derivation of auditory filter shapes from notched-noise data. Hearing Research, 47, 103–138.
Grantham, D. W., & Wightman, F. L. (1979). Detectability of a pulsed tone in the presence of a masker with time-varying interaural correlation. The Journal of the Acoustical Society of America, 65, 1509–1517.
Hartmann, W. M., & Pumplin, J. (1988). Noise power fluctuations and the masking of sine signals. The Journal of the Acoustical Society of America, 83, 2277–2289.
Hawkins, J. E., & Stevens, S. S. (1950). The masking of pure tones and of speech by white noise. The Journal of the Acoustical Society of America, 22, 6–13.
Hawley, M. L., Litovsky, R. Y., & Culling, J. F. (2004). The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer. The Journal of the Acoustical Society of America, 115, 833–843.
Hilkhuysen, G., & Machery, O. (2014). Optimizing pulse-spreading harmonic complexes to minimize intrinsic modulations after cochlear filtering. The Journal of the Acoustical Society of America, 136, 1281–1294.
Hirsh, I. J. (1948). The influence of interaural phase on interaural summation and inhibition. The Journal of the Acoustical Society of America, 20, 536–544.
Holmes, S. D., & Roberts, B. (2011). The influence of adaptation and inhibition on the effects of onset asynchrony on auditory grouping. Journal of Experimental Psychology. Human Perception and Performance, 37, 1988–2000.
Houtgast, T., & Steeneken, H. J. M. (1985). A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. The Journal of the Acoustical Society of America, 77, 1069–1077.
Howard-Jones, P. A., & Rosen, S. (1993). Uncomodulated glimpsing in ‘checkerboard’ noise. The Journal of the Acoustical Society of America, 93, 2915–2922.
Jelfs, S., Culling, J. F., & Lavandier, M. (2011). Revision and validation of a binaural model for speech intelligibility in noise. Hearing Research, 275, 96–104.
Jørgensen, S., & Dau, T. (2011). Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing. The Journal of the Acoustical Society of America, 130, 1475–1487.
Jørgensen, S., Ewert, S. D., & Dau, T. (2013). A multi-resolution envelope-power based model for speech intelligibility. The Journal of the Acoustical Society of America, 134, 436–446.
Klatt, H. (1980). Software for a cascade/parallel formant synthesizer. The Journal of the Acoustical Society of America, 67, 971–995.
Klumpp, R. G., & Eady, H. R. (1956). Some measurements of interaural time difference thresholds. The Journal of the Acoustical Society of America, 28, 859–860.
Kohlrausch, A., Fassel, R., van der Heijden, M., Kortekaas, R., et al. (1997). Detection of tones in low-noise noise: Further evidence for the role of envelope fluctuations. Acta Acustica united with Acustica, 83, 659–669.
Kohlrausch, A., & Sander, A. (1995). Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets. The Journal of the Acoustical Society of America, 97, 1817–1829.
Kwon, B. J., & Turner, C. W. (2001). Consonant identification under maskers with sinusoidal modulation: Masking release or modulation interference? The Journal of the Acoustical Society of America, 110, 1130–1140.
Licklider, J. C. R. (1948). The influence of interaural phase relations upon the masking of speech by white noise. The Journal of the Acoustical Society of America, 20, 150–159.
McAdams, S. (1989). Segregation of concurrent sounds. I: Effects of frequency modulation coherence. The Journal of the Acoustical Society of America, 86, 2148–2159.
Meddis, R., & Hewitt, M. J. (1991). Virtual pitch and phase sensitivity of a computer model of the auditory periphery. I: Pitch identification. The Journal of the Acoustical Society of America, 89, 2866–2882.
Meddis, R., & Hewitt, M. J. (1992). Modeling the identification of concurrent vowels with different fundamental frequencies. The Journal of the Acoustical Society of America, 91, 233–245.
Miller, G. A. (1947). The masking of speech. Psychological Bulletin, 44, 105–129.
Miller, G. A., & Licklider, J. C. R. (1950). The intelligibility of interrupted speech. The Journal of the Acoustical Society of America, 22, 167–173.
Nelson, P., Jin, S.-H., Carney, A. E., & Nelson, D. A. (2003). Understanding speech in modulated interference: Cochlear implant users and normal-hearing listeners. The Journal of the Acoustical Society of America, 113, 961–968.
Oxenham, A., & Simonson, A. M. (2009). Masking release for low- and high-pass-filtered speech in the presence of noise and single-talker interference. The Journal of the Acoustical Society of America, 125, 457–468.
Plomp, R. (1983). The role of modulation in hearing. In R. Klinke & R. Hartmann (Eds.), Hearing—Physiological bases and psychophysics (pp. 270–276). Heidelberg: Springer.
Pumplin, J. (1985). Low-noise noise. The Journal of the Acoustical Society of America, 78, 100–104.
Rhebergen, K. S., & Versfeld, N. J. (2005). A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners. The Journal of the Acoustical Society of America, 117, 2181–2192.
Roberts, B., & Holmes, S. D. (2006). Asynchrony and the grouping of vowel components: Captor tones revisited. The Journal of the Acoustical Society of America, 119, 2905–2918.
Scheffers, T. M. (1983). Sifting vowels: Auditory pitch analysis and sound segregation. Doctoral thesis, University of Groningen.
Schroeder, M. R. (1970). Synthesis of low-peak-factor signals and binary sequences with low autocorrelation. IEEE Transactions on Information Theory, 16, 85–89.
Schubert, E. D. (1956). Some preliminary experiments on binaural time delay and intelligibility. The Journal of the Acoustical Society of America, 28, 895–901.
Stone, M. A., Anton, K., & Moore, B. C. J. (2012). Use of high-rate envelope speech cues and their perceptually relevant dynamic range for the hearing impaired. The Journal of the Acoustical Society of America, 132, 1141–1151.
Stone, M. A., Füllgrabe, C., & Moore, B. C. J. (2010). Relative contribution to speech intelligibility of different envelope modulation rates within the speech dynamic range. The Journal of the Acoustical Society of America, 128, 2127–2137.
Stone, M. A., & Moore, B. C. J. (2014). On the near non-existence of “pure” energetic masking release for speech. The Journal of the Acoustical Society of America, 135, 1967–1977.
Studebaker, G. A., & Sherbecoe, R. L. (2002). Intensity-importance functions for bandlimited monosyllabic words. The Journal of the Acoustical Society of America, 111, 1422–1436.
Summerfield, Q., & Assmann, P. F. (1990). Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. The Journal of the Acoustical Society of America, 89, 1364–1377.
Summerfield, Q., & Assmann, P. F. (1991). Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. The Journal of the Acoustical Society of America, 89, 1364–1377.
Summers, V., & Leek, M. R. (1998). Masking of tones and speech by Schroeder-phase harmonic complexes in normally hearing and hearing-impaired listeners. Hearing Research, 118, 139–150.
von Helmholz, H. (1895). On the sensations of tone as a physiological basis for Theory of music. London: Longmans.
Wan, R., Durlach, N. I., & Colburn, H. S. (2014). Application of a short-time version of the equalization–cancellation model to speech intelligibility experiments with speech maskers. The Journal of the Acoustical Society of America, 136, 768–776.
Compliance with Ethics Requirements
John Culling has no conflicts of interest.
Michael Stone has no conflicts of interest.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Culling, J.F., Stone, M.A. (2017). Energetic Masking and Masking Release. In: Middlebrooks, J., Simon, J., Popper, A., Fay, R. (eds) The Auditory System at the Cocktail Party. Springer Handbook of Auditory Research, vol 60. Springer, Cham. https://doi.org/10.1007/978-3-319-51662-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-51662-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51660-8
Online ISBN: 978-3-319-51662-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)