Abstract
Experimental neuroscience using human subjects, to investigate how the auditory system solves the cocktail party problem, is a young and active field. The use of traditional neurophysiological methods is very tightly constrained in human subjects, but whole-brain monitoring techniques are considerably more advanced for humans than for animals. These latter methods in particular allow routine recording of neural activity from humans while they perform complex auditory tasks that would be very difficult for animals to learn. The findings reviewed in this chapter cover investigations obtained with a variety of experimental methodologies, including electroencephalography, magnetoencephalography, electrocorticography, and functional magnetic resonance imaging. Topics covered in detail include investigations in humans of the neural basis of spatial hearing, auditory stream segregation of simple sounds, auditory stream segregation of speech, and the neural role of attention. A key conceptual advance noted is a change of interpretational focus from the specific notion of attention-based neural gain, to the general role played by attention in neural auditory scene analysis and sound segregation. Similarly, investigations have gradually changed their emphasis from explanations of how auditory representations remain faithful to the acoustics of the stimulus, to how neural processing transforms them into new representations corresponding to the percept of an auditory scene. An additional important methodological advance has been the successful transfer of linear systems theory analysis techniques commonly used in single-unit recordings to whole-brain noninvasive recordings.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ahissar, E., Nagarajan, S., Ahissar, M., Protopapas, A., et al. (2001). Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proceedings of the National Academy of Sciences of the USA, 98(23), 13367–13372.
Ahveninen, J., Hamalainen, M., Jaaskelainen, I. P., Ahlfors, S. P., et al. (2011). Attention-driven auditory cortex short-term plasticity helps segregate relevant sounds from noise. Proceedings of the National Academy of Sciences of the USA, 108(10), 4182–4187.
Ahveninen, J., Kopco, N., & Jaaskelainen, I. P. (2014). Psychophysics and neuronal bases of sound localization in humans. Hearing Research, 307, 86–97.
Akram, S., Englitz, B., Elhilali, M., Simon, J. Z., & Shamma, S. A. (2014). Investigating the neural correlates of a streaming percept in an informational-masking paradigm. PLoS ONE, 9(12), e114427.
Alain, C., Arnott, S. R., & Picton, T. W. (2001). Bottom-up and top-down influences on auditory scene analysis: Evidence from event-related brain potentials. Journal of Experimental Psychology: Human Perception and Performance, 27(5), 1072–1089.
Alain, C., Reinke, K., He, Y., Wang, C., & Lobaugh, N. (2005). Hearing two things at once: Neurophysiological indices of speech segregation and identification. Journal of Cognitive Neuroscience, 17(5), 811–818.
Bidet-Caulet, A., Fischer, C., Besle, J., Aguera, P. E., et al. (2007). Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex. The Journal of Neuroscience, 27(35), 9252–9261.
Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sound. Cambridge, MA: MIT Press.
Briley, P. M., Kitterick, P. T., & Summerfield, A. Q. (2013). Evidence for opponent process analysis of sound source location in humans. Journal of the Association for Research in Otolaryngology, 14(1), 83–101.
Briley, P. M., Goman, A. M., & Summerfield, A. Q. (2016). Physiological evidence for a midline spatial channel in human auditory cortex. Journal of the Association for Research in Otolaryngology, 17(4), 331–340.
Brungart, D. S., Simpson, B. D., Ericson, M. A., & Scott, K. R. (2001). Informational and energetic masking effects in the perception of multiple simultaneous talkers. The Journal of the Acoustical Society of America, 110(5 Pt 1), 2527–2538.
Chait, M., Poeppel, D., & Simon, J. Z. (2006). Neural response correlates of detection of monaurally and binaurally created pitches in humans. Cerebral Cortex, 16(6), 835–848.
Chait, M., de Cheveigne, A., Poeppel, D., & Simon, J. Z. (2010). Neural dynamics of attending and ignoring in human auditory cortex. Neuropsychologia, 48(11), 3262–3271.
Cherry, E. C. (1953). Some experiments on the recognition of speech, with one and with 2 Ears. The Journal of the Acoustical Society of America, 25(5), 975–979.
Cramer, E. M., & Huggins, W. H. (1958). Creation of pitch through binaural Interaction. The Journal of the Acoustical Society of America, 30(5), 413–417.
Cusack, R. (2005). The intraparietal sulcus and perceptual organization. Journal of Cognitive Neuroscience, 17(4), 641–651.
de Cheveigne, A. (2003). Time-domain auditory processing of speech. Journal of Phonetics, 31(3–4), 547–561.
Deike, S., Gaschler-Markefski, B., Brechmann, A., & Scheich, H. (2004). Auditory stream segregation relying on timbre involves left auditory cortex. NeuroReport, 15(9), 1511–1514.
Deike, S., Scheich, H., & Brechmann, A. (2010). Active stream segregation specifically involves the left human auditory cortex. Hearing Research, 265(1–2), 30–37.
Depireux, D. A., Simon, J. Z., Klein, D. J., & Shamma, S. A. (2001). Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. Journal of Neurophysiology, 85(3), 1220–1234.
Dijkstra, K. V., Brunner, P., Gunduz, A., Coon, W., et al. (2015). Identifying the attended speaker using electrocorticographic (ECoG) signals. Brain-Computer Interfaces, 2(4), 161–173.
Di Liberto, G. M., O’Sullivan, J. A., & Lalor, E. C. (2015). Low-frequency cortical entrainment to speech reflects phoneme-level processing. Current Biology, 25(19), 2457–2465.
Ding, N., & Simon, J. Z. (2012a). Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. Journal of Neurophysiology, 107(1), 78–89.
Ding, N., & Simon, J. Z. (2012b). Emergence of neural encoding of auditory objects while listening to competing speakers. Proceedings of the National Academy of Sciences of the USA, 109(29), 11854–11859.
Ding, N., & Simon, J. Z. (2013). Adaptive temporal encoding leads to a background-insensitive cortical representation of speech. The Journal of Neuroscience, 33(13), 5728–5735.
Ding, N., Chatterjee, M., & Simon, J. Z. (2014). Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure. NeuroImage, 88, 41–46.
Ding, N., Melloni, L., Zhang, H., Tian, X., & Poeppel, D. (2016). Cortical tracking of hierarchical linguistic structures in connected speech. Nature Neuroscience, 19(1), 158–164.
Dingle, R. N., Hall, S. E., & Phillips, D. P. (2010). A midline azimuthal channel in human spatial hearing. Hearing Research, 268(1–2), 67–74.
Dingle, R. N., Hall, S. E., & Phillips, D. P. (2012). The three-channel model of sound localization mechanisms: Interaural level differences. The Journal of the Acoustical Society of America, 131(5), 4023–4029.
Dykstra, A. R., Halgren, E., Thesen, T., Carlson, C. E., et al. (2011). Widespread brain areas engaged during a classical auditory streaming task revealed by intracranial EEG. Frontiers in Human Neuroscience, 5, 74.
Elhilali, M., Xiang, J., Shamma, S. A., & Simon, J. Z. (2009a). Interaction between attention and bottom-up saliency mediates the representation of foreground and background in an auditory scene. PLoS Biology, 7(6), e1000129.
Elhilali, M., Ma, L., Micheyl, C., Oxenham, A. J., & Shamma, S. A. (2009b). Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron, 61(2), 317–329.
Gutschalk, A., & Dykstra, A. R. (2014). Functional imaging of auditory scene analysis. Hearing Research, 307, 98–110.
Gutschalk, A., Micheyl, C., Melcher, J. R., Rupp, A., et al. (2005). Neuromagnetic correlates of streaming in human auditory cortex. The Journal of Neuroscience, 25(22), 5382–5388.
Gutschalk, A., Oxenham, A. J., Micheyl, C., Wilson, E. C., & Melcher, J. R. (2007). Human cortical activity during streaming without spectral cues suggests a general neural substrate for auditory stream segregation. The Journal of Neuroscience, 27(48), 13074–13081.
Gutschalk, A., Micheyl, C., & Oxenham, A. J. (2008). Neural correlates of auditory perceptual awareness under informational masking. PLoS Biology, 6(6), e138.
Hambrook, D. A., & Tata, M. S. (2014). Theta-band phase tracking in the two-talker problem. Brain and Language, 135, 52–56.
Hawley, M. L., Litovsky, R. Y., & Culling, J. F. (2004). The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer. The Journal of the Acoustical Society of America, 115(2), 833–843.
Hill, K. T., & Miller, L. M. (2010). Auditory attentional control and selection during cocktail party listening. Cerebral Cortex, 20(3), 583–590.
Hill, K. T., Bishop, C. W., & Miller, L. M. (2012). Auditory grouping mechanisms reflect a sound’s relative position in a sequence. Frontiers in Human Neuroscience, 6, 158.
Hillyard, S. A., Hink, R. F., Schwent, V. L., & Picton, T. W. (1973). Electrical signs of selective attention in the human brain. Science, 182(4108), 177–180.
Horton, C., D’Zmura, M., & Srinivasan, R. (2013). Suppression of competing speech through entrainment of cortical oscillations. Journal of Neurophysiology, 109(12), 3082–3093.
Hugdahl, K. (2005). Symmetry and asymmetry in the human brain. European Review, 13(Suppl. S2), 119–133.
Jeffress, L. A. (1948). A place theory of sound localization. Journal of Comparative and Physiological Psychology, 41(1), 35–39.
Kaas, J. H., & Hackett, T. A. (2000). Subdivisions of auditory cortex and processing streams in primates. Proceedings of the National Academy of Sciences of the USA, 97(22), 11793–11799.
Kayser, S. J., Ince, R. A., Gross, J., & Kayser, C. (2015). Irregular speech rate dissociates auditory cortical entrainment, evoked responses, and frontal alpha. The Journal of Neuroscience, 35(44), 14691–14701.
Kerlin, J. R., Shahin, A. J., & Miller, L. M. (2010). Attentional gain control of ongoing cortical speech representations in a “cocktail party”. The Journal of Neuroscience, 30(2), 620–628.
Kidd, G., Jr., Mason, C. R., & Richards, V. M. (2003). Multiple bursts, multiple looks, and stream coherence in the release from informational masking. The Journal of the Acoustical Society of America, 114(5), 2835–2845.
Kulesza, R. J., Jr. (2007). Cytoarchitecture of the human superior olivary complex: Medial and lateral superior olive. Hearing Research, 225(1–2), 80–90.
Lalor, E. C., & Foxe, J. J. (2010). Neural responses to uninterrupted natural speech can be extracted with precise temporal resolution. European Journal of Neuroscience, 31(1), 189–193.
Lee, A. K., Larson, E., Maddox, R. K., & Shinn-Cunningham, B. G. (2014). Using neuroimaging to understand the cortical mechanisms of auditory selective attention. Hearing Research, 307, 111–120.
Luo, H., & Poeppel, D. (2007). Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron, 54(6), 1001–1010.
Lutkenhoner, B., & Steinstrater, O. (1998). High-precision neuromagnetic study of the functional organization of the human auditory cortex. Audiology and Neuro-Otology, 3(2–3), 191–213.
Maddox, R. K., Billimoria, C. P., Perrone, B. P., Shinn-Cunningham, B. G., & Sen, K. (2012). Competing sound sources reveal spatial effects in cortical processing. PLoS Biology, 10(5), e1001319.
Magezi, D. A., & Krumbholz, K. (2010). Evidence for opponent-channel coding of interaural time differences in human auditory cortex. Journal of Neurophysiology, 104(4), 1997–2007.
Makela, J. P., Hamalainen, M., Hari, R., & McEvoy, L. (1994). Whole-head mapping of middle-latency auditory evoked magnetic fields. Electroencephalography and Clinical Neurophysiology, 92(5), 414–421.
McAlpine, D. (2005). Creating a sense of auditory space. Journal of Physiology, 566(Pt 1), 21–28.
McLaughlin, S. A., Higgins, N. C., & Stecker, G. C. (2016). Tuning to binaural cues in human auditory cortex. Journal of the Association for Research in Otolaryngology, 17(1), 37–53.
Mesgarani, N., & Chang, E. F. (2012). Selective cortical representation of attended speaker in multi-talker speech perception. Nature, 485(7397), 233–236.
Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014). Phonetic feature encoding in human superior temporal gyrus. Science, 343(6174), 1006–1010.
Middlebrooks, J. C., & Bremen, P. (2013). Spatial stream segregation by auditory cortical neurons. The Journal of Neuroscience, 33(27), 10986–11001.
Naatanen, R., Paavilainen, P., Rinne, T., & Alho, K. (2007). The mismatch negativity (MMN) in basic research of central auditory processing: A review. Clinical Neurophysiology, 118(12), 2544–2590.
Nakai, T., Kato, C., & Matsuo, K. (2005). An FMRI study to investigate auditory attention: A model of the cocktail party phenomenon. Magnetic Resonance in Medical Sciences, 4(2), 75–82.
O’Sullivan, J. A., Shamma, S. A., & Lalor, E. C. (2015a). Evidence for neural computations of temporal coherence in an auditory scene and their enhancement during active listening. The Journal of Neuroscience, 35(18), 7256–7263.
O’Sullivan, J. A., Power, A. J., Mesgarani, N., Rajaram, S., et al. (2015b). Attentional selection in a cocktail party environment can be decoded from single-trial EEG. Cerebral Cortex, 25(7), 1697–1706.
Pasley, B. N., David, S. V., Mesgarani, N., Flinker, A., et al. (2012). Reconstructing speech from human auditory cortex. PLoS Biology, 10(1), e1001251.
Patel, A. D. (2008). Music, language, and the brain. New York: Oxford University Press.
Peelle, J. E., Gross, J., & Davis, M. H. (2013). Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cerebral Cortex, 23(6), 1378–1387.
Power, A. J., Foxe, J. J., Forde, E. J., Reilly, R. B., & Lalor, E. C. (2012). At what time is the cocktail party? A late locus of selective attention to natural speech. European Journal of Neuroscience, 35(9), 1497–1503.
Ross, B., Tremblay, K. L., & Picton, T. W. (2007a). Physiological detection of interaural phase differences. The Journal of the Acoustical Society of America, 121(2), 1017–1027.
Ross, B., Fujioka, T., Tremblay, K. L., & Picton, T. W. (2007b). Aging in binaural hearing begins in mid-life: Evidence from cortical auditory-evoked responses to changes in interaural phase. The Journal of Neuroscience, 27(42), 11172–11178.
Ross, B., Miyazaki, T., Thompson, J., Jamali, S., & Fujioka, T. (2014). Human cortical responses to slow and fast binaural beats reveal multiple mechanisms of binaural hearing. Journal of Neurophysiology, 112(8), 1871–1884.
Salminen, N. H., Tiitinen, H., Yrttiaho, S., & May, P. J. (2010). The neural code for interaural time difference in human auditory cortex. The Journal of the Acoustical Society of America, 127(2), EL60–65.
Scott, S. K., & McGettigan, C. (2013). The neural processing of masked speech. Hearing Research, 303, 58–66.
Scott, S. K., Rosen, S., Wickham, L., & Wise, R. J. S. (2004). A positron emission tomography study of the neural basis of informational and energetic masking effects in speech perception. The Journal of the Acoustical Society of America, 115(2), 813–821.
Scott, S. K., Rosen, S., Beaman, C. P., Davis, J. P., & Wise, R. J. S. (2009). The neural processing of masked speech: Evidence for different mechanisms in the left and right temporal lobes. The Journal of the Acoustical Society of America, 125(3), 1737–1743.
Shamma, S. A., Elhilali, M., & Micheyl, C. (2011). Temporal coherence and attention in auditory scene analysis. Trends in Neurosciences, 34(3), 114–123.
Shuai, L., & Elhilali, M. (2014). Task-dependent neural representations of salient events in dynamic auditory scenes. Frontiers in Neuroscience, 8, 203.
Simon, J. Z., Depireux, D. A., Klein, D. J., Fritz, J. B., & Shamma, S. A. (2007). Temporal symmetry in primary auditory cortex: Implications for cortical connectivity. Neural Computation, 19(3), 583–638.
Snyder, J. S., Alain, C., & Picton, T. W. (2006). Effects of attention on neuroelectric correlates of auditory stream segregation. Journal of Cognitive Neuroscience, 18(1), 1–13.
Snyder, J. S., Gregg, M. K., Weintraub, D. M., & Alain, C. (2012). Attention, awareness, and the perception of auditory scenes. Frontiers in Psychology, 3, 15.
Stecker, G. C., Harrington, I. A., & Middlebrooks, J. C. (2005). Location coding by opponent neural populations in the auditory cortex. PLoS Biology, 3(3), e78.
Sussman, E. S., Chen, S., Sussman-Fort, J., & Dinces, E. (2014). The five myths of MMN: Redefining how to use MMN in basic and clinical research. Brain Topography, 27(4), 553–564.
Szalardy, O., Bohm, T. M., Bendixen, A., & Winkler, I. (2013). Event-related potential correlates of sound organization: Early sensory and late cognitive effects. Biological Psychology, 93(1), 97–104.
Teki, S., Chait, M., Kumar, S., von Kriegstein, K., & Griffiths, T. D. (2011). Brain bases for auditory stimulus-driven figure-ground segregation. The Journal of Neuroscience, 31(1), 164–171.
Thompson, S. K., von Kriegstein, K., Deane-Pratt, A., Marquardt, T., et al. (2006). Representation of interaural time delay in the human auditory midbrain. Nature Neuroscience, 9(9), 1096–1098.
van Noorden, L. P. A. S. (1975). Temporal coherence in the perception of tone sequences. PhD dissertation, Eindhoven University of Technology.
von Kriegstein, K., Griffiths, T. D., Thompson, S. K., & McAlpine, D. (2008). Responses to interaural time delay in human cortex. Journal of Neurophysiology, 100(5), 2712–2718.
Wiegand, K., & Gutschalk, A. (2012). Correlates of perceptual awareness in human primary auditory cortex revealed by an informational masking experiment. NeuroImage, 61(1), 62–69.
Wilson, E. C., Melcher, J. R., Micheyl, C., Gutschalk, A., & Oxenham, A. J. (2007). Cortical FMRI activation to sequences of tones alternating in frequency: Relationship to perceived rate and streaming. Journal of Neurophysiology, 97(3), 2230–2238.
Xiang, J., Simon, J., & Elhilali, M. (2010). Competing streams at the cocktail party: Exploring the mechanisms of attention and temporal integration. The Journal of Neuroscience, 30(36), 12084–12093.
Zion Golumbic, E. M., Ding, N., Bickel, S., Lakatos, P., et al. (2013). Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron, 77(5), 980–991.
Acknowledgements
Support for the author’s work was provided by the National Institute of Deafness and Other Communication Disorders Grant R01-DC-014085.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Jonathan Z. Simon declares that he has no conflict of interest.
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Simon, J.Z. (2017). Human Auditory Neuroscience and the Cocktail Party Problem. In: Middlebrooks, J., Simon, J., Popper, A., Fay, R. (eds) The Auditory System at the Cocktail Party. Springer Handbook of Auditory Research, vol 60. Springer, Cham. https://doi.org/10.1007/978-3-319-51662-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-51662-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51660-8
Online ISBN: 978-3-319-51662-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)