Abstract
Natural stimulus functional magnetic resonance imaging (N-fMRI) such as fMRI acquired when participants were watching video streams or listening to audio streams has been increasingly used to investigate functional mechanisms of the human brain in recent years. One of the fundamental challenges in functional brain mapping based on N-fMRI is to model the brain’s functional responses to continuous, naturalistic and dynamic natural stimuli. To address this challenge, in this paper we present a data-driven approach to exploring functional interactions in the human brain during free listening to music and speech streams. Specifically, we model the brain responses using N-fMRI by measuring the functional interactions on large-scale brain networks with intrinsically established structural correspondence, and perform music and speech classification tasks to guide the systematic identification of consistent and discriminative functional interactions when multiple subjects were listening music and speech in multiple categories. The underlying premise is that the functional interactions derived from N-fMRI data of multiple subjects should exhibit both consistency and discriminability. Our experimental results show that a variety of brain systems including attention, memory, auditory/language, emotion, and action networks are among the most relevant brain systems involved in classic music, pop music and speech differentiation. Our study provides an alternative approach to investigating the human brain’s mechanism in comprehension of complex natural music and speech.
Similar content being viewed by others
Notes
It is the number of entries in the upper triangular of the 358 × 358 functional interaction matrix with the diagonal removed, i.e., 63903 = (358 × 358–358)/2.
200 functional interactions are chosen based on the later experiment on consistently discriminative functional interactions.
References
Alluri, V., Toiviainen, P., Jääskeläinen, I. P., Glerean, E., Sams, M., & Brattico, E. (2012). Large-scale brain networks emerge from dynamic processing of musical timbre, key, and rhythm. NeuroImage, 59, 3677–3689.
Bandettini, P. A., Jesmanowicz, A., et al. (1998). Functional MRI of brain activation induced by scanner acoustic noise. Magnetic Resonance in Medicine, 39(3), 410–416.
Bartels, A., Zeki, S., & Logothetis, N. (2008). Natural vision reveals regional specialization to local motion and to contrast-invariant, global flow in the human brain. Cerebral Cortex, 18, 705–717.
Berti, S., & Schröger, E. (2003). Working memory controls involuntary attention switching: evidence from an auditory distraction paradigm. European Journal of Neuroscience, 17, 1119–1122.
Blinkenberg, M., Bonde, C., Holm, S., et al. (1996). Rate dependence of regional cerebral activation during performance of a repetitive motor task: a PET study. Journal of Cerebral Blood Flow and Metabolism, 16, 794–803.
Bordier, C., Puja, F., & Macaluso, E. (2012). Sensory processing during viewing of cinematographic material: Computational modeling and functional neuroimaging. NeuroImage, 67, 213–226.
Brosch, T., Sander, D., Pourtois, G., & Scherer, K. R. (2008). Beyond fear: Rapid spatial orienting toward positive emotional stimuli. Psychological Science, 19, 362–370.
Cathy, J. P. (2012). A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. NeuroImage, 62(2), 816–847.
Chang, C., & Glover, G. H. (2010). Time-frequency dynamics of resting-state brain connectivity measured with fMRI. NeuroImage, 50, 81–98.
Chen, M., Han, J., Hu, X., Jiang, X., Guo, L., & Liu, T. (2013). Survey of encoding and decoding of visual stimulus via FMRI: an image analysis perspective. Brain Imaging and Behavior. doi:10.1007/s11682-013-9238-z.
Duda, R., Hart, P., & Stork, D. G. (2001). Pattern classification (2nd ed.). New York: Wiley.
Escoffier, N., Zhong, J., Schirmer, A., & Qiu, A. (2012). Emotional expressions in voice and music: Same code, same effect? Human Brain Mapping, 34, 1796–1810.
Formisano, E., De Martino, F., Bonte, M., & Goebel, R. (2008). “Who” is saying “What”? Brain-based decoding of human voice and speech. Science, 322, 970–973.
Friston, K. J. (2009). Modalities, modes, and models in functional neuroimaging. Science Signaling, 326, 399–403.
Golland, Y., Bentin, S., Gelbard, H., et al. (2007). Extrinsic and intrinsic systems in the posterior cortex of the human brain revealed during natural sensory stimulation. Cerebral Cortex, 17, 766–777.
Hagmann, P., Cammoun, L., Gigandet, X., Gerhard, S., Ellen Grant, P., Wedeen, V., & Sporns, O. (2010). MR connectomics: principles and challenges. Journal of neuroscience methods, 194(1), 34–45.
Han, J., Ji, X., Hu, X., Zhu, D., Li, K., Jiang, X., Cui, G., Guo, L., & Liu, T. (2013). Representing and retrieving video shots in human-centric brain imaging space. IEEE Transactions on Image Processing, 22, 2723–2736.
Hasson, U., Malach, R., & Heeger, D. J. (2010). Reliability of cortical activity during natural stimulation. Trends in Cognitive Science, 14, 40–48.
Haynes, J. D., & Rees, G. (2006). Decoding mental states from brain activity in humans. Nature Reviews Neuroscience, 7(7), 523–534.
Hu, X., Li, K., Han, J., Hua, X., Guo, L., & Liu, T. (2012). Bridging the Semantic Gap via Functional Brain Imaging. Multimedia, IEEE Transactions on, 14, 314–325.
Hu, X., Deng, F., Li, K., Zhang, T., Chen, H., Jiang, X., Lv, J., Zhu, D., Faraco, C., & Zhang, D. (2010). Bridging low-level features and high-level semantics via fMRI brain imaging for video classification (pp. 451–460). Firenze: Proceedings of the International Conference on Multimedia: ICM. ACM.
Janata, P., Birk, J. L., Van Horn, J. D., Leman, M., Tillmann, B., & Bharucha, J. J. (2002). The cortical topography of tonal structures underlying western music. Science, 298, 2167–2170.
Jiang, X., Zhang, T., Hu, X., Lu, L., Han, J., Guo, L., Liu, T., 2012. Music/speech classification using high-level features derived from fMRI brain imaging. Proceedings of the 20th ACM International Conference on Multimedia: ACMMM. ACM, pp. 825–828.
Juslin, P. N., & Vastfjall, D. (2008). Emotional responses to music: the need to consider underlying mechanisms. Behavioral and brain sciences, 31, 559–575.
Khalfa, S., Schon, D., Anton, J. L., & Lie’geois-Chauvel, C. (2005). Brain regions involved in the recognition of happiness and sadness in music. Neuroreport, 16, 1981–1984.
Koelsch, S. (2005). Neural substrates of processing syntax and semantics in music. Current opinion in neurobiology, 15, 207–212.
Koelsch, S., Fritz, T., Müller, K., & Friederici, A. D. (2006). Investigating emotion with music: An fMRI study. Human Brain Mapping, 27, 239–250.
Koelsch, S. (2009). Music-syntactic processing and auditory memory: similarities and differences between ERAN and MMN. Psychophysiology, 46, 179–190.
Koelsch, S. (2010). Towards a neural basis of music-evoked emotions. Trends in Cognitive Science, 14, 131–137.
Koelsch, S. (2011a). Towards a neural basis of processing musical semantics. Physics of life reviews, 8, 89–105.
Koelsch, S. (2011b). Toward a neural basis of music perception - a review and updated model. Frontiers in Psychology, 2, 110.
Kohonen, T. (1998). The self-organizing map. Neurocomputing, 21, 1–6.
Kononenko, I., 1994. Estimating attributes: analysis and extension of RELIEF. Proceedings of the European Conference on Machine Learning: ECML. Springer, pp: 171–182.
Kreutz, G., Bongard, S., Rohrmann, S., Hodapp, V., & Grebe, D. (2004). Effects of choir singing or listening on secretory immunoglobulin A, cortisol, and emotional state. Journal of behavioral medicine, 27, 623–635.
Laird, A. R., Lancaster, J. L., & Fox, P. T. (2005). BrainMap: the social evolution of a human brain mapping database. Neuroinformatics, 3, 65–78.
Laird, A.R., Eickhoff, S.B., et al., 2009. ALE meta-analysis workflows via the BrainMap database: Progress towards a probabilistic functional brain atlas. Frontiers in neuroinformatics, 3
Langers, D. R. M., Van Dijk, P., & Backes, W. H. (2005). Interactions between hemodynamic responses to scanner acoustic noise and auditory stimuli in functional magnetic resonance imaging. Magnetic Resonance in Medicine, 53(1), 49–60.
Lee, T. W., Dolan, R. J., & Critchley, H. D. (2008). Controlling emotional expression: Behavioral and neural correlates of nonimitative emotional responses. Cerebral Cortex, 18, 104–113.
Lew, M. S., Sebe, N., Djeraba, C., & Jain, R. (2006). Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications, 2(1), 1–19.
Li, K., Guo, L., Li, G., et al. (2010). Cortical surface based identification of brain networks using high spatial resolution resting state fMRI data. IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010, 656–659.
Li, K., Zhu, D., Guo, L., et al. (2012). Connectomics Signatures of Prenatal Cocaine Exposure Affected Adolescent Brains. Human Brain Mapping. doi:10.1002/hbm.22082.
Liu, H., Setiono, R., 1995. Chi2: Feature selection and discretization of numeric attributes. Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence. IEEE, pp: 388–391
Liu, T., Shen, D., & Davatzikos, C. (2004). Deformable registration of cortical structures via hybrid volumetric and surface warping. NeuroImage, 22(4), 1790–1801.
Liu, T. (2011). A few thoughts on brain ROIs. Brain Imaging and Behavior, 5, 189–202.
Mechler, F., Vicotr, J., Purpura, K., & Shapley, R. (1998). Robust temporal coding of contrast by V1 neurons for transient but not for steady-state stimuli. Journal of Neuroscience, 18, 6583–6598.
Meyer, D., Leisch, F., & Hornik, K. (2003). The support vector machine under test. Neurocomputing, 55, 169–186.
Mitterschiffthaler, M., Fu, C., Dalton, J., Andrew, C., & Williams, S. (2007). A functional MRI study of happy and sad affective states induced by classical music. Human Brain Mapping, 28, 1150–1162.
Murcia, C. Q., Bongard, S., & Kreutz, G. (2009). Emotional and neurohumoral responses to dancing tango argentino. Music and Medicine, 1, 14–21.
Näätänen, R., Tervaniemi, M., Sussman, E., Paavilainen, P., & Winkler, I. (2001). Primitive intelligence in the auditory cortex. Trends in Neurosciences, 24, 283–288.
Najib, A., Lorberbaum, J. P., Kose, S., Bohning, D. E., & George, M. S. (2004). Regional brain activity in women grieving a romantic relationship breakup. American Journal of Psychiatry, 161, 2245–2256.
Nishimoto, S., Vu, A. T., Naselaris, T., Benjamini, Y., Yu, B., & Gallant, J. L. (2011). Reconstructing visual experiences from brain activity evoked by natural movies. Current Biology, 21, 1641–1646.
Ogawa, A., Cecile, B., & Emiliano, M. (2013). Audio-visual perception of 3D cinematography: an fMRI study using condition-based and computation-based analyses. PloS one, 8(10), e76003.
Passingham, R. E., Stephan, K. E., & Kotter, R. (2002). The anatomical basis of functional localization in the cortex. Nature Reviews Neuroscience, 3, 606–616.
Peretz, I., Zatorre, R.J., 2003. The Cognitive Neuroscience of Music. Oxford University Press, 2003.
Peters, J., & Buchel, C. (2009). Overlapping and distinct neural systems code for subjective value during intertemporal and risky decision making. Journal of Neuroscience, 29, 15727–15734.
Richiardi, J., Eryilmaz, H., Schwartz, S., Vuilleumier, P., & Van De Ville, D. (2011). Decoding brain states from fMRI connectivity graphs. NeuroImage, 56(2), 616–626.
Shirer, W. R., Ryali, S., Rykhlevskaia, E., Menon, V., & Greicius, M. D. (2012). Decoding subject-driven cognitive states with whole-brain connectivity patterns. Cerebral cortex, 22(1), 158–165.
Smith, G. S., Reynolds, C. F., Pollock, B. G., et al. (2012). Cerebral glucose metabolic response to combined total sleep deprivation and antidepressant treatment in geriatric depression. American Journal of Psychiatry, 156, 683–689.
Staeren, N., Hanna, R., Federico, D. M., Rainer, G., & Elia, F. (2009). Sound categories are represented as distributed patterns in the human auditory cortex. Current Biology, 19, 498–502.
Stokes, M., Thompson, R., Cusack, R., & Duncan, J. (2009). Top-down activation of shape specific population codes in visual cortex during mental imagery. The Journal of Neuroscience, 29, 1565–1572.
Stokes, M., Saraiva, A., Rohenkohl, G., & Nobre, A. C. (2011). Imagery for shapes activates position-invariant representations in human visual cortex. NeuroImage, 56, 1540–1545.
Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio signals. Speech and Audio Processing, IEEE Transactions on, 10, 293–302.
Yao, H., Shi, L., Han, F., Gao, H., & Dan, Y. (2007). Rapid learning in cortical coding of visual scenes. Nature neuroscience, 10, 772–778.
Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature Methods, 8, 665–670.
Yuan, Y., Jiang, X., et al. (2013). Meta-analysis of functional roles of DICCCOLs. Neuroinformatics, 11, 47–63.
Zhang, Y., Han, J., Hu, X., Guo, L., Liu, T., 2013. Data-driven evaluation of functional connectivity metrics. 2013 I.E. 10th International Symposium on Biomedical Imaging: ISBI. IEEE, pp: 532–535.
Zheng, Z., Fred, M., 2010. Advancing feature selection research. ASU feature selection repository
Zhu, D., Li, K., Guo, L., et al. (2012). DICCCOL: dense individualized and common connectivity-based landmarks. Cerebral Cortex, 23, 786–800.
Acknowledgments
J Fang was supported by the National Natural Science Foundation of China under Grant 61202186. T Liu was supported by NIH Career Award (NIH EB 006878), NSF CAREER Award (IIS-1149260), NIH R01 DA033393, NSF BME-1302089 and NIH R01 AG-042599. X Hu was supported by the National Natural Science Foundation of China under Grant 61103061, China Postdoctoral Science Foundation under Grant 20110490174 and 2012 T50819, and Program for New Century Excellent Talents in University under grant NCET-13-0472. L Guo was supported by the National Natural Science Foundation of China under Grant 61273362 and 61333017. J Han was supported by the National Science Foundation of China under Grant 61005018 and 91120005, NPU-FFR-JC20120237 and Program for New Century Excellent Talents in University under grant NCET-10-0079.
Conflict of Interest
Jun Fang, Xintao Hu, Junwei Han, Xi Jiang, Dajiang Zhu, Lei Guo, Tianming Liu declare that they have no conflict of interest.
Informed Consent Statement
All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, and the applicable revisions at the time of the investigation. Informed consent was obtained from all patients for being included in the study.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Fang, J., Hu, X., Han, J. et al. Data-driven analysis of functional brain interactions during free listening to music and speech. Brain Imaging and Behavior 9, 162–177 (2015). https://doi.org/10.1007/s11682-014-9293-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11682-014-9293-0