Crossmodal interactions in the perception of expressivity in musical performance


In musical performance, bodily gestures play an important role in communicating expressive intentions to audiences. Although previous studies have demonstrated that visual information can have an effect on the perceived expressivity of musical performances, the investigation of audiovisual interactions has been held back by the technical difficulties associated with the generation of controlled, mismatching stimuli. With the present study, we aimed to address this issue by utilizing a novel method in order to generate controlled, balanced stimuli that comprised both matching and mismatching bimodal combinations of different expressive intentions. The aim of Experiment 1 was to investigate the relative contributions of auditory and visual kinematic cues in the perceived expressivity of piano performances, and in Experiment 2 we explored possible crossmodal interactions in the perception of auditory and visual expressivity. The results revealed that although both auditory and visual kinematic cues contribute significantly to the perception of overall expressivity, the effect of visual kinematic cues appears to be somewhat stronger. These results also provide preliminary evidence of crossmodal interactions in the perception of auditory and visual expressivity. In certain performance conditions, visual cues had an effect on the ratings of auditory expressivity, and auditory cues had a small effect on the ratings of visual expressivity.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7


  1. Alais, D., & Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal integration. Current Biology, 14, 257–262. doi:10.1016/j.cub.2004.01.029

  2. Bakeman, R. (2005). Recommended effect size statistics for repeated measures designs. Behavior Research Methods, 37, 379–384. doi:10.3758/BF03192707

  3. Bhatara, A., Tirovolas, A. K., Duan, L. M., Levy, B., & Levitin, D. J. (2011). Perception of emotional expression in musical performance. Journal of Experimental Psychology: Human Perception and Performance, 37, 921–934.

  4. Blake, R., & Shiffrar, M. (2007). Perception of human motion. Annual Review of Psychology, 58, 47–73. doi:10.1146/annurev.psych.57.102904.190152

  5. Chapados, C., & Levitin, D. J. (2008). Cross-modal interactions in the experience of musical performances: Physiological correlates. Cognition, 108, 639–651.

  6. Clarke, E. F. (1988). Generative principles in music performance. In J. A. Sloboda (Ed.), Generative processes in music: The psychology of performance, improvisation, and composition (pp. 1–26). Oxford, UK: Oxford University Press.

  7. Clarke, E. F. (1995). Expression in performance: Generativity, perception and semiosis. In J. Rink (Ed.), The practice of performance (pp. 21–54). Cambridge, UK: Cambridge University Press.

  8. Dahl, S., & Friberg, A. (2007). Visual perception of expressiveness in musicians’ body movements. Music Perception, 24, 433–454.

  9. Davidson, J. W. (1993). Visual perception of performance manner in the movements of solo musicians. Psychology of Music, 21, 103–113.

  10. Davidson, J. W. (1994). What type of information is conveyed in the body movements of solo musician performers? Journal of Human Movement Studies, 6, 279–301.

  11. Davidson, J. W. (1995). What does the visual information contained in music performances offer the observer? Some preliminary thoughts. In R. Steinberg (Ed.), Music and the mind machine: Psychophysiology and psychopathology of the sense of music (pp. 105–113). Berlin, Germany: Springer.

  12. Davidson, J. W. (2005). Bodily communication in musical performance. In D. Miell, R. MacDonald, & D. J. Hargreaves (Eds.), Musical communication (pp. 215–237). Oxford, UK: Oxford University Press.

  13. de Gelder, B., Pourtois, G., & Weiskrantz, L. (2002). Fear recognition in the voice is modulated by unconsciously recognized facial expressions but not by unconsciously recognized affective pictures. Proceedings of the National Academy of Sciences, 99, 4121–4126.

  14. Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429–433. doi:10.1038/415429a

  15. Gabrielsson, A. (1999). The performance of music. In D. Deutsch (Ed.), The psychology of music (2nd ed., pp. 501–602). San Diego, CA: Academic Press.

  16. Gabrielsson, A. (2001–2002). Emotion perceived and emotion felt: Same or different? Musicae Scientiae, 5, 123–147. doi:10.1177/10298649020050S105

  17. Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin & Review, 13, 361–377. doi:10.3758/BF03193857

  18. Goebl, W., & Palmer, C. (2009). Synchronization of timing and motion among performing musicians. Music Perception, 26, 427–438.

  19. Goldin-Meadow, S. (1999). The role of gesture in communication and thinking. Trends in Cognitive Sciences, 3, 419–429.

  20. Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70.

  21. Juchniewicz, J. (2008). The influence of physical movement on the perception of musical performance. Psychology of Music, 36, 417–427.

  22. Juslin, P. N. (1997). Emotional communication in music performance: A functionalist perspective and some data. Music Perception, 14, 383–418.

  23. Juslin, P. N. (2001). Communicating emotion in music performance: A review and a theoretical framework. In P. N. Juslin & J. A. Sloboda (Eds.), Music and emotion: Theory and research (pp. 309–337). Oxford, UK: Oxford University Press.

  24. Juslin, P. N. (2003). Five facets of musical expression: A psychologist’s perspective on music performance. Psychology of Music, 31, 273–302.

  25. Juslin, P. N. (2005). From mimesis to catharsis: Expression, perception, and induction of emotion in music. In D. Miell, R. MacDonald, & D. J. Hargreaves (Eds.), Musical communication (pp. 85–115). Oxford, UK: Oxford University Press.

  26. Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music performance: Different channels, same code? Psychological Bulletin, 129, 770–814.

  27. Juslin, P. N., & Sloboda, J. (2010). Introduction: Aims, organization, and terminology. In P. N. Juslin & J. A. Sloboda (Eds.), Handbook of music and emotion: Theory, research, applications (pp. 3–12). Oxford, UK: Oxford University Press.

  28. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74, 431–461.

  29. McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746–748.

  30. McNeill, D., Cassell, J., & McCullough, K. E. (1994). Communicative effects of speech-mismatched gestures. Research on Language and Social Interaction, 27, 223–237.

  31. Morrison, S. J., Price, H. E., Geiger, C. G., & Cornacchio, R. A. (2009). The effect of conductor expressivity on ensemble performance evaluation. Journal of Research in Music Education, 57, 37–49.

  32. Palmer, C. (1997). Music performance. Annual Review of Psychology, 48, 115–138.

  33. Petrini, K., McAleer, P., & Pollick, F. (2010). Audiovisual integration of emotional signals from music improvisation does not depend on temporal correspondence. Brain Research, 1323, 139–148.

  34. Platz, F., & Kopiez, R. (2012). When the eye listens: A meta-analysis of how audio–visual presentation enhances the appreciation of music performance. Music Perception, 30, 71–83.

  35. Quinto, L., Thompson, W. F., Russo, F. A., & Trehub, S. E. (2010). A comparison of the McGurk effect for spoken and sung syllables. Attention, Perception, & Psychophysics, 72, 1450–1454. doi:10.3758/APP.72.6.1450

  36. Rosenblum, L. D., & Fowler, C. A. (1991). Audiovisual investigation of the loudness-effort effect for speech and nonspeech events. Journal of Experimental Psychology: Human Perception and Performance, 17, 976–985.

  37. Saldaña, H. M., & Rosenblum, L. D. (1993). Visual influences on auditory pluck and bow judgments. Perception & Psychophysics, 54, 406–416.

  38. Schutz, M., & Kubovy, M. (2009). Causality and cross-modal integration. Journal of Experimental Psychology: Human Perception and Performance, 35, 1791–1810.

  39. Schutz, M., & Lipscomb, S. D. (2007). Hearing gestures, seeing music: Vision influences perceived tone duration. Perception, 36, 888–897.

  40. Sloboda, J. A., & Lehmann, A. C. (2001). Tracking performance correlates of changes in perceived intensity of emotion during different interpretations of a Chopin piano prelude. Music Perception, 19, 87–120.

  41. Stein, B. E., & Meredith, M. A. (1993). The merging of the senses. Cambridge, MA: MIT Press.

  42. Thompson, W. F., Graham, P., & Russo, F. A. (2005). Seeing music performance: Visual influences on perception and experience. Semiotica, 2005, 203–227.

  43. Thompson, M. R., & Luck, G. (2008). Effect of pianists’ expressive intention on amount and type of body movement. In K. Miyazaki, Y. Hiraga, M. Adachi, Y. Nakajima, & M. Tsuzaki (Eds.), Proceedings of the 10th International Conference on Music Perception and Cognition (pp. 540–544). Sapporo, Japan: Hokkaido University.

  44. Thompson, M. R., & Luck, G. (2012). Exploring relationships between pianists’ body movements, their expressive intentions, and structural elements of the music. Musicae Scientiae, 16, 19–40.

  45. Thompson, W. F., Russo, F. A., & Quinto, L. (2008). Audio–visual integration of emotional cues in song. Cognition and Emotion, 22, 1457–1470.

  46. Toiviainen, P., & Burger, B. (2010). MoCap toolbox manual. Jyväskylä, Finland: University of Jyväskylä. Retrieved from

  47. Tsay, C.-J. (2013). Sight over sound in the judgment of music performance. Proceedings of the National Academy of Sciences, 110, 14580–14585. doi:10.1073/pnas.1221454110

  48. Vatakis, A., Maragos, P., Rodomagoulakis, I., & Spence, C. (2012). Assessing the effect of physical differences in the articulation of consonants and vowels on audiovisual temporal perception. Frontiers in Integrative Neuroscience, 6(71), 1–18.

  49. Vatakis, A., & Spence, C. (2007). Crossmodal binding: Evaluating the “unity assumption” using audiovisual speech stimuli. Perception & Psychophysics, 69, 744–756. doi:10.3758/BF03193776

  50. Vatakis, A., & Spence, C. (2008). Evaluating the influence of the “unity assumption” on the temporal perception of realistic audiovisual stimuli. Acta Psychologica, 127, 12–23.

  51. Verron, C. (2005). Traitement et visualisation de données gestuelles captées par Optotrak [Processing and visualizing gesture data captured by Optotrak]. Unpublished report, Input Devices and Music Interaction Laboratory, McGill University. Retrieved from

  52. Vines, B. W., Krumhansl, C. L., Wanderley, M. M., Dalca, I. M., & Levitin, D. J. (2011). Music to my eyes: Cross-modal interactions in the perception of emotions in musical performance. Cognition, 118, 157–170.

  53. Vines, B. W., Krumhansl, C. L., Wanderley, M. M., & Levitin, D. J. (2006). Cross-modal interactions in the perception of musical performance. Cognition, 101, 80–113. doi:10.1016/j.cognition.2005.09.003

  54. Wanderley, M., Vines, B. W., Middleton, N., McKay, C., & Hatch, W. (2005). The musical significance of clarinetists’ ancillary gestures: An exploration of the field. Journal of New Music Research, 34, 97–113.

  55. Williamon, A., & Davidson, J. W. (2002). Exploring co-performer communication. Musicae Scientiae, 6, 53–72.

Download references

Author information

Correspondence to Jonna K. Vuoskoski.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Pianist1_Deadpan (MOV 9009 kb)

Pianist1_Exaggerated (MOV 10209 kb)

Pianist1_Deadpan_video_Exaggerated_audio (MOV 10043 kb)

Pianist2_Exaggerated_video_Deadpan_audio (MOV 10035 kb)

Pianist2_Normal_video_Exaggerated_audio (MOV 10416 kb)


Pianist1_Deadpan (MOV 9009 kb)


Pianist1_Exaggerated (MOV 10209 kb)


Pianist1_Deadpan_video_Exaggerated_audio (MOV 10043 kb)


Pianist2_Exaggerated_video_Deadpan_audio (MOV 10035 kb)


Pianist2_Normal_video_Exaggerated_audio (MOV 10416 kb)



Frédéric Chopin: Piano Prelude in E minor, Op. 28, No. 4


Rights and permissions

Reprints and Permissions

About this article

Cite this article

Vuoskoski, J.K., Thompson, M.R., Clarke, E.F. et al. Crossmodal interactions in the perception of expressivity in musical performance. Atten Percept Psychophys 76, 591–604 (2014).

Download citation


  • Crossmodal interaction
  • Multisensory integration
  • Music cognition
  • Performance
  • Expressivity
  • Gesture