Skip to main content

Capturing Expressive and Indicative Qualities of Conducting Gesture: An Application of Temporal Expectancy Models

  • Conference paper
Computer Music Modeling and Retrieval. Sense of Sounds (CMMR 2007)

Abstract

Many event sequences in everyday human movement exhibit temporal structure: for instance, footsteps in walking, the striking of balls in a tennis match, the movements of a dancer set to rhythmic music, and the gestures of an orchestra conductor. These events generate prior expectancies regarding the occurrence of future events. Moreover, these expectancies play a critical role in conveying expressive qualities and communicative intent through the movement; thus they are of considerable interest in musical control contexts. To this end, we introduce a novel Bayesian framework which we call the temporal expectancy model and use it to develop an analysis tool for capturing expressive and indicative qualities of the conducting gesture based on temporal expectancies. The temporal expectancy model is a general dynamic Bayesian network (DBN) that can be used to encode prior knowledge regarding temporal structure to improve event segmentation. The conducting analysis tool infers beat and tempo, which are indicative and articulation which is expressive, as well as temporal expectancies regarding beat (ictus and preparation instances) from conducting gesture. Experimental results using our analysis framework reveal a very strong correlation in how significantly the preparation expectancy builds up for staccato vs legato articulation, which bolsters the case for temporal expectancy as cognitive model for event anticipation, and as a key factor in the communication of expressive qualities of conducting gesture. Our system operates on data obtained from a marker based motion capture system, but can be easily adapted for more affordable technologies like video camera arrays.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ozyurek, S.K.A., Willems, R.M., Hagoort, P.: On-line integration of semantic information from speech and gesture: Insights from event-related brain potentials. Journal of Cognitive Neuroscience 19, 605–616 (2007)

    Article  Google Scholar 

  2. Arulampalam, S., Maskell, S., Gordon, N., Clapp, T.: Tutorial on particle filters for on-line nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing (2001)

    Google Scholar 

  3. Barabasi, A.L.: The origin of bursts and heavy tails in human dynamics. Nature 435, 207 (2005)

    Article  Google Scholar 

  4. Berger, J., Gang, D.: A neural network model of metric perception and cognition in the audition of functional tonal music. In: International Computer Music Conference (1997)

    Google Scholar 

  5. Bregler, C.: Learning and recognizing human dynamics in video sequences. In: International conference on computer vision and pattern recognition (1997)

    Google Scholar 

  6. Cemgil, A., Kappen, H.J., Desain, P., Honing, H.: On tempo tracking: Tempogram representation and Kalman filtering. In: Proceedings of the 2000 International Computer Music Conference, pp. 352–355 (2000)

    Google Scholar 

  7. Cemgil, A.T.: Bayesian Music Transcription. PhD thesis, Radboud University (2004)

    Google Scholar 

  8. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley and Sons, Chichester (1999)

    Google Scholar 

  9. Desain, P.: What rhythm do I have in mind? Detection of imagined temporal patterns from single trial ERP. In: Proceedings of the International Conference on Music Perception and Cognition (ICMPC) (2004)

    Google Scholar 

  10. Campana, S.P.M.K.T.E., Silverman, L., Bennetto, L.: Listeners immediately integrate natural combinations of speech and iconic gesture. Language and Cognitive Processes (submitted)

    Google Scholar 

  11. Hackney, P.: Making Connections: Total Body Integration Through Bartenieff Fundamentals. Routledge (2000)

    Google Scholar 

  12. Hagendoorn, I.G.: Some speculative hypotheses about the nature and perception of dance and choreography. Journal of Consciousness Studies, 79–110 (2004)

    Google Scholar 

  13. Hainsworth, S.W.: Techniques for the Automated Analysis of Musical Audio. PhD thesis, University of Cambridge (2003)

    Google Scholar 

  14. Holle, H., Gunter, T.C.: The role of iconic gestures in speech disambiguation: ERP evidence. Journal of Cognitive Neuroscience 19, 1175–1192 (2007)

    Article  Google Scholar 

  15. Huron, D.: Sweet Anticipation: Music and the Psychology of Expectation (Bradford Books). MIT Press, Cambridge (2006)

    Google Scholar 

  16. Jaynes, E.T.: Probability Theory: Logic of Science, Cambridge (2003)

    Google Scholar 

  17. Jones, M.R., McAuley, J.D.: Time judgments in global temporal contexts. Perception and Psychophysics, 398–417 (2005)

    Google Scholar 

  18. Kay, S.M.: Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc., Englewood Cliffs (1993)

    MATH  Google Scholar 

  19. Kolesnik, P., Wanderley, M.: Recognition, analysis and performance with expressive conducting gestures. In: International Computer Music Conference (2004)

    Google Scholar 

  20. Kutas, M., Federmeier, K.: Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Science 4, 463–470 (2000)

    Article  Google Scholar 

  21. Lee, E., Grull, I., Kiel, H., Borchers, J.: conga: A framework for adaptive conducting gesture analysis. In: International Conference on New Interfaces for Musical Expression (2006)

    Google Scholar 

  22. Leistikow, R.: Bayesian Modeling of Musical Expectations using Maximum Entropy Stochastic Grammars. PhD thesis, Stanford University (2006)

    Google Scholar 

  23. McAuley, J.D.: The effect of tempo and musical experience on perceived beat. Australian Journal of Psychology, 176–187 (1999)

    Google Scholar 

  24. Meyer, L.B.: Emotion and Meaning in Music. University Of Chicago Press (1961)

    Google Scholar 

  25. Miranda, R.A., Ullman, M.T.: Double dissociation between rules and memory in music: An event-related potential study. NeuroImage 38, 331–345 (2007)

    Article  Google Scholar 

  26. Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Computer Vision and Image Understanding 81, 231–268 (2001)

    Article  MATH  Google Scholar 

  27. Moeslund, T.B., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. International Journal of Computer Vision and Image Understanding (2006)

    Google Scholar 

  28. Murphy, D., Andersen, T., Jensen, K.: Conducting Audio Files via Computer Vision. In: Camurri, A., Volpe, G. (eds.) GW 2003. LNCS (LNAI), vol. 2915, pp. 529–540. Springer, Heidelberg (2004)

    Google Scholar 

  29. Murphy, K.: Dynamic Bayesian Networks:Representation, Inference and Learning. PhD thesis, University of California, Berkeley (2002)

    Google Scholar 

  30. Narmour, E.: The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model. University of Chicago Press, Chicago (1990)

    Google Scholar 

  31. Povel, D.J., Essens, P.: Perception of temporal patterns. Music Perception, 411–440 (1985)

    Google Scholar 

  32. Ross, S.: Stochastic Processes. Wiley Interscience, Chichester (1995)

    Google Scholar 

  33. Kelly, C.K.S.D., Hopkins, M.: Neural correlates of bimodal speech and gesture comprehension. Brain and Language 89(1), 253–260 (2004)

    Article  Google Scholar 

  34. Koelsch, D.S.K.S.T.G.S., Kasper, E., Friederici, A.D.: Music, language and meaning: Brain signatures of semantic processing. Nature Neuroscience 7, 302–307 (2004)

    Article  Google Scholar 

  35. Savitzky, A., Golay, M.J.E.: Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 1627–1639 (1964)

    Google Scholar 

  36. Thornburg, H.: Detection and Modeling of Transient Audio Signals with Prior Information. PhD thesis, Stanford University (2005)

    Google Scholar 

  37. Thornburg, H., Swaminathan, D., Ingalls, T., Leistikow, R.: Joint segmentation and temporal structure inference for partially-observed event sequences. In: International Workshop on Multimedia Signal Processing (2006)

    Google Scholar 

  38. Torresani, L., Hackney, P., Bregler, C.: Learning motion style synthesis from perceptual observations. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 1393–1400. MIT Press, Cambridge (2007)

    Google Scholar 

  39. Ude, A.: Robust estimation of human body kinematics from video. In: Proc. IEEE/RSJ Conf. Intelligent Robots and Systems (1999)

    Google Scholar 

  40. Urtasun, R., Fleet, D.J., Fua, P.: 3d people tracking with gaussian process dynamical models. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)

    Google Scholar 

  41. Usa, S., Mochida, Y.: A multi-modal conducting simulator. In: International Computer Music Conference (1978)

    Google Scholar 

  42. Weisstein, E.W.: MathWorld–A Wolfram Web Resource, http://mathworld.wolfram.com/hypotrochoid.html

  43. Wu, Y.C., Coulson, S.: Meaningful gestures: Electrophysiological indices of iconic gesture comprehension. Psychophysiology 42, 654–667 (2005)

    Article  Google Scholar 

  44. Wu, Y.C., Coulson, S.: How iconic gestures enhance communication: An ERP study. Brain and Language (in press, 2007)

    Google Scholar 

  45. Xenakis, I.: Formalized Music. Pendragon Press, Stuyvesant (1992)

    Google Scholar 

  46. Zanto, T.P., Snyder, J.S., Large, E.W.: Neural correlates of rhythmic expectancy. Advances in Cognitive Psychology, 221–231 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Richard Kronland-Martinet Sølvi Ystad Kristoffer Jensen

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Swaminathan, D. et al. (2008). Capturing Expressive and Indicative Qualities of Conducting Gesture: An Application of Temporal Expectancy Models. In: Kronland-Martinet, R., Ystad, S., Jensen, K. (eds) Computer Music Modeling and Retrieval. Sense of Sounds. CMMR 2007. Lecture Notes in Computer Science, vol 4969. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85035-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85035-9_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85034-2

  • Online ISBN: 978-3-540-85035-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics