Abstract
Many event sequences in everyday human movement exhibit temporal structure: for instance, footsteps in walking, the striking of balls in a tennis match, the movements of a dancer set to rhythmic music, and the gestures of an orchestra conductor. These events generate prior expectancies regarding the occurrence of future events. Moreover, these expectancies play a critical role in conveying expressive qualities and communicative intent through the movement; thus they are of considerable interest in musical control contexts. To this end, we introduce a novel Bayesian framework which we call the temporal expectancy model and use it to develop an analysis tool for capturing expressive and indicative qualities of the conducting gesture based on temporal expectancies. The temporal expectancy model is a general dynamic Bayesian network (DBN) that can be used to encode prior knowledge regarding temporal structure to improve event segmentation. The conducting analysis tool infers beat and tempo, which are indicative and articulation which is expressive, as well as temporal expectancies regarding beat (ictus and preparation instances) from conducting gesture. Experimental results using our analysis framework reveal a very strong correlation in how significantly the preparation expectancy builds up for staccato vs legato articulation, which bolsters the case for temporal expectancy as cognitive model for event anticipation, and as a key factor in the communication of expressive qualities of conducting gesture. Our system operates on data obtained from a marker based motion capture system, but can be easily adapted for more affordable technologies like video camera arrays.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ozyurek, S.K.A., Willems, R.M., Hagoort, P.: On-line integration of semantic information from speech and gesture: Insights from event-related brain potentials. Journal of Cognitive Neuroscience 19, 605–616 (2007)
Arulampalam, S., Maskell, S., Gordon, N., Clapp, T.: Tutorial on particle filters for on-line nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing (2001)
Barabasi, A.L.: The origin of bursts and heavy tails in human dynamics. Nature 435, 207 (2005)
Berger, J., Gang, D.: A neural network model of metric perception and cognition in the audition of functional tonal music. In: International Computer Music Conference (1997)
Bregler, C.: Learning and recognizing human dynamics in video sequences. In: International conference on computer vision and pattern recognition (1997)
Cemgil, A., Kappen, H.J., Desain, P., Honing, H.: On tempo tracking: Tempogram representation and Kalman filtering. In: Proceedings of the 2000 International Computer Music Conference, pp. 352–355 (2000)
Cemgil, A.T.: Bayesian Music Transcription. PhD thesis, Radboud University (2004)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley and Sons, Chichester (1999)
Desain, P.: What rhythm do I have in mind? Detection of imagined temporal patterns from single trial ERP. In: Proceedings of the International Conference on Music Perception and Cognition (ICMPC) (2004)
Campana, S.P.M.K.T.E., Silverman, L., Bennetto, L.: Listeners immediately integrate natural combinations of speech and iconic gesture. Language and Cognitive Processes (submitted)
Hackney, P.: Making Connections: Total Body Integration Through Bartenieff Fundamentals. Routledge (2000)
Hagendoorn, I.G.: Some speculative hypotheses about the nature and perception of dance and choreography. Journal of Consciousness Studies, 79–110 (2004)
Hainsworth, S.W.: Techniques for the Automated Analysis of Musical Audio. PhD thesis, University of Cambridge (2003)
Holle, H., Gunter, T.C.: The role of iconic gestures in speech disambiguation: ERP evidence. Journal of Cognitive Neuroscience 19, 1175–1192 (2007)
Huron, D.: Sweet Anticipation: Music and the Psychology of Expectation (Bradford Books). MIT Press, Cambridge (2006)
Jaynes, E.T.: Probability Theory: Logic of Science, Cambridge (2003)
Jones, M.R., McAuley, J.D.: Time judgments in global temporal contexts. Perception and Psychophysics, 398–417 (2005)
Kay, S.M.: Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc., Englewood Cliffs (1993)
Kolesnik, P., Wanderley, M.: Recognition, analysis and performance with expressive conducting gestures. In: International Computer Music Conference (2004)
Kutas, M., Federmeier, K.: Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Science 4, 463–470 (2000)
Lee, E., Grull, I., Kiel, H., Borchers, J.: conga: A framework for adaptive conducting gesture analysis. In: International Conference on New Interfaces for Musical Expression (2006)
Leistikow, R.: Bayesian Modeling of Musical Expectations using Maximum Entropy Stochastic Grammars. PhD thesis, Stanford University (2006)
McAuley, J.D.: The effect of tempo and musical experience on perceived beat. Australian Journal of Psychology, 176–187 (1999)
Meyer, L.B.: Emotion and Meaning in Music. University Of Chicago Press (1961)
Miranda, R.A., Ullman, M.T.: Double dissociation between rules and memory in music: An event-related potential study. NeuroImage 38, 331–345 (2007)
Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Computer Vision and Image Understanding 81, 231–268 (2001)
Moeslund, T.B., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. International Journal of Computer Vision and Image Understanding (2006)
Murphy, D., Andersen, T., Jensen, K.: Conducting Audio Files via Computer Vision. In: Camurri, A., Volpe, G. (eds.) GW 2003. LNCS (LNAI), vol. 2915, pp. 529–540. Springer, Heidelberg (2004)
Murphy, K.: Dynamic Bayesian Networks:Representation, Inference and Learning. PhD thesis, University of California, Berkeley (2002)
Narmour, E.: The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model. University of Chicago Press, Chicago (1990)
Povel, D.J., Essens, P.: Perception of temporal patterns. Music Perception, 411–440 (1985)
Ross, S.: Stochastic Processes. Wiley Interscience, Chichester (1995)
Kelly, C.K.S.D., Hopkins, M.: Neural correlates of bimodal speech and gesture comprehension. Brain and Language 89(1), 253–260 (2004)
Koelsch, D.S.K.S.T.G.S., Kasper, E., Friederici, A.D.: Music, language and meaning: Brain signatures of semantic processing. Nature Neuroscience 7, 302–307 (2004)
Savitzky, A., Golay, M.J.E.: Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 1627–1639 (1964)
Thornburg, H.: Detection and Modeling of Transient Audio Signals with Prior Information. PhD thesis, Stanford University (2005)
Thornburg, H., Swaminathan, D., Ingalls, T., Leistikow, R.: Joint segmentation and temporal structure inference for partially-observed event sequences. In: International Workshop on Multimedia Signal Processing (2006)
Torresani, L., Hackney, P., Bregler, C.: Learning motion style synthesis from perceptual observations. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 1393–1400. MIT Press, Cambridge (2007)
Ude, A.: Robust estimation of human body kinematics from video. In: Proc. IEEE/RSJ Conf. Intelligent Robots and Systems (1999)
Urtasun, R., Fleet, D.J., Fua, P.: 3d people tracking with gaussian process dynamical models. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)
Usa, S., Mochida, Y.: A multi-modal conducting simulator. In: International Computer Music Conference (1978)
Weisstein, E.W.: MathWorld–A Wolfram Web Resource, http://mathworld.wolfram.com/hypotrochoid.html
Wu, Y.C., Coulson, S.: Meaningful gestures: Electrophysiological indices of iconic gesture comprehension. Psychophysiology 42, 654–667 (2005)
Wu, Y.C., Coulson, S.: How iconic gestures enhance communication: An ERP study. Brain and Language (in press, 2007)
Xenakis, I.: Formalized Music. Pendragon Press, Stuyvesant (1992)
Zanto, T.P., Snyder, J.S., Large, E.W.: Neural correlates of rhythmic expectancy. Advances in Cognitive Psychology, 221–231 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Swaminathan, D. et al. (2008). Capturing Expressive and Indicative Qualities of Conducting Gesture: An Application of Temporal Expectancy Models. In: Kronland-Martinet, R., Ystad, S., Jensen, K. (eds) Computer Music Modeling and Retrieval. Sense of Sounds. CMMR 2007. Lecture Notes in Computer Science, vol 4969. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85035-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-85035-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85034-2
Online ISBN: 978-3-540-85035-9
eBook Packages: Computer ScienceComputer Science (R0)