“Watch and Learn”—Computer Vision for Musical Gesture Analysis

  • Gil WeinbergEmail author
  • Mason Bretan
  • Guy Hoffman
  • Scott Driscoll
Part of the Automation, Collaboration, & E-Services book series (ACES, volume 8)


In the previous chapter we showed how human musicians can benefit from visual and physical cues that are afforded by robotic musicians. Similarly, robotic musicians can benefit by augmenting their own abilities through analyzing visual cues by humans. Like humans, robotic musicians can use vision to anticipate, coordinate and synchronize their music playing with human collaborators.


  1. 1.
    Johansson, Birger, and Christian Balkenius. 2006. An experimental study of anticipation in simple robot navigation. In Workshop on anticipatory behavior in adaptive learning systems, 365–378. Springer.Google Scholar
  2. 2.
    Eyssel, Friederike, Dieta Kuchenbrandt, and Simon Bobinger. 2011. Effects of anticipated human-robot interaction and predictability of robot behavior on perceptions of anthropomorphism. In Proceedings of the 6th international conference on Human-robot interaction, 61–68. ACM.Google Scholar
  3. 3.
    Gielniak, Michael J., and Andrea L Thomaz. 2011. Generating anticipation in robot motion. In 2011 RO-MAN.Google Scholar
  4. 4.
    Hoffman, Guy. 2010. Anticipation in human-robot interaction. In 2010 AAAI Spring Symposium Series.Google Scholar
  5. 5.
    Wang, Zhikun, Christoph H. Lampert, Katharina Mulling, Bernhard Scholkopf, and Jan Peters. 2011. Learning anticipation policies for robot table tennis. In 2011 IEEE/RSJ international conference on intelligent robots and systems (IROS), 332–337. IEEE.Google Scholar
  6. 6.
    Bradski, Gary, and Adrian Kaehler. 2008. Learning OpenCV: Computer vision with the OpenCV library. O’reilly.Google Scholar
  7. 7.
    Puckette, Miller S., Miller S. Puckette Ucsd, Theodore Apel, et al. 1998. Real-time audio analysis tools for Pd and MSP.Google Scholar
  8. 8.
    Ghias, Asif, Jonathan Logan, David Chamberlin, and Brian C. Smith. 1995. Query by humming: Musical information retrieval in an audio database. In Proceedings of the third ACM international conference on Multimedia, 231–236. ACM.Google Scholar
  9. 9.
    Lewis, Barbara E. 1988. The effect of movement-based instruction on first-and third-graders’ achievement in selected music listening skills. Psychology of Music 16 (2): 128–142.CrossRefGoogle Scholar
  10. 10.
    Mitchell, Robert W., and Matthew C. Gallaher. 2001. Embodying music: Matching music and dance in memory. Music Perception 19 (1): 65–85.CrossRefGoogle Scholar
  11. 11.
    Phillips-Silver, Jessica, and Laurel J. Trainor. 2005. Feeling the beat: Movement influences infant rhythm perception. Science 308 (5727): 1430–1430.CrossRefGoogle Scholar
  12. 12.
    Krumhansl, Carol L, and Diana Lynn Schenck. 1997. Can dance reflect the structural and expressive qualities of music? A perceptual experiment on Balanchine’s choreography of Mozart’s divertimento no. 15. Musicae Scientiae 1 (1): 63–85.CrossRefGoogle Scholar
  13. 13.
    Sievers, Beau, Larry Polansky, Michael Casey, and Thalia Wheatley. 2013. Music and movement share a dynamic structure that supports universal expressions of emotion. Proceedings of the National Academy of Sciences 110 (1): 70–75.CrossRefGoogle Scholar
  14. 14.
    Gazzola, Valeria, Lisa Aziz-Zadeh, and Christian Keysers. 2006. Empathy and the somatotopic auditory mirror system in humans. Current Biology 16 (18): 1824–1829.CrossRefGoogle Scholar
  15. 15.
    Grosse, Ernst. 1897. The beginnings of art, vol. 4. D. Appleton and Company.Google Scholar
  16. 16.
    Paradiso, Joseph A., and Hu, Eric. 1997. Expressive footwear for computer-augmented dance performance. In First international symposium on wearable computers, 1997. Digest of papers, 165–166. IEEE.Google Scholar
  17. 17.
    Paradiso, Joseph, and Flavia Sparacino. 1997. Optical tracking for music and dance performance. Optical 3-D measurement techniques IV, 11–18.Google Scholar
  18. 18.
    Camurri, Antonio, Shuji Hashimoto, Matteo Ricchetti, Andrea Ricci, Kenji Suzuki, Riccardo Trocca, and Gualtiero Volpe. 2000. Eyesweb: Toward gesture and affect recognition in interactive dance and music systems. Computer Music Journal 24 (1): 57–69.CrossRefGoogle Scholar
  19. 19.
    Aylward, Ryan, and Joseph A. Paradiso. Sensemble: A wireless, compact, multi-user sensor system for interactive dance. In Proceedings of the 2006 conference on new interfaces for musical expression, 134–139. IRCAM–Centre Pompidou.Google Scholar
  20. 20.
    Winkler, Todd. 1998. Motion-sensing music: Artistic and technical challenges in two works for dance. In Proceedings of the international computer music conference.Google Scholar
  21. 21.
    Samberg, Joshua, Armando Fox, and Maureen Stone. 2002. iClub, an interactive dance club. In ADJUNCT PROCEEDINGS, 73.Google Scholar
  22. 22.
    Bretan, Mason, and Gil Weinberg. 2014. Chronicles of a robotic musical companion. In Proceedings of the 2014 conference on new interfaces for musical expression. University of London.Google Scholar
  23. 23.
    Bouguet, Jean-Yves. 2001. Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation 5.Google Scholar
  24. 24.
    Shi, Jianbo, and Carlo Tomasi. 1994. Good features to track. In Proceedings CVPR’94, 1994 IEEE computer society conference on computer vision and pattern recognition, 593–600. IEEE.Google Scholar
  25. 25.
    Jehan, Tristan, Paul Lamere, and Brian Whitman. 2010. Music retrieval from everything. In Proceedings of the international conference on Multimedia information retrieval, 245–246. ACM.Google Scholar
  26. 26.
    Gouyon, Fabien, Anssi Klapuri, Simon Dixon, Miguel Alonso, George Tzanetakis, Christian Uhle, and Pedro Cano. 2006. An experimental comparison of audio tempo induction algorithms. IEEE Transactions on Audio, Speech, and Language Processing 14 (5): 1832–1844.CrossRefGoogle Scholar
  27. 27.
    Sundram, J. 2013. Danceability and energy: Introducing echo nest attributes.Google Scholar
  28. 28.
    Grunberg, David K., Alyssa M. Batula, Erik M. Schmidt, and Youngmoo E. Kim. Affective gesturing with music mood recognition.Google Scholar
  29. 29.
    Baker, Simon, and Iain Matthews. 2004. Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision 56 (3): 221–255.CrossRefGoogle Scholar
  30. 30.
    Toussaint, Godfried. 2005. The Euclidean algorithm generates traditional musical rhythms. In BRIDGES: Mathematical connections in art, music and science, 1–25.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Gil Weinberg
    • 1
    Email author
  • Mason Bretan
    • 2
  • Guy Hoffman
    • 3
  • Scott Driscoll
    • 4
  1. 1.Georgia Institute of TechnologyAtlantaUSA
  2. 2.NovatoUSA
  3. 3.Cornell UniversityIthacaUSA
  4. 4.AtlantaUSA

Personalised recommendations