Tracking in Action Space

  • Dennis L. Herzog
  • Volker Krüger
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6553)


The recognition of human actions such as pointing at objects (“Give me that...”) is difficult because they ought to be recognized independent of scene parameters such as viewing direction. Furthermore, the parameters of the action, such as pointing direction, are important pieces of information. One common way to achieve recognition is by using 3D human body tracking followed by action recognition based on the captured tracking data. General 3D body tracking is, however, still a difficult problem. In this paper, we are looking at human body tracking for action recognition from a context-driven perspective. Instead of the space of human body poses, we consider the space of possible actions of a given context and argue that 3D body tracking reduces to action tracking in the parameter space in which the actions live. This reduces the high-dimensional problem to a low-dimensional one. In our approach, we use parametric hidden Markov models to represent parametric movements; particle filtering is used to track in the space of action parameters. Our approach is content with monocular video data and we demonstrate its effectiveness on synthetic and on real image sequences. In the experiments we focus on human arm movements.


Action Space Action Recognition Action Tracking Observation Function Action Primitive 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Asfour, T., Welke, K., Ude, A., Azad, P., Dillmann, R.: Perceiving Objects and Movemetns to Generate Actions on a Humanoid Robot. In: Kragic, D., Kyrki, V. (eds.) Unifying Perspectives in Computational and Robot Vision. LNEE, vol. 8, pp. 41–55. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  2. 2.
    Krüger, V., Kragic, D., Ude, A., Geib, C.: The meaning of action: A review on action recognition and mapping. Advanced Robotics 21, 1473–1501 (2007)Google Scholar
  3. 3.
    Wilson, A.D., Bobick, A.F.: Parametric hidden markov models for gesture recognition. PAMI 21, 884–900 (1999)CrossRefGoogle Scholar
  4. 4.
    Ren, H., Xu, G., Kee, S.: Subject-independent Natural Action Recognition. In: International Conference on Automatic Face and Gesture Recognition, Seoul, Korea, May 17-19 (2004)Google Scholar
  5. 5.
    Lv, F., Nevatia, R.: Recognition and Segmentation of 3-D Human Action Using HMM and Multi-class AdaBoost. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part IV. LNCS, vol. 3954, pp. 359–372. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Xiang, T., Gong, S.: Beyond Tracking: Modelling Action and Understanding Behavior. International Journal of Computer Vision 67, 21–51 (2006)CrossRefGoogle Scholar
  7. 7.
    Lee, M., Nevatia, R.: Human pose tracking in monocular sequences using multilevel structured models. PAMI 31, 27–38 (2009)CrossRefGoogle Scholar
  8. 8.
    Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: CVPR, vol. 2, pp. 126–133 (2000)Google Scholar
  9. 9.
    Sidenbladh, H., Black, M.J., Sigal, L.: Implicit Probabilistic Models of Human Motion for Synthesis and Tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 784–800. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Sminchisescu, C., Triggs, B.: Covarinace Scaled Sampling for Monocular 3D Body Tracking. In: CVPR, Kauai Marriott, Hawaii (2001)Google Scholar
  11. 11.
    Gupta, A., Davis, L.: Objects in action: An approach for combining action understanding and object perception. In: CVPR (2007)Google Scholar
  12. 12.
    Kjellström, H., Romero, J., Martínez, D., Kragić, D.: Simultaneous Visual Recognition of Manipulation Actions and Manipulated Objects. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 336–349. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Helbig, H.B., Graf, M., Kiefer, M.: The role of action representation in visual object. Experimental Brain Research 174, 221–228 (2006)CrossRefGoogle Scholar
  14. 14.
    Bub, D., Masson, M.: Gestural knowledge evoked by objects as part of conceptual representations. Aphasiology 20, 1112–1124 (2006)CrossRefGoogle Scholar
  15. 15.
    Rizzolatti, G., Fogassi, L., Gallese, V.: Neurophysiological Mechanisms Underlying the Understanding and Imitation of Action. Nature Reviews 2, 661–670 (2001)CrossRefGoogle Scholar
  16. 16.
    Rizzolatti, G., Fogassi, L., Gallese, V.: Parietal Cortex: from Sight to Action. Current Opinion in Neurobiology 7, 562–567 (1997)CrossRefGoogle Scholar
  17. 17.
    Guerra-Filho, G., Aloimonos, Y.: A sensory-motor language for human activity understanding. HUMANOIDS (2006)Google Scholar
  18. 18.
    Jenkins, O., Mataric, M.: Deriving Action and Behavior Primitives from Human Motion Data. In: International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, September 30-October 4, pp. 2551–2556 (2002)Google Scholar
  19. 19.
    Guerra-Filho, G., Aloimonos, Y.: A language for human action. Computer 40, 42–51 (2007)CrossRefGoogle Scholar
  20. 20.
    Ivanov, Y., Bobick, A.: Recognition of Visual Activities and Interactions by Stochastic Parsing. PAMI 22, 852–872 (2000)CrossRefGoogle Scholar
  21. 21.
    Moeslund, T., Hilton, A., Krueger, V.: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 104, 90–127 (2006)CrossRefGoogle Scholar
  22. 22.
    Isard, M., Blake, A.: Condensation – conditional density propagation for visual tracking. International Journal of Computer Vision 29, 5–28 (1998)CrossRefGoogle Scholar
  23. 23.
    Gupta, A., Mittal, A., Davis, L.S.: Constraint integration for efficient multiview pose estimation with self-occlusions. PAMI 30, 493–506 (2008)CrossRefGoogle Scholar
  24. 24.
    Gall, J., Patthoff, J., Schnoerr, C., Rosenhahn, B., Seidel, H.P.: Interacting and annealing particle filters: Mathematics and recipe for applications. Jounral of Mathematical Imaging and Vision 28, 1–18 (2007)CrossRefGoogle Scholar
  25. 25.
    Urtasun, R., Fua, P.: 3D Human Body Tracking Using Deterministic Temporal Motion Models. In: Pajdla, T., Matas, J. (eds.) ECCV 2004, Part III. LNCS, vol. 3023, pp. 92–106. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  26. 26.
    Elgammal, A., Lee, C.S.: Inferring 3D body pose from silhouettes using activity manifold learning. In: CVPR (2004)Google Scholar
  27. 27.
    Wang, J.M., Fleet, D.J., Hertzmann, A.: Correction to ”gaussian process dynamical models for human motion”. PAMI 30, 1118 (2008)CrossRefGoogle Scholar
  28. 28.
    Rabiner, L.R., Juang, B.H.: An introduction to hidden Markov models. IEEE ASSP Magazine, 4–15 (1986)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Dennis L. Herzog
    • 1
  • Volker Krüger
    • 1
  1. 1.Copenhagen Institute of TechnologyAalborg UniversityBallerupDenmark

Personalised recommendations