Recognizing Human Actions by Their Pose

  • Christian Thurau
  • Václav Hlaváč
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5604)


The topic of human action recognition from image sequences gained increasing interest throughout the last years. Interestingly, the majority of approaches are restricted to dynamic motion features and therefore not universally applicable. In this paper, we propose to recognize human actions by evaluating a distribution over a set of predefined static poses which we refer to as pose primitives. We aim at a generally applicable approach that also works in still images, or for images taken from a moving camera. Experimental validation takes varying video sequence lengths into account and emphasizes the possibility for action recognition from single images, which we believe is an often overlooked but nevertheless important aspect of action recognition.

The proposed approach uses a set of training video sequences to estimate pose and action class representations. To incorporate the local temporal context of poses, atomic subsequences of poses using n-gram expressions are explored. Action classes can be represented by histograms of poses primitive n-grams which allows for action recognition by means of histogram comparison. Although the suggested action recognition method is independent of the underlying low-level representation of poses, representations remain important for targeting practical problems. Thus, to deal with common problems in video based action recognition, e.g. articulated poses and cluttered background, a recently introduced Histogram of Oriented Gradient based descriptor is extended using a non-negative matrix factorization reconstruction.


Action Class Action Recognition Latent Dirichlet Allocation Background Clutter Human Action Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, A., Triggs, B.: A Local Basis Representation for Estimating Human Pose from Cluttered Images. In: Narayanan, P.J., Nayar, S.K., Shum, H.-Y. (eds.) ACCV 2006. LNCS, vol. 3851, pp. 50–59. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Ali, S., Basharat, A., Shah, M.: Chaotic Invariants for Human Action Recognition. In: ICCV 2007 (2007)Google Scholar
  3. 3.
    Bissacco, A., Yang, M.H., Soatto, S.: Detecting Humans via Their Pose. In: NIPS 2006 (2006)Google Scholar
  4. 4.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as Space-Time Shapes. In: ICCV 2005 (2005)Google Scholar
  5. 5.
    Carlsson, S., Sullivan, J.: Action recognition by shape matching to key frames. In: Workshop on Models versus Exemplars in Computer Vision (2001)Google Scholar
  6. 6.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: CVPR 2005 (2005)Google Scholar
  7. 7.
    Ferrari, V., Marin, M., Zisserman, A.: Progressive Search Space Reduction for Human Pose Estimation. In: CVPR 2008 (2008)Google Scholar
  8. 8.
    Flash, T., Hochner, B.: Motor primitives in vertebrates and invertebrates. Current Opinion in Neurobiology 15(6), 660–666 (2005)CrossRefGoogle Scholar
  9. 9.
    Fod, A., Matarić, M., Jenkins, O.: Automated Derivation of Primitives for Movement Classification. Autonomous Robots 12(1), 39–54 (2002)zbMATHCrossRefGoogle Scholar
  10. 10.
    Ghahramani, Z.: Building blocks of movement. Nature 407, 682–683 (2000)CrossRefGoogle Scholar
  11. 11.
    Goldenberg, R., Kimmel, R., Rivlin, E., Rudzsky, M.: Behavior classification by eigendecomposition of periodic motions. Pattern Recognition 38, 1033–1043 (2005)Google Scholar
  12. 12.
    Guerra-Filho, G., Aloimonos, Y.: A Sensory-Motor Language for Human Activity Understanding. In: 6th IEEE-RAS International Conference on Humanoid Robots (HUMANOIDS 2006), pp. 69–75 (2006)Google Scholar
  13. 13.
    Hamid, R., Johnson, A., Batta, S., Bobick, A., Isbell, C., Coleman, G.: Detection and Explanation of Anomalous Activities: Representing Activities as Bags of Event n-Grams. In: CVPR 2005 (2005)Google Scholar
  14. 14.
    Hoyer, P.O.: Non-negative Matrix Factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)Google Scholar
  15. 15.
    Ikizler, N., Duygulu, P.: Human Action Recognition Using Distribution of Oriented Rectangular Patches. In: Human Motion ICCV 2007 (2007)Google Scholar
  16. 16.
    Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A Biologically Inspired System for Action Recognition. In: ICCV 2007 (2007)Google Scholar
  17. 17.
    Laptev, I., Perez, P.: Retrieving actions in movies. In: ICCV 2007 (2007)Google Scholar
  18. 18.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–799 (1999)CrossRefGoogle Scholar
  19. 19.
    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorizationi. In: NIPS 2001 (2001)Google Scholar
  20. 20.
    Lu, W.L., Little, J.J.: Simultaneous Tracking and Action Recognition using the PCA-HOG Descriptor. In: CRV 2006 (2006)Google Scholar
  21. 21.
    Moeslund, T., Fihl, P., Holte, M.: Action Recognition using Motion Primitives. In: Danish Conference on Pattern Recognition and Image Analysis (2006)Google Scholar
  22. 22.
    Moeslund, T., Reng, L., Granum, E.: Finding Motion Primitives in Human Body Gestures. In: Wolfmann, J., Cohen, G. (eds.) Coding Theory 1988. LNCS (LNAI), vol. 388, pp. 133–144. Springer, Heidelberg (1989)Google Scholar
  23. 23.
    Niebles, J.C., Fei-Fei, L.: A Hierarchical Model of Shape and Appearance for Human Action Classification. In: CVPR 2007 (2007)Google Scholar
  24. 24.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. In: BMVC 2006 (2006)Google Scholar
  25. 25.
    Ogale, A.S., Karapurkar, A., Aloimonos, Y.: View-invariant modeling and recognition of human actions using grammars. In: ICCV Workshop on Dynamical Vision (2005)Google Scholar
  26. 26.
    Schack, T., Mechsner, F.: Representation of motor skills in human long-term memory. Neuroscience Letters 391, 77–81 (2006)CrossRefGoogle Scholar
  27. 27.
    Schindler, K., van Gool, L.: Action Snippets: How many frames does human action recognition require? In: CVPR 2008 (2008)Google Scholar
  28. 28.
    Schroff, F., Criminisi, A., Zisserman, A.: Single-Histogram Class Models for Image Segmentation. In: Kalra, P.K., Peleg, S. (eds.) ICVGIP 2006. LNCS, vol. 4338, pp. 82–93. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  29. 29.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering object categories in image collections. In: Proceedings of the International Conference on Computer Vision (2005)Google Scholar
  30. 30.
    Thoroughman, K., Shadmehr, R.: Learning of action through adaptive combination of motor primitives. Nature 407, 742–747 (2000)CrossRefGoogle Scholar
  31. 31.
    Thurau, C.: Behavior Histograms for Action Recognition and Human Detection. In: Human Motion ICCV 2007 (2007)Google Scholar
  32. 32.
    Thurau, C., Bauckhage, C., Sagerer, G.: Synthesizing Movements for Computer Game Characters. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 179–186. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  33. 33.
    Thurau, C., Hlaváč, V.: n-grams of action primitives for recognizing human behavior. In: Kropatsch, W.G., Kampel, M., Hanbury, A. (eds.) CAIP 2007. LNCS, vol. 4673, Springer, Heidelberg (2007)CrossRefGoogle Scholar
  34. 34.
    Thurau, C., Hlaváč, V.: Pose primitive based human action recognition in videos or still images. In: International Conference on Computer Vision and Pattern Recognition (CVPR 2008), IEEE, Los Alamitos (2008)Google Scholar
  35. 35.
    Vangeneugden, J., Pollick, F., Vogels, R.: Functional differentiation of macaque visual temporal cortical neurons using a parameterized action space. J. Vis. 8(6), 232–232 (2008), Google Scholar
  36. 36.
    Weiland, D., Boyer, E.: Action Recognition using Exemplar-based Embedding. In: CVPR 2008 (2008)Google Scholar
  37. 37.
    Wolpert, D.M., Ghahramani, Z., Flanagan, J.R.: Perspectives and problems in motor learning. TRENDS in Cognitive Sciences 5(11), 487–494 (2001)CrossRefGoogle Scholar
  38. 38.
    Zhang, L., Wu, B., Nevatia, R.: Detection and Tracking of Multiple Humans with Extensive Pose Articulation. In: ICCV 2007 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Christian Thurau
    • 1
  • Václav Hlaváč
    • 2
  1. 1.Fraunhofer IAIS, Schloss BirlinghovenSankt AugustinGermany
  2. 2.Faculty of Electrical Engineering, Center for Machine PerceptionCzech Technical University in PraguePrague 2Czech Republic

Personalised recommendations