3D Action Recognition in an Industrial Environment

  • Markus Hahn
  • Lars Krüger
  • Christian Wöhler
  • Franz Kummert
Part of the Cognitive Systems Monographs book series (COSMOS, volume 6)


In this study we introduce a method for 3D trajectory based recognition of and discrimination between different working actions. The 3D pose of the human hand-forearm limb is tracked over time with a two-hypothesis tracking framework based on the Shape Flow algorithm. A sequence of working actions is recognised with a particle filter based non-stationary Hidden Markov Model framework, relying on the spatial context and a classification of the observed 3D trajectories using the Levenshtein Distance on Trajectories as a measure for the similarity between the observed trajectories and a set of reference trajectories. An experimental evaluation is performed on 20 real-world test sequences acquired from different viewpoints in an industrial working environment. The action-specific recognition rates of our system correspond to more than 90%. The actions are recognised with a delay of typically some tenths of a second. Our system is able to detect disturbances, i.e. interruptions of the sequence of working actions, by entering a safety mode, and it returns to the regular mode as soon as the working actions continue.


Gesture Recognition Hide State Reference Trajectory Industrial Robot Dynamic Bayesian Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Black, M.J., Jepson, A.D.: A probabilistic framework for matching temporal trajectories: Condensation-based recognition of gestures and expressions. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 909–924. Springer, Heidelberg (1998)Google Scholar
  2. 2.
    Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)CrossRefGoogle Scholar
  3. 3.
    Campbell, L.W., Becker, D.A., Azarbayejani, A., Bobick, A.F., Pentland, A.: Invariant features for 3-d gesture recognition. In: FG 1996: Proc. of the 2nd Int. Conf. on Automatic Face and Gesture Recognition, FG 1996 (1996)Google Scholar
  4. 4.
    Croitoru, A., Agouris, P., Stefanidis, A.: 3d trajectory matching by pose normalization. In: GIS 2005: Proc. of the 13th annual ACM international workshop on Geographic information systems, pp. 153–162 (2005)Google Scholar
  5. 5.
    Fritsch, J., Hofemann, N., Sagerer, G.: Combining sensory and symbolic data for manipulative gesture recognition. In: Proc. Int. Conf. on Pattern Recognition, vol. 3, pp. 930–933 (2004)Google Scholar
  6. 6.
    Hahn, M., Krüger, L., Wöhler, C.: 3d action recognition and long-term prediction of human motion. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 23–32. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Hahn, M., Krüger, L., Wöhler, C.: Spatio-temporal 3d pose estimation and tracking of human body parts using the shape flow algorithm. In: Proc. Int. Conf. on Pattern Recognition, Tampa, USA (2008)Google Scholar
  8. 8.
    Hofemann, N.: Videobasierte Handlungserkennung für die natürliche Mensch-Maschine-Interaktion. Dissertation, Universität Bielefeld, Technische Fakultät (2007)Google Scholar
  9. 9.
    Isard, M., Blake, A.: Condensation—conditional density propagation forvisual tracking. Int. J. Comput. Vision 29(1), 5–28 (1998)CrossRefGoogle Scholar
  10. 10.
    Kearsley, S.K.: On the orthogonal transformation used for structural comparisons. Acta Cryst. A45, 208–210 (1989)CrossRefGoogle Scholar
  11. 11.
    Li, Z., Fritsch, J., Wachsmuth, S., Sagerer, G.: An object-oriented approach using a top-down and bottom-up process for manipulative action recognition. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.) DAGM 2006. LNCS, vol. 4174, pp. 212–221. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 104(2), 90–126 (2006)CrossRefGoogle Scholar
  13. 13.
    Murphy, K.P.: Dynamic bayesian networks: representation, inference and learning. PhD thesis, Chair-Russell, Stuart (2002)Google Scholar
  14. 14.
    Park, S.: A hierarchical bayesian network for event recognition of human actions and interactions. Multimedia Systems, Sp. lss. on Video Surveillance 10(2), 164–179 (2004)CrossRefGoogle Scholar
  15. 15.
    Schürmann, J.: Pattern classification: a unified view of statistical and neural approaches. John Wiley & Sons, Inc., Chichester (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Markus Hahn
    • 1
  • Lars Krüger
    • 1
  • Christian Wöhler
    • 1
    • 2
  • Franz Kummert
    • 2
  1. 1.Group Research and Advanced EngineeringDaimler AGUlmGermany
  2. 2.Applied Informatics, Faculty of TechnologyBielefeld UniversityBielefeldGermany

Personalised recommendations