Skip to main content

Model-Based Human Motion Tracking and Behavior Recognition Using Hierarchical Finite State Automata

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3046))

Abstract

The generation of motion of an articulated body for computer animation is an expensive and time-consuming task. Recognition of human actions and interactions is important to video annotation, automated surveillance, and content-based video retrieval. This paper presents a new model-based human-intervention-free approach to articulated body motion tracking and recognition of human interaction using static-background monocular video sequences. This paper presents two major applications based on basic motion tracking: motion capture and human behavior recognition.

To determine a human body configuration in a scene, a 3D human body model is postulated and projected on a 2D projection plane to overlap with the foreground image silhouette. We convert the human model body overlapping problem into a parameter optimization problem to avoid the kinematic singularity problem. Unlike other methods, our body tracking does not need any user intervention. A cost function is used to estimate the degree of the overlapping between the foreground input image silhouette and a projected 3D model body silhouette. The configuration the best overlap with the foreground of the image least overlap with the background is sought. The overlapping is computed using computational geometry by converting a set of pixels from the image domain to a polygon in the 2D projection plane domain.

We recognize human interaction motion using hierarchical finite state automata (FA). The model motion data we get from tracking is analyzed to get various states and events in terms of feet, torso, and hands by a low-level behavior recognition model. The recognition model represents human behaviors as sequences of states that classify the configuration of individual body parts in space and time. To overcome the exponential growth of the number of states that usually occurs in a single-level FA, we present a new hierarchical FA that abstracts states and events from motion data at three levels: the low-level FA analyzes body parts only, the middle-level FAs recognize motion and the high-level FAs analyze a human interaction. Motion tracking results and behavior recognition from video sequences are very encouraging.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Morris, D., Rehg, J.: Singularity analysis for articulated object tracking. In: Computer Vision and Pattern Recognition (1998)

    Google Scholar 

  2. Park, J., Park, S., Aggarwal, J.K.: Human motion tracking by combining viewbased and model-based methods for monocular video sequences. In: Kumar, V., Gavrilova, M.L., Tan, C.J.K., L’Ecuyer, P. (eds.) ICCSA 2003. LNCS, vol. 2669, Springer, Heidelberg (2003)

    Google Scholar 

  3. Park, J., Park, S., Aggarwal, J.K.: Model-based human motion capture from monocular video sequences. In: Yazıcı, A., Şener, C. (eds.) ISCIS 2003. LNCS, vol. 2869, pp. 405–412. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Park, S., Park, J., Aggarwal, J.K.: Video retrieval of human interactions using model-based motion tracking and multi-layer finite state automata. In: Bakker, E.M., Lew, M., Huang, T.S., Sebe, N., Zhou, X.S. (eds.) CIVR 2003. LNCS, vol. 2728, Springer, Heidelberg (2003)

    Google Scholar 

  5. Oliver, N.M., Rosario, B., Pentland, A.P.: A Bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Analysis and Machine Intelligence 22, 831–843 (2000)

    Article  Google Scholar 

  6. Hongeng, S., Bremond, F., Nevatia, R.: Representation and optimal recognition of human activities. In: IEEE Conf. on Computer Vision and Pattern Recognition., vol. 1, pp. 818–825 (2000)

    Google Scholar 

  7. Hong, P., Turk, M., Huang, T.S.: Gesture modeling and recognition using finite state machines. In: IEEE Conf. on Face and Gesture Recognition (2000)

    Google Scholar 

  8. Wada, T., Matsuyama, T.: Appearance based behavior recognition by event driven slective attention. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Orlando, FL, pp. 759–764 (1998)

    Google Scholar 

  9. Lasdon, L., Waren, A.: GRG2 User’s Guide (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Park, J., Park, S., Aggarwal, J.K. (2004). Model-Based Human Motion Tracking and Behavior Recognition Using Hierarchical Finite State Automata. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds) Computational Science and Its Applications – ICCSA 2004. ICCSA 2004. Lecture Notes in Computer Science, vol 3046. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24768-5_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24768-5_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22060-2

  • Online ISBN: 978-3-540-24768-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics