Combining Densely Sampled Form and Motion for Human Action Recognition

Schindler, Konrad; van Gool, Luc

doi:10.1007/978-3-540-69321-5_13

Konrad Schindler¹ &
Luc van Gool^1,2

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5096))

Included in the following conference series:

Joint Pattern Recognition Symposium

2318 Accesses
3 Citations

Abstract

We present a method for human action recognition from video, which exploits both form (local shape) and motion (local flow). Inspired by models of the human visual system, the two feature sets are processed independently in separate channels. The form channel extracts a dense local shape representation from every frame, while the motion channel extracts dense optic flow from the frame and its immediate predecessor. The same processing pipeline is applied in both channels: feature maps are pooled locally, down-sampled, and compared to a collection of learnt templates, yielding a vector of similarity scores. In a final step, the two score vectors are merged, and recognition is performed with a discriminative classifier. In an evaluation on two standard datasets our method outperforms the state-of-the-art, confirming that the combination of form and motion improves recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ali, S., Basharat, A., Shah, M.: Chaotic invariants for human action recognition. In: Proc. ICCV (2007)
Google Scholar
Beintema, J.A., Lappe, M.: Perception of biological motion without local image motion. P. Natl. Acad. Sci. USA 99, 5661–5663 (2002)
Article Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Proc. ICCV (2005)
Google Scholar
Carlsson, S., Sullivan, J.: Action recognition by shape matching to key frames. In: Proc. Workshop on Models versus Exemplars in Computer Vision (2001)
Google Scholar
Casile, A., Giese, M.A.: Critical features for the recognition of biological motion. J. Vision 5, 348–360 (2005)
Article Google Scholar
Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE T. Pattern Anal. 25(5), 564–575 (2003)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc ICCV, pp. 886–893 (2005)
Google Scholar
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Workshop on Performance Evaluation of Tracking and Surveillance (VS-PETS) (2005)
Google Scholar
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proc. ICCV (2003)
Google Scholar
Felleman, D.J., van Essen, D.C.: Distributed hierarchical processing in the primate visual cortex. Cereb. Cortex 1, 1–47 (1991)
Article Google Scholar
Field, D.J.: Relations between the statistics of natural images and the response properties of cortical cells. J. Opt. Soc. Am. A. 4(12), 2379–2394 (1987)
Article Google Scholar
Fukushima, K.: Neocognitron: a self-organizing neural network model for mechanisms of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980)
Article MathSciNet MATH Google Scholar
Gawne, T.J., Martin, J.: Response of primate visual cortical V4 neurons to two simultaneously presented stimuli. J. Neurophysiol. 88, 1128–1135 (2002)
Article Google Scholar
Giese, M.A., Poggio, T.: Neural mechanisms for the recognition of biological movements. Nat. Neurosci. 4, 179–192 (2003)
Google Scholar
Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. (Lond) 160, 106–154 (1962)
Google Scholar
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proc. ICCV (2007)
Google Scholar
Lampl, I., Ferster, D., Poggio, T., Riesenhuber, M.: Intracellular measurements of spatial integration and the max operation in complex cells of the cat primary visual cortex. J. Neurophysiol. 92, 2704–2713 (2004)
Article Google Scholar
Laptev, I., Lindeberg, T.: Local descriptors for spatio-temporal recognition. In: Proc. ICCV (2003)
Google Scholar
Niebles, J.C., Fei-Fei, L.: A hierarchical model of shape and appearance for human action classification. In: Proc. CVPR (2007)
Google Scholar
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatio-temporal words. In: Proc. BMVC (2006)
Google Scholar
Rao, C., Yilmaz, A., Shah, M.: View-invariant representation and recognition of actions. Int. J. Comput. Vision 50(2), 203–226 (2002)
Article MATH Google Scholar
Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nat. Neurosci. 2, 1019–1025 (1999)
Article Google Scholar
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proc. ICPR (2004)
Google Scholar
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Object recognition with cortex-like mechanisms. IEEE T. Pattern Anal. 29(3), 411–426 (2007)
Article Google Scholar
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: Proc. CVPR (2005)
Google Scholar
Wang, L., Suter, D.: Recognizing human activities from silhouettes: motion subspace and factorial discriminative graphical model. In: Proc. CVPR (2007)
Google Scholar
Yacoob, Y., Black, M.J.: Parameterized modeling and recognition of activities. Comput. Vis. Image Und. 72(2), 232–247 (1999)
Article Google Scholar
Zach, C., Pock, T., Bischof, H.: A duality-based approach to realtime TV − L ₁ optical flow. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds.) DAGM 2007. LNCS, vol. 4713. Springer, Heidelberg (2007)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

BIWI / ETH Zürich, Sternwartstrasse 7, CH-8092, Zürich, Switzerland
Konrad Schindler & Luc van Gool
ESAT / KU Leuven, Kasteelpark Arenberg 10, B-3001, Heverlee, Belgium
Luc van Gool

Authors

Konrad Schindler
View author publications
You can also search for this author in PubMed Google Scholar
Luc van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Gerhard Rigoll

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schindler, K., van Gool, L. (2008). Combining Densely Sampled Form and Motion for Human Action Recognition. In: Rigoll, G. (eds) Pattern Recognition. DAGM 2008. Lecture Notes in Computer Science, vol 5096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69321-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-69321-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69320-8
Online ISBN: 978-3-540-69321-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics