Skip to main content

Local Descriptors for Spatio-temporal Recognition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3667))

Abstract

This paper presents and investigates a set of local space-time descriptors for representing and recognizing motion patterns in video. Following the idea of local features in the spatial domain, we use the notion of space-time interest points and represent video data in terms of local space-time events. To describe such events, we define several types of image descriptors over local spatio-temporal neighborhoods and evaluate these descriptors in the context of recognizing human activities. In particular, we compare motion representations in terms of spatio-temporal jets, position dependent histograms, position independent histograms, and principal component analysis computed for either spatio-temporal gradients or optic flow. An experimental evaluation on a video database with human actions shows that high classification performance can be achieved, and that there is a clear advantage of using local position dependent histograms, consistent with previously reported findings regarding spatial recognition.

The support from the Swedish Research Council and from the Royal Swedish Academy of Sciences as well as the Knut and Alice Wallenberg Foundation is gratefully acknowledged. We also thank Christian Schüldt and Barbara Caputo for their help in obtaining the experimental video data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Black, M.J., Jepson, A.D.: Eigentracking: Robust matching and tracking of articulated objects using view-based representation. IJCV 26(1), 63–84 (1998)

    Article  Google Scholar 

  2. Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE-PAMI 23(3), 257–267 (2001)

    Google Scholar 

  3. Chomat, O., Martin, J., Crowley, J.L.: A Probabilistic Sensor for the Perception and the Recognition of Activities. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. I:487–503. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proc. ICCV, pp. 726–733 (2003)

    Google Scholar 

  5. Fablet, R., Bouthemy, P.: Motion recognition using nonparametric image motion models estimated from temporal and multiscale co-occurrence statistics. IEEE-PAMI 25(12), 1619–1624 (2003)

    Google Scholar 

  6. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR, Madison, Wisconsin, pp. 264–271 (2003)

    Google Scholar 

  7. Gavrila, D.M.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)

    Article  MATH  Google Scholar 

  8. Hoey, J., Little, J.J.: Representation and recognition of complex human motion. In: Proc. CVPR, pp. I:752–759 (2000)

    Google Scholar 

  9. Ke, Y., Sukthankar, R.: PCA-SIFT: A more disctinctive representation for local image descriptors. Technical Report IRP–TR–03–15, Intel (2003)

    Google Scholar 

  10. Koenderink, J.J., van Doorn, A.J.: Representation of local geometry in the visual system. Biol. Cyb. 55, 367–375 (1987)

    Article  MATH  Google Scholar 

  11. Laptev, I., Lindeberg, T.: Space-time interest points. In: Proc. ICCV, pp. 432–439 (2003)

    Google Scholar 

  12. Laptev, I., Lindeberg, T.: Velocity adaptation of space-time interest points. In: Proc. of ICPR (to appear, 2004)

    Google Scholar 

  13. Laptev, I., Lindeberg, T.: Velocity-adapted spatio-temporal receptive fields for direct recognition of activities. IVC 22(2), 105–116 (2004)

    Google Scholar 

  14. Lindeberg, T.: Feature detection with automatic scale selection. IJCV 30(2), 77–116 (1998)

    Google Scholar 

  15. Lindeberg, T.: Time-recursive velocity-adapted spatio-temporal scale-space filters. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. I:52–67. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Lindeberg, T., Gårding, J.: Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure. IVC 15, 415–434 (1997)

    Google Scholar 

  17. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. 7th Int. Conf. on Computer Vision, Corfu, Greece, pp. 1150–1157 (1999)

    Google Scholar 

  18. Lukas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Image Understanding Workshop (1981)

    Google Scholar 

  19. Mikolajczyk, K., Schmid, C.: An Affine Invariant Interest Point Detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. I:128–142. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  20. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: Proc. CVPR, pp.II: 257–263 (2003)

    Google Scholar 

  21. Nagel, H.H., Gehrke, A.: Spatiotemporal adaptive filtering for estimation and segmentation of optical flow fields. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 86–102. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  22. Schiele, B., Crowley, J.: Recognition without correspondence using multidimensional receptive field histograms. IJCV 36(1), 31–50 (2000)

    Article  Google Scholar 

  23. Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proc. of ICPR (to appear, 2004)

    Google Scholar 

  24. Shah, M., Jain, R. (eds.): Motion-Based Recognition. Kluwer, Dordrecht (1997)

    MATH  Google Scholar 

  25. Yacoob, Y., Black, M.J.: Parameterized modeling and recognition of activities. Computer Vision and Image Understanding 73(2), 232–247 (1999)

    Article  Google Scholar 

  26. Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: Proc. CVPR, pp. II:123–130 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Laptev, I., Lindeberg, T. (2006). Local Descriptors for Spatio-temporal Recognition. In: MacLean, W.J. (eds) Spatial Coherence for Visual Motion Analysis. SCVMA 2004. Lecture Notes in Computer Science, vol 3667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11676959_8

Download citation

  • DOI: https://doi.org/10.1007/11676959_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32533-8

  • Online ISBN: 978-3-540-32534-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics