Skip to main content

Detecting and Tracking Action Content

  • Chapter
Computer Analysis of Human Behavior
  • 1353 Accesses

Abstract

Detection and tracking of action content in the field of view of a camera is a significant step that needs to be completed prior to analysis. The detection task can be performed by either examining a single image or by analyzing the motion in a series of consecutive frames from a video. The tracking task, in contrast, requires multiple images and can be performed by association of detected objects or by iteratively estimating the motion in consecutive frames. This chapter provides insight to both tasks as they relate to the analysis of human actions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this chapter, differential geometric descriptors are treated different from differential descriptors, as will be discussed under appearance based descriptors.

  2. 2.

    Subpixel refers to spatial locations that are not integer.

References

  1. Akita, K.: Image sequence analysis of real world human motion. Pattern Recognit. 17(1), 73–83 (1984)

    Article  Google Scholar 

  2. Ali, A., Aggarwal, J.K.: Segmentation and recognition of continuous human activity. In: IEEE Workshop on Detection and Recognition of Events in Video, pp. 28–35 (2001)

    Chapter  Google Scholar 

  3. Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: People detection and articulated pose estimation. In: IEEE Conf. on Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  4. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)

    Article  Google Scholar 

  5. Bertalmio, M., Sapiro, G., Randall, G.: Morphing active contours. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 733–737 (2000)

    Article  Google Scholar 

  6. Black, M., Jepson, A.: Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. Int. J. Comput. Vis. 26(1), 63–84 (1998)

    Article  Google Scholar 

  7. Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: IEEE Int. Conf. on Computer Vision (2005)

    Google Scholar 

  8. Caselles, V., Kimmel, R., Sapiro, G.: Geodesic active contours. In: IEEE Int. Conf. on Computer Vision, pp. 694–699 (1995)

    Chapter  Google Scholar 

  9. Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25, 564–575 (2003)

    Article  Google Scholar 

  10. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 886–893 (2005)

    Google Scholar 

  11. Ek, C., Torr, P., Lawrence, N.: Gaussian process latent variable models for human pose estimation. In: Int. Conf on Machine Learning for Multimodal Interaction (2007)

    Google Scholar 

  12. Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: European Conf. on Computer Vision, pp. 751–767 (2000)

    Google Scholar 

  13. Fabbri, R., Costa, L., Torelli, J., Bruno, O.: 2D Euclidean distance transforms: a comparative survey. ACM Computing Surveys 40(1) (2008)

    Google Scholar 

  14. Felzenszwalb, P.F., Huttenlocher, D.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61(1), 55–79 (2005)

    Article  Google Scholar 

  15. Felzenszwalb, P.F., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conf. on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  16. Freeman, W.T., Roth, M.: Orientation histograms for hand gesture recognition. In: IEEE Intl. Workshop on Automatic Face and Gesture Recognition, pp. 296–301 (1995)

    Google Scholar 

  17. Gavrila, D.M.: The visual analysis of human movement: A survey. Comput. Vis. Image Underst. 73(1), 82–98 (1999)

    Article  MATH  Google Scholar 

  18. Haritaoglu, I., Harwood, D., Davis, L.S.: W4: real-time surveillance of people and their activities. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 809–830 (2000)

    Article  Google Scholar 

  19. Harris, C.G., Stephens, M.: A combined corner and edge detector. In: 4th Alvey Vision Conference, pp. 147–151 (1988)

    Google Scholar 

  20. Hartley, R., Zisserman, A.: Multiple View Geometry. Cambridge University Press, Cambridge (2000)

    MATH  Google Scholar 

  21. Jain, R., Nagel, H.H.: On the analysis of accumulative difference pictures from image sequences of real world scenes. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 206–214 (1979)

    Article  Google Scholar 

  22. Javed, O., Shafique, K., Shah, M.: A hierarchical approach to robust background subtraction using color and gradient information. In: IEEE Workshop on Motion and Video Computing (2002)

    Google Scholar 

  23. Jepson, A.D., Fleet, D.J., ElMaraghi, T.F.: Robust online appearance models for visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(10), 1296–1311 (2003)

    Article  Google Scholar 

  24. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis. 1, 321–332 (1988)

    Article  Google Scholar 

  25. Kuhn, H.W.: The Hungarian method for solving the assignment problem. Nav. Res. Logist. Q. 2, 83–97 (1955)

    Article  Google Scholar 

  26. Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2), 107–123 (2005)

    Article  MathSciNet  Google Scholar 

  27. Lee, M.W., Cohen, I., Jung, S.K.: Particle filter with analytical inference for human body tracking. In: IEEE Workshop on Motion and Video Computing (2002)

    Google Scholar 

  28. Li, C., Xu, C., Gui, C., Fox, M.D.: Level set evolution without reinitialization: A new variational formulation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 430–436 (2005)

    Google Scholar 

  29. Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  30. Liyuan, L., Maylor, L.: Integrating intensity and texture differences for robust change detection. IEEE Trans. Image Process. 11(2), 105–112 (2002)

    Article  Google Scholar 

  31. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  32. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conference on Artificial Intelligence, pp. 121–130 (1981)

    Google Scholar 

  33. Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and Viterbi path searching. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2007)

    Chapter  Google Scholar 

  34. Mansouri, A.R.: Region tracking via level set PDEs without motion computation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 947–961 (2002)

    Article  Google Scholar 

  35. Maskell, S., Gordon, N.: A tutorial on particle filters for on-line non-linear/non-Gaussian Bayesian tracking. In: IEEE Target Tracking: Algorithms and Applications, vol. 2, pp. 1–15 (2001)

    Google Scholar 

  36. Maybeck, P.: Stochastic Models, Estimation, and Control. Mathematics in Science and Engineering, vol. 141. Elsevier, Amsterdam (1979)

    MATH  Google Scholar 

  37. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1615–1630 (2003)

    Google Scholar 

  38. Moravec, H.P.: Visual mapping by a robot rover. In: Proc. of IJCAI, pp. 598–600 (1979)

    Google Scholar 

  39. Niyogi, S., Adelson, E.: Analyzing gait with spatiotemporal surfaces. In: Wrks. on Nonrigid and Artic. Motion (1994)

    Google Scholar 

  40. Oliver, N.M., Rosario, B., Pentland, A.: A Bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 831–843 (2000)

    Article  Google Scholar 

  41. Paragios, N., Deriche, R.: Geodesic active contours and level sets for the detection and tracking of moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 22(3), 266–280 (2000)

    Article  Google Scholar 

  42. Paragios, N., Deriche, R.: Geodesic active regions and level set methods for supervised texture segmentation. Int. J. Comput. Vis. 46(3), 223–247 (2002)

    Article  MATH  Google Scholar 

  43. Rangarajan, K., Shah, M.: Establishing motion correspondence. Comput. Vis. Graph. Image Process. 54(1), 56–73 (1991)

    MATH  Google Scholar 

  44. Rao, C., Yilmaz, A., Shah, M.: View invariant representation and recognition of actions. Int. J. Comput. Vis. 50(2), 203–226 (2002)

    Article  MATH  Google Scholar 

  45. Ren, X., Berg, A.C., Malik, J.: Recovering human body configurations using pairwise constraints between parts. In: IEEE Int. Conf. on Computer Vision (2005)

    Google Scholar 

  46. Riesenfeld, R.F.: Geometric Modeling with Splines: An Introduction. CRC Press, Boca Raton (2001)

    MATH  Google Scholar 

  47. Schunk, B.G.: The image flow constraint equation. Comput. Vis. Graph. Image Process. 35, 20–46 (1986)

    Article  Google Scholar 

  48. Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: ACM Multimedia (2007)

    Google Scholar 

  49. Sethian, J.A.: Level Set Methods: Evolving Interfaces in Geometry, Fluid Mechanics Computer Vision and Material Sciences. Cambridge University Press, Cambridge (1999)

    Google Scholar 

  50. Shafique, K., Shah, M.: A non-iterative greedy algorithm for multi-frame point correspondence. IEEE Trans. Pattern Anal. Mach. Intell. 27(1), 51–65 (2005)

    Article  Google Scholar 

  51. Sheikh, Y., Shah, M.: Bayesian modeling of dynamic scenes for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 27(11), 1778–1792 (2005)

    Article  Google Scholar 

  52. Sigal, L., Black, M.: Measure locally, reason globally: Occlusion-sensitive articulated pose estimation. In: IEEE Conf. on Computer Vision and Pattern Recognition (2006)

    Google Scholar 

  53. Stauffer, C., Grimson, W.E.L.: Learning patterns of activity using real time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–767 (2000)

    Article  Google Scholar 

  54. Syeda-Mahmood, T., Vasilescu, A., Sethi, S.: Recognizing action events from multiple viewpoints. In: IEEE Workshop on Detection and Recognition of Events in Video (2001)

    Google Scholar 

  55. Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on Riemannian manifolds. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  56. van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)

    Article  Google Scholar 

  57. Wren, C.R., Azarbayejani, A., Pentland, A.: Pfinder: Real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 780–785 (1997)

    Article  Google Scholar 

  58. Yang, M., Kriegman, D., Ahuja, N.: Detecting faces in images: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(1), 34–58 (2002)

    Article  Google Scholar 

  59. Yilmaz, A.: Kernel based object tracking using asymmetric kernels with adaptive scale and orientation selection. Mach. Vis. Appl. J. (2010)

    Google Scholar 

  60. Yilmaz, A., Shah, M.: A differential geometric approach to representing the human actions. Comput. Vis. Image Underst. 109(3), 335–351 (2008)

    Article  Google Scholar 

  61. Yilmaz, A., Li, X., Shah, M.: Contour based object tracking with occlusion handling in video acquired using mobile cameras. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1531–1536 (2004)

    Article  Google Scholar 

  62. Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Comput. Surv. 38(4), 13 (2006)

    Article  Google Scholar 

  63. Zelnik-Manor, L.Z., Irani, M.: Event-based analysis of video. In: IEEE Conf. on Computer Vision and Pattern Recognition (2001)

    Google Scholar 

  64. Zhu, S.C., Yuille, A.: Region competition: unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18(9), 884–900 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alper Yilmaz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Yilmaz, A. (2011). Detecting and Tracking Action Content. In: Salah, A., Gevers, T. (eds) Computer Analysis of Human Behavior. Springer, London. https://doi.org/10.1007/978-0-85729-994-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-994-9_3

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-993-2

  • Online ISBN: 978-0-85729-994-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics