Skip to main content

Human Action Recognition Using Distribution of Oriented Rectangular Patches

  • Conference paper
Book cover Human Motion – Understanding, Modeling, Capture and Animation (HuMo 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4814))

Included in the following conference series:

Abstract

We describe a “bag-of-rectangles” method for representing and recognizing human actions in videos. In this method, each human pose in an action sequence is represented by oriented rectangular patches extracted over the whole body. Then, spatial oriented histograms are formed to represent the distribution of these rectangular patches. In order to carry the information from the spatial domain described by the bag-of-rectangles descriptor to temporal domain for recognition of the actions, four different methods are proposed. These are namely, (i) frame by frame voting, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis by rectangular patches, (iii) a classifier based approach using SVMs, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the descriptor. The detailed experiments are carried out on the action dataset of Blank et. al. High success rates (100%) prove that with a very simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, pp. 1395–1402 (2005)

    Google Scholar 

  2. Bobick, A., Davis, J.: The recognition of human movement using temporal templates. IEEE T. Pattern Analysis and Machine Intelligence 23(3), 257–267 (2001)

    Article  Google Scholar 

  3. Brand, M., Oliver, N., Pentland, A.: Coupled hidden markov models for complex action recognition. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 994–999. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  4. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conf. on Computer Vision and Pattern Recognition, vol. I, pp. 886–893. IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  5. Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV 2003, pp. 726–733 (2003)

    Google Scholar 

  6. Fei-Fei, L., Perona, P.: A bayesian heirarcical model for learning natural scene categories. In: IEEE Conf. on Computer Vision and Pattern Recognition, IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  7. Forsyth, D., Fleck, M.: Body plans. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 678–683. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  8. Forsyth, D., Arikan, O., Ikemoto, L., O’Brien, J., Ramanan, D.: Computational studies of human motion i: Tracking and animation. Foundations and Trends in Computer Graphics and Vision 1(2/3) (2006)

    Google Scholar 

  9. Freeman, W., Roth, M.: Orientation histograms for hand gesture recognition. In: International Workshop on Automatic Face and Gesture Recognition (1995)

    Google Scholar 

  10. Hong, P., Turk, M., Huang, T.: Gesture modeling and recognition using finite state machines. In: Int. Conf. Automatic Face and Gesture Recognition, pp. 410–415 (2000)

    Google Scholar 

  11. Hongeng, S., Nevatia, R., Bremond, F.: Video-based event recognition: activity representation and probabilistic recognition methods. Computer Vision and Image Understanding 96(2), 129–162 (2004)

    Article  Google Scholar 

  12. Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE transactions on systems, man, and cybernetics c: applications and reviews 34(3) (2004)

    Google Scholar 

  13. Ikizler, N., Forsyth, D.: Searching video for complex activities with finite state models. In: IEEE Conf. on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  14. Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Computer Vision 43(1), 29–44 (2001)

    Article  MATH  Google Scholar 

  15. Ling, H., Okada, K.: Diffusion distance for histogram comparison. In: IEEE Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 246–253 (2006)

    Google Scholar 

  16. Monay, F., Gatica-Perez, D.: Modeling semantic aspects for cross-media image retrieval. IEEE T. Pattern Analysis and Machine Intelligence (accepted for publication)

    Google Scholar 

  17. Niebles, J.C., Fei-Fei, L.: A hierarchical model of shape and appearance for human action classification. In: IEEE Conf. on Computer Vision and Pattern Recognition, IEEE Computer Society Press, Los Alamitos (2007)

    Google Scholar 

  18. Oliver, N., Garg, A., Horvitz, E.: Layered representations for learning and inferring office activity from multiple sensory channels. Computer Vision and Image Understanding 96(2), 163–180 (2004)

    Article  Google Scholar 

  19. Pinhanez, C., Bobick, A.: Pnf propagation and the detection of actions described by temporal intervals. In: DARPA IU Workshop, pp. 227–234 (1997)

    Google Scholar 

  20. Pinhanez, C., Bobick, A.: Human action detection using pnf propagation of temporal constraints. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 898–904. IEEE Computer Society Press, Los Alamitos (1998)

    Google Scholar 

  21. Polana, R., Nelson, R.: Detecting activities. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2–7. IEEE Computer Society Press, Los Alamitos (1993)

    Chapter  Google Scholar 

  22. Ramanan, D., Forsyth, D., Zisserman, A.: Strike a pose: Tracking people by finding stylized poses. In: IEEE Conf. on Computer Vision and Pattern Recognition, vol. I, pp. 271–278. IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  23. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Computer Vision 40(2), 99–121 (2000)

    Article  MATH  Google Scholar 

  24. Siskind, J.M.: Reconstructing force-dynamic models from video sequences. Artificial Intelligence 151, 91–154 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  25. Sivic, J., Russell, B., Efros, A., Zisserman, A., Freeman, W.: Discovering object categories in image collections. In: Int. Conf. on Computer Vision (2005)

    Google Scholar 

  26. Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Conditional models for contextual human motion recognition. In: Int. Conf. on Computer Vision, pp. 1808–1815 (2005)

    Google Scholar 

  27. Wilson, A., Bobick, A.: Parametric hidden markov models for gesture recognition. IEEE T. Pattern Analysis and Machine Intelligence 21(9), 884–900 (1999)

    Article  Google Scholar 

  28. Yu-Gang Jiang, C.-W.N., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Int. Conf. Image Video Retrieval (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ahmed Elgammal Bodo Rosenhahn Reinhard Klette

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

İkizler, N., Duygulu, P. (2007). Human Action Recognition Using Distribution of Oriented Rectangular Patches. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds) Human Motion – Understanding, Modeling, Capture and Animation. HuMo 2007. Lecture Notes in Computer Science, vol 4814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75703-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75703-0_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75702-3

  • Online ISBN: 978-3-540-75703-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics