Pose Filter Based Hidden-CRF Models for Activity Detection

Banerjee, Prithviraj; Nevatia, Ram

doi:10.1007/978-3-319-10605-2_46

Prithviraj Banerjee¹⁹ &
Ram Nevatia¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8690))

Included in the following conference series:

European Conference on Computer Vision

17k Accesses
3 Citations

Abstract

Detecting activities which involve a sequence of complex pose and motion changes in unsegmented videos is a challenging task, and common approaches use sequential graphical models to infer the human pose-state in every frame. We propose an alternative model based on detecting the key-poses in a video, where only the temporal positions of a few key-poses are inferred. We also introduce a novel pose summarization algorithm to automatically discover the key-poses of an activity. We learn a detection filter for each key-pose, which along with a bag-of-words root filter are combined in an HCRF model, whose parameters are learned using the latent-SVM optimization. We evaluate the performance of our model for detection on unsegmented videos on four human action datasets, which include challenging crowded scenes with dynamic backgrounds, inter-person occlusions, multi-human interactions and hard-to-detect daily use objects.

Download to read the full chapter text

Chapter PDF

Human Pose Tracking Using Online Latent Structured Support Vector Machine

Enhancing Human Pose Estimation with Temporal Clues

Video Action Detection with Relational Dynamic-Poselets

Keywords

References

Cao, Y., Barrett, D.: Recognizing Human Activities from Partially Observed Videos. In: CVPR (2013)
Google Scholar
Felzenszwalb, P., McAllester, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR (2008)
Google Scholar
Gaidon, A.: Actom sequence models for efficient action detection. In: CVPR (2011)
Google Scholar
Huang, C., Wu, B., Nevatia, R.: Robust object tracking by hierarchical association of detection responses. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 788–801. Springer, Heidelberg (2008)
Chapter Google Scholar
Jain, A., Gupta, A., Rodriguez, M., Davis, L.: Representing Videos using Mid-level Discriminative Patches. In: CVPR (2013)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV (2005)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Volumetric Features for Video Event Detection. IJCV (2010)
Google Scholar
Kong, Y., Jia, Y., Fu, Y.: Learning Human Interaction by Interactive Phrases. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 300–313. Springer, Heidelberg (2012)
Chapter Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Liu, T., Kender, J.R.: Computational approaches to temporal sampling of video sequences. MCCA (2007)
Google Scholar
Lv, F., Nevatia, R.: Single view human action recognition using key pose matching & viterbi path searching. In: CVPR (2007)
Google Scholar
Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: ICCV (2009)
Google Scholar
Natarajan, P., Singh, V., Nevatia, R.: Learning 3D Action Models from a few 2D videos. In: CVPR (2010)
Google Scholar
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Chapter Google Scholar
Raptis, M., Sigal, L.: Poselet Key-framing: A Model for Human Activity Recognition. In: CVPR (2013)
Google Scholar
Raptis, M., Soatto, S.: Tracklet Descriptors for Action Modeling and Video Analysis. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 577–590. Springer, Heidelberg (2010)
Chapter Google Scholar
Rodriguez, M., Ahmed, J., Shah, M.: Action Mach A spatio-temporal maximum average correlation height filter for action recognition. In: CVPR (2008)
Google Scholar
Ryoo, M.S., Chen, C.-C., Aggarwal, J.K., Roy-Chowdhury, A.: An overview of contest on semantic description of human activities (SDHA) 2010. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 270–285. Springer, Heidelberg (2010)
Chapter Google Scholar
Ryoo, M.: Human activity prediction: Early recognition of ongoing activities from streaming videos. In: ICCV. IEEE (2011)
Google Scholar
Satkin, S., Hebert, M.: Modeling the Temporal Extent of Actions. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 536–548. Springer, Heidelberg (2010)
Chapter Google Scholar
Schindler, K., Van Gool, L.: Action Snippets: How many frames does human action recognition require? In: CVPR (2008)
Google Scholar
Shechtman, E., Irani, M.: Space-time behavior-based correlation-Or-how to tell if two underlying motion fields are similar without computing them? PAMI (2007)
Google Scholar
Singh, V., Nevatia, R.: Action recognition in cluttered dynamic scenes using Pose-Specific Part Models. In: ICCV (2011)
Google Scholar
Tian, Y., Sukthankar, R., Shah, M.: Spatiotemporal Deformable Part Models for Action Detection. In: CVPR (2013)
Google Scholar
Vahdat, A., Gao, B., Ranjbar, M., Greg Mori: A discriminative key pose sequence model for recognizing human interactions. In: Workshop on Visual Surveillance (2011)
Google Scholar
Wang, J., Chen, Z., Wu, Y.: Action Recognition with Multiscale Spatio-Temporal Contexts. In: CVPR (2011)
Google Scholar
Wang, Y., Mori, G.: Hidden Part Models for Human Action Recognition: Probabilistic vs. Max-Margin. PAMI (2010)
Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural SVMs with latent variables. In: ICML (2009)
Google Scholar
Yuan, J., Liu, Z., Wu, Y.: Discriminative Subvolume Search for Efficient Action Detection. In: CVPR (2009)
Google Scholar
Zhang, Y., Liu, X., Chang, M.-C., Ge, W., Chen, T.: Spatio-Temporal Phrases for Activity Recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 707–721. Springer, Heidelberg (2012)
Chapter Google Scholar
Zhuang, Y., Rui, Y.: Adaptive key frame extraction using unsupervised clustering. In: ICIP (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Southern California, Los Angeles, USA
Prithviraj Banerjee & Ram Nevatia

Authors

Prithviraj Banerjee
View author publications
You can also search for this author in PubMed Google Scholar
Ram Nevatia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
KU Leuven, ESAT - PSI, iMinds, Kasteelpark Arenberg, 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Banerjee, P., Nevatia, R. (2014). Pose Filter Based Hidden-CRF Models for Activity Detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8690. Springer, Cham. https://doi.org/10.1007/978-3-319-10605-2_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-10605-2_46
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10604-5
Online ISBN: 978-3-319-10605-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Pose Filter Based Hidden-CRF Models for Activity Detection

Abstract

Chapter PDF

Similar content being viewed by others

Human Pose Tracking Using Online Latent Structured Support Vector Machine

Enhancing Human Pose Estimation with Temporal Clues

Video Action Detection with Relational Dynamic-Poselets

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Pose Filter Based Hidden-CRF Models for Activity Detection

Abstract

Chapter PDF

Similar content being viewed by others

Human Pose Tracking Using Online Latent Structured Support Vector Machine

Enhancing Human Pose Estimation with Temporal Clues

Video Action Detection with Relational Dynamic-Poselets

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation