Part Bricolage: Flow-Assisted Part-Based Graphs for Detecting Activities in Videos

Shankar, Sukrit; Badrinarayanan, Vijay; Cipolla, Roberto

doi:10.1007/978-3-319-10599-4_38

Sukrit Shankar¹⁹,
Vijay Badrinarayanan¹⁹ &
Roberto Cipolla¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8694))

Included in the following conference series:

European Conference on Computer Vision

17k Accesses

Abstract

Space-time detection of human activities in videos can significantly enhance visual search. To handle such tasks, while solely using low-level features has been found somewhat insufficient for complex datasets; mid-level features (like body parts) that are normally considered, are not robustly accounted for their inaccuracy. Moreover, the activity detection mechanisms do not constructively utilize the importance and trustworthiness of the features.

This paper addresses these problems and introduces a unified formulation for robustly detecting activities in videos. Our first contribution is the formulation of the detection task as an undirected node- and edge-weighted graphical structure called Part Bricolage (PB), where the node weights represent the type of features along with their importance, and edge weights incorporate the probability of the features belonging to a known activity class, while also accounting for the trustworthiness of the features connecting the edge. Prize-Collecting-Steiner-Tree (PCST) problem [19] is solved for such a graph that gives the best connected subgraph comprising the activity of interest. Our second contribution is a novel technique for robust body part estimation, which uses two types of state-of-the-art pose detectors, and resolves the plausible detection ambiguities with pre-trained classifiers that predict the trustworthiness of the pose detectors. Our third contribution is the proposal of fusing the low-level descriptors with the mid-level ones, while maintaining the spatial structure between the features.

For a quantitative evaluation of the detection power of PB, we run PB on Hollywood and MSR-Actions datasets and outperform the state-of-the-art by a significant margin for various detection paradigms.

Download to read the full chapter text

Chapter PDF

Spatio-Temporal Detection of Fine-Grained Dyadic Human Interactions

Pose Filter Based Hidden-CRF Models for Activity Detection

A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps

Keywords

References

Black, M.J., Anandan, P.: A framework for the robust estimation of optical flow. In: Proceedings of the Fourth International Conference on Computer Vision, pp. 231–236. IEEE (1993)
Google Scholar
Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3d human pose annotations. In: ICCV (2009)
Google Scholar
Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chen, C.Y., Grauman, K.: Efficient activity detection with max-subgraph search. In: CVPR (2012)
Google Scholar
Chen, J., Kim, M., Wang, Y., Ji, Q.: Switching gaussian process dynamic models for simultaneous composite motion tracking and recognition. In: CVPR (2009)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Dittrich, M.T., Klau, G.W., Rosenwald, A., Dandekar, T., Müller, T.: Identifying functional modules in protein–protein interaction networks: an integrated exact approach. Bioinformatics 24(13), i223–i231 (2008)
Google Scholar
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: CVPR (2003)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
Fragkiadaki, K., Hu, H., Shi, J.: Pose from flow and flow from pose. In: CVPR (2013)
Google Scholar
Gopalan, R.: Joint sparsity-based representation and analysis of unconstrained activities. In: CVPR (2013)
Google Scholar
Jain, A., Gupta, A., Rodriguez, M., Davis, L.S.: Representing videos using mid-level discriminative patches. In: CVPR (2013)
Google Scholar
Jain, M., Jégou, H., Bouthemy, P., et al.: Better exploiting motion for better action recognition. In: CVPR (2013)
Google Scholar
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: CVPR (2010)
Google Scholar
Laptev, I.: On space-time interest points. IJCV 64(2-3), 107–123 (2005)
Article Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Lee, C.S., Elgammal, A.: Coupled visual and kinematic manifold models for tracking. IJCV 87(1-2), 118–139 (2010)
Article Google Scholar
Ljubić, I., Weiskircher, R., Pferschy, U., Klau, G.W., Mutzel, P., Fischetti, M.: An algorithmic framework for the exact solution of the prize-collecting steiner tree problem. Mathematical Programming 105(2-3), 427–449 (2006)
Article MATH MathSciNet Google Scholar
Ma, S., Zhang, J., Ikizler-Cinbis, N., Sclaroff, S.: Action recognition and localization by hierarchical space-time segments. In: ICCV (2013)
Google Scholar
Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: CVPR (2011)
Google Scholar
Malgireddy, M., Inwogu, I., Govindaraju, V.: A temporal bayesian model for classifying, detecting and localizing activities in video sequences. In: CVPR (2012)
Google Scholar
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR (2009)
Google Scholar
Ramanan, D., Forsyth, D.A.: Automatic annotation of everyday movements. In: NIPS (2003)
Google Scholar
Raptis, M., Sigal, L.: Poselet key-framing: A model for human activity recognition. In: CVPR (2013)
Google Scholar
Sadanand, S., Corso, J.J.: Action bank: A high-level representation of activity in video. In: CVPR (2012)
Google Scholar
Sapp, B., Weiss, D., Taskar, B.: Parsing human motion with stretchable models. In: CVPR (2011)
Google Scholar
Schindler, K., Van Gool, L.: Action snippets: How many frames does human action recognition require? In: CVPR (2008)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: ICPR (2004)
Google Scholar
Shi, F., Petriu, E., Laganiere, R.: Sampling strategies for real-time action recognition. In: CVPR (2013)
Google Scholar
Sullivan, M., Shah, M.: Action mach: Maximum average correlation height filter for action recognition. In: CVPR (2008)
Google Scholar
Taylor, G.W., Sigal, L., Fleet, D.J., Hinton, G.E.: Dynamical binary latent variable models for 3d human pose tracking. In: CVPR (2010)
Google Scholar
Thurau, C., Hlavác, V.: Pose primitive based human action recognition in videos or still images. In: CVPR (2008)
Google Scholar
Wang, C., Wang, Y., Yuille, A.L.: An approach to pose-based action recognition. In: CVPR (2013)
Google Scholar
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR (2011)
Google Scholar
Wang, H., Schmid, C., et al.: Action recognition with improved trajectories. In: ICCV (2013)
Google Scholar
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C., et al.: Evaluation of local spatio-temporal features for action recognition. In: BMVC (2009)
Google Scholar
Yang, W., Wang, Y., Mori, G.: Recognizing human actions from still images with latent poses. In: CVPR (2010)
Google Scholar
Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR (2011)
Google Scholar
Yao, A., Gall, J., Van Gool, L.: Coupled action recognition and pose estimation from multiple views. IJCV 100(1), 16–37 (2012)
Article MATH Google Scholar
Yeffet, L., Wolf, L.: Local trinary patterns for human action recognition. In: ICCV (2009)
Google Scholar
Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: CVPR (2009)
Google Scholar
Zanfir, M., Leordeanu, M., Sminchisescu, C.: The moving pose: An efficient 3D kinematics descriptor for low-latency action recognition and detection. In: ICCV (2013)
Google Scholar
Zhu, J., Wang, B., Yang, X., Zhang, W., Tu, Z.: Action recognition with actons. In: ICCV (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Machine Intelligence Lab, Division of Information Processing, University of Cambridge, UK
Sukrit Shankar, Vijay Badrinarayanan & Roberto Cipolla

Authors

Sukrit Shankar
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Badrinarayanan
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Cipolla
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shankar, S., Badrinarayanan, V., Cipolla, R. (2014). Part Bricolage: Flow-Assisted Part-Based Graphs for Detecting Activities in Videos. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Springer, Cham. https://doi.org/10.1007/978-3-319-10599-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-10599-4_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10598-7
Online ISBN: 978-3-319-10599-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Part Bricolage: Flow-Assisted Part-Based Graphs for Detecting Activities in Videos

Abstract

Chapter PDF

Similar content being viewed by others

Spatio-Temporal Detection of Fine-Grained Dyadic Human Interactions

Pose Filter Based Hidden-CRF Models for Activity Detection

A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Part Bricolage: Flow-Assisted Part-Based Graphs for Detecting Activities in Videos

Abstract

Chapter PDF

Similar content being viewed by others

Spatio-Temporal Detection of Fine-Grained Dyadic Human Interactions

Pose Filter Based Hidden-CRF Models for Activity Detection

A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation