PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth

Vineet, Vibhav; Sheasby, Glenn; Warrell, Jonathan; Torr, Philip H. S.

doi:10.1007/978-3-642-40395-8_14

Vibhav Vineet¹⁸,
Glenn Sheasby¹⁸,
Jonathan Warrell¹⁸ &
…
Philip H. S. Torr¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8081))

Included in the following conference series:

International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition

1609 Accesses
9 Citations

Abstract

Many models have been proposed to estimate human pose and segmentation by leveraging information from several sources. A standard approach is to formulate it in a dual decomposition framework. However, these models generally suffer from the problem of high computational complexity. In this work, we propose PoseField, a new highly efficient filter-based mean-field inference approach for jointly estimating human segmentation, pose, per-pixel body parts, and depth given stereo pairs of images. We extensively evaluate the efficiency and accuracy offered by our approach on H2View [1], and Buffy [2] datasets. We achieve 20 to 70 times speedup compared to the current state-of-the-art methods, as well as achieving better accuracy in all these cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sheasby, G., Valentin, J., Crook, N., Torr, P.: A robust stereo prior for human segmentation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part II. LNCS, vol. 7725, pp. 94–107. Springer, Heidelberg (2013)
Chapter Google Scholar
Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR, pp. 1–8 (2008)
Google Scholar
Sun, M., Kohli, P., Shotton, J.: Conditional regression forests for human pose estimation. In: CVPR, pp. 3394–3401. IEEE (2012)
Google Scholar
Sigal, L., Black, M.: Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. Brown Univertsity TR, 120 (2006)
Google Scholar
Kumar, M., Zisserman, A., Torr, P.: Efficient discriminative learning of parts-based models. In: CVPR, pp. 552–559 (2009)
Google Scholar
Winn, J., Shotton, J.: The layout consistent random field for recognizing and segmenting partially occluded objects (pdf) (2006)
Google Scholar
Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR, pp. 1385–1392 (2011)
Google Scholar
Bray, M., Kohli, P., Torr, P.: PoseCut: Simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part II. LNCS, vol. 3952, pp. 642–655. Springer, Heidelberg (2006)
Chapter Google Scholar
Kumar, M., Torr, P., Zisserman, A.: Objcut: Efficient segmentation using top-down and bottom-up cues. PAMI 32, 530–545 (2010)
Article Google Scholar
Ladicky, L., Russell, C., Kohli, P., Torr, P.H.S.: Graph cut based inference with co-occurrence statistics. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 239–253. Springer, Heidelberg (2010)
Chapter Google Scholar
Ladický, L., Sturgess, P., Russell, C., Sengupta, S., Bastanlar, Y., Clocksin, W., Torr, P.: Joint optimisation for object class segmentation and dense stereo reconstruction. BMVC, 104.1–104.11 (2010), doi:10.5244/C.24.104
Google Scholar
Komodakis, N., Paragios, N., Tziritas, G.: Mrf energy minimization and beyond via dual decomposition. PAMI, 1
Google Scholar
Wang, H., Koller, D.: Multi-level inference by relaxed dual decomposition for human pose segmentation. In: CVPR, pp. 2433–2440 (2011)
Google Scholar
Sheasby, G., Warrell, J., Zhang, Y., Crook, N., Torr, P.: Simultaneous human segmentation, depth and pose estimation via dual decomposition. BMVW (2012)
Google Scholar
Koller, D., Friedman, N.: Probabilistic graphical models: principles and techniques. MIT Press (2009)
Google Scholar
Vineet, V., Warrell, J., Torr, P.H.S.: Filter-based mean-field inference for random fields with higher-order terms and product label-spaces. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 31–44. Springer, Heidelberg (2012)
Chapter Google Scholar
Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304 (2011)
Google Scholar
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: NIPS, pp. 109–117 (2011)
Google Scholar
Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)
Chapter Google Scholar
Adams, A., Baek, J., Davis, M.A.: Fast high-dimensional filtering using the permutohedral lattice. Comput. Graph. Forum 29, 753–762 (2010)
Article Google Scholar
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML (2004)
Google Scholar
Ladický, L., Russell, C., Kohli, P., Torr, P.: Associative hierarchical crfs for object class image segmentation. In: ICCV, pp. 739–746 (2009)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: People detection and articulated pose estimation. In: CVPR, pp. 1014–1021 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Oxford Brookes University, Oxford, UK
Vibhav Vineet, Glenn Sheasby, Jonathan Warrell & Philip H. S. Torr

Authors

Vibhav Vineet
View author publications
You can also search for this author in PubMed Google Scholar
Glenn Sheasby
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Warrell
View author publications
You can also search for this author in PubMed Google Scholar
Philip H. S. Torr
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Mathematical Sciences, Lund University, Sweden
Anders Heyden , Fredrik Kahl , Carl Olsson & Magnus Oskarsson , , &
Dept. of Mathematics, University of Bergen, Johaness Brunsgate 12, 5007, Bergen, Norway
Xue-Cheng Tai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vineet, V., Sheasby, G., Warrell, J., Torr, P.H.S. (2013). PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth. In: Heyden, A., Kahl, F., Olsson, C., Oskarsson, M., Tai, XC. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2013. Lecture Notes in Computer Science, vol 8081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40395-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-40395-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40394-1
Online ISBN: 978-3-642-40395-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics