Key Developments in Human Pose Estimation for Kinect

Kohli, Pushmeet; Shotton, Jamie

doi:10.1007/978-1-4471-4640-7_4

Pushmeet Kohli⁶ &
Jamie Shotton⁶

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

5649 Accesses
10 Citations

Abstract

The last few years have seen a surge in the development of natural user interfaces. These interfaces do not require devices such as keyboards and mice that have been the dominant modes of interaction over the last few decades. An important milestone in the progress of natural user interfaces was the recent launch of Kinect with its unique ability to reliably estimate the pose of the human user in real time. Human pose estimation has been the subject of much research in Computer Vision, but only recently with the introduction of depth cameras and algorithmic advances has pose estimation made it out of the lab and into the living room. In this chapter we briefly summarize the work on human pose estimation for Kinect that has been undertaken at Microsoft Research Cambridge, and discuss some of the remaining open challenges. Due to the summary nature of this chapter, we limit our description to the key insights and refer the reader to the original publications for the technical details.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This project would eventually be launched as Kinect.
2.
For more detail on the story behind Kinect, please see the Foreword.

References

Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Ng, A.: Discriminative learning of Markov random fields for segmentation of 3D scan data. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Besl, P., McKay, N.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. (1992). doi:10.1109/34.121791
Google Scholar
Blake, A., Kohli, P.: Introduction to Markov Random Fields. Markov Random Fields for Vision and Image Processing. MIT Press, Cambridge (2011)
Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61(1), 55–79 (2005)
Article Google Scholar
Gall, J., Lempitsky, V.: Class-specific Hough forests for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: International Conference on Computer Vision (2011)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vis. 77(1–3), 259–289 (2008)
Article Google Scholar
Moeslund, T., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. (2006). doi:10.1016/j.cviu.2006.08.002
Google Scholar
Müller, J., Arens, M.: Human pose estimation with implicit shape models. In: ARTEMIS (2010)
Google Scholar
Poppe, R.: Vision-based human motion analysis: an overview. Comput. Vis. Image Underst. 108 (2007). doi:10.1016/j.cviu.2006.10.016
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter sensitive hashing. In: International Conference on Computer Vision (2003)
Google Scholar
Sharp, T.: Implementing decision trees and forests on a GPU. In: European Conference on Computer Vision (2008)
Google Scholar
Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Sun, M., Kohli, P., Shotton, J.: Conditional regression forests for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The Vitruvian Manifold: inferring dense correspondences for one-shot human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Tu, Z.: Auto-context and its application to high-level vision tasks. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar

Download references

Acknowledgements

This chapter is a summary of existing published work, and we would like to highlight the contributions of all the original authors.

Author information

Authors and Affiliations

Microsoft Research, Cambridge, UK
Pushmeet Kohli & Jamie Shotton

Authors

Pushmeet Kohli
View author publications
You can also search for this author in PubMed Google Scholar
Jamie Shotton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pushmeet Kohli .

Editor information

Editors and Affiliations

Computer Vision Laboratory, ETH Zürich, Sternwartstrasse 7, Zürich, 8092, Switzerland
Andrea Fossati
Perceiving Systems Department, Max Planck Inst. for Intelligent Systems, Spemannstrasse 41, Tübingen, 72076, Germany
Juergen Gall
Computer Vision Laboratory, ETH Zürich, Sternwartstrasse 7, Zürich, 8092, Switzerland
Helmut Grabner
Intel Science and Technology Center, Allen Center 462, Seattle, 98195, Washington, USA
Xiaofeng Ren
Industrial Perception, Industrial Ave 911, Palo Alto, 94303, California, USA
Kurt Konolige

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kohli, P., Shotton, J. (2013). Key Developments in Human Pose Estimation for Kinect. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds) Consumer Depth Cameras for Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-4640-7_4

Download citation

DOI: https://doi.org/10.1007/978-1-4471-4640-7_4
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4639-1
Online ISBN: 978-1-4471-4640-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics