3D Human Pose Lifting: From Joint Position to Joint Rotation

Wu, Zeye; Che, Wujun

doi:10.1007/978-981-13-9917-6_22

Zeye Wu^10,11 &
Wujun Che^10,12

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1043))

Included in the following conference series:

Chinese Conference on Image and Graphics Technologies

1554 Accesses
1 Citations

Abstract

3D human pose estimation is a fundamental task in computer vision. However, most of the related works focus on recovering human joint positions, which provides sparse and insufficient pose information for many applications like 3D avatar animation. Therefore, this paper presents a deep network for recovering joint angles from 3D joint positions, which learns the prior dependence between them. We test the validity and robustness of our method. We also discuss some details in designing and training the network. Our method is simple, effective and extensive. It can be combined with work of 3D human pose estimation that predict 3D joint positions from image or depth data to produce more detailed and natural poses. It builds a map between two joint sets with different numbers of joints, which provides a framework to unify multiple datasets for human pose estimation with different annotation formats.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wei, X., Zhang, P., Chai, J.: Accurate realtime full-body motion capture using a single depth camera. ACM Trans. Graph. (TOG) 31, 188 (2012)
Article Google Scholar
Knoop, S., Vacek, S., Dillmann, R.: Sensor fusion for 3D human body tracking with an articulated 3D body model. In: International Conference on Robotics and Automation, pp. 1686–1691 (2006)
Google Scholar
Shuai, L., Li, C., Guo, X., Prabhakaran, B., Chai, J.: Motion capture with ellipsoidal skeleton using multiple depth cameras. IEEE Trans. Vis. Comput. Graph. 23, 1085–1098 (2017)
Article Google Scholar
De Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H., Thrun, S.: Performance capture from sparse multi-view video. In: International Conference on Computer Graphics and Interactive Techniques, vol. 27, p. 98 (2008)
Google Scholar
Wu, C., Stoll, C., Valgaerts, L., Theobalt, C.: On-set performance capture of multiple actors with a stereo camera. In: International Conference on Computer Graphics and Interactive Techniques, vol. 32, pp. 1–11 (2013)
Google Scholar
Urtasun, R., Darrell, T.: Sparse probabilistic regression for activity-independent human pose inference. In: Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Google Scholar
Anguelov, D., et al.: Discriminative learning of Markov random fields for segmentation of 3D scan data. In: Computer Vision and Pattern Recognition, pp. 169–176 (2005)
Google Scholar
Siddiqui, M., Medioni, G.: Human pose estimation from a single view point, real-time range sensor. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 1–8 (2010)
Google Scholar
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: 2011 International Conference on Computer Vision, pp. 415–422 (2011)
Google Scholar
Kanaujia, A.: Coupling top-down and bottom-up methods for 3D human pose and shape estimation from monocular image sequences. In: Computer Vision and Pattern Recognition (2014)
Google Scholar
Orriteurunuela, C., Rincon, J.M.D., Herrerojaraba, J.E., Rogez, G.: 2D silhouette and 3D skeletal models for human detection and tracking. In: International Conference on Pattern Recognition, pp. 244–247 (2004)
Google Scholar
Sigal, L., Balan, A.O., Black, M.J.: Combined discriminative and generative articulated pose and non-rigid shape estimation. In: Neural Information Processing Systems, pp. 1337–1344 (2007)
Google Scholar
Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: Computer Vision and Pattern Recognition, p. 72 (2005)
Google Scholar
Ouyang, W., Chu, X., Wang, X.: Multi-source deep learning for human pose estimation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2337–2344 (2014)
Google Scholar
Haque, A., Peng, B., Luo, Z., Alahi, A., Yeung, S., Fei-Fei, L.: Towards viewpoint invariant 3D human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 160–177. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_10
Chapter Google Scholar
Pavlakos, G., Zhou, X., Derpanis, K.G., Daniilidis, K.: Coarse-to-fine volumetric prediction for single-image 3D human pose. In: Computer Vision and Pattern Recognition (2016)
Google Scholar
Shafaei, A., Little, J.J.: Real-time human motion capture with multiple depth cameras. In: 2016 13th Conference on Computer and Robot Vision (CRV), pp. 24–31. IEEE (2016)
Google Scholar
Tekin, B., Katircioglu, I., Salzmann, M., Lepetit, V., Fua, P.: Structured prediction of 3D human pose with deep neural networks. arXiv preprint arXiv:1605.05180 (2016)
Zhou, X., Sun, X., Zhang, W., Liang, S., Wei, Y.: Deep kinematic pose regression. In: European Conference on Computer Vision, pp. 186–201 (2016)
Google Scholar
Mehta, D., et al.: VNect: real-time 3D human pose estimation with a single RGB Camera. ACM Trans. Graph. 36, 1–14 (2017)
Article Google Scholar
Mehta, D., Rhodin, H., Casas, D., et al.: Monocular 3D human pose estimation in the wild using improved CNN supervision. In: 2017 Fifth International Conference on 3D Vision (3DV) (2017)
Google Scholar
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: Computer Vision and Pattern Regognition (CVPR) (2018)
Google Scholar
Loper, M., Mahmood, N., Black, M.J.: MoSh: motion and shape capture from sparse markers. In: International Conference on Computer Graphics and Interactive Techniques, vol. 33, p. 220 (2014)
Article Google Scholar
Akhter, I., Black, M.J.: Pose-conditioned joint angle limits for 3D human pose reconstruction. In: Computer Vision and Pattern Recognition, pp. 1446–1455 (2015)
Google Scholar
Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: International Conference on Computer Vision, pp. 2659–2668 (2017)
Google Scholar
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human 3.6 m: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1325–1339 (2014)
Article Google Scholar
EIA-0196217, F.b.N.: CMU graphics lab motion capture library (2000). http://mocap.cs.cmu.edu
Loper, M., Mahmood, N., Romero, J., Ponsmoll, G., Black, M.J.: SMPL: a skinned multi-person linear model. In: International Conference on Computer Graphics and Interactive Techniques, vol. 34, p. 248 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Key R&D Plan of China (No. 2017YFB1002804) and the National Natural Science Foundation of China (No. 61471359).

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Zeye Wu & Wujun Che
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
Zeye Wu
AICFVE of Beijing Film Academy, Beijing, 100088, China
Wujun Che

Authors

Zeye Wu
View author publications
You can also search for this author in PubMed Google Scholar
Wujun Che
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wujun Che .

Editor information

Editors and Affiliations

Beijing Institute of Technology, Beijing, China
Yongtian Wang
University of Chinese Academy of Science, Beijing, China
Qingmin Huang
Institute of Computer Science and Technology, Peking University, Beijing, China
Yuxin Peng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Z., Che, W. (2019). 3D Human Pose Lifting: From Joint Position to Joint Rotation. In: Wang, Y., Huang, Q., Peng, Y. (eds) Image and Graphics Technologies and Applications. IGTA 2019. Communications in Computer and Information Science, vol 1043. Springer, Singapore. https://doi.org/10.1007/978-981-13-9917-6_22

Download citation

DOI: https://doi.org/10.1007/978-981-13-9917-6_22
Published: 20 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9916-9
Online ISBN: 978-981-13-9917-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics