Abstract
3D human pose estimation is a fundamental task in computer vision. However, most of the related works focus on recovering human joint positions, which provides sparse and insufficient pose information for many applications like 3D avatar animation. Therefore, this paper presents a deep network for recovering joint angles from 3D joint positions, which learns the prior dependence between them. We test the validity and robustness of our method. We also discuss some details in designing and training the network. Our method is simple, effective and extensive. It can be combined with work of 3D human pose estimation that predict 3D joint positions from image or depth data to produce more detailed and natural poses. It builds a map between two joint sets with different numbers of joints, which provides a framework to unify multiple datasets for human pose estimation with different annotation formats.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wei, X., Zhang, P., Chai, J.: Accurate realtime full-body motion capture using a single depth camera. ACM Trans. Graph. (TOG) 31, 188 (2012)
Knoop, S., Vacek, S., Dillmann, R.: Sensor fusion for 3D human body tracking with an articulated 3D body model. In: International Conference on Robotics and Automation, pp. 1686–1691 (2006)
Shuai, L., Li, C., Guo, X., Prabhakaran, B., Chai, J.: Motion capture with ellipsoidal skeleton using multiple depth cameras. IEEE Trans. Vis. Comput. Graph. 23, 1085–1098 (2017)
De Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H., Thrun, S.: Performance capture from sparse multi-view video. In: International Conference on Computer Graphics and Interactive Techniques, vol. 27, p. 98 (2008)
Wu, C., Stoll, C., Valgaerts, L., Theobalt, C.: On-set performance capture of multiple actors with a stereo camera. In: International Conference on Computer Graphics and Interactive Techniques, vol. 32, pp. 1–11 (2013)
Urtasun, R., Darrell, T.: Sparse probabilistic regression for activity-independent human pose inference. In: Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Anguelov, D., et al.: Discriminative learning of Markov random fields for segmentation of 3D scan data. In: Computer Vision and Pattern Recognition, pp. 169–176 (2005)
Siddiqui, M., Medioni, G.: Human pose estimation from a single view point, real-time range sensor. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 1–8 (2010)
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: 2011 International Conference on Computer Vision, pp. 415–422 (2011)
Kanaujia, A.: Coupling top-down and bottom-up methods for 3D human pose and shape estimation from monocular image sequences. In: Computer Vision and Pattern Recognition (2014)
Orriteurunuela, C., Rincon, J.M.D., Herrerojaraba, J.E., Rogez, G.: 2D silhouette and 3D skeletal models for human detection and tracking. In: International Conference on Pattern Recognition, pp. 244–247 (2004)
Sigal, L., Balan, A.O., Black, M.J.: Combined discriminative and generative articulated pose and non-rigid shape estimation. In: Neural Information Processing Systems, pp. 1337–1344 (2007)
Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: Computer Vision and Pattern Recognition, p. 72 (2005)
Ouyang, W., Chu, X., Wang, X.: Multi-source deep learning for human pose estimation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2337–2344 (2014)
Haque, A., Peng, B., Luo, Z., Alahi, A., Yeung, S., Fei-Fei, L.: Towards viewpoint invariant 3D human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 160–177. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_10
Pavlakos, G., Zhou, X., Derpanis, K.G., Daniilidis, K.: Coarse-to-fine volumetric prediction for single-image 3D human pose. In: Computer Vision and Pattern Recognition (2016)
Shafaei, A., Little, J.J.: Real-time human motion capture with multiple depth cameras. In: 2016 13th Conference on Computer and Robot Vision (CRV), pp. 24–31. IEEE (2016)
Tekin, B., Katircioglu, I., Salzmann, M., Lepetit, V., Fua, P.: Structured prediction of 3D human pose with deep neural networks. arXiv preprint arXiv:1605.05180 (2016)
Zhou, X., Sun, X., Zhang, W., Liang, S., Wei, Y.: Deep kinematic pose regression. In: European Conference on Computer Vision, pp. 186–201 (2016)
Mehta, D., et al.: VNect: real-time 3D human pose estimation with a single RGB Camera. ACM Trans. Graph. 36, 1–14 (2017)
Mehta, D., Rhodin, H., Casas, D., et al.: Monocular 3D human pose estimation in the wild using improved CNN supervision. In: 2017 Fifth International Conference on 3D Vision (3DV) (2017)
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: Computer Vision and Pattern Regognition (CVPR) (2018)
Loper, M., Mahmood, N., Black, M.J.: MoSh: motion and shape capture from sparse markers. In: International Conference on Computer Graphics and Interactive Techniques, vol. 33, p. 220 (2014)
Akhter, I., Black, M.J.: Pose-conditioned joint angle limits for 3D human pose reconstruction. In: Computer Vision and Pattern Recognition, pp. 1446–1455 (2015)
Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: International Conference on Computer Vision, pp. 2659–2668 (2017)
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human 3.6 m: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1325–1339 (2014)
EIA-0196217, F.b.N.: CMU graphics lab motion capture library (2000). http://mocap.cs.cmu.edu
Loper, M., Mahmood, N., Romero, J., Ponsmoll, G., Black, M.J.: SMPL: a skinned multi-person linear model. In: International Conference on Computer Graphics and Interactive Techniques, vol. 34, p. 248 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Acknowledgements
This work is supported by the National Key R&D Plan of China (No. 2017YFB1002804) and the National Natural Science Foundation of China (No. 61471359).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, Z., Che, W. (2019). 3D Human Pose Lifting: From Joint Position to Joint Rotation. In: Wang, Y., Huang, Q., Peng, Y. (eds) Image and Graphics Technologies and Applications. IGTA 2019. Communications in Computer and Information Science, vol 1043. Springer, Singapore. https://doi.org/10.1007/978-981-13-9917-6_22
Download citation
DOI: https://doi.org/10.1007/978-981-13-9917-6_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9916-9
Online ISBN: 978-981-13-9917-6
eBook Packages: Computer ScienceComputer Science (R0)