Advertisement

A Kinematic Chain Space for Monocular Motion Capture

  • Bastian WandtEmail author
  • Hanno Ackermann
  • Bodo Rosenhahn
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11132)

Abstract

This paper deals with motion capture of kinematic chains (e.g. human skeletons) from monocular image sequences taken by uncalibrated cameras. We present a method based on projecting an observation onto a kinematic chain space (KCS). An optimization of the nuclear norm is proposed that implicitly enforces structural properties of the kinematic chain. Unlike other approaches our method is not relying on training data or previously determined constraints such as particular body lengths. The proposed algorithm is able to reconstruct scenes with little or no camera motion and previously unseen motions. It is not only applicable to human skeletons but also to other kinematic chains for instance animals or industrial robots. We achieve state-of-the-art results on different benchmark databases and real world scenes.

References

  1. 1.
    Akhter, I., Sheikh, Y., Khan, S., Kanade, T.: Trajectory space: a dual representation for nonrigid structure from motion. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1442–1456 (2011)CrossRefGoogle Scholar
  2. 2.
    Hamsici, O.C., Gotardo, P.F.U., Martinez, A.M.: Learning spatially-smooth mappings in non-rigid structure from motion. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 260–273. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33765-9_19CrossRefGoogle Scholar
  3. 3.
    Dai, Y., Li, H.: A simple prior-free method for non-rigid structure-from-motion factorization. In: Conference on Computer Vision and Pattern Recognition (CVPR 2012), Washington, DC, USA, pp. 2018–2025. IEEE Computer Society (2012)Google Scholar
  4. 4.
    Rehan, A., et al.: NRSfM using local rigidity. In: Proceedings Winter Conference on Applications of Computer Vision, Steamboat Springs, CO, USA, pp. 69–74. IEEE, March 2014Google Scholar
  5. 5.
    Chen, Y.-L., Chai, J.: 3D reconstruction of human motion and skeleton from uncalibrated monocular video. In: Zha, H., Taniguchi, R., Maybank, S. (eds.) ACCV 2009, Part I. LNCS, vol. 5994, pp. 71–82. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-12307-8_7CrossRefGoogle Scholar
  6. 6.
    Wandt, B., Ackermann, H., Rosenhahn, B.: 3D reconstruction of human motion from monocular image sequences. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1505–1516 (2016)CrossRefGoogle Scholar
  7. 7.
    Ramakrishna, V., Kanade, T., Sheikh, Y.: Reconstructing 3D human pose from 2D image landmarks. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 573–586. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33765-9_41CrossRefGoogle Scholar
  8. 8.
    Wang, C., Wang, Y., Lin, Z., Yuille, A., Gao, W.: Robust estimation of 3D human poses from a single image. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  9. 9.
    Akhter, I., Black, M.J.: Pose-conditioned joint angle limits for 3D human pose reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 1446–1455, June 2015Google Scholar
  10. 10.
    Zhou, X., Leonardos, S., Hu, X., Daniilidis, K.: 3D shape estimation from 2D landmarks: a convex relaxation approach. In: CVPR, pp. 4447–4455. IEEE Computer Society (2015)Google Scholar
  11. 11.
    Wandt, B., Ackermann, H., Rosenhahn, B.: 3D human motion capture from monocular image sequences. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, June 2015Google Scholar
  12. 12.
    Bregler, C., Hertzmann, A., Biermann, H.: Recovering non-rigid 3D shape from image streams. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 690–696 (2000)Google Scholar
  13. 13.
    CMU: Human motion capture database (2014)Google Scholar
  14. 14.
    Kazemi, V., Burenius, M., Azizpour, H., Sullivan, J.: Multi-view body part recognition with random forests. In: British Machine Vision Conference (BMVC) (2013)Google Scholar
  15. 15.
    Sigal, L., Balan, A.O., Black, M.J.: Humaneva: synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1–2), 4–27 (2010)CrossRefGoogle Scholar
  16. 16.
    Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2014)CrossRefGoogle Scholar
  17. 17.
    Pishchulin, L., et al.: Deepcut: joint subset partition and labeling for multi person pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  18. 18.
    Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: DeeperCut: a deeper, stronger, and faster multi-person pose estimation model. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part VI. LNCS, vol. 9910, pp. 34–50. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_3CrossRefGoogle Scholar
  19. 19.
    Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vis. 9, 137–154 (1992)CrossRefGoogle Scholar
  20. 20.
    Xiao, J., Chai, J., Kanade, T.: A closed-form solution to non-rigid shape and motion recovery. In: Pajdla, T., Matas, J. (eds.) ECCV 2004, Part IV. LNCS, vol. 3024, pp. 573–587. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-24673-2_46CrossRefGoogle Scholar
  21. 21.
    Torresani, L., Hertzmann, A., Bregler, C.: Learning non-rigid 3D shape from 2D motion. In: Thrun, S., Saul, L.K., Schölkopf, B. (eds.) Neural Information Processing Systems (NIPS). MIT Press, Cambridge (2003)Google Scholar
  22. 22.
    Torresani, L., Hertzmann, A., Bregler, C.: Nonrigid structure-from-motion: estimating shape and motion with hierarchical priors. IEEE Trans. Pattern Anal. Mach. Intell. 30, 878–892 (2008). https://ieeexplore.ieee.org/document/4359359CrossRefGoogle Scholar
  23. 23.
    Torresani, L., Yang, D.B., Alexander, E.J., Bregler, C.: Tracking and modeling non-rigid objects with rank constraints. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 493–500 (2001)Google Scholar
  24. 24.
    Gotardo, P., Martinez, A.: Non-rigid structure from motion with complementary rank-3 spaces. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  25. 25.
    Gotardo, P., Martinez, A.: Kernel non-rigid structure from motion. In: International Conference on Computer Vision (ICCV). IEEE (2011)Google Scholar
  26. 26.
    Park, H.S., Sheikh, Y.: 3D reconstruction of a smooth articulated trajectory from a monocular image sequence. In: Metaxas, D.N., Quan, L., Sanfeliu, A., Gool, L.J.V. (eds.) ICCV, pp. 201–208. IEEE Computer Society (2011)Google Scholar
  27. 27.
    Valmadre, J., Zhu, Y., Sridharan, S., Lucey, S.: Efficient articulated trajectory reconstruction using dynamic programming and filters. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 72–85. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33718-5_6CrossRefGoogle Scholar
  28. 28.
    Lee, M., Cho, J., Choi, C.H., Oh, S.: Procrustean normal distribution for non-rigid structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1280–1287 (2013)Google Scholar
  29. 29.
    Zell, P., Wandt, B., Rosenhahn, B.: Joint 3D human motion capture and physical analysis from monocular videos. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017Google Scholar
  30. 30.
    Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Piscataway, NJ, USA. IEEE, October 2017Google Scholar
  31. 31.
    Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M.J., Gehler, P.V.: Unite the people: closing the loop between 3D and 2D human representations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  32. 32.
    Pavlakos, G., Zhou, X., Derpanis, K.G., Daniilidis, K.: Coarse-to-fine volumetric prediction for single-image 3D human pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  33. 33.
    Rogez, G., Weinzaepfel, P., Schmid, C.: LCR-Net: localization-classification-regression for human pose. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, United States, pp. 1216–1224. IEEE, July 2017Google Scholar
  34. 34.
    Zhou, X., Huang, Q., Sun, X., Xue, X., Wei, Y.: Towards 3D human pose estimation in the wild: a weakly-supervised approach. In: The IEEE International Conference on Computer Vision (ICCV), October 2017Google Scholar
  35. 35.
    Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  36. 36.
    Zhou, X., Zhu, M., Leonardos, S., Derpanis, K.G., Daniilidis, K.: Sparseness meets deepness: 3D human pose estimation from monocular video. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  37. 37.
    von Marcard, T., Pons-Moll, G., Rosenhahn, B.: Human pose estimation from video and imus. Trans. Pattern Anal. Mach. Intell. 38(8), 1533–1547 (2016)CrossRefGoogle Scholar
  38. 38.
    von Marcard, T., Rosenhahn, B., Black, M., Pons-Moll, G.: Sparse inertial poser: automatic 3D human pose estimation from sparse IMUs. In: Proceedings of the 38th Annual Conference of the European Association for Computer Graphics (Eurographics). Computer Graphics Forum, vol. 36(2) (2017)Google Scholar
  39. 39.
    von Marcard, T., Henschel, R., Black, M.J., Rosenhahn, B., Pons-Moll, G.: Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part X. LNCS, vol. 11214, pp. 614–631. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01249-6_37CrossRefGoogle Scholar
  40. 40.
    Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34(6), 248:1–248:16 (2015)Google Scholar
  41. 41.
    Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3D human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part V. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_34CrossRefGoogle Scholar
  42. 42.
    Alldieck, T., Kassubeck, M., Wandt, B., Rosenhahn, B., Magnor, M.: Optical flow-based 3D human motion estimation from monocular video. In: German Conference on Pattern Recognition (GCPR), September 2017Google Scholar
  43. 43.
    Cai, J.F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Simo-Serra, E., Ramisa, A., Alenyà, G., Torras, C., Moreno-Noguer, F.: Single image 3D human pose estimation from noisy observations. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2673–2680. IEEE (2012)Google Scholar
  45. 45.
    Hasler, N., Rosenhahn, B., Thormählen, T., Wand, M., Seidel, H.P.: Markerless motion capture with unsynchronized moving cameras. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Bastian Wandt
    • 1
    Email author
  • Hanno Ackermann
    • 1
  • Bodo Rosenhahn
    • 1
  1. 1.Institut für InformationsverarbeitungLeibniz Universität HannoverHanoverGermany

Personalised recommendations