ArticulatedFusion: Real-Time Reconstruction of Motion, Geometry and Segmentation Using a Single Depth Camera

  • Chao Li
  • Zheheng Zhao
  • Xiaohu GuoEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11212)


This paper proposes a real-time dynamic scene reconstruction method capable of reproducing the motion, geometry, and segmentation simultaneously given live depth stream from a single RGB-D camera. Our approach fuses geometry frame by frame and uses a segmentation-enhanced node graph structure to drive the deformation of geometry in registration step. A two-level node motion optimization is proposed. The optimization space of node motions and the range of physically-plausible deformations are largely reduced by taking advantage of the articulated motion prior, which is solved by an efficient node graph segmentation method. Compared to previous fusion-based dynamic scene reconstruction methods, our experiments show robust and improved reconstruction results for tangential and occluded motions.


Fusion Articulated Motion Segmentation 



We would like to thank the reviewers for their valuable comments. We are grateful to Matthias Innmann for the help on comparison results of VolumeDeform, Tao Yu for providing their Vicon-based ground-truth marker data in BodyFusion, and Dimitrios Tzionas for providing their data. This work was partially supported by National Science Foundation under grant number IIS-1149737. Chao would like to thank the support provided by Hua Guo during the preparation for this paper.

Supplementary material

Supplementary material 1 (mp4 21539 KB)

474213_1_En_20_MOESM2_ESM.pdf (2.1 mb)
Supplementary material 2 (pdf 2101 KB)


  1. 1.
    Cai, Y., Guo, X.: Anisotropic superpixel generation based on Mahalanobis distance. Comput. Graph. Forum 35(7), 199–207 (2016)CrossRefGoogle Scholar
  2. 2.
    Cai, Y., Guo, X., Liu, Y., Wang, W., Mao, W., Zhong, Z.: Surface approximation via asymptotic optimal geometric partition. IEEE Trans. Vis. Comput. Graph. 23(12), 2613–2626 (2017)CrossRefGoogle Scholar
  3. 3.
    Cao, C., Weng, Y., Lin, S., Zhou, K.: 3D shape regression for real-time facial animation. ACM Trans. Graph. 32(4), 41 (2013)CrossRefGoogle Scholar
  4. 4.
    Chang, W., Zwicker, M.: Global registration of dynamic range scans for articulated model reconstruction. ACM Trans. Graph. (TOG) 30(3), 26 (2011)CrossRefGoogle Scholar
  5. 5.
    Dou, M., et al.: Fusion4D: real-time performance capture of challenging scenes. ACM Trans. Graph. 35(4), 114 (2016)CrossRefGoogle Scholar
  6. 6.
    Guo, K., Xu, F., Wang, Y., Liu, Y., Dai, Q.: Robust non-rigid motion tracking and surface reconstruction using \({L}_0\) regularization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3083–3091 (2015)Google Scholar
  7. 7.
    Guo, K., Xu, F., Wang, Y., Liu, Y., Dai, Q.: Robust non-rigid motion tracking and surface reconstruction using \({L}_0\) regularization. IEEE Trans. Vis. Comput. Graph. (2017)Google Scholar
  8. 8.
    Guo, K., Xu, F., Yu, T., Liu, X., Dai, Q., Liu, Y.: Real-time geometry, albedo, and motion reconstruction using a single RGB-D camera. ACM Trans. Graph. 36(3), 32 (2017)CrossRefGoogle Scholar
  9. 9.
    Horn, B.K.P.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A 4(4), 629–642 (1987)CrossRefGoogle Scholar
  10. 10.
    Innmann, M., Zollhöfer, M., Nießner, M., Theobalt, C., Stamminger, M.: VolumeDeform: real-time volumetric non-rigid reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 362–379. Springer, Cham (2016). Scholar
  11. 11.
    James, D.L., Twigg, C.D.: Skinning mesh animations. ACM Trans. Graph. 24(3), 399–407 (2005)CrossRefGoogle Scholar
  12. 12.
    Le, B.H., Deng, Z.: Smooth skinning decomposition with rigid bones. ACM Trans. Graph. 31(6), 199 (2012)CrossRefGoogle Scholar
  13. 13.
    Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. ACM Trans. Graph. (TOG) 28(5), 175 (2009)CrossRefGoogle Scholar
  14. 14.
    Li, H., Yu, J., Ye, Y., Bregler, C.: Realtime facial animation with on-the-fly correctives. ACM Trans. Graph. 32(4), 42-1 (2013)zbMATHGoogle Scholar
  15. 15.
    Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3D surface construction algorithm. In: ACM siggraph computer graphics, vol. 21, pp. 163–169. ACM (1987)Google Scholar
  16. 16.
    Murray, R.M., Li, Z., Sastry, S.S.: A Mathematical Introduction to Robotic Manipulation. CRC Press, Boca Raton (1994)zbMATHGoogle Scholar
  17. 17.
    Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 343–352 (2015)Google Scholar
  18. 18.
    Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 10th IEEE international symposium on Mixed and Augmented Reality, pp. 127–136 (2011)Google Scholar
  19. 19.
    Pekelny, Y., Gotsman, C.: Articulated object reconstruction and markerless motion capture from depth video. Comput. Graph. Forum 27(2), 399–408 (2008)CrossRefGoogle Scholar
  20. 20.
    Pons-Moll, G., Baak, A., Helten, T., Müller, M., Seidel, H.P., Rosenhahn, B.: Multisensor-fusion for 3D full-body human motion capture. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 663–670 (2010)Google Scholar
  21. 21.
    Slavcheva, M., Baust, M., Cremers, D., Ilic, S.: KillingFusion: non-rigid 3D reconstruction without correspondences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  22. 22.
    Slavcheva, M., Baust, M., Ilic, S.: SobolevFusion: 3D reconstruction of scenes undergoing free non-rigid motion. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  23. 23.
    Sorkine, O.: Least-squares rigid motion using SVD. Technical notes (2017)Google Scholar
  24. 24.
    Tagliasacchi, A., Schröder, M., Tkach, A., Bouaziz, S., Botsch, M., Pauly, M.: Robust articulated-ICP for real-time hand tracking. Comput. Graph. Forum 34(5), 101–114 (2015)CrossRefGoogle Scholar
  25. 25.
    Tkach, A., Pauly, M., Tagliasacchi, A.: Sphere-meshes for real-time hand modeling and tracking. ACM Trans. Graph. 35(6), 222 (2016)CrossRefGoogle Scholar
  26. 26.
    Tzionas, D., Gall, J.: Reconstructing articulated rigged models from RGB-D videos. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 620–633. Springer, Cham (2016). Scholar
  27. 27.
    Vlasic, D., Baran, I., Matusik, W., Popović, J.: Articulated mesh animation from multi-view silhouettes. ACM Trans. Graph. 27(3), 97 (2008)CrossRefGoogle Scholar
  28. 28.
    Wand, M., et al.: Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM Trans. Graph. 28(2), 15 (2009)CrossRefGoogle Scholar
  29. 29.
    Yu, T., et al.: Bodyfusion: real-time capture of human motion and surface geometry using a single depth camera. In: The IEEE International Conference on Computer Vision (ICCV). IEEE, October 2017Google Scholar
  30. 30.
    Yu, T., et al.: Doublefusion: real-time capture of human performances with inner body shapes from a single depth sensor. In: The IEEE International Conference on Computer Vision and Pattern Recognition(CVPR). IEEE, June 2018Google Scholar
  31. 31.
    Zhang, H., Xu, F.: MixedFusion: real-time reconstruction of an indoor scene with dynamic objects. IEEE Trans. Vis. Comput. Graph. (2017)Google Scholar
  32. 32.
    Zollhöfer, M., et al.: Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33(4), 156 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of Texas at DallasRichardsonUSA

Personalised recommendations