Advertisement

Joint 3D Tracking of a Deformable Object in Interaction with a Hand

  • Aggeliki TsoliEmail author
  • Antonis A. Argyros
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11218)

Abstract

We present a novel method that is able to track a complex deformable object in interaction with a hand. This is achieved by formulating and solving an optimization problem that jointly considers the hand, the deformable object and the hand/object contact points. The optimization evaluates several hand/object contact configuration hypotheses and adopts the one that results in the best fit of the object’s model to the available RGBD observations in the vicinity of the hand. Thus, the hand is not treated as a distractor that occludes parts of the deformable object, but as a source of valuable information. Experimental results on a dataset that has been developed specifically for this new problem illustrate the superior performance of the proposed approach against relevant, state of the art solutions.

Notes

Acknowledgments

This work was partially supported by the EU project Co4Robots.

References

  1. 1.
    Albrecht, I., Haber, J., Seidel, H.P.: Construction and animation of anatomically based human hand models. In: Eurographics Symposium on Computer Animation, p. 109. Eurographics Association (2003)Google Scholar
  2. 2.
    Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 640–653. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33783-3_46CrossRefGoogle Scholar
  3. 3.
    Bartoli, A., Gerard, Y., Chadebecq, F., Collins, T., Pizarro, D.: Shape-from-template. IEEE Trans. Patt. Anal. Mach. Intell. 37(10), 2099–2118 (2015)CrossRefGoogle Scholar
  4. 4.
  5. 5.
    Crivellaro, A., Lepetit, V.: Robust 3D tracking with descriptor fields. In: Conference on Computer Vision and Pattern Recognition (CVPR), No. EPFL-CONF-198219 (2014)Google Scholar
  6. 6.
    Garg, R., Roussos, A., Agapito, L.: A variational approach to video registration with subspace constraints. Int. J. Comput. Vis. 104(3), 286–314 (2013)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Ge, L., Liang, H., Yuan, J., Thalmann, D.: Robust 3D hand pose estimation in single depth images: from single-view CNN to multi-view CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3593–3601 (2016)Google Scholar
  8. 8.
    Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: IEEE International Conference on Computer Vision (ICCV), pp. 1475–1482. IEEE (2009)Google Scholar
  9. 9.
    Hilsmann, A., Eisert, P.: Tracking deformable surfaces with optical flow in the presence of self occlusion in monocular image sequences. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, VPRW 2008, pp. 6, 1 (2008).  https://doi.org/10.1109/CVPRW.2008.4563081
  10. 10.
    Kyriazis, N., Argyros, A.: Physically plausible 3D scene tracking: the single actor hypothesis. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9–16. IEEE (2013)Google Scholar
  11. 11.
    Kyriazis, N., Argyros, A.: Scalable 3D tracking of multiple interacting objects. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3430–3437. IEEE (2014)Google Scholar
  12. 12.
    Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)MathSciNetCrossRefGoogle Scholar
  15. 15.
  16. 16.
    Mueller, F., Mehta, D., Sotnychenko, O., Sridhar, S., Casas, D., Theobalt, C.: Real-time hand tracking under occlusion from an egocentric RGB-D sensor. In: Proceedings of International Conference on Computer Vision (ICCV), vol. 10 (2017)Google Scholar
  17. 17.
    Ngo, D.T., Park, S., Jorstad, A., Crivellaro, A., Yoo, C., Fua, P.: Dense image registration and deformable surface reconstruction in presence of occlusions and minimal texture. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  18. 18.
    Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3316–3324 (2015)Google Scholar
  19. 19.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3D tracking of hand articulations using kinect. In: BMVC, Dundee, UK, August 2011Google Scholar
  20. 20.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: International Conference on Computer Vision (ICCV), pp. 2088–2095. IEEE (2011)Google Scholar
  21. 21.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: IEEE Computer Vision and Pattern Recognition (CVPR 2012), pp. 1862–1869. IEEE, Providence, June 2012Google Scholar
  22. 22.
    Östlund, J., Varol, A., Ngo, D.T., Fua, P.: Laplacian meshes for monocular 3D shape recovery. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 412–425. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33712-3_30CrossRefGoogle Scholar
  23. 23.
    Panteleris, P., Kyriazis, N., Argyros, A.A.: 3D tracking of human hands in interaction with unknown objects. In: British Machine Vision Conference (BMVC 2015), pp. 123–1. BMVA, Swansea, September 2015Google Scholar
  24. 24.
    Panteleris, P., Oikonomidis, I., Argyros, A.: Using a single RGB frame for real time 3D hand pose estimation in the wild (2018)Google Scholar
  25. 25.
    Parashar, S., Pizarro, D., Bartoli, A., Collins, T.: As-rigid-as-possible volumetric shape-from-template. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 891–899 (2015)Google Scholar
  26. 26.
    Petit, A., Lippiello, V., Siciliano, B.: Tracking an elastic object with an RGB-D sensor for a pizza chef robotGoogle Scholar
  27. 27.
    Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J.: Realtime and robust hand tracking from depth. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1106–1113 (2014)Google Scholar
  28. 28.
    Romero, J., Kjellstrom, H., Kragic, D.: Monocular real-time 3D articulated hand pose estimation. In: IEEE-RAS International Conference on Humanoid Robots, December 2009.  https://doi.org/10.1109/ICHR.2009.5379596, http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5379596
  29. 29.
    Salzmann, M., Lepetit, V., Fua, P.: Deformable surface tracking ambiguities. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8. IEEE (2007)Google Scholar
  30. 30.
    Schulman, J., Lee, A., Ho, J., Abbeel, P.: Tracking deformable objects with point clouds. In: Proceedings of the International Conference on Robotics and Automation (ICRA) (2013)Google Scholar
  31. 31.
    Sharp, T., et al.: Accurate, robust, and flexible real-time hand tracking. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3633–3642. ACM (2015)Google Scholar
  32. 32.
    Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2 (2017)Google Scholar
  33. 33.
    Sinha, A., Choi, C., Ramani, K.: DeepHand: robust hand pose estimation by completing a matrix imputed with deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4150–4158 (2016)Google Scholar
  34. 34.
    Sridhar, S., Mueller, F., Zollhöfer, M., Casas, D., Oulasvirta, A., Theobalt, C.: Real-time joint tracking of a hand manipulating an object from RGB-D input. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 294–310. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_19CrossRefGoogle Scholar
  35. 35.
    Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using RGB and depth data. In: IEEE International Conference on Computer Vision (ICCV), pp. 2456–2463. IEEE (2013)Google Scholar
  36. 36.
    Sumner, R.W., Popović, J.: Deformation transfer for triangle meshes. In: ACM Transactions on Graphics (TOG), vol. 23, pp. 399–405. ACM (2004)Google Scholar
  37. 37.
    Tagliasacchi, A., Schröder, M., Tkach, A., Bouaziz, S., Botsch, M., Pauly, M.: Robust articulated-ICP for real-time hand tracking. In: Computer Graphics Forum, vol. 34, pp. 101–114. Wiley Online Library (2015)Google Scholar
  38. 38.
    Tang, D., Jin Chang, H., Tejani, A., Kim, T.K.: Latent regression forest: structured estimation of 3D articulated hand posture. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3786–3793 (2014)Google Scholar
  39. 39.
    The Blender open source 3D creation suite. https://docs.blender.org/
  40. 40.
    Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. (ToG) 33(5), 169 (2014)CrossRefGoogle Scholar
  41. 41.
    Tsoli, A., Argyros, A.: Tracking deformable surfaces that undergo topological changes using an RGB-D camera. In: Proceedings of International Conference on 3D Vision (3DV), Stanford University, CA, USA, October 2016Google Scholar
  42. 42.
    Tzionas, D., Ballan, L., Srikantha, A., Aponte, P., Pollefeys, M., Gall, J.: Capturing hands in action using discriminative salient points and physics simulation. Int. J. Comput. Vis. 118(2), 172–193 (2016)MathSciNetCrossRefGoogle Scholar
  43. 43.
    Tzionas, D., Gall, J.: 3D object reconstruction from hand-object interactions. In: International Conference on Computer Vision (ICCV), pp. 729–737, December 2015Google Scholar
  44. 44.
    Tzionas, D., Srikantha, A., Aponte, P., Gall, J.: Capturing hand motion with an RGB-D sensor, fusing a generative model with salient points. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 277–289. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11752-2_22CrossRefGoogle Scholar
  45. 45.
    Wan, C., Probst, T., Van Gool, L., Yao, A.: Crossing nets: combining GANs and VAEs with a shared latent space for hand pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2017)Google Scholar
  46. 46.
    Wan, C., Yao, A., Van Gool, L.: Hand pose estimation from local surface normals. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 554–569. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_34CrossRefGoogle Scholar
  47. 47.
    Wuhrer, S., Lang, J., Shu, C.: Tracking complete deformable objects with finite elements. In: 3DIMPVT, pp. 1–8. IEEE Computer Society (2012). http://dblp.uni-trier.de/db/conf/3dim/3dimpvt2012.html#WuhrerLS12
  48. 48.
    Ye, Q., Yuan, S., Kim, T.-K.: Spatial attention deep net with partial PSO for hierarchical hybrid hand pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 346–361. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_21CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Institute of Computer Science, FORTHHeraklionGreece

Personalised recommendations