Motion Capture of Hands in Action Using Discriminative Salient Points

  • Luca Ballan
  • Aparna Taneja
  • Jürgen Gall
  • Luc Van Gool
  • Marc Pollefeys
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7577)


Capturing the motion of two hands interacting with an object is a very challenging task due to the large number of degrees of freedom, self-occlusions, and similarity between the fingers, even in the case of multiple cameras observing the scene. In this paper we propose to use discriminatively learned salient points on the fingers and to estimate the finger-salient point associations simultaneously with the estimation of the hand pose. We introduce a differentiable objective function that also takes edges, optical flow and collisions into account. Our qualitative and quantitative evaluations show that the proposed approach achieves very accurate results for several challenging sequences containing hands and objects in action.


Particle Swarm Optimization Motion Capture Salient Point Virtual Node Hand Tracking 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. ACM Transaction on Graphics 28 (2009)Google Scholar
  2. 2.
    Rehg, J.M., Kanade, T.: Visual Tracking of High DOF Articulated Structures: An Application to Human Hand Tracking. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  3. 3.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. CVIU 108, 52–73 (2007)Google Scholar
  4. 4.
    Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: ICCV, pp. 1475–1482 (2009)Google Scholar
  5. 5.
    Romero, J., Kjellström, H., Kragic, D.: Hands in action: real-time 3D reconstruction of hands in interaction with objects. In: ICRA, pp. 458–463 (2010)Google Scholar
  6. 6.
    Oikonomidis, I., Kyriazis, N., Argyros, A.: Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: ICCV (2011)Google Scholar
  7. 7.
    Hamer, H., Gall, J., Weise, T., Van Gool, L.: An object-dependent hand pose prior from sparse training data. In: CVPR, pp. 671–678 (2010)Google Scholar
  8. 8.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: CVPR (2012)Google Scholar
  9. 9.
    Lu, S., Metaxas, D., Samaras, D., Oliensis, J.: Using multiple cues for hand tracking and model refinement. In: CVPR (2003)Google Scholar
  10. 10.
    de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3D hand pose estimation from monocular video. PAMI 33, 1793–1805 (2011)CrossRefGoogle Scholar
  11. 11.
    Delamarre, Q., Faugeras, O.D.: 3D articulated models and multiview tracking with physical forces. CVIU 81, 328–357 (2001)zbMATHGoogle Scholar
  12. 12.
    Bray, M., Koller-Meier, E., Van Gool, L.: Smart particle filtering for high-dimensional tracking. CVIU 106, 116–129 (2007)Google Scholar
  13. 13.
    Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3D tracking of hand articulations using kinect. In: BMVC (2011)Google Scholar
  14. 14.
    Stenger, B., Mendonca, P., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: CVPR, pp. 310–315 (2001)Google Scholar
  15. 15.
    MacCormick, J., Isard, M.: Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking. In: Vernon, D. (ed.) ECCV 2000, Part II. LNCS, vol. 1843, pp. 3–19. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  16. 16.
    Heap, T., Hogg, D.: Towards 3D hand tracking using a deformable model. In: International Conference on Automatic Face and Gesture Recognition (1996)Google Scholar
  17. 17.
    Wu, Y., Lin, J., Huang, T.: Capturing natural hand articulation. In: ICCV, pp. 426–432 (2001)Google Scholar
  18. 18.
    Sudderth, E., Mandel, M., Freeman, W., Willsky, A.: Visual Hand Tracking Using Nonparametric Belief Propagation. In: Workshop on Generative Model Based Vision, 189–189 (2004)Google Scholar
  19. 19.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Markerless and Efficient 26-DOF Hand Pose Recovery. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 744–757. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  20. 20.
    Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3D hand pose reconstruction using specialized mappings. In: ICCV, pp. 378–387 (2001)Google Scholar
  21. 21.
    Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: CVPR, pp. 432–439 (2003)Google Scholar
  22. 22.
    de Campos, T., Murray, D.: Regression-based hand pose estimation from multiple cameras. In: CVPR, pp. 782–789 (2006)Google Scholar
  23. 23.
    Salzmann, M., Urtasun, R.: Combining discriminative and generative methods for 3D deformable surface and articulated pose reconstruction. In: CVPR (2010)Google Scholar
  24. 24.
    Stenger, B., Thayananthan, A., Torr, P.: Model-based hand tracking using a hierarchical bayesian filter. PAMI 28, 1372–1384 (2006)CrossRefGoogle Scholar
  25. 25.
    Liu, Y., Stoll, C., Gall, J., Seidel, H.P., Theobalt, C.: Markerless motion capture of interacting characters using multi-view image segmentation. In: CVPR, pp. 1249–1256 (2011)Google Scholar
  26. 26.
    Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: A unified approach to shape interpolation and skeleton-driven deformation. In: SIGGRAPH, pp. 165–172 (2000)Google Scholar
  27. 27.
    Bregler, C., Malik, J., Pullen, K.: Twist based acquisition and tracking of animal and human kinematics. IJCV 56, 179–194 (2004)CrossRefGoogle Scholar
  28. 28.
    Ballan, L., Cortelazzo, G.M.: Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes. In: 3DPVT (2008)Google Scholar
  29. 29.
    Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region- and motion-based 3D tracking of rigid and articulated objects. PAMI 32, 402–415 (2010)CrossRefGoogle Scholar
  30. 30.
    Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. PAMI 33, 2188–2202 (2011)CrossRefGoogle Scholar
  31. 31.
    Jones, M.W., Baerentzen, J.A., Sramek, M.: 3D distance fields: A survey of techniques and applications. IEEE Transactions on Visualization and Computer Graphics 12, 581–599 (2006)CrossRefGoogle Scholar
  32. 32.
    Teschner, M., Kimmerle, S., Heidelberger, B., Zachmann, G., Raghupathi, L., Fuhrmann, A., Cani, M.P., Faure, F., Magnetat-Thalmann, N., Strasser, W.: Collision detection for deformable objects. In: Eurographics, pp. 119–139 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Luca Ballan
    • 1
  • Aparna Taneja
    • 1
  • Jürgen Gall
    • 1
  • Luc Van Gool
    • 1
  • Marc Pollefeys
    • 1
  1. 1.ETH ZurichSwitzerland

Personalised recommendations