Adapting Egocentric Visual Hand Pose Estimation Towards a Robot-Controlled Exoskeleton

  • Gerald BauligEmail author
  • Thomas GuldeEmail author
  • Cristóbal CurioEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11134)


The basic idea behind a wearable robotic grasp assistance system is to support people that suffer from severe motor impairments in daily activities. Such a system needs to act mostly autonomously and according to the user’s intent. Vision-based hand pose estimation could be an integral part of a larger control and assistance framework. In this paper we evaluate the performance of egocentric monocular hand pose estimation for a robot-controlled hand exoskeleton in a simulation. For hand pose estimation we adopt a Convolutional Neural Network (CNN). We train and evaluate this network with computer graphics, created by our own data generator. In order to guide further design decisions we focus in our experiments on two egocentric camera viewpoints tested on synthetic data with the help of a 3D-scanned hand model, with and without an exoskeleton attached to it. We observe that hand pose estimation with a wrist-mounted camera performs more accurate than with a head-mounted camera in the context of our simulation. Further, a grasp assistance system attached to the hand alters visual appearance and can improve hand pose estimation. Our experiment provides useful insights for the integration of sensors into a context sensitive analysis framework for intelligent assistance.


Hand pose estimation Egocentric view Grasp assistance Simulation 



We gratefully acknowledge the Baden-Württemberg Foundation for supporting KONSENS-NHE (NEU007/3) in the program neurorobotics as well as the Federal Ministry of Education and Research (BMBF) for funding the projects MoCap 4.0 (03FH015IN6) and KollRo 4.0 (13FH049PX5). The exoskeleton CAD model prototype was kindly provided by Jonathan Eckstein from the University of Stuttgart.


  1. 1.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. CoRR abs/1605.08695 (2016).
  2. 2.
    Bambach, S., Lee, S., Crandall, D.J., Yu, C.: Lending a hand: detecting hands and recognizing activities in complex egocentric interactions. In: The IEEE International Conference on Computer Vision (ICCV), December 2015Google Scholar
  3. 3.
    Barsoum, E.: Articulated hand pose estimation review. CoRR abs/1604.06195 (2016).
  4. 4.
    Chan, C., Chen, S., Xie, P., Chang, C., Sun, M.: Recognition from hand cameras. CoRR abs/1512.01881 (2015).
  5. 5.
    Chen, X.: Awesome work on hand pose estimation. GitHub (2018).
  6. 6.
    Dipietro, L., Sabatini, A.M., Dario, P.: A survey of glove-based systems and their applications. IEEE Trans. Syst. Man, Cybern. Part C Appl. Rev. 38(4), 461–482 (2008). Scholar
  7. 7.
    Endo, Y., Tada, M., Mochimaru, M.: Reconstructing individual hand models from motion capture data. J. Comput. Des. Eng. 1(1), 1–12 (2014). Scholar
  8. 8.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108(1), 52–73 (2007)., special Issue on Vision for Human-Computer InteractionCrossRefGoogle Scholar
  9. 9.
    Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates, Inc. (2014).
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015).
  11. 11.
    Ho, T.: Convolutional pose machines - tensorflow. GitHub (2018).
  12. 12.
    Iqbal, U., Molchanov, P., Breuel, T., Gall, J., Kautz, J.: Hand pose estimation via latent 2.5d heatmap regression. CoRR abs/1804.09534 (2018).
  13. 13.
    Kim, D., et al.: Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor. In: Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology, UIST 2012, pp. 167–176. ACM, New York (2012).
  14. 14.
    Maekawa, T., et al.: Object-based activity recognition with heterogeneous sensors on wrist. In: Floréen, P., Krüger, A., Spasojevic, M. (eds.) Pervasive 2010. LNCS, vol. 6030, pp. 246–264. Springer, Heidelberg (2010). Scholar
  15. 15.
    Mueller, F., et al.: Ganerated hands for real-time 3D hand tracking from monocular RGB. CoRR abs/1712.01057 (2017).
  16. 16.
    Mueller, F., Mehta, D., Sotnychenko, O., Sridhar, S., Casas, D., Theobalt, C.: Real-time hand tracking under occlusion from an egocentric RGB-D sensor. In: Proceedings of International Conference on Computer Vision (ICCV), October 2017.
  17. 17.
    Pisharady, P., Saerbeck, M.: Recent methods and databases in vision-based hand gesture recognition: a review. Comput. Vis. Image Underst. 141, 152–165 (2015)CrossRefGoogle Scholar
  18. 18.
    Rajpura, P.S., Hegde, R.S., Bojinov, H.: Object detection using deep CNNs trained on synthetic images. CoRR abs/1706.06782 (2017).
  19. 19.
    Rogez, G., Supancic, J.S., Ramanan, D.: Understanding everyday hands in action from RGB-D images. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3889–3897, December 2015.
  20. 20.
    Rogez, G., Supancic III, J.S., Khademi, M., Montiel, J.M.M., Ramanan, D.: 3D hand pose detection in egocentric RGB-D images. CoRR abs/1412.0065 (2014).
  21. 21.
    Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: Proceedings of the International Conference on Computer Vision - Workshop on 3D Representation and Recognition (2011)Google Scholar
  22. 22.
    Simon, T., Joo, H., Matthews, I.A., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. CoRR abs/1704.07809 (2017).
  23. 23.
    Soekadar, S.R., et al.: Hybrid EEG/EOG-based brain/neural hand exoskeleton restores fully independent daily living activities after quadriplegia. Sci. Robot. 1(1), eaag3296 (2016)., Scholar
  24. 24.
    Sturman, D.J., Zeltzer, D.: A survey of glove-based input. IEEE Comput. Graph. Appl. 14(1), 30–39 (1994). Scholar
  25. 25.
    Wei, S., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. CoRR abs/1602.00134 (2016).
  26. 26.
    Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2878–2890 (2013). Scholar
  27. 27.
    Zhang, X., Chen, X., Li, Y., Lantz, V., Wang, K., Yang, J.: A framework for hand gesture recognition based on accelerometer and EMG sensors. IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 41(6), 1064–1076 (2011). Scholar
  28. 28.
    Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. CoRR abs/1705.01389 (2017).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Computer ScienceReutlingen UniversityReutlingenGermany

Personalised recommendations