Learning to Gesticulate by Observation Using a Deep Generative Approach

  • Unai Zabala
  • Igor RodriguezEmail author
  • José María Martínez-Otzeta
  • Elena Lazkano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11876)


The goal of the system presented in this paper is to develop a natural talking gesture generation behavior for a humanoid robot, by feeding a Generative Adversarial Network (GAN) with human talking gestures recorded by a Kinect. A direct kinematic approach is used to translate from human poses to robot joint positions. The provided videos show that the robot is able to use a wide variety of gestures, offering a non-dreary, natural expression level.


Social robots Motion capturing and imitation Generative Adversarial Networks Talking movements 


  1. 1.
    Alibeigi, M., Rabiee, S., Ahmadabadi, M.N.: Inverse kinematics based human mimicking system using skeletal tracking technology. J. Intell. Robotic Syst. 85(1), 27–45 (2017)CrossRefGoogle Scholar
  2. 2.
    Augello, A., Cipolla, E., Infantino, I., Manfrè, A., Pilato, G., Vella, F.: Creative robot dance with variational encoder. CoRR abs/1707.01489 (2017)Google Scholar
  3. 3.
    Beck, A., Yumak, Z., Magnenat-Thalmann, N.: Body movements generation for virtual characters and social robots. In: Judee, K.B., Nadia, M.-T., Maja, P., Alessandro, V. (eds.) Social Signal Processing, pp. 273–286. Cambridge University Press, Cambridge (2017)CrossRefGoogle Scholar
  4. 4.
    Breazeal, C.: Designing sociable robots. In: Intelligent Robotics and Autonomous Agents. MIT Press, Cambridge (2004)Google Scholar
  5. 5.
    Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018)
  6. 6.
    Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017)Google Scholar
  7. 7.
    Castillo, E., Gutiérrez, J.M., Hadi, A.S.: Learning Bayesian networks. In: Expert Systems and Probabilistic Network Models. Monographs in computer science. Springer-Verlag, New York (1997). Scholar
  8. 8.
    Everitt, B., Hand, D.: Finite Mixture Distributions. Chapman and Hall, New York (1981)CrossRefGoogle Scholar
  9. 9.
    Fadli, H., Machbub, C., Hidayat, E.: Human gesture imitation on NAO humanoid robot using kinect based on inverse kinematics method. In: International Conference on Advanced Mechatronics, Intelligent Manufacture, and Industrial Automation (ICAMIMIA). IEEE (2015)Google Scholar
  10. 10.
    Goodfellow, I.: NIPS tutorial: generative adversarial networks. ArXiv e-prints, December 2017Google Scholar
  11. 11.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  12. 12.
    Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. CoRR abs/1803.10892 (2018).
  13. 13.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  14. 14.
    Kwon, J., Park, F.C.: Using hidden markov models to generate natural humanoid movement. In: International Conference on Intelligent Robots and Systems (IROS). IEEE/RSJ (2006)Google Scholar
  15. 15.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  16. 16.
    MacCormick, J.: How does the kinect work?. Accessed 3 June 2019
  17. 17.
    Manfrè, A., Infantino, I., Vella, F., Gaglio, S.: An automatic system for humanoid dance creation. Biologically Inspired Cogn. Architect. 15, 1–9 (2016)CrossRefGoogle Scholar
  18. 18.
    McNeill, D.: Hand and Mind: What Gestures Reveal About Thought. University of Chicago press (1992)Google Scholar
  19. 19.
    Mehta, D., et al.: VNect: real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. 36(4), 44:1–44:14 (2017)CrossRefGoogle Scholar
  20. 20.
    Mukherjee, S., Paramkusam, D., Dwivedy, S.K.: Inverse kinematics of a NAO humanoid robot using Kinect to track and imitate human motion. In: International Conference on Robotics, Automation, Control and Embedded Systems (RACE). IEEE (2015)Google Scholar
  21. 21.
    Okamoto, T., Shiratori, T., Kudoh, S., Nakaoka, S., Ikeuchi, K.: Toward a dancing robot with listening capability: keypose-based integration of lower-, middle-, and upper-body motions for varying music tempos. IEEE Trans. Robot. 30, 771–778 (2014). Scholar
  22. 22.
    Poubel, L.P.: Whole-body online human motion imitation by a humanoid robot using task specification. Master’s thesis, Ecole Centrale de Nantes-Warsaw University of Technology (2013)Google Scholar
  23. 23.
    Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE. 77, 257–286 (1989)CrossRefGoogle Scholar
  24. 24.
    Rodriguez, I., Astigarraga, A., Ruiz, T., Lazkano, E.: Singing minstrel robots, a means for improving social behaviors. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2902–2907 (2016)Google Scholar
  25. 25.
    Rodriguez, I., Astigarraga, A., Jauregi, E., Ruiz, T., Lazkano, E.: Humanizing NAO robot teleoperation using ROS. In: International Conference on Humanoid Robots (Humanoids) (2014)Google Scholar
  26. 26.
    Rodriguez, I., Martínez-Otzeta, J.M., Irigoien, I., Lazkano, E.: Spontaneous talking gestures using generative adversarial networks. Robot. Auton. Syst. 114, 57–65 (2019)CrossRefGoogle Scholar
  27. 27.
    Schubert, T., Eggensperger, K., Gkogkidis, A., Hutter, F., Ball, T., Burgard, W.: Automatic bone parameter estimation for skeleton tracking in optical motion capture. In: International Conference on Robotics and Automation (ICRA). IEEE (2016)Google Scholar
  28. 28.
    Tanwani, A.K.: Generative models for learning robot manipulation. Ph.D. thesis, École Polytechnique Fédéral de Laussane (EPFL) (2018)Google Scholar
  29. 29.
    Tits, M., Tilmanne, J., Dutoit, T.: Robust and automatic motion-capture data recovery using soft skeleton constraints and model averaging. PLOS One 13(7), 1–21 (2018)CrossRefGoogle Scholar
  30. 30.
    Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1–3), 37–52 (1987)CrossRefGoogle Scholar
  31. 31.
    Zhang, Z., Niu, Y., Yan, Z., Lin, S.: Real-time whole-body imitation by humanoid robots and task-oriented teleoperation using an analytical mapping method and quantitative evaluation. Appl. Sci. 8(10), 2005 (2018). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Computer Science and Artificial Intelligence, Faculty of InformaticsUPV/EHUDonostiaSpain

Personalised recommendations