• Adrià ColoméEmail author
  • Carme Torras
Part of the Springer Tracts in Advanced Robotics book series (STAR, volume 134)


In this part of the monograph, we use the kinematic and control architectures built in Part I in order to learn cooperative manipulation. To this end, we will firstly introduce Policy Search (PS), a subtype of Reinforcement Learning in Sect. 5.1. For further details on PS, [4] presents a more exhaustive review, with a detailed description of many of the state-of-the-art PS algorithms and the reasoning behind them. We will also introduce the concept of Movement Primitives in Sect. 5.2, a motion characterization very suitable to PS.


  1. 1.
    Abdolmaleki, A., Simões, D., Lau, N., Reis, L.P., Neumann, G.: Contextual relative entropy policy search with covariance matrix adaptation. In: 2016 International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 94–99 (2016)Google Scholar
  2. 2.
    Colomé, A., Neumann, G., Peters, J., Torras, C.: Dimensionality reduction for probabilistic movement primitives. In: IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids), pp. 794–800 (2014)Google Scholar
  3. 3.
    Daniel, C., Neumann, G., Kroemer, O., Peters, J.: Hierarchical relative entropy policy search. J. Mach. Learn. Res. 17(93), 1–50 (2016)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot 2(1–2), 1–142 (2013)Google Scholar
  5. 5.
    Fabisch, A., Metzen, J.H.: Active contextual policy search. J. Mach. Learn. Res. 15, 3371–3399 (2014)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25(2), 328–373 (2013)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Ijspeert, A.J., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical systems in humanoid robots. IEEE Int. Conf. Robot. Autom. (ICRA) 2, 1398–1403 (2002)Google Scholar
  8. 8.
    Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. (1951)Google Scholar
  9. 9.
    Kupcsik, A.G., Deisenroth, M.P., Peters, J., Neumann, G.: Data-efficient generalization of robot skills with contextual policy search. In: AAAI Conference on Artificial Intelligence, pp. 1401–1407 (2013)Google Scholar
  10. 10.
    Lazaric, A., Ghavamzadeh, M.: Bayesian multi-task reinforcement learning. In: International Conference on Machine Learning (ICML), pp. 599–606 (2010)Google Scholar
  11. 11.
    Neumann, G.: Variational inference for policy search in changing situations. In: International Conference on Machine Learning, pp. 817–824 (2011)Google Scholar
  12. 12.
    Paraschos, A., Daniel, C., Peters, J., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems (NIPS), pp. 2616–2624 (2013)Google Scholar
  13. 13.
    Peters, J., Mülling, K., Altün, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence, pp. 1607–1612 (2010)Google Scholar
  14. 14.
    Peters, J., Schaal, S.: Policy gradient methods for robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2219–2225 (2006)Google Scholar
  15. 15.
    Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)CrossRefGoogle Scholar
  16. 16.
    Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. In: 29th International Conference on Machine Learning, volume abs/1206.4621 (2012)
  17. 17.
    Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. MIT Press (1998)Google Scholar
  18. 18.
    Theodorou, E., Buchli, J., Schaal, S.: A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137–3181 (2010)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Institut de Robòtica i Informàtica Industrial (UPC-CSIC)BarcelonaSpain

Personalised recommendations