Preliminaries
Chapter
First Online:
- 457 Downloads
Abstract
In this part of the monograph, we use the kinematic and control architectures built in Part I in order to learn cooperative manipulation. To this end, we will firstly introduce Policy Search (PS), a subtype of Reinforcement Learning in Sect. 5.1. For further details on PS, [4] presents a more exhaustive review, with a detailed description of many of the state-of-the-art PS algorithms and the reasoning behind them. We will also introduce the concept of Movement Primitives in Sect. 5.2, a motion characterization very suitable to PS.
References
- 1.Abdolmaleki, A., Simões, D., Lau, N., Reis, L.P., Neumann, G.: Contextual relative entropy policy search with covariance matrix adaptation. In: 2016 International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 94–99 (2016)Google Scholar
- 2.Colomé, A., Neumann, G., Peters, J., Torras, C.: Dimensionality reduction for probabilistic movement primitives. In: IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids), pp. 794–800 (2014)Google Scholar
- 3.Daniel, C., Neumann, G., Kroemer, O., Peters, J.: Hierarchical relative entropy policy search. J. Mach. Learn. Res. 17(93), 1–50 (2016)MathSciNetzbMATHGoogle Scholar
- 4.Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot 2(1–2), 1–142 (2013)Google Scholar
- 5.Fabisch, A., Metzen, J.H.: Active contextual policy search. J. Mach. Learn. Res. 15, 3371–3399 (2014)MathSciNetzbMATHGoogle Scholar
- 6.Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25(2), 328–373 (2013)MathSciNetCrossRefGoogle Scholar
- 7.Ijspeert, A.J., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical systems in humanoid robots. IEEE Int. Conf. Robot. Autom. (ICRA) 2, 1398–1403 (2002)Google Scholar
- 8.Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. (1951)Google Scholar
- 9.Kupcsik, A.G., Deisenroth, M.P., Peters, J., Neumann, G.: Data-efficient generalization of robot skills with contextual policy search. In: AAAI Conference on Artificial Intelligence, pp. 1401–1407 (2013)Google Scholar
- 10.Lazaric, A., Ghavamzadeh, M.: Bayesian multi-task reinforcement learning. In: International Conference on Machine Learning (ICML), pp. 599–606 (2010)Google Scholar
- 11.Neumann, G.: Variational inference for policy search in changing situations. In: International Conference on Machine Learning, pp. 817–824 (2011)Google Scholar
- 12.Paraschos, A., Daniel, C., Peters, J., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems (NIPS), pp. 2616–2624 (2013)Google Scholar
- 13.Peters, J., Mülling, K., Altün, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence, pp. 1607–1612 (2010)Google Scholar
- 14.Peters, J., Schaal, S.: Policy gradient methods for robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2219–2225 (2006)Google Scholar
- 15.Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)CrossRefGoogle Scholar
- 16.Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. In: 29th International Conference on Machine Learning, volume abs/1206.4621 (2012)
- 17.Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. MIT Press (1998)Google Scholar
- 18.Theodorou, E., Buchli, J., Schaal, S.: A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137–3181 (2010)MathSciNetzbMATHGoogle Scholar
Copyright information
© Springer Nature Switzerland AG 2020