Abstract
In this part of the monograph, we use the kinematic and control architectures built in Part I in order to learn cooperative manipulation. To this end, we will firstly introduce Policy Search (PS), a subtype of Reinforcement Learning in Sect. 5.1. For further details on PS, [4] presents a more exhaustive review, with a detailed description of many of the state-of-the-art PS algorithms and the reasoning behind them. We will also introduce the concept of Movement Primitives in Sect. 5.2, a motion characterization very suitable to PS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For numerical reasons, \(\frac{R_k-\text {min}(\mathbf {R})}{\text {max}(\mathbf {R})-\text {min}(\mathbf {R})}\) is often used instead.
- 2.
In standard DMPs, such a set of radial basis functions represented with Gaussian curves are uniformly distributed in the time domain for each DoF of the trajectory. The weights corresponding to each of these Gaussians are then obtained through least squares techniques. This approach has been used throughout this book.
- 3.
Note that other characterizations, such as \(g_i(x)=\frac{\phi _i \left( x \right) }{\sum _j \phi _j \left( x \right) } x (\mathbf {y}_g-\mathbf {y}_0)\), allow for a better goal rescaling [6].
References
Abdolmaleki, A., Simões, D., Lau, N., Reis, L.P., Neumann, G.: Contextual relative entropy policy search with covariance matrix adaptation. In: 2016 International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 94–99 (2016)
Colomé, A., Neumann, G., Peters, J., Torras, C.: Dimensionality reduction for probabilistic movement primitives. In: IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids), pp. 794–800 (2014)
Daniel, C., Neumann, G., Kroemer, O., Peters, J.: Hierarchical relative entropy policy search. J. Mach. Learn. Res. 17(93), 1–50 (2016)
Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot 2(1–2), 1–142 (2013)
Fabisch, A., Metzen, J.H.: Active contextual policy search. J. Mach. Learn. Res. 15, 3371–3399 (2014)
Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25(2), 328–373 (2013)
Ijspeert, A.J., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical systems in humanoid robots. IEEE Int. Conf. Robot. Autom. (ICRA) 2, 1398–1403 (2002)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. (1951)
Kupcsik, A.G., Deisenroth, M.P., Peters, J., Neumann, G.: Data-efficient generalization of robot skills with contextual policy search. In: AAAI Conference on Artificial Intelligence, pp. 1401–1407 (2013)
Lazaric, A., Ghavamzadeh, M.: Bayesian multi-task reinforcement learning. In: International Conference on Machine Learning (ICML), pp. 599–606 (2010)
Neumann, G.: Variational inference for policy search in changing situations. In: International Conference on Machine Learning, pp. 817–824 (2011)
Paraschos, A., Daniel, C., Peters, J., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems (NIPS), pp. 2616–2624 (2013)
Peters, J., Mülling, K., Altün, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence, pp. 1607–1612 (2010)
Peters, J., Schaal, S.: Policy gradient methods for robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2219–2225 (2006)
Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)
Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. In: 29th International Conference on Machine Learning, volume abs/1206.4621 (2012)
Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. MIT Press (1998)
Theodorou, E., Buchli, J., Schaal, S.: A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137–3181 (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Colomé, A., Torras, C. (2020). Preliminaries. In: Reinforcement Learning of Bimanual Robot Skills. Springer Tracts in Advanced Robotics, vol 134. Springer, Cham. https://doi.org/10.1007/978-3-030-26326-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-26326-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26325-6
Online ISBN: 978-3-030-26326-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)