Preliminaries

Colomé, Adrià; Torras, Carme

doi:10.1007/978-3-030-26326-3_5

Adrià Colomé²⁵ &
Carme Torras²⁵

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 134))

986 Accesses

Abstract

In this part of the monograph, we use the kinematic and control architectures built in Part I in order to learn cooperative manipulation. To this end, we will firstly introduce Policy Search (PS), a subtype of Reinforcement Learning in Sect. 5.1. For further details on PS, [4] presents a more exhaustive review, with a detailed description of many of the state-of-the-art PS algorithms and the reasoning behind them. We will also introduce the concept of Movement Primitives in Sect. 5.2, a motion characterization very suitable to PS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For numerical reasons, \(\frac{R_k-\text {min}(\mathbf {R})}{\text {max}(\mathbf {R})-\text {min}(\mathbf {R})}\) is often used instead.
2.
In standard DMPs, such a set of radial basis functions represented with Gaussian curves are uniformly distributed in the time domain for each DoF of the trajectory. The weights corresponding to each of these Gaussians are then obtained through least squares techniques. This approach has been used throughout this book.
3.
Note that other characterizations, such as \(g_i(x)=\frac{\phi _i \left( x \right) }{\sum _j \phi _j \left( x \right) } x (\mathbf {y}_g-\mathbf {y}_0)\), allow for a better goal rescaling [6].

References

Abdolmaleki, A., Simões, D., Lau, N., Reis, L.P., Neumann, G.: Contextual relative entropy policy search with covariance matrix adaptation. In: 2016 International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 94–99 (2016)
Google Scholar
Colomé, A., Neumann, G., Peters, J., Torras, C.: Dimensionality reduction for probabilistic movement primitives. In: IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids), pp. 794–800 (2014)
Google Scholar
Daniel, C., Neumann, G., Kroemer, O., Peters, J.: Hierarchical relative entropy policy search. J. Mach. Learn. Res. 17(93), 1–50 (2016)
MathSciNet MATH Google Scholar
Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot 2(1–2), 1–142 (2013)
Google Scholar
Fabisch, A., Metzen, J.H.: Active contextual policy search. J. Mach. Learn. Res. 15, 3371–3399 (2014)
MathSciNet MATH Google Scholar
Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25(2), 328–373 (2013)
Article MathSciNet Google Scholar
Ijspeert, A.J., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical systems in humanoid robots. IEEE Int. Conf. Robot. Autom. (ICRA) 2, 1398–1403 (2002)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. (1951)
Google Scholar
Kupcsik, A.G., Deisenroth, M.P., Peters, J., Neumann, G.: Data-efficient generalization of robot skills with contextual policy search. In: AAAI Conference on Artificial Intelligence, pp. 1401–1407 (2013)
Google Scholar
Lazaric, A., Ghavamzadeh, M.: Bayesian multi-task reinforcement learning. In: International Conference on Machine Learning (ICML), pp. 599–606 (2010)
Google Scholar
Neumann, G.: Variational inference for policy search in changing situations. In: International Conference on Machine Learning, pp. 817–824 (2011)
Google Scholar
Paraschos, A., Daniel, C., Peters, J., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems (NIPS), pp. 2616–2624 (2013)
Google Scholar
Peters, J., Mülling, K., Altün, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence, pp. 1607–1612 (2010)
Google Scholar
Peters, J., Schaal, S.: Policy gradient methods for robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2219–2225 (2006)
Google Scholar
Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)
Article Google Scholar
Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. In: 29th International Conference on Machine Learning, volume abs/1206.4621 (2012)
Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. MIT Press (1998)
Google Scholar
Theodorou, E., Buchli, J., Schaal, S.: A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137–3181 (2010)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut de Robòtica i Informàtica Industrial (UPC-CSIC), Barcelona, Spain
Adrià Colomé & Carme Torras

Authors

Adrià Colomé
View author publications
You can also search for this author in PubMed Google Scholar
Carme Torras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrià Colomé .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Colomé, A., Torras, C. (2020). Preliminaries. In: Reinforcement Learning of Bimanual Robot Skills. Springer Tracts in Advanced Robotics, vol 134. Springer, Cham. https://doi.org/10.1007/978-3-030-26326-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-26326-3_5
Published: 28 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26325-6
Online ISBN: 978-3-030-26326-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics