Skip to main content

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 134))

  • 986 Accesses

Abstract

In this part of the monograph, we use the kinematic and control architectures built in Part I in order to learn cooperative manipulation. To this end, we will firstly introduce Policy Search (PS), a subtype of Reinforcement Learning in Sect. 5.1. For further details on PS, [4] presents a more exhaustive review, with a detailed description of many of the state-of-the-art PS algorithms and the reasoning behind them. We will also introduce the concept of Movement Primitives in Sect. 5.2, a motion characterization very suitable to PS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For numerical reasons, \(\frac{R_k-\text {min}(\mathbf {R})}{\text {max}(\mathbf {R})-\text {min}(\mathbf {R})}\) is often used instead.

  2. 2.

    In standard DMPs, such a set of radial basis functions represented with Gaussian curves are uniformly distributed in the time domain for each DoF of the trajectory. The weights corresponding to each of these Gaussians are then obtained through least squares techniques. This approach has been used throughout this book.

  3. 3.

    Note that other characterizations, such as \(g_i(x)=\frac{\phi _i \left( x \right) }{\sum _j \phi _j \left( x \right) } x (\mathbf {y}_g-\mathbf {y}_0)\), allow for a better goal rescaling [6].

References

  1. Abdolmaleki, A., Simões, D., Lau, N., Reis, L.P., Neumann, G.: Contextual relative entropy policy search with covariance matrix adaptation. In: 2016 International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 94–99 (2016)

    Google Scholar 

  2. Colomé, A., Neumann, G., Peters, J., Torras, C.: Dimensionality reduction for probabilistic movement primitives. In: IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids), pp. 794–800 (2014)

    Google Scholar 

  3. Daniel, C., Neumann, G., Kroemer, O., Peters, J.: Hierarchical relative entropy policy search. J. Mach. Learn. Res. 17(93), 1–50 (2016)

    MathSciNet  MATH  Google Scholar 

  4. Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot 2(1–2), 1–142 (2013)

    Google Scholar 

  5. Fabisch, A., Metzen, J.H.: Active contextual policy search. J. Mach. Learn. Res. 15, 3371–3399 (2014)

    MathSciNet  MATH  Google Scholar 

  6. Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25(2), 328–373 (2013)

    Article  MathSciNet  Google Scholar 

  7. Ijspeert, A.J., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical systems in humanoid robots. IEEE Int. Conf. Robot. Autom. (ICRA) 2, 1398–1403 (2002)

    Google Scholar 

  8. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. (1951)

    Google Scholar 

  9. Kupcsik, A.G., Deisenroth, M.P., Peters, J., Neumann, G.: Data-efficient generalization of robot skills with contextual policy search. In: AAAI Conference on Artificial Intelligence, pp. 1401–1407 (2013)

    Google Scholar 

  10. Lazaric, A., Ghavamzadeh, M.: Bayesian multi-task reinforcement learning. In: International Conference on Machine Learning (ICML), pp. 599–606 (2010)

    Google Scholar 

  11. Neumann, G.: Variational inference for policy search in changing situations. In: International Conference on Machine Learning, pp. 817–824 (2011)

    Google Scholar 

  12. Paraschos, A., Daniel, C., Peters, J., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems (NIPS), pp. 2616–2624 (2013)

    Google Scholar 

  13. Peters, J., Mülling, K., Altün, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence, pp. 1607–1612 (2010)

    Google Scholar 

  14. Peters, J., Schaal, S.: Policy gradient methods for robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2219–2225 (2006)

    Google Scholar 

  15. Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)

    Article  Google Scholar 

  16. Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. In: 29th International Conference on Machine Learning, volume abs/1206.4621 (2012)

  17. Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. MIT Press (1998)

    Google Scholar 

  18. Theodorou, E., Buchli, J., Schaal, S.: A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137–3181 (2010)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrià Colomé .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Colomé, A., Torras, C. (2020). Preliminaries. In: Reinforcement Learning of Bimanual Robot Skills. Springer Tracts in Advanced Robotics, vol 134. Springer, Cham. https://doi.org/10.1007/978-3-030-26326-3_5

Download citation

Publish with us

Policies and ethics