Sampling Efficiency in Learning Robot Motion

Colomé, Adrià; Torras, Carme

doi:10.1007/978-3-030-26326-3_6

Adrià Colomé²⁵ &
Carme Torras²⁵

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 134))

993 Accesses

Abstract

Policy Search (PS) algorithms are nowadays widely used for their simplicity and effectiveness in finding solutions for robotic problems. However, most current PS algorithms derive policies by statistically fitting the data from the best experiments only. This means that those experiments yielding a poor performance are usually discarded or given too little influence on the policy update. In this chapter, we propose a generalization of the Relative Entropy Policy Search (REPS) algorithm that takes bad experiences into consideration when computing a policy. The proposed approach, named Dual REPS (DREPS) [1], following the philosophical interpretation of the duality between good and bad, finds clusters of experimental data yielding a poor behavior and adds them to the optimization problem as a repulsive constraint. Thus, considering there is a duality between good and bad data samples, both are taken into account in the stochastic search for a policy. Additionally, a cluster with the best samples may be included as an attractor to enforce faster convergence to a single optimal solution in multimodal problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Colomé, A., Torras, C.: Dual REPS: a generalization of relative entropy policy search exploiting bad experiences. IEEE Trans. Robot. 33(4), 978–985 (2017)
Article Google Scholar
Daniel, C., Neumann, G., Kroemer, O., Peters, J.: Hierarchical relative entropy policy search. J. Mach. Learn. Res. 17(93), 1–50 (2016)
MathSciNet MATH Google Scholar
Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot. 2(1–2), 1–142 (2013)
Google Scholar
Gómez,V., Kappen, H.J., Peters, J., Neumann, G.: Policy search for path integral control. In: European Conference in Machine Learning and Knowledge Discovery in Databases (ECML), pp. 482–497 (2014)
Chapter Google Scholar
Jevtic, A., Colomé, A., Alenyà, G., Torras, C.: Learning robot motion through user intervention and policy search. In: ICRA Workshop on Nature versus Nurture in Robotics (2016)
Google Scholar
Jevtic, A., Colomé, A., Alenyà, G., Torras, C.: User evaluation of an interactive learning framework for single-arm and dual-arm robots. In: 8th International Conference on Social Robotics, pp. 52–61 (2016)
Chapter Google Scholar
Jevtic, A., Colomé, A., Alenyà, G., Torras, C.: Robot motion adaptation through user intervention and reinforcement learning. Pattern Recogn. Lett. 105, 67–75 (2018)
Article Google Scholar
Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for k-means clustering. Pattern Recogn. Lett. 25(11), 1293–1302 (2004)
Article Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (2006)
Article MathSciNet Google Scholar
Neumann, G.: Variational inference for policy search in changing situations. In: International Conference on Machine Learning, pp. 817–824 (2011)
Google Scholar
Peters, J., Mülling, K., Altün, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence, pp. 1607–1612 (2010)
Google Scholar
Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.J.: Learning movement primitives. In: 11th International Symposium on Robotics Research, pp. 561–572 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut de Robòtica i Informàtica Industrial (UPC-CSIC), Barcelona, Spain
Adrià Colomé & Carme Torras

Authors

Adrià Colomé
View author publications
You can also search for this author in PubMed Google Scholar
Carme Torras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrià Colomé .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Colomé, A., Torras, C. (2020). Sampling Efficiency in Learning Robot Motion. In: Reinforcement Learning of Bimanual Robot Skills. Springer Tracts in Advanced Robotics, vol 134. Springer, Cham. https://doi.org/10.1007/978-3-030-26326-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-26326-3_6
Published: 28 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26325-6
Online ISBN: 978-3-030-26326-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics