Efficient and Robust Learning on Elaborated Gaits with Curriculum Learning

Zhou, Bo; Zeng, Hongsheng; Wang, Fan; Lian, Rongzhong; Tian, Hao

doi:10.1007/978-3-030-29135-8_10

Efficient and Robust Learning on Elaborated Gaits with Curriculum Learning

Bo Zhou⁶,
Hongsheng Zeng⁶,
Fan Wang⁶,
Rongzhong Lian⁶ &
…
Hao Tian⁶

Conference paper
First Online: 30 November 2019

949 Accesses

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

Abstract

Developing efficient walking gaits for biomechanical robots is a difficult task that requires optimizing parameters in a continuous, multidimensional space. In this paper we present a new framework for learning complex gaits with musculoskeletal models. We use Deep Deterministic Policy Gradient which is driven by the external control command, and apply curriculum learning to acquire a reasonable starting policy. We accelerate the learning process with large-scale distributed training and bootstrapped deep exploration paradigm. As a result, our approach won the NeurIPS 2018: AI for Prosthetics competition, scoring more than 30 points than the second placed solution.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.crowdai.org/challenges/31/leaderboards.
2.
Our code is based on the PARL, the PaddlePaddle Reinforcement Learning tool, https://github.com/PaddlePaddle/PARL.
3.
https://github.com/PaddlePaddle/PARL/tree/develop/examples/NeurIPS2018-AI-for-Prosthetics-Challenge.

References

Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp. 41–48. ACM (2009)
Google Scholar
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. In: International Conference on Machine Learning, pp. 1329–1338 (2016)
Google Scholar
Huang, Z., Zhou, S., Zhuang, B., Zhou, X.: Learning to run with actor-critic ensemble. CoRR abs/1712.08987 (2017). URL http://arxiv.org/abs/1712.08987
Karpathy, A., van de Panne, M.: Curriculum learning for motor skills. In: L. Kosseim, D. Inkpen (eds.) Advances in Artificial Intelligence, pp. 325–330. Springer Berlin Heidelberg, Berlin, Heidelberg (2012)
Chapter Google Scholar
Kidzinski, L., Mohanty, S.P., Ong, C.F., Huang, Z., Zhou, S., Pechenko, A., Stelmaszczyk, A., Jarosik, P., Pavlov, M., Kolesnikov, S., Plis, S.M., Chen, Z., Zhang, Z., Chen, J., Shi, J., Zheng, Z., Yuan, C., Lin, Z., Michalewski, H., Milos, P., Osinski, B., Melnik, A., Schilling, M., Ritter, H.J., Carroll, S.F., Hicks, J.L., Levine, S., Salathé, M., Delp, S.L.: Learning to run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. CoRR abs/1804.00361 (2018). URL http://arxiv.org/abs/1804.00361
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Google Scholar
Martin, J., Sasikumar, S.N., Everitt, T., Hutter, M.: Count-based exploration in feature space for reinforcement learning. arXiv preprint arXiv:1706.08090 (2017)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814 (2010)
Google Scholar
Osband, I., Blundell, C., Pritzel, A., Van Roy, B.: Deep exploration via bootstrapped dqn. In: Advances in neural information processing systems, pp. 4026–4034 (2016)
Google Scholar
Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R.Y., Chen, X., Asfour, T., Abbeel, P., Andrychowicz, M.: Parameter space noise for exploration. arXiv preprint arXiv:1706.01905 (2017)
Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)
Google Scholar
Seth, A., Hicks, J.L., Uchida, T.K., Habib, A., Dembia, C.L., Dunne, J.J., Ong, C.F., DeMers, M.S., Rajagopal, A., Millard, M., et al.: Opensim: Simulating musculoskeletal dynamics and neuromuscular control to study human and animal movement. PLoS computational biology 14(7), e1006223 (2018)
Article Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. nature 529(7587), 484 (2016)
Article Google Scholar
Tang, H., Houthooft, R., Foote, D., Stooke, A., Chen, O.X., Duan, Y., Schulman, J., DeTurck, F., Abbeel, P.: # exploration: A study of count-based exploration for deep reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2753–2762 (2017)
Google Scholar
Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Baidu Inc., Shenzhen, China
Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian & Hao Tian

Authors

Bo Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hongsheng Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Fan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rongzhong Lian
View author publications
You can also search for this author in PubMed Google Scholar
Hao Tian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Zhou .

Editor information

Editors and Affiliations

Universitat de Barcelona and Computer, Vision Center, Barcelona, Spain
Sergio Escalera
Amazon (Berlin), Berlin, Berlin, Germany
Ralf Herbrich

Appendix

1.1 Hyper-Parameters

We present the hyper-parameters used in our experiments at Table 1, for more details about the implementation please refer to our code repository.

Table 1 Hyper-parameters

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, B., Zeng, H., Wang, F., Lian, R., Tian, H. (2020). Efficient and Robust Learning on Elaborated Gaits with Curriculum Learning. In: Escalera, S., Herbrich, R. (eds) The NeurIPS '18 Competition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-29135-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-29135-8_10
Published: 30 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29134-1
Online ISBN: 978-3-030-29135-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Buying options

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Hyper-Parameters

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation