Skip to main content

Efficient and Robust Learning on Elaborated Gaits with Curriculum Learning

  • Conference paper
  • First Online:
  • 949 Accesses

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

Abstract

Developing efficient walking gaits for biomechanical robots is a difficult task that requires optimizing parameters in a continuous, multidimensional space. In this paper we present a new framework for learning complex gaits with musculoskeletal models. We use Deep Deterministic Policy Gradient which is driven by the external control command, and apply curriculum learning to acquire a reasonable starting policy. We accelerate the learning process with large-scale distributed training and bootstrapped deep exploration paradigm. As a result, our approach won the NeurIPS 2018: AI for Prosthetics competition, scoring more than 30 points than the second placed solution.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.crowdai.org/challenges/31/leaderboards.

  2. 2.

    Our code is based on the PARL, the PaddlePaddle Reinforcement Learning tool, https://github.com/PaddlePaddle/PARL.

  3. 3.

    https://github.com/PaddlePaddle/PARL/tree/develop/examples/NeurIPS2018-AI-for-Prosthetics-Challenge.

References

  1. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp. 41–48. ACM (2009)

    Google Scholar 

  2. Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. In: International Conference on Machine Learning, pp. 1329–1338 (2016)

    Google Scholar 

  3. Huang, Z., Zhou, S., Zhuang, B., Zhou, X.: Learning to run with actor-critic ensemble. CoRR abs/1712.08987 (2017). URL http://arxiv.org/abs/1712.08987

  4. Karpathy, A., van de Panne, M.: Curriculum learning for motor skills. In: L. Kosseim, D. Inkpen (eds.) Advances in Artificial Intelligence, pp. 325–330. Springer Berlin Heidelberg, Berlin, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Kidzinski, L., Mohanty, S.P., Ong, C.F., Huang, Z., Zhou, S., Pechenko, A., Stelmaszczyk, A., Jarosik, P., Pavlov, M., Kolesnikov, S., Plis, S.M., Chen, Z., Zhang, Z., Chen, J., Shi, J., Zheng, Z., Yuan, C., Lin, Z., Michalewski, H., Milos, P., Osinski, B., Melnik, A., Schilling, M., Ritter, H.J., Carroll, S.F., Hicks, J.L., Levine, S., Salathé, M., Delp, S.L.: Learning to run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. CoRR abs/1804.00361 (2018). URL http://arxiv.org/abs/1804.00361

  6. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

    Google Scholar 

  7. Martin, J., Sasikumar, S.N., Everitt, T., Hutter, M.: Count-based exploration in feature space for reinforcement learning. arXiv preprint arXiv:1706.08090 (2017)

    Google Scholar 

  8. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814 (2010)

    Google Scholar 

  9. Osband, I., Blundell, C., Pritzel, A., Van Roy, B.: Deep exploration via bootstrapped dqn. In: Advances in neural information processing systems, pp. 4026–4034 (2016)

    Google Scholar 

  10. Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R.Y., Chen, X., Asfour, T., Abbeel, P., Andrychowicz, M.: Parameter space noise for exploration. arXiv preprint arXiv:1706.01905 (2017)

    Google Scholar 

  11. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)

    Google Scholar 

  12. Seth, A., Hicks, J.L., Uchida, T.K., Habib, A., Dembia, C.L., Dunne, J.J., Ong, C.F., DeMers, M.S., Rajagopal, A., Millard, M., et al.: Opensim: Simulating musculoskeletal dynamics and neuromuscular control to study human and animal movement. PLoS computational biology 14(7), e1006223 (2018)

    Article  Google Scholar 

  13. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. nature 529(7587), 484 (2016)

    Article  Google Scholar 

  14. Tang, H., Houthooft, R., Foote, D., Stooke, A., Chen, O.X., Duan, Y., Schulman, J., DeTurck, F., Abbeel, P.: # exploration: A study of count-based exploration for deep reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2753–2762 (2017)

    Google Scholar 

  15. Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Zhou .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Hyper-Parameters

We present the hyper-parameters used in our experiments at Table 1, for more details about the implementation please refer to our code repository.

Table 1 Hyper-parameters

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, B., Zeng, H., Wang, F., Lian, R., Tian, H. (2020). Efficient and Robust Learning on Elaborated Gaits with Curriculum Learning. In: Escalera, S., Herbrich, R. (eds) The NeurIPS '18 Competition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-29135-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29135-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29134-1

  • Online ISBN: 978-3-030-29135-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics