In this chapter, we provide a practical project for readers to have some hands-on experiences of deep reinforcement learning applications, in which we adopt one challenge hosted by CrowdAI and NIPS (now NeurIPS) 2017: Learning to Run. The environment has a 41-dimension state space and 18-dimension action space, both continuous, which is a moderately large-scale environment for novices to gain some experiences. We provide a soft actor-critic solution for the task, as well as some tricks applied for boosting performances. The environment and code are available at https://github.com/deep-reinforcement- learning-book/Chapter13-Learning-to-Run.
KeywordsLearning to run Deep reinforcement learning Soft actor-critic Parallel training
- Agarap AF (2018) Deep learning using rectified linear units (ReLU). Preprint. arXiv:1803.08375Google Scholar
- Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. Preprint. arXiv:1607.06450Google Scholar
- Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (ELUs). Preprint. arXiv:1511.07289Google Scholar
- Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, Kumar V, Zhu H, Gupta A, Abbeel P, et al (2018) Soft actor-critic algorithms and applications. Preprint. arXiv:181205905Google Scholar