A Deep Reinforcement Learning Method for Self-driving

  • Yong FangEmail author
  • Jianfeng GuEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10955)


Self-driving technology is an important issue of artificial intelligence. Basing on the end-to-end architecture, deep reinforcement learning has been applied to research for self-driving. However, self-driving environment yields sparse rewards when using deep reinforcement learning, resulting in local optimum to network training. As a result, the self-driving vehicle does not obtain correct actions from outputs of neural network. This paper proposes a deep reinforcement learning method for self-driving. According to the classification threshold value that is dynamically adjusted by reward distributions, the sparse rewards is divided into three groups. The experience information for different rewards is fully utilized and the local optimum problem in the network training process is avoided. By comparing with the traditional method, simulation results show that the proposed method significantly reduces the training time of network.


Self-driving Deep reinforcement learning Sparse rewards Reward classification 


  1. 1.
    Silberg, G., Wallace, R.: Self-driving cars: the next revolution. Cent. Automot. Res. 36 (2012).
  2. 2.
    Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing atari with deep reinforcement learning. arXiv: 1312.5602 (2013)Google Scholar
  3. 3.
    Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  4. 4.
    Watkins, C.J.C.H., Dayan, P.: Q-learning. In: Machine Learning, pp. 279–292 (1992)Google Scholar
  5. 5.
    Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al.: Continuous control with deep reinforcement learning. Comput. Sci. 8(6), A187 (2015)Google Scholar
  6. 6.
    Konda, V.: Actor-critic algorithms. Siam J. Control Optim. 42(4), 1143–1166 (2006)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: ICML (2014)Google Scholar
  8. 8.
    Wolf, P., et al.: Learning how to drive in a real world simulation with deep Q-Networks. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 244–250. Los Angeles, CA (2017).
  9. 9.
    Wang, P., Chan, C.Y.: Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge (2017)Google Scholar
  10. 10.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  11. 11.
    Xiong, X., Wang, J., Zhang, F., et al.: Combining Deep Reinforcement Learning and Safety Based Control for Autonomous Driving. (2016). arXiv preprint arXiv:1612.00147
  12. 12.
    Sallab, A.E., Abdou, M., Perot, E., et al.: Deep reinforcement learning framework for autonomous driving. Electron. Imaging 2017(19), 70–76 (2017)CrossRefGoogle Scholar
  13. 13.
    Gomes, E.R., Kowalczyk, R.: Dynamic analysis of multiagent Q -learning with ε-greedy exploration. In: International Conference on Machine Learning. pp. 369–376. ACM (2009)Google Scholar
  14. 14.
    Dubey, R., Agrawal, P., et al.: Investigating Human Priors for Playing Video Games. eprintarXiv: 1803.05262 (2018)Google Scholar
  15. 15.
    Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)Google Scholar
  16. 16.
    Schaul, T., Quan, J., Antonoglou, I., et al.: Prioritized experience replay. (2015). arXiv preprint arXiv:1511.05952
  17. 17.
    Narasimhan, K., Kulkarni, T., Barzilay, R.: Language understanding for textbased games using deep reinforcement learning. In: Conference on Empirical Methods in Natural Language Processing (EMNLP) (2015)Google Scholar
  18. 18.
    Loiacono, D., Cardamone, L., Lanzi, P.L.: Simulated car racing championship: Competition software manual. (2013). arXiv preprint arXiv:1304.1672

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Key Laboratory for Specialty Fiber Optics and Optical Access Networks, Shanghai Institute for Advanced Communication and Data ScienceShanghai UniversityShanghaiChina

Personalised recommendations