RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies

  • Vahid BehzadanEmail author
  • William Hsu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11699)


This paper investigates the resilience and robustness of Deep Reinforcement Learning (DRL) policies to adversarial perturbations in the state space. We first present an approach for the disentanglement of vulnerabilities caused by representation learning of DRL agents from those that stem from the sensitivity of the DRL policies to distributional shifts in state transitions. Building on this approach, we propose two RL-based techniques for quantitative benchmarking of adversarial resilience and robustness in DRL policies against perturbations of state transitions. We demonstrate the feasibility of our proposals through experimental evaluation of resilience and robustness in DQN, A2C, and PPO2 policies trained in the Cartpole environment.


Deep Reinforcement Learning Adversarial attack Policy generalization Resilience Robustness Benchmarking 


  1. 1.
    Behzadan, V., Munir, A.: Vulnerability of deep reinforcement learning to policy induction attacks. In: Perner, P. (ed.) MLDM 2017. LNCS (LNAI), vol. 10358, pp. 262–275. Springer, Cham (2017). Scholar
  2. 2.
    Behzadan, V., Munir, A.: The faults in our pi stars: security issues and open challenges in deep reinforcement learning. arXiv preprint arXiv:1810.10369 (2018)
  3. 3.
    Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
  4. 4.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv preprint arXiv:1412.6572 (2014)
  5. 5.
    Huang, S., Papernot, N., Goodfellow, I., Duan, Y., Abbeel, P.: Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284 (2017)
  6. 6.
    Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: a survey of learning methods. ACM Comput. Surv. (CSUR) 50(2), 21 (2017)CrossRefGoogle Scholar
  7. 7.
    Lin, Y.C., Hong, Z.W., Liao, Y.H., Shih, M.L., Liu, M.Y., Sun, M.: Tactics of adversarial attack on deep reinforcement learning agents. arXiv preprint arXiv:1703.06748 (2017)
  8. 8.
    Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204 (2017)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Kansas State UniversityManhattanUSA

Personalised recommendations