Visual Rationalizations in Deep Reinforcement Learning for Atari Games

Weitkamp, Laurens; van der Pol, Elise; Akata, Zeynep

doi:10.1007/978-3-030-31978-6_12

Visual Rationalizations in Deep Reinforcement Learning for Atari Games

Laurens Weitkamp⁹,
Elise van der Pol¹⁰ &
Zeynep Akata¹⁰

Conference paper
First Online: 25 September 2019

923 Accesses
6 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1021))

Abstract

Due to the capability of deep learning to perform well in high dimensional problems, deep reinforcement learning agents perform well in challenging tasks such as Atari 2600 games. However, clearly explaining why a certain action is taken by the agent can be as important as the decision itself. Deep reinforcement learning models, as other deep learning models, tend to be opaque in their decision-making process. In this work, we propose to make deep reinforcement learning more transparent by visualizing the evidence on which the agent bases its decision. In this work, we emphasize the importance of producing a justification for an observed action, which could be applied to a black-box decision agent.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Here we assume problems where partial observability can be addressed by representing a state as a small number of past observations.
2.
k is usually chosen to be the last convolutional layer in the CNN.
3.
the Exponential Linear Unit has been chosen in favor of the ReLU used in the original Grad-CAM paper due to the dying ReLU effect described in [22].
4.
evaluated in this case means having forwarded each state that has been manually sampled through the model.

References

Andreas, J., Rohrbach, M., Darrell, T., Klein, D.: Neural module networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. CoRR abs/1207.4708 (2012). http://arxiv.org/abs/1207.4708
Biran, O., McKeown, K.: Justification narratives for individual classifications. In: Proceedings of the AutoML Workshop at ICML 2014 (2014)
Google Scholar
Greydanus, S., Koul, A., Dodge, J., Fern, A.: Visualizing and understanding atari agents. CoRR abs/1711.00138 (2017). http://arxiv.org/abs/1711.00138
Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. CoRR, abs/1507.06527 (2015)
Google Scholar
Heess, N., et al.: Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017)
Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., Meger, D.: Deep reinforcement learning that matters. arXiv preprint arXiv:1709.06560 (2017)
Hendricks, L.A., Akata, Z., Rohrbach, M., Donahue, J., Schiele, B., Darrell, T.: Generating visual explanations. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 3–19. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_1
Chapter Google Scholar
Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. arXiv preprint arXiv:1710.02298 (2017)
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Park, D.H., et al.: Multimodal explanations: justifying decisions and pointing to the evidence. In: IEEE CVPR (2018)
Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017). http://arxiv.org/abs/1707.06347
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: IEEE ICCV (2017)
Google Scholar
Sharma, S., Lakshminarayanan, A.S., Ravindran, B.: Learning to repeat: fine grained action repetition for deep reinforcement learning. CoRR abs/1702.06054 (2017). http://arxiv.org/abs/1702.06054
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Teach, R.L., Shortliffe, E.H.: An analysis of physician attitudes regarding computer-based clinical consultation systems. In: Anderson, J.G., Jay, S.J. (eds.) Use and Impact of Computers in Clinical Medicine. Computers and Medicine, pp. 68–85. Springer, New York (1981). https://doi.org/10.1007/978-1-4613-8674-2_6
Chapter Google Scholar
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: AAAI, vol. 16, pp. 2094–2100 (2016)
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. In: Sutton, R.S. (ed.) Reinforcement Learning. The Springer International Series in Engineering and Computer Science (Knowledge Representation, Learning and Expert Systems), vol. 173, pp. 5–32. Springer, Boston (1992). https://doi.org/10.1007/978-1-4615-3618-5_2
Chapter Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3), 229–256 (1992). https://doi.org/10.1007/BF00992696
Article MathSciNet MATH Google Scholar
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. CoRR abs/1505.00853 (2015). http://arxiv.org/abs/1505.00853
Zintgraf, L.M., Cohen, T.S., Adel, T., Welling, M.: Visualizing deep neural network decisions: prediction difference analysis. In: ICLR (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands
Laurens Weitkamp
UvA-Bosch Delta Lab, University of Amsterdam, Amsterdam, The Netherlands
Elise van der Pol & Zeynep Akata

Authors

Laurens Weitkamp
View author publications
You can also search for this author in PubMed Google Scholar
Elise van der Pol
View author publications
You can also search for this author in PubMed Google Scholar
Zeynep Akata
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laurens Weitkamp .

Editor information

Editors and Affiliations

Tilburg University, Tilburg, The Netherlands
Martin Atzmueller
Eindhoven University of Technology, Eindhoven, The Netherlands
Wouter Duivesteijn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weitkamp, L., van der Pol, E., Akata, Z. (2019). Visual Rationalizations in Deep Reinforcement Learning for Atari Games. In: Atzmueller, M., Duivesteijn, W. (eds) Artificial Intelligence. BNAIC 2018. Communications in Computer and Information Science, vol 1021. Springer, Cham. https://doi.org/10.1007/978-3-030-31978-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-31978-6_12
Published: 25 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31977-9
Online ISBN: 978-3-030-31978-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics