Skip to main content

Reinforcement Learning with Attention that Works: A Self-Supervised Approach

  • Conference paper
  • First Online:
Book cover Neural Information Processing (ICONIP 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1143))

Included in the following conference series:

Abstract

Attention models have had a significant positive impact on deep learning across a range of tasks. However previous attempts at integrating attention with reinforcement learning have failed to produce significant improvements. Unlike the selective attention models used in previous attempts, which constrain the attention via preconceived notions of importance, our implementation utilises the Markovian properties inherent in the state input. We propose the first combination of self attention and reinforcement learning that is capable of producing significant improvements, including new state of the art results in the Arcade Learning Environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). CoRR abs/1409.0473. http://arxiv.org/abs/1409.0473

  2. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)

    Article  Google Scholar 

  3. Broadbent, D.E.: Perception and Communication (1958)

    Book  Google Scholar 

  4. Choi, J., Lee, B., Zhang, B.: Multi-focus attention network for efficient deep reinforcement learning. CoRR abs/1712.04603 (2017). http://arxiv.org/abs/1712.04603

  5. Dhariwal, P., et al.: Openai baselines (2017). https://github.com/openai/baselines

  6. Fang, S., Xie, H., Zha, Z.J., Sun, N., Tan, J., Zhang, Y.: Attention and language ensemble for scene text recognition with convolutional sequence modeling. In: Proceedings of the 26th ACM International Conference on Multimedia, MM 2018, pp. 248–256. ACM, New York (2018). https://doi.org/10.1145/3240508.3240571

  7. Fortunato, M., et al.: Noisy networks for exploration. CoRR abs/1706.10295 (2017). http://arxiv.org/abs/1706.10295

  8. Gregor, M., Nemec, D., Janota, A., Pirnik, R.: A visual attention operator for playing Pac-Man, pp. 1–6, May 2018. https://doi.org/10.1109/ELEKTRO.2018.8398308

  9. Greydanus, S., Koul, A., Dodge, J., Fern, A.: Visualizing and understanding Atari agents. CoRR abs/1711.00138 (2017). http://arxiv.org/abs/1711.00138

  10. Han, Y.: Explore multi-step reasoning in video question answering. In: Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild. p. 5. CoVieW 2018. ACM, New York (2018). https://doi.org/10.1145/3265987.3265996

  11. Hausknecht, M.J., Stone, P.: Deep recurrent q-learning for partially observable mdps. CoRR abs/1507.06527 (2015). http://arxiv.org/abs/1507.06527

  12. Horgan, D., et al.: Distributed prioritized experience replay. CoRR abs/1803.00933 (2018). http://arxiv.org/abs/1803.00933

  13. Kastaniotis, D., Ntinou, I., Tsourounis, D., Economou, G., Fotopoulos, S.: Attention-aware generative adversarial networks (ATA-GANS). CoRR abs/1802.09070 (2018). http://arxiv.org/abs/1802.09070

  14. Kay, W., et al.: The kinetics human action video dataset. CoRR abs/1705.06950 (2017). http://arxiv.org/abs/1705.06950

  15. Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. CoRR abs/1406.6247 (2014). http://arxiv.org/abs/1406.6247

  16. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  17. Oh, J., Chockalingam, V., Singh, S.P., Lee, H.: Control of memory, active perception, and action in minecraft. CoRR abs/1605.09128 (2016). http://arxiv.org/abs/1605.09128

  18. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017). http://arxiv.org/abs/1707.06347

  19. Shi, J., Zhang, H., Li, J.: Explainable and explicit visual reasoning over scene graphs. CoRR abs/1812.01855 (2018). http://arxiv.org/abs/1812.01855

  20. Sigurdsson, G.A., Varol, G., Wang, X., Farhadi, A., Laptev, I., Gupta, A.: Hollywood in homes: crowdsourcing data collection for activity understanding. CoRR abs/1604.01753 (2016). http://arxiv.org/abs/1604.01753

  21. Sorokin, I., Seleznev, A., Pavlov, M., Fedorov, A., Ignateva, A.: Deep attention recurrent q-network. CoRR abs/1512.01693 (2015). http://arxiv.org/abs/1512.01693

  22. Wang, X., Girshick, R.B., Gupta, A., He, K.: Non-local neural networks. CoRR abs/1711.07971 (2017). http://arxiv.org/abs/1711.07971

  23. Wu, Y., Mansimov, E., Grosse, R.B., Liao, S., Ba, J.: Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 5279–5288. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7112-scalable-trust-region-method-for-deep-reinforcement-learning-using-kronecker-factored-approximation.pdf

  24. Xu, T., et al.: AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks. CoRR abs/1711.10485 (2017). http://arxiv.org/abs/1711.10485

  25. Yuezhang, L., Zhang, R., Ballard, D.H.: An initial attempt of combining visual selective attention with deep reinforcement learning. CoRR abs/1811.04407 (2018). http://arxiv.org/abs/1811.04407

  26. Zhang, R., et al.: AGIL: learning attention from human for visuomotor tasks. CoRR abs/1806.03960 (2018). http://arxiv.org/abs/1806.03960

  27. Zhao, S., Zhang, Z.: Attention-via-attention neural machine translation (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16534

Download references

Acknowledgments

We would like to thank Michele Sasdelli for his helpful discussions, and Damien Teney for his feed-back and advice on writing this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony Manchin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Manchin, A., Abbasnejad, E., van den Hengel, A. (2019). Reinforcement Learning with Attention that Works: A Self-Supervised Approach. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1143. Springer, Cham. https://doi.org/10.1007/978-3-030-36802-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36802-9_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36801-2

  • Online ISBN: 978-3-030-36802-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics