Reinforcement Learning in Multi-agent Games: Open AI Gym Diplomacy Environment

Cruz, Diogo; Cruz, José Aleixo; Lopes Cardoso, Henrique

doi:10.1007/978-3-030-30241-2_5

Diogo Cruz¹¹,
José Aleixo Cruz¹¹ &
Henrique Lopes Cardoso^11,12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11804))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

3012 Accesses
2 Citations

Abstract

Reinforcement learning has been successfully applied to adversarial games, exhibiting its potential. However, most real-life scenarios also involve cooperation, in addition to competition. Using reinforcement learning in multi-agent cooperative games is, however, still mostly unexplored. In this paper, a reinforcement learning environment for the Diplomacy board game is presented, using the standard interface adopted by OpenAI Gym environments. Our main purpose is to enable straightforward comparison and reuse of existing reinforcement learning implementations when applied to cooperative games. As a proof-of-concept, we show preliminary results of reinforcement learning agents exploiting this environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Available at https://github.com/jazzchipc/gym-diplomacy.

References

Abadi, M., Agarwal, A., Barham, P., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Brockman, G., et al.: OpenAI gym. arXiv preprint. arXiv:1606.01540 (2016)
Calhamer, A.B.: The Rules of Diplomacy, 4th edn. Avalon Hill, Baltimore (2000)
Google Scholar
Dhariwal, P., et al.: OpenAI baselines (2017)
Google Scholar
Drogoul, A.: When ants play chess (or can strategies emerge from tactical behaviours?). In: Castelfranchi, C., Müller, J.-P. (eds.) MAAMAW 1993. LNCS, vol. 957, pp. 11–27. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0027053
Chapter Google Scholar
Fabregues, A., Sierra, C.: DipGame: a challenging negotiation testbed. Eng. Appl. Artif. Intell. 24(7), 1137–1146 (2011)
Article Google Scholar
Ferreira, A., Lopes Cardoso, H., Reis, L.P.: Strategic negotiation and trust in diplomacy – the DipBlue approach. In: Nguyen, N.T., Kowalczyk, R., Duval, B., van den Herik, J., Loiseau, S., Filipe, J. (eds.) Transactions on Computational Collective Intelligence XX. LNCS, vol. 9420, pp. 179–200. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27543-7_9
Chapter Google Scholar
Hill, A., et al.: Stable baselines (2018). https://github.com/hill-a/stable-baselines
de Jonge, D., Baarslag, T., Aydoğan, R., Jonker, C., Fujita, K., Ito, T.: The challenge of negotiation in the game of diplomacy. In: Lujak, M. (ed.) AT 2018. LNCS (LNAI), vol. 11327, pp. 100–114. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17294-7_8
Chapter Google Scholar
de Jonge, D., Sierra, C.: D-Brane: a diplomacy playing agent for automated negotiations research. Appl. Intell. 47(1), 158–177 (2017)
Article Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint. arXiv:1312.5602 (2013)
OpenAI: OpenAI five. https://blog.openai.com/openai-five/
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS-W (2017)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017)
Google Scholar
Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press, Cambridge (2018)
MATH Google Scholar
Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6(2), 215–219 (1994)
Article Google Scholar
Wu, Y., Mansimov, E., Liao, S., Grosse, R.B., Ba, J.: Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. CoRR abs/1708.05144 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculdade de Engenharia, Universidade do Porto, Porto, Portugal
Diogo Cruz, José Aleixo Cruz & Henrique Lopes Cardoso
Laboratório de Inteligência Artificial e Ciências dos Computadores (LIACC), Porto, Portugal
Henrique Lopes Cardoso

Authors

Diogo Cruz
View author publications
You can also search for this author in PubMed Google Scholar
José Aleixo Cruz
View author publications
You can also search for this author in PubMed Google Scholar
Henrique Lopes Cardoso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henrique Lopes Cardoso .

Editor information

Editors and Affiliations

INESC-TEC, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal
Paulo Moura Oliveira
University of Minho, Braga, Portugal
Paulo Novais
LIACC/UP, University of Porto, Porto, Portugal
Luís Paulo Reis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cruz, D., Cruz, J.A., Lopes Cardoso, H. (2019). Reinforcement Learning in Multi-agent Games: Open AI Gym Diplomacy Environment. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-30241-2_5
Published: 30 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30240-5
Online ISBN: 978-3-030-30241-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics