Abstract
Reinforcement learning has been successfully applied to adversarial games, exhibiting its potential. However, most real-life scenarios also involve cooperation, in addition to competition. Using reinforcement learning in multi-agent cooperative games is, however, still mostly unexplored. In this paper, a reinforcement learning environment for the Diplomacy board game is presented, using the standard interface adopted by OpenAI Gym environments. Our main purpose is to enable straightforward comparison and reuse of existing reinforcement learning implementations when applied to cooperative games. As a proof-of-concept, we show preliminary results of reinforcement learning agents exploiting this environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Available at https://github.com/jazzchipc/gym-diplomacy.
References
Abadi, M., Agarwal, A., Barham, P., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Brockman, G., et al.: OpenAI gym. arXiv preprint. arXiv:1606.01540 (2016)
Calhamer, A.B.: The Rules of Diplomacy, 4th edn. Avalon Hill, Baltimore (2000)
Dhariwal, P., et al.: OpenAI baselines (2017)
Drogoul, A.: When ants play chess (or can strategies emerge from tactical behaviours?). In: Castelfranchi, C., Müller, J.-P. (eds.) MAAMAW 1993. LNCS, vol. 957, pp. 11–27. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0027053
Fabregues, A., Sierra, C.: DipGame: a challenging negotiation testbed. Eng. Appl. Artif. Intell. 24(7), 1137–1146 (2011)
Ferreira, A., Lopes Cardoso, H., Reis, L.P.: Strategic negotiation and trust in diplomacy – the DipBlue approach. In: Nguyen, N.T., Kowalczyk, R., Duval, B., van den Herik, J., Loiseau, S., Filipe, J. (eds.) Transactions on Computational Collective Intelligence XX. LNCS, vol. 9420, pp. 179–200. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27543-7_9
Hill, A., et al.: Stable baselines (2018). https://github.com/hill-a/stable-baselines
de Jonge, D., Baarslag, T., Aydoğan, R., Jonker, C., Fujita, K., Ito, T.: The challenge of negotiation in the game of diplomacy. In: Lujak, M. (ed.) AT 2018. LNCS (LNAI), vol. 11327, pp. 100–114. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17294-7_8
de Jonge, D., Sierra, C.: D-Brane: a diplomacy playing agent for automated negotiations research. Appl. Intell. 47(1), 158–177 (2017)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint. arXiv:1312.5602 (2013)
OpenAI: OpenAI five. https://blog.openai.com/openai-five/
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS-W (2017)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017)
Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press, Cambridge (2018)
Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6(2), 215–219 (1994)
Wu, Y., Mansimov, E., Liao, S., Grosse, R.B., Ba, J.: Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. CoRR abs/1708.05144 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cruz, D., Cruz, J.A., Lopes Cardoso, H. (2019). Reinforcement Learning in Multi-agent Games: Open AI Gym Diplomacy Environment. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-30241-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30240-5
Online ISBN: 978-3-030-30241-2
eBook Packages: Computer ScienceComputer Science (R0)