Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel

Lanctot, Marc; Lisý, Viliam; Winands, Mark H. M.

doi:10.1007/978-3-319-05428-5_3

Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel

Marc Lanctot⁴,
Viliam Lisý⁵ &
Mark H. M. Winands⁴

Conference paper
First Online: 01 January 2014

859 Accesses
5 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 408))

Abstract

Monte Carlo Tree Search (MCTS) has become a widely popular sampled-based search algorithm for two-player games with perfect information. When actions are chosen simultaneously, players may need to mix between their strategies. In this paper, we discuss the adaptation of MCTS to simultaneous move games. We introduce a new algorithm, Online Outcome Sampling (OOS), that approaches a Nash equilibrium strategy over time. We compare both head-to-head performance and exploitability of several MCTS variants in Goofspiel. We show that regret matching and OOS perform best and that all variants produce less exploitable strategies than UCT.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331 (1995)
Google Scholar
Auger, D.: Multiple tree for partially observable Monte-Carlo tree search. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 53–62. Springer, Heidelberg (2011)
Google Scholar
Bosansky, B., Lisy, V., Cermak, J., Vitek, R., Pechoucek, M.: Using double-oracle method and serialized alpha-beta search for pruning in simultaneous moves games. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), pp. 48–54 (2013)
Google Scholar
Browne, C.B., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)
Article Google Scholar
Buro, M.: Solving the Oshi-Zumo game. In: Van Den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games. IFIP, vol. 135, pp. 361–366. Springer, Heidelberg (2003)
Google Scholar
Cazenave, T., Saffidine, A.: Score bounded Monte-Carlo tree search. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 93–104. Springer, Heidelberg (2011)
Google Scholar
Chaslot, G.M.J.B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive strategies for Monte-Carlo tree search. New Math. Nat. Comput. 4(3), 343–357 (2008)
Article MATH MathSciNet Google Scholar
Couetoux, A., Hoock, J.-B., Sokolovska, N., Teytaud, O., Bonnard, N.: Continuous upper confidence trees. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 433–445. Springer, Heidelberg (2011)
Google Scholar
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Google Scholar
Cowling, P.I., Powley, E.J., Whitehouse, D.: Information set Monte Carlo tree search. IEEE Trans. Comput. Intell. AI Games 4(2), 120–143 (2012)
Article Google Scholar
Finnsson, H.: Cadia-player: a general game playing agent. Master’s thesis, Reykjavík University (2007)
Google Scholar
Finnsson, H.: Simulation-based general game playing. Ph.D. thesis, Reykjavík University (2012)
Google Scholar
Gelly, S., Kocsis, L., Schoenauer, M., Sebag, M., Silver, D., Szepesvári, C., Teytaud, O.: The grand challenge of computer go: Monte Carlo tree search and extensions. Commun. ACM 55(3), 106–113 (2012)
Article Google Scholar
Gibson, R., Lanctot, M., Burch, N., Szafron, D., Bowling, M.: Generalized sampling and variance in counterfactual regret minimization. In: Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI-12), pp. 1355–1361 (2012)
Google Scholar
Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)
Article MATH MathSciNet Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Lanctot, M., Waugh, K., Bowling, M., Zinkevich, M.: Sampling for regret minimization in extensive games. In: Advances in Neural Information Processing Systems (NIPS 2009), pp. 1078–1086 (2009)
Google Scholar
Lanctot, M.: Monte Carlo sampling and regret minimization for equilibrium computation and decision-making in large extensive form games. Ph.D. thesis, Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada (2013)
Google Scholar
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In. Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163. Morgan Kaufmann (1994)
Google Scholar
Perick, P., St-Pierre, D.L., Maes, F., Ernst, D.: Comparison of different selection strategies in Monte-Carlo tree search for the game of Tron. In: Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG), pp. 242–249 (2012)
Google Scholar
Rhoads, G.C., Bartholdi, L.: Computer solution to the game of pure strategy. Games 3(4), 150–156 (2012)
Article MathSciNet Google Scholar
Ross, S.M.: Goofspiel – the game of pure strategy. J. Appl. Probab. 8(3), 621–625 (1971)
Article MATH Google Scholar
Saffidine, A., Finnsson, H., Buro, M.: Alpha-beta pruning for games with simultaneous moves. In: Proceedings of the Thirty-Second Conference on Artificial Intelligence (AAAI-12), pp. 556–562 (2012)
Google Scholar
Samothrakis, S., Robles, D., Lucas, S.M.: A UCT agent for Tron: initial investigations. In: Proceedings of the 2010 IEEE Symposium on Computational Intelligence and Games (CIG), pp. 365–371 (2010)
Google Scholar
Shafiei, M., Sturtevant, N.R., Schaeffer, J.: Comparing UCT versus CFR in simultaneous games. In: Proceeding of the IJCAI Workshop on General Game-Playing (GIGA), pp. 75–82 (2009)
Google Scholar
Teytaud, O., Flory, S.: Upper confidence trees with short term partial information. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 153–162. Springer, Heidelberg (2011)
Google Scholar
Winands, M.H.M., Björnsson, Y., Saito, J.-T.: Monte-Carlo tree search solver. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 25–36. Springer, Heidelberg (2008)
Google Scholar
Zinkevich, M., Johanson, M., Bowling, M., Piccione, C.: Regret minimization in games with incomplete information. In: Advances in Neural Information Processing Systems 20 (NIPS 2007), pp. 905–912 (2008)
Google Scholar

Download references

Acknowledgments

We would like to thank Laurent Bartholdi for sharing his code for solving Goofspiel. We would also like to thank Olivier Teytaud for advice in optimizing Exp3. This work is partially funded by the Netherlands Organisation for Scientific Research (NWO) in the framework of the project Go4Nature, grant number 612.000.938 and the Czech Science Foundation, grant no. P202/12/2054.

Author information

Authors and Affiliations

Department of Knowledge Engineering, Maastricht University, Maastricht, The Netherlands
Marc Lanctot & Mark H. M. Winands
Department of Computer Science, Czech Technical University in Prague, Praha, Czech Republic
Viliam Lisý

Authors

Marc Lanctot
View author publications
You can also search for this author in PubMed Google Scholar
Viliam Lisý
View author publications
You can also search for this author in PubMed Google Scholar
Mark H. M. Winands
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Lanctot .

Editor information

Editors and Affiliations

Université Paris-Dauphine, Paris, France
Tristan Cazenave
Universiteit Maastricht, Maastricht, The Netherlands
Mark H.M. Winands
School of Information Science, JAIST, Nomi, Japan
Hiroyuki Iida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lanctot, M., Lisý, V., Winands, M.H.M. (2014). Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel. In: Cazenave, T., Winands, M., Iida, H. (eds) Computer Games. CGW 2013. Communications in Computer and Information Science, vol 408. Springer, Cham. https://doi.org/10.1007/978-3-319-05428-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-05428-5_3
Published: 01 April 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05427-8
Online ISBN: 978-3-319-05428-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics