Skip to main content

Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 408))

Abstract

Monte Carlo Tree Search (MCTS) has become a widely popular sampled-based search algorithm for two-player games with perfect information. When actions are chosen simultaneously, players may need to mix between their strategies. In this paper, we discuss the adaptation of MCTS to simultaneous move games. We introduce a new algorithm, Online Outcome Sampling (OOS), that approaches a Nash equilibrium strategy over time. We compare both head-to-head performance and exploitability of several MCTS variants in Goofspiel. We show that regret matching and OOS perform best and that all variants produce less exploitable strategies than UCT.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331 (1995)

    Google Scholar 

  2. Auger, D.: Multiple tree for partially observable Monte-Carlo tree search. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 53–62. Springer, Heidelberg (2011)

    Google Scholar 

  3. Bosansky, B., Lisy, V., Cermak, J., Vitek, R., Pechoucek, M.: Using double-oracle method and serialized alpha-beta search for pruning in simultaneous moves games. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), pp. 48–54 (2013)

    Google Scholar 

  4. Browne, C.B., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)

    Article  Google Scholar 

  5. Buro, M.: Solving the Oshi-Zumo game. In: Van Den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games. IFIP, vol. 135, pp. 361–366. Springer, Heidelberg (2003)

    Google Scholar 

  6. Cazenave, T., Saffidine, A.: Score bounded Monte-Carlo tree search. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 93–104. Springer, Heidelberg (2011)

    Google Scholar 

  7. Chaslot, G.M.J.B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive strategies for Monte-Carlo tree search. New Math. Nat. Comput. 4(3), 343–357 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  8. Couetoux, A., Hoock, J.-B., Sokolovska, N., Teytaud, O., Bonnard, N.: Continuous upper confidence trees. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 433–445. Springer, Heidelberg (2011)

    Google Scholar 

  9. Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)

    Google Scholar 

  10. Cowling, P.I., Powley, E.J., Whitehouse, D.: Information set Monte Carlo tree search. IEEE Trans. Comput. Intell. AI Games 4(2), 120–143 (2012)

    Article  Google Scholar 

  11. Finnsson, H.: Cadia-player: a general game playing agent. Master’s thesis, Reykjavík University (2007)

    Google Scholar 

  12. Finnsson, H.: Simulation-based general game playing. Ph.D. thesis, Reykjavík University (2012)

    Google Scholar 

  13. Gelly, S., Kocsis, L., Schoenauer, M., Sebag, M., Silver, D., Szepesvári, C., Teytaud, O.: The grand challenge of computer go: Monte Carlo tree search and extensions. Commun. ACM 55(3), 106–113 (2012)

    Article  Google Scholar 

  14. Gibson, R., Lanctot, M., Burch, N., Szafron, D., Bowling, M.: Generalized sampling and variance in counterfactual regret minimization. In: Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI-12), pp. 1355–1361 (2012)

    Google Scholar 

  15. Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  16. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Lanctot, M., Waugh, K., Bowling, M., Zinkevich, M.: Sampling for regret minimization in extensive games. In: Advances in Neural Information Processing Systems (NIPS 2009), pp. 1078–1086 (2009)

    Google Scholar 

  18. Lanctot, M.: Monte Carlo sampling and regret minimization for equilibrium computation and decision-making in large extensive form games. Ph.D. thesis, Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada (2013)

    Google Scholar 

  19. Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In. Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163. Morgan Kaufmann (1994)

    Google Scholar 

  20. Perick, P., St-Pierre, D.L., Maes, F., Ernst, D.: Comparison of different selection strategies in Monte-Carlo tree search for the game of Tron. In: Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG), pp. 242–249 (2012)

    Google Scholar 

  21. Rhoads, G.C., Bartholdi, L.: Computer solution to the game of pure strategy. Games 3(4), 150–156 (2012)

    Article  MathSciNet  Google Scholar 

  22. Ross, S.M.: Goofspiel – the game of pure strategy. J. Appl. Probab. 8(3), 621–625 (1971)

    Article  MATH  Google Scholar 

  23. Saffidine, A., Finnsson, H., Buro, M.: Alpha-beta pruning for games with simultaneous moves. In: Proceedings of the Thirty-Second Conference on Artificial Intelligence (AAAI-12), pp. 556–562 (2012)

    Google Scholar 

  24. Samothrakis, S., Robles, D., Lucas, S.M.: A UCT agent for Tron: initial investigations. In: Proceedings of the 2010 IEEE Symposium on Computational Intelligence and Games (CIG), pp. 365–371 (2010)

    Google Scholar 

  25. Shafiei, M., Sturtevant, N.R., Schaeffer, J.: Comparing UCT versus CFR in simultaneous games. In: Proceeding of the IJCAI Workshop on General Game-Playing (GIGA), pp. 75–82 (2009)

    Google Scholar 

  26. Teytaud, O., Flory, S.: Upper confidence trees with short term partial information. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 153–162. Springer, Heidelberg (2011)

    Google Scholar 

  27. Winands, M.H.M., Björnsson, Y., Saito, J.-T.: Monte-Carlo tree search solver. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 25–36. Springer, Heidelberg (2008)

    Google Scholar 

  28. Zinkevich, M., Johanson, M., Bowling, M., Piccione, C.: Regret minimization in games with incomplete information. In: Advances in Neural Information Processing Systems 20 (NIPS 2007), pp. 905–912 (2008)

    Google Scholar 

Download references

Acknowledgments

We would like to thank Laurent Bartholdi for sharing his code for solving Goofspiel. We would also like to thank Olivier Teytaud for advice in optimizing Exp3. This work is partially funded by the Netherlands Organisation for Scientific Research (NWO) in the framework of the project Go4Nature, grant number 612.000.938 and the Czech Science Foundation, grant no. P202/12/2054.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Lanctot .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Lanctot, M., Lisý, V., Winands, M.H.M. (2014). Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel. In: Cazenave, T., Winands, M., Iida, H. (eds) Computer Games. CGW 2013. Communications in Computer and Information Science, vol 408. Springer, Cham. https://doi.org/10.1007/978-3-319-05428-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05428-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05427-8

  • Online ISBN: 978-3-319-05428-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics