Monte Carlo Tree Search with Robust Exploration

Imagawa, Takahisa; Kaneko, Tomoyuki

doi:10.1007/978-3-319-50935-8_4

Monte Carlo Tree Search with Robust Exploration

Takahisa Imagawa^16,17 &
Tomoyuki Kaneko¹⁶

Conference paper
First Online: 10 December 2016

1077 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10068))

Abstract

This paper presents a new Monte-Carlo tree search method that focuses on identifying the best move. UCT which minimizes the cumulative regret, has achieved remarkable success in Go and other games. However, recent studies on simple regret reveal that there are better exploration strategies. To further improve the performance, a leaf to be explored is determined not only by the mean but also by the whole reward distribution. We adopted a hybrid approach to obtain reliable distributions. A negamax-style backup of reward distributions is used in the shallower half of a search tree, and UCT is adopted in the rest of the tree. Experiments on synthetic trees show that this presented method outperformed UCT and similar methods, except for trees having uniform width and depth.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://www.althofer.de/crazy-shadows.html.

References

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)
Article MATH Google Scholar
Baudiš, P.: Balancing MCTS by dynamically adjusting the komi value. ICGA J. Int. Comput. Games Assoc. 34(3), 131 (2011)
Google Scholar
Baudiš, P., Gailly, J.: PACHI: state of the art open source go program. In: Herik, H.J., Plaat, A. (eds.) ACG 2011. LNCS, vol. 7168, pp. 24–38. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31866-5_3
Chapter Google Scholar
Baum, E.B., Smith, W.D.: A bayesian approach to relevance in game playing. Artif. Intell. 97(1), 195–242 (1997)
Article MathSciNet MATH Google Scholar
Browne, C., Powley, E.J., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intellig. AI Games 4(1), 1–43 (2012)
Article Google Scholar
Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19), 1832–1852 (2011). Algorithmic Learning Theory (ALT 2009)
Article MathSciNet MATH Google Scholar
Cazenave, T.: Sequential halving applied to trees. IEEE Trans. Comput. Intellig. AI Games 7(1), 102–105 (2015)
Article Google Scholar
Enzenberger, M., Muller, M., Arneson, B., Segal, R.: Fuego-an open-source framework for board games and go engine based on Monte Carlo tree search. IEEE Trans. Comput. Intellig. AI Games 2(4), 259–270 (2010)
Article Google Scholar
Furtak, T., Buro, M.: Minimum proof graphs and fastest-cut-first search heuristics. In: IJCAI, pp. 492–498 (2009)
Google Scholar
Gelly, S., Kocsis, L., Schoenauer, M., Sebag, M., Silver, D., Szepesvári, C., Teytaud, O.: The grand challenge of computer go: Monte Carlo tree search and extensions. Commun. ACM 55(3), 106–113 (2012)
Article Google Scholar
Graf, T., Schaefers, L., Platzner, M.: On semeai detection in Monte-Carlo go. In: Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 14–25. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09165-5_2
Google Scholar
Imagawa, T., Kaneko, T.: Enhancements in Monte Carlo tree search algorithms for biased game trees. In: IEEE Computational Intelligence and Games (CIG), pp. 43–50 (2015)
Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). doi:10.1007/11871842_29
Chapter Google Scholar
Liu, Y.C., Tsuruoka, Y.: Regulation of exploration for simple regret minimization in Monte-Carlo tree search. In: 2015 IEEE Conference on Computational Intelligence and Games (CIG), pp. 35–42, August 2015
Google Scholar
Pepels, T., Cazenave, T., Winands, M.H.M., Lanctot, M.: Minimizing simple and cumulative regret in Monte-Carlo tree search. In: Cazenave, T., Winands, M.H.M., Björnsson, Y. (eds.) CGW 2014. CCIS, vol. 504, pp. 1–15. Springer, Heidelberg (2014). doi:10.1007/978-3-319-14923-3_1
Chapter Google Scholar
Plaat, A., Schaeffer, J., Pijls, W., de Bruin, A.: Best-first fixed depth minimax algorithms. Artif. Intell. 87, 255–293 (1996)
Article MathSciNet Google Scholar
Ramanujan, R., Sabharwal, A., Selman, B.: On adversarial search spaces and sampling-based planning. In: ICAPS, pp. 242–245 (2010)
Google Scholar
Tesauro, G., Rajan, V., Segal, R.: Bayesian inference in Monte-Carlo tree search. In: the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010) (2010)
Google Scholar
Tolpin, D., Shimony, S.E.: MCTS based on simple regret. In: AAAI (2012)
Google Scholar
Winands, M.H.M., Björnsson, Y., Saito, J.-T.: Monte-Carlo tree search solver. In: Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 25–36. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87608-3_3
Chapter Google Scholar
Yokoyama, D., Kitsuregawa, M.: A randomized game-tree search algorithm for shogi based on Bayesian approach. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS (LNAI), vol. 8862, pp. 937–944. Springer, Heidelberg (2014). doi:10.1007/978-3-319-13560-1_81
Google Scholar

Download references

Acknowledgement

This work was partially supported by Grant-in-Aid for JSPS Fellows 16J07455.

Author information

Authors and Affiliations

Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
Takahisa Imagawa & Tomoyuki Kaneko
Research Fellow of Japan Society for the Promotion of Science, Tokyo, Japan
Takahisa Imagawa

Authors

Takahisa Imagawa
View author publications
You can also search for this author in PubMed Google Scholar
Tomoyuki Kaneko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takahisa Imagawa .

Editor information

Editors and Affiliations

Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, Zuid-Holland, The Netherlands
Aske Plaat
Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, Zuid-Holland, The Netherlands
Walter Kosters
Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, Zuid-Holland, The Netherlands
Jaap van den Herik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Imagawa, T., Kaneko, T. (2016). Monte Carlo Tree Search with Robust Exploration. In: Plaat, A., Kosters, W., van den Herik, J. (eds) Computers and Games. CG 2016. Lecture Notes in Computer Science(), vol 10068. Springer, Cham. https://doi.org/10.1007/978-3-319-50935-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-50935-8_4
Published: 10 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50934-1
Online ISBN: 978-3-319-50935-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics