Skip to main content

Monte Carlo Tree Search with Robust Exploration

  • Conference paper
  • First Online:
  • 1077 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10068))

Abstract

This paper presents a new Monte-Carlo tree search method that focuses on identifying the best move. UCT which minimizes the cumulative regret, has achieved remarkable success in Go and other games. However, recent studies on simple regret reveal that there are better exploration strategies. To further improve the performance, a leaf to be explored is determined not only by the mean but also by the whole reward distribution. We adopted a hybrid approach to obtain reliable distributions. A negamax-style backup of reward distributions is used in the shallower half of a search tree, and UCT is adopted in the rest of the tree. Experiments on synthetic trees show that this presented method outperformed UCT and similar methods, except for trees having uniform width and depth.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.althofer.de/crazy-shadows.html.

References

  1. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)

    Article  MATH  Google Scholar 

  2. Baudiš, P.: Balancing MCTS by dynamically adjusting the komi value. ICGA J. Int. Comput. Games Assoc. 34(3), 131 (2011)

    Google Scholar 

  3. Baudiš, P., Gailly, J.: PACHI: state of the art open source go program. In: Herik, H.J., Plaat, A. (eds.) ACG 2011. LNCS, vol. 7168, pp. 24–38. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31866-5_3

    Chapter  Google Scholar 

  4. Baum, E.B., Smith, W.D.: A bayesian approach to relevance in game playing. Artif. Intell. 97(1), 195–242 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  5. Browne, C., Powley, E.J., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intellig. AI Games 4(1), 1–43 (2012)

    Article  Google Scholar 

  6. Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19), 1832–1852 (2011). Algorithmic Learning Theory (ALT 2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cazenave, T.: Sequential halving applied to trees. IEEE Trans. Comput. Intellig. AI Games 7(1), 102–105 (2015)

    Article  Google Scholar 

  8. Enzenberger, M., Muller, M., Arneson, B., Segal, R.: Fuego-an open-source framework for board games and go engine based on Monte Carlo tree search. IEEE Trans. Comput. Intellig. AI Games 2(4), 259–270 (2010)

    Article  Google Scholar 

  9. Furtak, T., Buro, M.: Minimum proof graphs and fastest-cut-first search heuristics. In: IJCAI, pp. 492–498 (2009)

    Google Scholar 

  10. Gelly, S., Kocsis, L., Schoenauer, M., Sebag, M., Silver, D., Szepesvári, C., Teytaud, O.: The grand challenge of computer go: Monte Carlo tree search and extensions. Commun. ACM 55(3), 106–113 (2012)

    Article  Google Scholar 

  11. Graf, T., Schaefers, L., Platzner, M.: On semeai detection in Monte-Carlo go. In: Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 14–25. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09165-5_2

    Google Scholar 

  12. Imagawa, T., Kaneko, T.: Enhancements in Monte Carlo tree search algorithms for biased game trees. In: IEEE Computational Intelligence and Games (CIG), pp. 43–50 (2015)

    Google Scholar 

  13. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). doi:10.1007/11871842_29

    Chapter  Google Scholar 

  14. Liu, Y.C., Tsuruoka, Y.: Regulation of exploration for simple regret minimization in Monte-Carlo tree search. In: 2015 IEEE Conference on Computational Intelligence and Games (CIG), pp. 35–42, August 2015

    Google Scholar 

  15. Pepels, T., Cazenave, T., Winands, M.H.M., Lanctot, M.: Minimizing simple and cumulative regret in Monte-Carlo tree search. In: Cazenave, T., Winands, M.H.M., Björnsson, Y. (eds.) CGW 2014. CCIS, vol. 504, pp. 1–15. Springer, Heidelberg (2014). doi:10.1007/978-3-319-14923-3_1

    Chapter  Google Scholar 

  16. Plaat, A., Schaeffer, J., Pijls, W., de Bruin, A.: Best-first fixed depth minimax algorithms. Artif. Intell. 87, 255–293 (1996)

    Article  MathSciNet  Google Scholar 

  17. Ramanujan, R., Sabharwal, A., Selman, B.: On adversarial search spaces and sampling-based planning. In: ICAPS, pp. 242–245 (2010)

    Google Scholar 

  18. Tesauro, G., Rajan, V., Segal, R.: Bayesian inference in Monte-Carlo tree search. In: the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010) (2010)

    Google Scholar 

  19. Tolpin, D., Shimony, S.E.: MCTS based on simple regret. In: AAAI (2012)

    Google Scholar 

  20. Winands, M.H.M., Björnsson, Y., Saito, J.-T.: Monte-Carlo tree search solver. In: Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 25–36. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87608-3_3

    Chapter  Google Scholar 

  21. Yokoyama, D., Kitsuregawa, M.: A randomized game-tree search algorithm for shogi based on Bayesian approach. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS (LNAI), vol. 8862, pp. 937–944. Springer, Heidelberg (2014). doi:10.1007/978-3-319-13560-1_81

    Google Scholar 

Download references

Acknowledgement

This work was partially supported by Grant-in-Aid for JSPS Fellows 16J07455.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takahisa Imagawa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Imagawa, T., Kaneko, T. (2016). Monte Carlo Tree Search with Robust Exploration. In: Plaat, A., Kosters, W., van den Herik, J. (eds) Computers and Games. CG 2016. Lecture Notes in Computer Science(), vol 10068. Springer, Cham. https://doi.org/10.1007/978-3-319-50935-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50935-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50934-1

  • Online ISBN: 978-3-319-50935-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics