Skip to main content

Monte-Carlo Simulation Balancing in Practice

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6515))

Abstract

Simulation balancing is a new technique to tune parameters of a playout policy for a Monte-Carlo game-playing program. So far, this algorithm had only been tested in a very artificial setting: it was limited to 5×5 and 6×6 Go, and required a stronger external program that served as a supervisor. In this paper, the effectiveness of simulation balancing is demonstrated in a more realistic setting. A state-of-the-art program, Erica, learned an improved playout policy on the 9×9 board, without requiring any external expert to provide position evaluations. The evaluations were collected by letting the program analyze positions by itself. The previous version of Erica learned pattern weights with the minorization-maximization algorithm. Thanks to simulation balancing, its playing strength was improved from a winning rate of 69% to 78% against Fuego 0.4.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramson, B.: Expected-outcome: A general model of static evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(2), 182–193 (1990)

    Article  Google Scholar 

  2. Brügmann, B.: Monte Carlo Go (1993) (unpublished technical report)

    Google Scholar 

  3. Bouzy, B., Helmstetter, B.: Monte Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) ACG10, pp. 159–175. Kluwer Academic Publishers, Dordrecht (2003)

    Google Scholar 

  4. Coulom, R.: Efficient selectivity and backup operators in monte-carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Gelly, S., Wang, Y., Munos, R., Teytaud, O.: Modification of UCT with patterns in Monte-Carlo Go. Technical Report RR-6062, INRIA (2006)

    Google Scholar 

  6. Bouzy, B.: Associating domain-dependent knowledge and Monte-Carlo approaches within a Go program. Information Sciences, Heuristic Search and Computer Game Playing IV 175(4), 247–257 (2005)

    Google Scholar 

  7. Chen, K.H., Zhang, P.: Monte-Carlo Go with knowledge-guided simulations. ICGA Journal 31(2), 67–76 (2008)

    Google Scholar 

  8. Chaslot, G., Fiter, C., Hoock, J.-B., Rimmel, A., Teytaud, O.: Adding expert knowledge and exploration in monte-carlo tree search. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 1–13. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Bouzy, B., Chaslot, G.: Monte-Carlo Go reinforcement learning experiments. In: Kendall, G., Louis, S. (eds.) 2006 IEEE Symposium on Computational Intelligence and Games, Reno, USA, pp. 187–194 (May 2006)

    Google Scholar 

  10. Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Proceedings of the 24th International Conference on Machine Learning, Corvallis Oregon, USA, pp. 273–280 (2007)

    Google Scholar 

  11. Chaslot, G.M.J.B., Winands, M.H.M., Szita, I., van den Herik, H.J.: Cross-entropy for Monte-Carlo tree search. ICGA Journal 31(3), 145–156 (2008)

    Google Scholar 

  12. Silver, D., Tesauro, G.: Monte-Carlo simulation balancing. In: Bottou, L., Littman, M. (eds.) Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, pp. 945–952. Omnipress (June 2009)

    Google Scholar 

  13. Coulom, R.: Computing Elo ratings of move patterns in the game of Go. ICGA Journal 30(4), 198–208 (2007)

    Google Scholar 

  14. Enzenberger, M., Muller, M.: Fuego—an open-source framework for board games and Go engine based on Monte-Carlo tree search. Technical Report TR 09-08, University of Alberta, Edmonton, Alberta, Canada (2009)

    Google Scholar 

  15. Anderson, D.A.: Monte Carlo search in games. Technical report, Worcester Polytechnic Institute (2009)

    Google Scholar 

  16. Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Chaslot, G., Winands, M., Bouzy, B., Uiterwijk, J.W.H.M., van den Herik, H.J.: Progressive strategies for monte-carlo tree search. In: Wang, P. (ed.) Proceedings of the 10th Joint Conference on Information Sciences, Salt Lake City, USA, pp. 655–661 (2007)

    Google Scholar 

  18. Goertz, U., Shubert, W.: Game records in SGF format (2007), http://www.u-go.net/gamerecords/

  19. Chung-Hsiung, L.: Web2go web site (2009), http://www.web2go.idv.tw/gopro/

  20. Silver, D.: Message to the computer-go mailing list (2009), http://www.mail-archive.com/computer-go@computer-go.org/msg11260.html

    Google Scholar 

  21. Schraudolph, N.N.: Local gain adaptation in stochastic gradient descent. In: Proceedings of the 9th International Conference on Artificial Neural Networks, London. IEEE, Los Alamitos (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, SC., Coulom, R., Lin, SS. (2011). Monte-Carlo Simulation Balancing in Practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds) Computers and Games. CG 2010. Lecture Notes in Computer Science, vol 6515. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17928-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17928-0_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17927-3

  • Online ISBN: 978-3-642-17928-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics