Abstract
Computer Go has improved up to a superhuman level thanks to Monte Carlo Tree Search (MCTS) combined with Deep Learning. The best computer Go programs use reinforcement learning to train a policy and a value network. These networks are used in a MCTS algorithm to provide strong computer Go players. In this paper we propose to improve the architecture of a value network using Spatial Average Pooling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Browne, C., et al.: A survey of Monte Carlo tree search methods. IEEE TCIAIG 4(1), 1–43 (2012)
Cazenave, T.: Improved policy networks for computer go. In: Winands, M.H.M., van den Herik, H.J., Kosters, W.A. (eds.) ACG 2017. LNCS, vol. 10664, pp. 90–100. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71649-7_8
Cazenave, T.: Residual networks for computer go. IEEE Trans. Games 10(1), 107–110 (2018)
Cazenave, T., Jouandeau, N.: On the parallelization of UCT. In: Proceedings of the Computer Games Workshop, pp. 93–101. Citeseer (2007)
Cazenave, T., Jouandeau, N.: A parallel Monte-Carlo tree search algorithm. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 72–80. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87608-3_7
Chaslot, G.M.J.-B., Winands, M.H.M., van den Herik, H.J.: Parallel Monte-Carlo tree search. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 60–71. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87608-3_6
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop, number EPFL-CONF-192376 (2011)
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75538-8_7
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29
Pascutto, G.-C.: Leela zero (2018). http://zero.sjeng.org/
Rosin, C.D.: Multi-armed bandits with episode context. Ann. Math. Artif. Intell. 61(3), 203–230 (2011)
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Silver, D., et al.: Mastering chess and Shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Tian, Y., Gong, Q., Shang, W., Wu, Y., Zitnick, C.L.: Elf: an extensive, lightweight and flexible research platform for real-time strategy games. In: Advances in Neural Information Processing Systems, pp. 2656–2666 (2017)
Tian, Y., Ma, J., Gong, Q., Sengupta, S., Chen, Z., Zitnick, C.L.: ELF OpenGo (2018). https://github.com/pytorch/ELF
Tian, Y., Zhu, Y.: Better computer go player with neural network and long-term prediction. In: ICLR (2016)
Wu, T.-R., et al.: Multi-labelled value networks for computer go. arXiv e-prints, May 2017
Yamaguchi, Y.: AQ (2018). https://github.com/ymgaq/AQ
Zeng, Q., Zhang, J., Zeng, Z., Li, Y., Chen, M., Liu, S.: Phoenixgo (2018). https://github.com/Tencent/PhoenixGo
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cazenave, T. (2019). Spatial Average Pooling for Computer Go. In: Cazenave, T., Saffidine, A., Sturtevant, N. (eds) Computer Games. CGW 2018. Communications in Computer and Information Science, vol 1017. Springer, Cham. https://doi.org/10.1007/978-3-030-24337-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-24337-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24336-4
Online ISBN: 978-3-030-24337-1
eBook Packages: Computer ScienceComputer Science (R0)