Abstract
Computer Go has improved up to a superhuman level thanks to Monte Carlo Tree Search (MCTS) combined with Deep Learning. The best computer Go programs use reinforcement learning to train a policy and a value network. These networks are used in a MCTS algorithm to provide strong computer Go players. In this paper we propose to improve the architecture of a value network using Spatial Average Pooling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Browne, C., et al.: A survey of Monte Carlo tree search methods. IEEE TCIAIG 4(1), 1–43 (2012)
Cazenave, T.: Improved policy networks for computer go. In: Winands, M.H.M., van den Herik, H.J., Kosters, W.A. (eds.) ACG 2017. LNCS, vol. 10664, pp. 90–100. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71649-7_8
Cazenave, T.: Residual networks for computer go. IEEE Trans. Games 10(1), 107–110 (2018)
Cazenave, T., Jouandeau, N.: On the parallelization of UCT. In: Proceedings of the Computer Games Workshop, pp. 93–101. Citeseer (2007)
Cazenave, T., Jouandeau, N.: A parallel Monte-Carlo tree search algorithm. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 72–80. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87608-3_7
Chaslot, G.M.J.-B., Winands, M.H.M., van den Herik, H.J.: Parallel Monte-Carlo tree search. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 60–71. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87608-3_6
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop, number EPFL-CONF-192376 (2011)
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75538-8_7
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29
Pascutto, G.-C.: Leela zero (2018). http://zero.sjeng.org/
Rosin, C.D.: Multi-armed bandits with episode context. Ann. Math. Artif. Intell. 61(3), 203–230 (2011)
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Silver, D., et al.: Mastering chess and Shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Tian, Y., Gong, Q., Shang, W., Wu, Y., Zitnick, C.L.: Elf: an extensive, lightweight and flexible research platform for real-time strategy games. In: Advances in Neural Information Processing Systems, pp. 2656–2666 (2017)
Tian, Y., Ma, J., Gong, Q., Sengupta, S., Chen, Z., Zitnick, C.L.: ELF OpenGo (2018). https://github.com/pytorch/ELF
Tian, Y., Zhu, Y.: Better computer go player with neural network and long-term prediction. In: ICLR (2016)
Wu, T.-R., et al.: Multi-labelled value networks for computer go. arXiv e-prints, May 2017
Yamaguchi, Y.: AQ (2018). https://github.com/ymgaq/AQ
Zeng, Q., Zhang, J., Zeng, Z., Li, Y., Chen, M., Liu, S.: Phoenixgo (2018). https://github.com/Tencent/PhoenixGo
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cazenave, T. (2019). Spatial Average Pooling for Computer Go. In: Cazenave, T., Saffidine, A., Sturtevant, N. (eds) Computer Games. CGW 2018. Communications in Computer and Information Science, vol 1017. Springer, Cham. https://doi.org/10.1007/978-3-030-24337-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-24337-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24336-4
Online ISBN: 978-3-030-24337-1
eBook Packages: Computer ScienceComputer Science (R0)