Skip to main content

Spatial Average Pooling for Computer Go

  • Conference paper
  • First Online:
Computer Games (CGW 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1017))

Included in the following conference series:

Abstract

Computer Go has improved up to a superhuman level thanks to Monte Carlo Tree Search (MCTS) combined with Deep Learning. The best computer Go programs use reinforcement learning to train a policy and a value network. These networks are used in a MCTS algorithm to provide strong computer Go players. In this paper we propose to improve the architecture of a value network using Spatial Average Pooling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Browne, C., et al.: A survey of Monte Carlo tree search methods. IEEE TCIAIG 4(1), 1–43 (2012)

    Google Scholar 

  2. Cazenave, T.: Improved policy networks for computer go. In: Winands, M.H.M., van den Herik, H.J., Kosters, W.A. (eds.) ACG 2017. LNCS, vol. 10664, pp. 90–100. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71649-7_8

    Chapter  Google Scholar 

  3. Cazenave, T.: Residual networks for computer go. IEEE Trans. Games 10(1), 107–110 (2018)

    Article  Google Scholar 

  4. Cazenave, T., Jouandeau, N.: On the parallelization of UCT. In: Proceedings of the Computer Games Workshop, pp. 93–101. Citeseer (2007)

    Google Scholar 

  5. Cazenave, T., Jouandeau, N.: A parallel Monte-Carlo tree search algorithm. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 72–80. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87608-3_7

    Chapter  Google Scholar 

  6. Chaslot, G.M.J.-B., Winands, M.H.M., van den Herik, H.J.: Parallel Monte-Carlo tree search. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 60–71. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87608-3_6

    Chapter  Google Scholar 

  7. Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop, number EPFL-CONF-192376 (2011)

    Google Scholar 

  8. Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75538-8_7

    Chapter  Google Scholar 

  9. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29

    Chapter  Google Scholar 

  10. Pascutto, G.-C.: Leela zero (2018). http://zero.sjeng.org/

  11. Rosin, C.D.: Multi-armed bandits with episode context. Ann. Math. Artif. Intell. 61(3), 203–230 (2011)

    Article  MathSciNet  Google Scholar 

  12. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  13. Silver, D., et al.: Mastering chess and Shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)

  14. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)

    Article  Google Scholar 

  15. Tian, Y., Gong, Q., Shang, W., Wu, Y., Zitnick, C.L.: Elf: an extensive, lightweight and flexible research platform for real-time strategy games. In: Advances in Neural Information Processing Systems, pp. 2656–2666 (2017)

    Google Scholar 

  16. Tian, Y., Ma, J., Gong, Q., Sengupta, S., Chen, Z., Zitnick, C.L.: ELF OpenGo (2018). https://github.com/pytorch/ELF

  17. Tian, Y., Zhu, Y.: Better computer go player with neural network and long-term prediction. In: ICLR (2016)

    Google Scholar 

  18. Wu, T.-R., et al.: Multi-labelled value networks for computer go. arXiv e-prints, May 2017

    Google Scholar 

  19. Yamaguchi, Y.: AQ (2018). https://github.com/ymgaq/AQ

  20. Zeng, Q., Zhang, J., Zeng, Z., Li, Y., Chen, M., Liu, S.: Phoenixgo (2018). https://github.com/Tencent/PhoenixGo

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tristan Cazenave .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 910 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cazenave, T. (2019). Spatial Average Pooling for Computer Go. In: Cazenave, T., Saffidine, A., Sturtevant, N. (eds) Computer Games. CGW 2018. Communications in Computer and Information Science, vol 1017. Springer, Cham. https://doi.org/10.1007/978-3-030-24337-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-24337-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-24336-4

  • Online ISBN: 978-3-030-24337-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics