Abstract
Alpha Zero has made remarkable achievements in Go, Chess and Japanese Chess without human knowledge. Generally, the hardware resources have much influence on the effect of model training significantly. It is important to study game model that do not rely excessively on high-performance computing capabilities. In view of this, by referring to the methods used in AlphaGo Zero, this paper studies the model applying deep learning (DL) and monte carlo tree search (MCTS) with a simple deep neural network (DNN) structure on the Game of Gomoku Model, without considering human expert knowledge. Additionally, an improved method to accelerate MCTS search is proposed on the base of the characteristics of Gomoku. Experiments show that this model can improve the chess power in a short training time with limited hardware resources.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Bouzy B, Helmstetter B (2004) Monte-Carlo go developments. In: ACG, pp 159–174
Bouzy B (2006) Move-pruning techniques for Monte-Carlo go. In: Advances in computer games, pp 104–119
Guo X et al (2014) Deep learning for real-time atari game play using offline Monte-Carlo tree search planning. Adv. Neural Inf. Process. Syst. 4(27):3338–3346
Silver D et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Silver D et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Silver D et al (2018) A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419):1140–1144
Papadopoulos A et al (2012) Exploring optimization strategies in board game abalone for alpha-beta search. In: 2012 IEEE Conference on Computational Intelligence and Games (CIG), pp 63–70
Barto AG (2007) Temporal-difference learning. Scholarpedia 2(11):1604
Hoki K, Kaneko T (2014) Large-scale optimization for evaluation functions with minimax search. J Artif Intell Res 49(1):527–568
Mnih V et al (2013) Playing atari with deep reinforcement learning. ArXiv Preprint arXiv:1312.5602
He K et al (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision, pp 630–645
Abadi M et al (2016) TensorFlow: a system for large-scale machine learning. In: Operating systems design and implementation, pp 265–283
Choi K et al (2017) Kapre: on-GPU audio preprocessing layers for a quick implementation of deep neural network models with keras. ArXiv Preprint arXiv:1706.05781
Jain P et al (2010) Inductive regularized learning of kernel functions. Adv Neural Info Process Syst 23:946–954
Wang Y, Gelly S (2007) Modifications of UCT and sequence-like simulations for monte-carlo go. In: 2007 IEEE Symposium on Computational Intelligence and Games, pp 175–182
Kocsis L, Szepesvári C (2006) Bandit based Monte-Carlo planning. European conference on machine learning, Springer, Heidelberg
Acknowledgment
This work is funded by National Natural Science Foundation of China (61602539, 61873291 and 61773416).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, X., He, S., Wu, L., Chen, D., Zhao, Y. (2020). A Game Model for Gomoku Based on Deep Learning and Monte Carlo Tree Search. In: Deng, Z. (eds) Proceedings of 2019 Chinese Intelligent Automation Conference. CIAC 2019. Lecture Notes in Electrical Engineering, vol 586. Springer, Singapore. https://doi.org/10.1007/978-981-32-9050-1_10
Download citation
DOI: https://doi.org/10.1007/978-981-32-9050-1_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9049-5
Online ISBN: 978-981-32-9050-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)