Advertisement

Approximation Methods for Monte Carlo Tree Search

  • Kirill Aksenov
  • Aleksandr I. PanovEmail author
Conference paper
  • 15 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1156)

Abstract

Today planning algorithms are among the most sought after. One of the main such algorithms is Monte Carlo Tree Search. However, this architecture is complex in terms of parallelization and development. We presented possible approximations for the MCTS algorithm, which allowed us to significantly increase the learning speed of the agent.

Keywords

Reinforcement learning Monte-Carlo tree search Neural network approximation Deep learning 

Notes

Acknowledgements

The reported study was supported by RFBR, research Projects No. 17-29-07079 and No. 18-29-22047.

References

  1. 1.
    Sutton, R.S.: Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bull. 2(4), 160–163 (1991)CrossRefGoogle Scholar
  2. 2.
    Coulom, R.: Efficient selectivity and backup operators in monte-carlo tree search. In: International Conference on Computers and Games, pp. 72–83. Springer (2006)Google Scholar
  3. 3.
    Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)CrossRefGoogle Scholar
  4. 4.
    Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: European Conference on Machine Learning, pp. 282–293. Springer (2006)Google Scholar
  5. 5.
    Gelly, S., Silver, D.: Monte-carlo tree search and rapid action value estimation in computer go. Artif. Intell. 175(11), 1856–1875 (2011)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Lecarpentier, E., Infantes, G., Lesire, C., Rachelson, E.: Open loop execution of tree-search algorithms. arXiv preprint arXiv:1805.01367 (2018)
  7. 7.
    Guez, A., Weber, T., Antonoglou, I., Simonyan, K., Vinyals, O., Wierstra, D., Munos, R., Silver, D.: Learning to search with MCTSNets. arXiv preprint arXiv:1802.04697 (2018)
  8. 8.
    Racanière, S., Weber, T., Reichert, D., Buesing, L., Guez, A., Rezende, D.J., Badia, A.P., Vinyals, O., Heess, N., Li, Y., et al.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 5690–5701 (2017)Google Scholar
  9. 9.
    Chaslot, G.M.J.-B., Winands, M.H.M., van Den Herik, H.J.: Parallel monte-carlo tree search. In: International Conference on Computers and Games, pp. 60–71. Springer (2008)Google Scholar
  10. 10.
    Ali Mirsoleimani, S., Plaat, A., van den Herik, J., Vermaseren, J.: A new method for parallel Monte Carlo tree search. arXiv preprint arXiv:1605.04447 (2016)
  11. 11.
    Schrader, M.-P.B.: gym-sokoban (2018). https://github.com/mpSchrader/gym-sokoban
  12. 12.
    Edelkamp, S., Gath, M., Greulich, C., Humann, M., Herzog, O., Lawo, M.: Monte-Carlo tree search for logistics. In: Commercial Transport, pp. 427–440. Springer (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.National Research University Higher School of EconomicsMoscowRussia
  2. 2.Artificial Intelligence Research Institute, Federal Research Center “Computer Science and Control” of the Russian Academy of SciencesMoscowRussia
  3. 3.Moscow Institute of Physics and TechnologyMoscowRussia

Personalised recommendations