Abstract
One of the most basic activities performed by an intelligent agent is deciding what to do next. The decision is usually about selecting the move with the highest expectation, or exploring new scenarios. Monte-Carlo Tree Search (MCTS), which was developed as a game playing agent, deals with this exploration–exploitation ‘dilemma’ using a multi-armed bandits strategy. The success of MCTS in a wide range of problems, such as combinatorial optimisation, reinforcement learning, and games, is due to its ability to rapidly evaluate problem states without requiring domain-specific knowledge. However, it has been acknowledged that the trade-off between exploration and exploitation is crucial for the performance of the algorithm, and affects the efficiency of the agent in learning deceptive states. One type of deception is states that give immediate rewards, but lead to a suboptimal solution in the long run. These states are known as trap states, and have been thoroughly investigated in previous research. In this work, we study the opposite of trap states, known as sacrifice states, which are deceptive moves that result in a local loss but are globally optimal, and investigate the efficiency of MCTS enhancements in identifying this type of moves.
Similar content being viewed by others
References
Arneson, B., Ryan, H., Henderson, P.: Mohex wins hex tournament. ICGA J. 32(2), 114 (2009)
Baba, S., Joe, Y., Iwasaki, A., Yokoo, M.: Real-time solving of quantified csps based on monte-carlo game tree search. In: International Joint Conferences on Artificial Intelligence (IJCAI), pp. 1–10 (2011)
Bjornsson, Y., Finnsson, H.: Cadiaplayer: A simulation-based general game player. In: IEEE Transactions on Computational Intelligence and AI in Games, pp. 4–15 (2009)
Blum, C., Dorigo, M.: Deception in ant colony optimization. In Ant Colony Optimization and Swarm Intelligence, pp. 118–129. Springer, (2004)
Bouzy, B.: Associating domain-dependent knowledge and Monte Carlo approaches within a Go program. In: Joint Conference on Information Sciences, pp. 505–508 (2003)
Bouzy, B.: Associating shallow and selective global tree search with monte carlo for 9\(\times \) 9 go. In: Computers and Games, pp. 67–80. Springer (2006)
Bravi, I., Khalifa, A., Holmgård, C., Togelius, J.: Evolving uct alternatives for general video game playing. In: The IJCAI-16 Workshop on General Game Playing, p. 63 (2016)
Browne, C.B., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of monte carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)
Cazenave, T., Balbo, F., Pinson, S.: Using a monte-carlo approach for bus regulation. In: International IEEE Conference on Intelligent Transportation Systems, pp. 1–6. IEEE (2009)
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games, vol. 1. Cambridge University Press, Cambridge (2006)
Chaslot, G., Bakkes, S., Szita, I., Spronck, P.: Monte-Carlo tree search: a new framework for game AI. In: Proceedings of the Artificial Intelligence and Interactive Digital Entertainment Conference, pp. 216–217. The AAAI Press (2008)
Coulom, R.: Efficient selectivity and backup operators in monte-carlo tree search. In: Computers and Games, pp. 72–83. Springer (2007)
Enzenberger, M., Muller, M., Arneson, B., Segal, R.: FUEGO-an open-source framework for board games and Go engine based on Monte-Carlo tree search. IEEE Trans. Comput. Intell. AI Games 2(4), 259–270 (2010)
Enzenberger, M.: Muller, Martin, Arneson, Broderick, Segal, Richard: Fuego-an open-source framework for board games and go engine based on monte carlo tree search. IEEE Trans. Comput. Intell. AIGames 2(4), 259–270 (2010)
Finnsson, H., Björnsson, Y.: Game-tree properties and MCTS performance. In: International Joint Conferences on Artificial Intelligence, IJCAI, Workshop on General Game Playing (GIGA), pp. 23–30 (2011)
Frydenberg, F., Andersen, K.R., Risi, S., Togelius, J.: Investigating mcts modifications in general video game playing. In: 2015 IEEE Conference on Computational Intelligence and Games (CIG), pp. 107–113. IEEE (2015)
Gelly, S., Silver, D.: Monte-carlo tree search and rapid action value estimation in computer go. Artif. Intell. 175(11), 1856–1875 (2011)
Greiner, R., Hayward, R., Jankowska, M., Molloy, M.: Finding optimal satisficing strategies for and-or trees. Artif. Intell. 170(1), 19–58 (2006)
Hayward, R.B., Arneson, B., Henderson, P.: Monte Carlo tree search in hex. IEEE Trans. Comput. Intell. AI Games 2(4), 251–258 (2010)
Helmbold, D.P., Parker-Wood, A.: All-moves-as-first heuristics in monte-carlo go. In: Arabnia, H.R., de la Fuente, D., Olivas, J.A. (eds.) IC-AI, pp. 605–610. CSREA Press (2009)
Horn, J., Goldberg, D.E.: Genetic algorithm difficulty and the modality of fitness landscapes. In: Whitley, L.D., Vose, M.D. (eds.) Foundations of genetic algorithms vol. 3, pp. 243–269. Morgan Kaufmann (1994)
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Machine learning: ECML 2006, pp. 282–293. Springer (2006)
Kocsis, L., Szepesvári, C., Willemson, J.: Improved monte-carlo search. University of Tartu, Estonia, Techical Report 1, (2006)
Levinson, R.: General game-playing and reinforcement learning. Comput. Intell. 12(1), 155–176 (1996)
Mahlmann, T., Togelius, J., Yannakakis, G.N.: Towards procedural strategy game generation: evolving complementary unit types. In: Applications of Evolutionary Computation, pp. 93–102. Springer, (2011)
Park, H., Kim, K.-J.: Mcts with influence map for general video game playing. In: 2015 IEEE Conference on Computational Intelligence and Games (CIG), pp. 534–535. IEEE (2015)
Pell, B.: A strategic metagame player for general chess-like games. Comput. Intell. 12(1), 177–198 (1996)
Potvin, J.-Y., Bengio, S.: The vehicle routing problem with time windows part ii: genetic search. INFORMS J. Comput. 8(2), 165–172 (1996)
Ramanujan, R., Sabharwal, A., Selman, B.: On adversarial search spaces and sampling-based planning. In: The International Conference on Automated Planning and Scheduling (ICAPS), pp. 242–245 (2010a)
Ramanujan, R., Sabharwal, A., Selman, B.: Understanding sampling style adversarial search methods. In: Grünwald, P., Spirtes, P. (eds.) UAI, pp. 474–483. AUAI Press (2010b)
Richard, J.L.: Amazons discover Monte-Carlo. In: Proceedings of the International Conference on Computers and Games, CG ’08, pp. 13–24, Berlin, Heidelberg. Springer (2008)
Rimmel, A., Teytaud, F., Cazenave, T.: Optimization of the nested monte-carlo algorithm on the traveling salesman problem with time windows. In: Applications of Evolutionary Computation, pp. 501–510 (2011)
Sato, Y., Takahashi, D., Grimbergen, R.: A shogi program based on monte-carlo tree search. Int. Comput. Games Assoc. 33(2), 80–92 (2010)
Shibahara, K., Yoshiyuki, K.: Combining final score with winning percentage by sigmoid function in monte-carlo simulations. In IEEE Symposium on Computational Intelligence and Games, pp. 183–190. IEEE (2008)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, P.: George, Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Szita, I., Chaslot, G., Spronck, P.: Monte-Carlo tree search in settlers of catan. In: Proceedings of the 12th International Conference on Advances in Computer Games, ACG’09, pp. 21–32, Berlin, Heidelberg. Springer (2010)
Tesauro, G., Rajan, V.T., Segal, R.: Bayesian inference in monte-carlo tree search. In: Grünwald, P., Spirtes, P. (eds.) Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 580–588. AUAI Press (2010)
Teytaud, F. Teytaud, O.: On the huge benefit of decisive moves in monte-carlo tree search algorithms. In: Yannakakis, G.N., Togelius, J. (eds.) IEEE Conference on Computational Intelligence and Games, pp. 359–364. IEEE (2010)
Tsai, C.-T., Liaw, C., Huang, H.-C., Ko, C.-H.: An evolutionary strategy for a computer team game. Comput. Intell. 27(2), 218–234 (2011)
Weinstein, A., Mansley, C., Littman, M.: Sample-based planning for continuous action markov decision processes. In: Bacchus, F. Domshlak, C., Edelkamp, S., Helmert, M. (eds.) Proceedings of the International Conference on Automated Planning and Scheduling, ICAPS, pp. 335–338. AAAI (2011)
Winands, M.H.M., Björnsson, Y., Saito, J.-T.: Monte carlo tree search in lines of action. IEEE Trans. Comput. Intell. AI Games 2(4), 239–250 (2010)
Xie, F., Liu, Z.: Backpropagation modification in monte-carlo game tree search. In: International Symposium on Intelligent Information Technology Application, vol. 2, pp. 125–128. IEEE, (2009)
Xin, B., Chen, J., Pan, F.: Problem difficulty analysis for particle swarm optimization: deception and modality. In: Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation, pp. 623–630. ACM (2009)
Acknowledgments
This research was supported under Australian Research Council’s Discovery Projects funding scheme, Project Number DE 140100017.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Companez, N., Aleti, A. Can Monte-Carlo Tree Search learn to sacrifice?. J Heuristics 22, 783–813 (2016). https://doi.org/10.1007/s10732-016-9320-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10732-016-9320-y