Adaptive CGF Commander Behavior Modeling Through HTN Guided Monte Carlo Tree Search

  • Xiao Xu
  • Mei Yang
  • Ge Li


Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to predefined conditions, which is not suitable for complex and dynamic environments. This paper aims to apply Monte Carlo Tree Search (MCTS) for the behavior modeling of CGF commander. By look-ahead reasoning, the model generates adaptive decisions to direct the whole troops to fight. Our main work is to formulate the tree model through the state and action abstraction, and extend its expansion process to handle simultaneous and durative moves. We also employ Hierarchical Task Network (HTN) planning to guide the search, thus enhancing the search efficiency. The final implementation is tested in an infantry combat simulation where a company commander needs to control three platoons to assault and clear enemies within defined areas. Comparative results from a series of experiments demonstrate that the HTN guided MCTS commander can outperform other commanders following fixed strategies.


Monte Carlo Tree Search Hierarchical Task Network Computer generated force Behavior modeling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This paper is supported by the HunanProvincial Natural Science Foundation of China (Grant No. 2017JJ3371).


  1. [1]
    Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3): 235–256.CrossRefzbMATHGoogle Scholar
  2. [2]
    Balla, R. & Fern, A. (2009). UCT for tactical assault planning in real-time strategy games. In: 21st International Joint Conference on Artificial Intelligence, 40–45, Pasadena, CA, USA, July 11-17, 2009, Morgan Kaufmann Publishers Inc.Google Scholar
  3. [3]
    Barriga, N. A., Stanescu, M., & Buro, M. (2017). Combining strategic learning with tactical search in real-time strategy games. In: Magerko, B. & Rowe, J.P. (eds), Proceedings of the Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-17), 9–15, Little Cottonwood Canyon, Utah, USA., October 5-9, 2017, AAAI Press.Google Scholar
  4. [4]
    Browne, C., Powley, E. J., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., Tavener, S., Liebana, D. P., Samothrakis, S., & Colton, S. (2012). A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1): 1–43.CrossRefGoogle Scholar
  5. [5]
    Churchill, D. & Buro, M. (2013). Portfolio greedy search and simulation for large-scale combat in starcraft. In: 2013 IEEE Conference on Computational Inteligence in Games (CIG), 1–8, Niagara Falls, ON, Canada, August 11-13, 2013, IEEE.Google Scholar
  6. [6]
    Churchill, D., Saffidine, A., & Buro, M. (2012). Fast heuristic search for RTS game combat scenarios. In: Riedl, M. & Sukthankar, G. (eds), Proceedings of the Eighth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Stanford, California, October 8-12, 2012. AAAI Press.Google Scholar
  7. [7]
    Cowling, P. I., Buro, M., Bida, M., Botea, A., Bouzy, B., Butz, M. V., Hingston, P., Munoz-Avila, H., Nau, D., & Sipper, M. (2013). Search in real-time video games. In: Lucas, S. M., Mateas, M., Preuss, M., Spronck, P. & Togelius, J. (eds), Artificial and Computational Intelligence in Games, 1–19, Dagstuhl, Germany, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google Scholar
  8. [8]
    Juarez-Espinosa, O. & Gonzalez, C. (2004). Situation awareness of commanders: a cognitive model. In: 2004 Conference on Proceedings of Behavior Representation in Modeling and Simulation, Arlington, VA.Google Scholar
  9. [9]
    Justesen, N., Bontrager, P., Togelius, J. & Risi, S. (2017). Deep learning for video game playing. CoRR, abs/1708.07902. Available at Cited November 11, 2017.Google Scholar
  10. [10]
    Justesen, N., Tillman, B., Togelius, J. & Risi, S. (2014). Script-and cluster-based UCT for starcraft. In: 2014 IEEE Conference on Computational Intelligence and Games (CIG), 1–8, Dortmund, Germany, IEEE.Google Scholar
  11. [11]
    Kocsis, L. & Szepesvári, C. (2006). Bandit based monte-carlo planning. In: 17th European Conference on Machine Learning, 282–293, Berlin, Germany, Springer.Google Scholar
  12. [12]
    Kovarsky, A. & Buro, M. (2005). Heuristic search applied to abstract combat games. In Kégl, B. & Lapalme, G., (eds), 18th Conference of the Canadian Society for Computational Studies of Intelligence, 66–78, Springer.Google Scholar
  13. [13]
    Nau, D. S., Cao, Y., Lotem, A., & Munoz-Avila, H. (1999). SHOP: simple hierarchical ordered planner. In: Dean, T. (ed), Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, Stockholm, Sweden, July 31-August 6, 1999, 968-975. Morgan Kaufmann Publishers Inc.Google Scholar
  14. [14]
    North, M. J., Collier, N. T., Ozik, J., Tatara, E. R., Macal, C. M., Bragen, M., & Sydelko, P. (2013). Complex adaptive systems modeling with repast simphony. Complex Adaptive Systems Modeling, 1(1):3.CrossRefGoogle Scholar
  15. [15]
    Ontanon, S. & Buro, M. (2015). Adversarial hierarchical-task network planning for complex real-time games. In: Yang, Q. & Wooldridge, M. (eds), Proceedings of the 24th International Joint Conference on Artificial Intelligence, pp. 1652–1658, Buenos Aires, Argentina, Morgan Kaufmann Publishers Inc.Google Scholar
  16. [16]
    Pew, R. W. & Mavor, A. S. (1998). Modeling Human and Organizational Behavior: Application to Military Simulations. The National Academies Press, Washington, DC.Google Scholar
  17. [17]
    Sokolowski, J. A. (2012). Human behavior modeling: A real-world application. In: Handbook of Real-World Applications in Modeling and Simulation, 26–92. John Wiley & Sons, Inc.CrossRefGoogle Scholar
  18. [18]
    Stanescu, M., Barriga, N. A. & Buro, M. (2014). Hierarchical adversarial search applied to real-time strategy games. In: Horswill, I. & Jhala, A. (eds), Proceedings of the Tenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, October 3-7, 2014, North Carolina State University, Raleigh, NC, USA, AAAI.Google Scholar
  19. [19]
    Stanescu, M., Barriga, N. A., Hess, A. & Buro, M. (2016). Evaluating real-time strategy game states using convolutional neural networks. In: IEEE Conference on Computational Intelligence and Games, Santorini, Greece, September 20-23, 2016, pp 1–7, IEEE.Google Scholar
  20. [20]
    Straatman, R., Verweij, T., Champandard, A., Morcus, R. & Kleve, H. (2013). Hierarchical AI for multiplayer bots in killzone 3. In Rabin, S. (ed), Game AI Pro: Collected Wisdom of Game AI Professionals, 377–390, CRC PressTaylor & Francis Group.Google Scholar
  21. [21]
    Vakas, D., Prince, J., Blacksten, H. R. & Burdick, C. (2001). Commander behavior and course of action selection in JWARS. In Rohrer, M. W., Medeiros, D. J. & Grabau, M. R. (eds), Proceedings of the 33rd Conference on Winter Simulation, WSC 2001, Arlington, VA, USA, December 9-12, 2001, 697-705, WSC.Google Scholar
  22. [22]
    Xu, X., Yang, M., Li, G. & Huang, K. (2017). HTN guided game tree search for adaptive CGF commander behavior modeling. In IEEE 2nd International Conference on Agents, Beijing, China, July 6-9, IEEE.Google Scholar
  23. [23]
    Zhuo, H. H., Munoz-Avila, H. & Yang, Q. (2014). Learning hierarchical task network domains from partially observed plan traces. Artificial Intelligence, 212:134–157.MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Systems Engineering Society of China and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of System EngineeringNational University of Defense TechnologyChangshaChina

Personalised recommendations