Application to the StarCraft Game Environment

Ho, Seng-Beng

doi:10.1007/978-3-319-32113-4_7

Seng-Beng Ho^3,4

Part of the book series: Socio-Affective Computing ((SAC,volume 3))

430 Accesses

Abstract

The StarCraft game environment provides an ideal computational platform to test and illustrate the various principles of noology that have been described in the previous chapters. In this chapter we describe an implemented AI program that plays against the StarCraft built-in game engine. Causal learning is applied successfully to rapidly learn the causal rules to engage and attack enemy agents. Scripts are learned along the way to accelerate problem solving. Counterfactual information associated with scripts that were alluded to in previous chapters are shown here to play a critical role in providing information for the planning of battle strategies. Affective competition is implemented as a high-level goal prioritizing mechanism for the agent involved. As in the previous chapter, the learning of heuristics is shown here to assist in reducing the search space needed for problem solving. Also, as in Chap. 5, it is illustrated here how the grounded conceptual representations used enable the system to learn problem solving methods rapidly through language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Suppose many instances of Attack events are triggered. What would happen in StarCraft is that the situation in which HP = 0 (death at the end) for the Enemy agent does not take place all the time. Sometimes, it is the Self agent whose HP value becomes zero. Nevertheless, this script, other than initiating an attack event (i.e. ATT → 1), also provides one way to kill the Enemy agent. Therefore, the script can also be called ATTACK-AND-KILL-SCRIPT and can be used by a Self agent accordingly. In Sect. 7.4.3.3 we show how the different HP OUTCOMES (i.e., sometimes the Enemy agent dies and sometimes the Self agent dies) can be organized in the counterfactual portion of a script.
2.
As mentioned in footnote 3 of Chap. 3, fear and anxiousness are used synonymously here and the rest of the book. There are subtle differences which we will not particularly address in this book. E.g., typically it is “fear of the snake causes him to run” and “I am anxious that I cannot finish my homework by tonight.” In these examples, it seems to be a matter of degree.
3.
For now, we assume other parameters, such as the starting HP value of the Self and Enemy agents, are always fixed at a certain value. There are also other parameters such as the absolute locations of the agents involved. We assume there are heuristics learned earlier, such as that discussed in Sect. 6.4, Chap. 6, that supply the knowledge that absolute locations typically do not affect physical processes unless the entities involved are electrically or magnetically charged.
4.
A question may be raised here as to whether selecting to compute ANX(Self1+Self2) is a built-in process or something the system can figure out by itself. Since earlier in Sect. 7.4.3.1 we have defined minimizing ACDR (Anxiousness of Commander) to be a goal and ACDR is related to the summation of the anxiousness of individual Self agents, there is a way to select to computer ANX(Self1+Self2) automatically through a high level reasoning process. However, we will not delve into this in this book and will leave this for future investigations.
5.
Again, a same question like that discussed in footnote 4 can be raised here – how can this function be learned? Or derived from some other built-in knowledge? The Effort portion of this function relates to energy expansion, and this could come from the basic knowledge that anything that has to do with energy expansion has to be taken into consideration in any task, as in the case of spatial movement discussed in Chap. 3, Sect. 3.1. As for Duration of Battle, this can come from earlier experiences in which too much time spent on an Attack event resulted in the enemy being able to reinforce its agents, and that resulted in the Self side being more likely to lose in the battle. The complete reasoning and learning process for this is left as a challenge for future investigations, but it is certainly doable within our framework or a logical extension of it.
6.
This entire process of observing a trend of BATT-DIFFTY changing as a result of the change of the number of Self agents involved in a battle and then triggering the continued experiments by varying the number of Self agents is certainly learnable within our causal learning framework, but it would be necessary to build some representations at a meta-level – a level at which the physical experimental process itself is explicitly represented in some scripts and/or logical forms. As mentioned at the end of Chap. 6, Sect. 6.6, “thinking and reasoning” processes are also a kind of “problem solving script.” In our current simulations, we use a build-in procedure that changes this “number of agents” parameter in the spirit of the algorithm in Fig. 7.11.
7.
A question can be asked about the construction of the CFI portion and the attendant BATT-DIFFTY vs Number of Self Agents graph. How does the system automatically know to construct this graph? Again, there has to be a meta-level explicit representation of the physical experimental process and the script construction process – the system has an internal knowledge that it is varying the number of agents in these experiments, and it knows that it is targeting to compute BATT-DIFFTY as a goal of these experiments. Therefore, there would be a meta-level script representing this internal knowledge explicitly that is subject to some meta-level learning process.
8.
This first round of simulation results was derived from a certain setting in the StarCraft environment and the settings were such that there was a slight advantage on the enemy side. Therefore, to guarantee winning over the one Enemy agent, more Self agents are needed. Later in Sect. 7.4.3.12, we change this setting and a different picture emerges.
9.
Knowing what parameters in the environment are relevant for experimentation is a perennially interesting issue. Firstly, the number of Enemy agent observed is a parameter provided by the visual system. Suppose earlier, the Commander had come across only situations that had only 1 Enemy. She would form a generalization that the Enemy agents always come singly. The moment there is a situation in which there is more than 1 Enemy agent, she can immediately over-generalize and form the impression that the Enemy agents can come in any number. This could provide an impetus for her to begin the multi-Enemy agent simulations. This is environment-inspired experimentation. Alternatively, we can have an “imaginative” Commander who would simply take any parameter and carry out a “what-if-this-has-a-different-value” process. This requires resource commitment, of course, and the Commander must have some indication that exploration along that direction may be worthwhile.
10.
Again, the automatic construction of this graph would be based on a similar process as that discussed in footnote 6 concerning the construction of the graph in Fig. 7.33.

References

Ho, S.-B., & Liausvia, F. (2014). Rapid learning and problem solving. In Proceedings of the IEEE symposium series on computational intelligence for human-like intelligence, Orlando, Florida (pp. 110–117). Piscataway: IEEE Press.
Google Scholar
Leong, H. W., & Kwok, K. (2011). Towards robust agent behaviors in model and simulation: Situation filling in with commonsense knowledge. In 20th Conference on Behavior Representation in Modeling & Simulation, Sundance, Utah (pp. 41–48). BRIMS Society.
Google Scholar
Lorentz, H. A., Einstein, A., Minkowski, H., & Weyl, H. (1923) The principle of relativity: A collection of original memoirs on the special and general relativity. London: Constable.
Google Scholar
Shipley, T. F., & Zacks, J. M. (2008). Understanding events: From perception to action. Oxford: Oxford University Press.
Google Scholar
StarCraft II. (2015). http://us.battle.net/sc2/en/. Blazzard Entertainment, Inc.
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Google Scholar
Wender, S., & Watson, I. (2012). Applying reinforcement learning to small scale combat in the real-time strategy game StarCraft: Broodwar. In Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG), Granada (pp. 402-408). Piscataway: IEEE Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Social and Cognitive Computing, Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Seng-Beng Ho
2009–2014 Temasek Laboratories, National University of Singapore, Singapore, Singapore
Seng-Beng Ho

Authors

Seng-Beng Ho
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ho, SB. (2016). Application to the StarCraft Game Environment. In: Principles of Noology. Socio-Affective Computing, vol 3. Springer, Cham. https://doi.org/10.1007/978-3-319-32113-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-32113-4_7
Published: 29 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32111-0
Online ISBN: 978-3-319-32113-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics