Collective behavior of artificial intelligence population: transition from optimization to game

  • Si-Ping Zhang
  • Ji-Qiang Zhang
  • Zi-Gang HuangEmail author
  • Bing-Hui Guo
  • Zhi-Xi Wu
  • Jue Wang
Original Paper


Collective behavior in the resource allocation systems has attracted much attention, where the efficiency of the system is intimately depended on the self-organized processes of the multiple agents that composed the system. Nowadays, as artificial intelligence (AI) is adopted ubiquitously in decision making in various scenes, it becomes crucial and unavoidable to understand what would emerge in an multi-agent AI systems for resource allocation and how can we intervene the collective behavior there in the future, as we have experience of the possible unexpected outcomes that are induced by collective behavior. Here, we introduce the reinforcement learning (RL) algorithm into minority game (MG) dynamics, in which agents have learning ability based on one typical RL scheme, Q-learning. We investigate the dynamical behaviors of the system numerically and analytically for a different game setting, with combination of two different types of agents which mimic the diversified situations. It is found that through short-term training, the multi-agent AI system adopting Q-learning algorithm relaxes to the optimal solution of the game. Moreover, one striking phenomenon is the transition of interaction mechanism from self-organized optimization to game through tuning the fraction of RL agents \(\eta _{q}\). The critical curve for transition between the two mechanisms in phase diagram is obtained analytically. The adaptability of the AI agents population against the time-variable environment is also discussed. To gain further understanding of these phenomena, a theoretical framework with mean-field approximation is also developed. Our findings from the simplified multi-agent AI system may give new enlightenment to how the reconciliation and optimization can be breed in the coming era of AI.


Self-organized processes Resource allocation Artificial intelligence Minority game Reinforcement learning 



We thank Prof. Ying-Cheng Lai, Richong Zhang, Liang Huang and Dr. Xu-sheng Liu for helpful discussions. This work was supported by NSFC Nos. 11275003, 11575072, 61431012 and 11475074, the Science and Technology Coordination Innovation Project of Shaanxi Province (2016KTCQ01-45), and the Fundamental Research Funds for the Central Universities No. lzujbky-2016-123. ZGH gratefully acknowledges the support of K. C. Wong Education Foundation.

Compliance with ethical standards

Conflicts of interest

The authors declare that they have no conflict of interest.


  1. 1.
    Kauffman, S.A.: The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press, Oxford (1993)Google Scholar
  2. 2.
    Levin, S.A.: Ecosystems and the biosphere as complex adaptive systems. Ecosystems 1(5), 431–436 (1998)CrossRefGoogle Scholar
  3. 3.
    Brian Arthur, W., Durlauf, S.N., Lane, D.A.: The Economy as an Evolving Complex System II, vol. 28. Addison-Wesley, Reading (1997)Google Scholar
  4. 4.
    Nowak, M.A., Page, K.M., Sigmund, K.: Fairness versus reason in the ultimatum game. Science 289(5485), 1773–1775 (2000)CrossRefGoogle Scholar
  5. 5.
    Roca, C.P., Cuesta, J.A., Sánchez, A.: Effect of spatial structure on the evolution of cooperation. Phys. Rev. E 80(4), 046106 (2009)CrossRefGoogle Scholar
  6. 6.
    Press, W.H., Dyson, F.J.: Iterated prisoner dilemma contains strategies that dominate any evolutionary opponent. Proc. Natl. Acad. Sci. (UDA) 109(26), 10409–10413 (2012)CrossRefGoogle Scholar
  7. 7.
    Challet, D., Zhang, Y.-C.: Emergence of cooperation and organization in an evolutionary game. arXiv preprint adap-org/9708006, (1997)Google Scholar
  8. 8.
    Arthur, W.B.: Inductive reasoning and bounded rationality. Am. Econ. Rev. 84(2), 406–411 (1994)Google Scholar
  9. 9.
    Challet, D., Marsili, M.: Phase transition and symmetry breaking in the minority game. Phys. Rev. E 60(6), R6271 (1999)CrossRefGoogle Scholar
  10. 10.
    Savit, R., Manuca, R., Riolo, R.: Adaptive competition, market efficiency, and phase transitions. Phys. Rev. Lett. 82(10), 2203 (1999)CrossRefGoogle Scholar
  11. 11.
    Johnson, N.F., Hart, M., Hui, P.M.: Crowd effects and volatility in markets with competing agents. Physica A 269(1), 1–8 (1999)CrossRefGoogle Scholar
  12. 12.
    Kalinowski, T., Schulz, H.-J., Birese, M.: Cooperation in the minority game with local information. Physica A 277, 502 (2000)CrossRefGoogle Scholar
  13. 13.
    Paczuski, M., Bassler, K.E., Corral, Á.: Self-organized networks of competing boolean agents. Phys. Rev. Lett. 84(14), 3185 (2000)CrossRefGoogle Scholar
  14. 14.
    Eguiluz, V.M., Zimmermann, M.G.: Transmission of information and herd behavior: an application to financial markets. Phys. Rev. Lett. 85(26), 5659 (2000)CrossRefGoogle Scholar
  15. 15.
    Slanina, F.: Harms and benefits from social imitation. Physica A 299, 334 (2001)CrossRefGoogle Scholar
  16. 16.
    Hart, M., Jefferies, P., Johnson, N.F., Hui, P.M.: Crowd-anticrowd theory of the minority game. Physica A 298(3), 537–544 (2001)CrossRefGoogle Scholar
  17. 17.
    Marsili, M.: Market mechanism and expectations in minority and majority games. Physica A 299(1), 93–103 (2001)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Galstyan, A., Lerman, K.: Adaptive boolean networks and minority games with time-dependent capacities. Phys. Rev. E 66, 015103 (2002)CrossRefGoogle Scholar
  19. 19.
    De Martino, A., Marsili, M., Mulet, R.: Adaptive drivers in a model of urban traffic. Europhys. Lett. 65(2), 283 (2004)CrossRefGoogle Scholar
  20. 20.
    Anghel, M., Toroczkai, Z., Bassler, K.E., Korniss, G.: Competition-driven network dynamics: emergence of a scale-free leadership structure and collective efficiency. Phys. Rev. Lett. 92, 058701 (2004)CrossRefGoogle Scholar
  21. 21.
    Lo, T.S., Chan, H.Y., Hui, P.M., Johnson, N.F.: Theory of networked minority games based on strategy pattern dynamics. Phys. Rev. E 70, 056102 (2004)CrossRefGoogle Scholar
  22. 22.
    Moro, E.: Advances in Condensed Matter and Statistical Physics, Chapter the Minority Games: An Introductory Guide. Nova Science Publishers, New York (2004)Google Scholar
  23. 23.
    Xie, Y.B., Hu, C.-K., Wang, B.H., Zhou, T.: Global optimization of minority game by intelligent agents. Eur. Phys. J. B 47, 587 (2005)CrossRefGoogle Scholar
  24. 24.
    Zhong, L.-X., Zheng, D.-F., Zheng, B., Hui, P.M.: Effects of contrarians in the minority game. Phys. Rev. E 72, 026134 (2005)CrossRefGoogle Scholar
  25. 25.
    Zhou, T., Wang, B.-H., Zhou, P.-L., Yang, C.-X., Liu, J.: Self-organized boolean game on networks. Phys. Rev. E 72(4), 046139 (2005)CrossRefGoogle Scholar
  26. 26.
    Challet, D., Marsili, M., Zhang, Y.-C.: Minority Games. Oxford Finance, Oxford University Press, Oxford (2005)zbMATHGoogle Scholar
  27. 27.
    Lo, T.S., Chan, K.P., Hui, P.M., Johnson, N.F.: Theory of enhanced performance emerging in a sparsely connected competitive population. Phys. Rev. E 71, 050101 (2005)CrossRefGoogle Scholar
  28. 28.
    Borghesi, C., Marsili, M., Miccichè, S.: Emergence of time-horizon invariant correlation structure in financial returns by subtraction of the market mode. Phys. Rev. E 76, 026104 (2007)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Challet, D., De Martino, A., Marsili, M.: Dynamical instabilities in a simple minority game with discounting. J. Stat. Mech. Theory Exp 2008(4), L04004 (2008)CrossRefGoogle Scholar
  30. 30.
    Yeung, C.H., Zhang, Y.C.: Minority games. In: Meyers, R.A. (ed.) Encyclopedia of Complexity and Systems Science, pp. 5588–5604. Springer, New York (2009)CrossRefGoogle Scholar
  31. 31.
    Bianconi, G., De Martino, A., Ferreira, F.F., Marsili, M.: Multi-asset minority games. Quant. Finance 8(3), 225–231 (2008)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Huang, Z.-G., Zhang, J.-Q., Dong, J.-Q., Huang, L., Lai, Y.-C.: Emergence of grouping in multi-resource minority game dynamics. Sci. Rep. 2, 703 (2012)CrossRefGoogle Scholar
  33. 33.
    Zhang, J.-Q., Huang, Z.-G., Dong, J.-Q., Huang, L., Lai, Y.-C.: Controlling collective dynamics in complex minority-game resource-allocation systems. Phys. Rev. E 87, 052808 (2013)CrossRefGoogle Scholar
  34. 34.
    Dong, J.-Q., Huang, Z.-G., Huang, L., Lai, Y.-C.: Triple grouping and period-three oscillations in minority-game dynamics. Phys. Rev. E 90(6), 062917 (2014)CrossRefGoogle Scholar
  35. 35.
    Zhang, J.-Q., Huang, Z.-G., Wu, Z.-X., Su, R.-Q., Lai, Y.-C.: Controlling herding in minority game systems. Sci. Rep. 6, 20925 (2016)CrossRefGoogle Scholar
  36. 36.
    Das, R., Wales, D.J.: Energy landscapes for a machine-learning prediction of patient discharge. Phys. Rev. E 93, 063310 (2016)CrossRefGoogle Scholar
  37. 37.
    Kim, B.-J., Kim, S.-H.: Prediction of inherited genomic susceptibility to 20 common cancer types by a supervised machine-learning method. Proc. Natl. Acad. Sci. (UDA) 115(6), 1322–1327 (2018)CrossRefGoogle Scholar
  38. 38.
    Singh, S., Okun, A., Jackson, A.: Artificial intelligence: learning to play go from scratch. Nature 550(2), 336–337 (2017)Google Scholar
  39. 39.
    Murray Campbell, A., Joseph Hoane, A., Hsu, F.H.: Deep blue. Artif. Intell. 134(1), 57–83 (2002)CrossRefGoogle Scholar
  40. 40.
    Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Li, F.-F.: Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States. Proc. Natl. Acad. Sci. (UDA) 114(50), 13108–13113 (2017)CrossRefGoogle Scholar
  41. 41.
    Naik, N., Kominers, S.D., Raskar, R., Glaeser, E.L., Hidalgo, C.A.: Computer vision uncovers predictors of physical urban change. Proc. Natl. Acad. Sci. (UDA) 114(29), 7571–7576 (2017)CrossRefGoogle Scholar
  42. 42.
    Blumenstock, J., Cadamuro, G., On, R.: Predicting poverty and wealth from mobile phone metadata. Science 350(6264), 1073–1076 (2015)CrossRefGoogle Scholar
  43. 43.
    Dia, H., Panwai, S.: Modelling drivers’ compliance and route choice behaviour in response to travel information. Nonlinear Dynam 49(4), 493–509 (2007)CrossRefGoogle Scholar
  44. 44.
    Li, D.-J., Tang, L., Liu, Y.-J.: Adaptive intelligence learning for nonlinear chaotic systems. Nonlinear Dyn. 73(4), 2103–2109 (2013)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Kianercy, A., Galstyan, A.: Coevolutionary networks of reinforcement-learning agents. Phys. Rev. E 88, 012815 (2013)CrossRefGoogle Scholar
  46. 46.
    Zhang, S.-P., Dong, J.Q., Liu, L., Huang, Z.-G., Huang, L., Lai. Y.-C.: Artificial intelligence meets minority game: toward optimal resource allocation. ArXiv e-prints, (2018)Google Scholar
  47. 47.
    Barto, A.G., Sutton, R.S.: Reinforcement Learning: An Introduction, vol. 21. The MIT press, Cambridge (1998)Google Scholar
  48. 48.
    Bellman, R.E.: Dynamic Programing. Princeton University Press, Princeton (1957)Google Scholar
  49. 49.
    Sutton, R.S.: Learning top redict by the methods of temporal difference. Mach. Learn. 3, 9–44 (1998)Google Scholar
  50. 50.
    Watkins, C.J.C.: Learning from delayed rewards. Ph.D. thesis Cambridge University, (1989)Google Scholar
  51. 51.
    Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)zbMATHGoogle Scholar
  52. 52.
    Potapov, A., Ali, M.K.: Convergence of reinforcement learning algorithms and acceleration of learning. Phys. Rev. E 67, 026706 (2003)CrossRefGoogle Scholar
  53. 53.
    Sato, Y., Crutchfield, J.P.: Coupled replicator equations for the dynamics of learning in multiagent systems. Phys. Rev. E 67, 015206 (2003)CrossRefGoogle Scholar
  54. 54.
    Kianercy, A., Galstyan, A.: Dynamics of boltzmann \(q\) learning in two-player two-action games. Phys. Rev. E 85, 041145 (2012)CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Institute of Computational Physics and Complex SystemsLanzhou UniversityLanzhouChina
  2. 2.The Key Laboratory of Biomedical Information Engineering of Ministry of Education, National Engineering Research Center of Health Care and Medical Devices, The Key Laboratory of Neuro-informatics & Rehabilitation Engineering of Ministry of Civil Affairs, School of Life Science and TechnologyXi’an Jiaotong UniversityXi’anChina
  3. 3.Beijing Advanced Innovation Center for Big Data and Brain ComputingBeihang UniversityBeijingChina
  4. 4.Beijing Advanced Innovation Center for Big Data and Brain Computing, LMIB and School of Mathematics and System SciencesBeihang UniversityBeijingChina

Personalised recommendations