The greedy crowd and smart leaders: a hierarchical strategy selection game with learning protocol


In this paper, a general resource distribution game with a hierarchical structure on the bipartite graph is proposed. In this system, the game is divided into two interacting levels, the agent level and the group level, with negotiations taking place on both levels. Each agent can belong to multiple groups, resulting in a system topology with a bipartite structure. On the agent level, decisions are based on the greedy principle, with the game being a state-based potential game. In contrast, some participants on the group level behave more “smartly” and are more likely to adopt a sophisticated strategy maximizing their personal interest. Strategies on both levels are based on distributed protocols, and the social welfare increases as the system approaches a Nash-equilibrium point. The designed protocols are theoretically analyzed from stability and efficiency. Furthermore, a reinforcement learning algorithm is introduced in the group level, where the smarter players are allowed to refine their strategies in the multi-step decision-making process by learning from historic game outcomes. In theory and according to simulations, agents with the learning behavior improve not only their personal interest but also the efficiency of the systemic resource distribution.

This is a preview of subscription content, access via your institution.


  1. 1

    Quijano N, Ocampo-Martinez C, Barreiro-Gomez J, et al. The role of population games and evolutionary dynamics in distributed control systems: the advantages of evolutionary game theory. IEEE Control Syst, 2017, 37: 70–97

    MathSciNet  Google Scholar 

  2. 2

    Nowak M A, Tarnita C E, Antal T. Evolutionary dynamics in structured populations. Phil Trans R Soc B, 2010, 365: 19–30

    Article  Google Scholar 

  3. 3

    Fu F, Wang L, Nowak M A, et al. Evolutionary dynamics on graphs: efficient method for weak selection. Phys Rev E, 2009, 79: 046707

    Article  Google Scholar 

  4. 4

    Taylor C, Fudenberg D, Sasaki A, et al. Evolutionary game dynamics in finite populations. Bull Math Biol, 2004, 66: 1621–1644

    MathSciNet  Article  Google Scholar 

  5. 5

    Ohtsuki H, Nowak M A. Evolutionary games on cycles. Proc R Soc B, 2006, 273: 2249–2256

    Article  Google Scholar 

  6. 6

    Nowak M A. Five rules for the evolution of cooperation. Science, 2006, 314: 1560–1563

    Article  Google Scholar 

  7. 7

    Ohtsuki H, Nowak M A, Pacheco J M. Breaking the symmetry between interaction and replacement in evolutionary dynamics on graphs. Phys Rev Lett, 2007, 98: 108106

    Article  Google Scholar 

  8. 8

    Tarnita C E, Ohtsuki H, Antal T, et al. Strategy selection in structured populations. J Theory Biol, 2009, 259: 570–581

    MathSciNet  Article  Google Scholar 

  9. 9

    Xia C Y, Li X P, Wang Z, et al. Doubly effects of information sharing on interdependent network reciprocity. New J Phys, 2018, 20: 075005

    Article  Google Scholar 

  10. 10

    Tang C B, Li X, Wang Z, et al. Cooperation and distributed optimization for the unreliable wireless game with indirect reciprocity. Sci China Inf Sci, 2017, 60: 110205

    Article  Google Scholar 

  11. 11

    Xia C Y, Ding S, Wang C J, et al. Risk analysis and enhancement of cooperation yielded by the individual reputation in the spatial public goods game. IEEE Syst J, 2017, 11: 1516–1525

    Article  Google Scholar 

  12. 12

    Chen M H, Wang L, Sun S W, et al. Evolution of cooperation in the spatial public goods game with adaptive reputation assortment. Phys Lett A, 2016, 380: 40–47

    Article  Google Scholar 

  13. 13

    Fudenberg D, Levine D K. The Theory of Learning in Games. Boston: MIT Press, 1998

    Google Scholar 

  14. 14

    Li J Q, Zhang C Y, Sun Q L, et al. Changing intensity of interaction can resolve prisoner’s dilemmas. Europhys Lett, 2016, 113: 58002

    Article  Google Scholar 

  15. 15

    Perc M, Gómez-Gardeñes J, Szolnoki A, et al. Evolutionary dynamics of group interactions on structured populations: a review. J R Soc Interface, 2013, 10: 20120997

    Article  Google Scholar 

  16. 16

    Gracia-Lázaro C, Gómez-Gardeñes J, Floría L M, et al. Intergroup information exchange drives cooperation in the public goods game. Phys Rev E, 2014, 90: 042808

    Article  Google Scholar 

  17. 17

    Gómez-Gardeñes J, Vilone D, Sánchez A. Disentangling social and group heterogeneities: public goods games on complex networks. EPL, 2011, 95: 68003

    Article  Google Scholar 

  18. 18

    Gómez-Gardeñes J, Romance M, Criado R, et al. Evolutionary games defined at the network mesoscale: the public goods game. Chaos, 2011, 21: 016113

    MathSciNet  Article  Google Scholar 

  19. 19

    Kelly F P, Maulloo A K, Tan D K H. Rate control for communication networks: shadow prices, proportional fairness and stability. J Oper Res Soc, 1998, 49: 237–252

    Article  Google Scholar 

  20. 20

    Li J, Ma G Q, Li T, et al. A Stackelberg game approach for demandresponse management of multi-microgrids with overlapping sales areas. Sci China Inf Sci, 2019, 62: 212203

    MathSciNet  Article  Google Scholar 

  21. 21

    Monderer D, Shapley L S. Potential games. Games Econom Behav, 1996, 16: 124–143

    MathSciNet  Article  Google Scholar 

  22. 22

    Barreiro-Gomez J, Obando G, Quijano N. Distributed population dynamics: optimization and control applications. IEEE Trans Syst Man Cybern Syst, 2017, 47: 304–314

    Google Scholar 

  23. 23

    Barreiro-Gomez J, Quijano N, Ocampo-Martinez C. Constrained distributed optimization: a population dynamics approach. Automatica, 2016, 69: 101–116

    MathSciNet  Article  Google Scholar 

  24. 24

    Li N, Marden J R. Designing games for distributed optimization. IEEE J Sel Top Signal Process, 2013, 7: 230–242

    Article  Google Scholar 

  25. 25

    Li N, Marden J R. Decoupling coupled constraints through utility design. IEEE Trans Autom Control, 2014, 59: 2289–2294

    MathSciNet  Article  Google Scholar 

  26. 26

    Marden J R. State based potential games. Automatica, 2012, 48: 3075–3088

    MathSciNet  Article  Google Scholar 

  27. 27

    Maheswaran R, Basar T. Efficient signal proportional allocation (ESPA) mechanisms: decentralized social welfare maximization for divisible resources. IEEE J Sel Areas Commun, 2006, 24: 1000–1009

    Article  Google Scholar 

  28. 28

    Yan L, Qu B Y, Zhu Y S, et al. Dynamic economic emission dispatch based on multi-objective pigeon-inspired optimization with double disturbance. Sci China Inf Sci, 2019, 62: 070210

    MathSciNet  Article  Google Scholar 

  29. 29

    Tang C B, Li A, Li X. Asymmetric game: a silver bullet to weighted vertex cover of networks. IEEE Trans Cybern, 2018, 48: 2994–3005

    Article  Google Scholar 

  30. 30

    Li X X, Peng Z H, Liang L, et al. Policy iteration based Q-learning for linear nonzero-sum quadratic differential games. Sci China Inf Sci, 2019, 62: 052204

    MathSciNet  Article  Google Scholar 

  31. 31

    Watkins C J, Dayan P. Technical note: Q-learning. Mach Learn, 1992, 8: 279–292

    MATH  Google Scholar 

  32. 32

    Lanctot M, Zambaldi V F, Gruslys A, et al. A unified game-theoretic approach to multiagent reinforcement learning. In: Proceedings of the 31st International Conference on Neural Information Processing, 2017. 4190–4203

  33. 33

    Tuyls K, Pérolat J, Lanctot M, et al. Symmetric decomposition of asymmetric games. Sci Rep, 2018, 8: 1015

    Article  Google Scholar 

  34. 34

    Zhang K Q, Yang Z R, Liu H, et al. Fully decentralized multi-agent reinforcement learning with networked agents. In: Proceedings of International Conference on Machine Learning, 2018. 5867–5876

  35. 35

    Busoniu L, Babuska R, de Schutter B. A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern C, 2008, 38: 156–172

    Article  Google Scholar 

Download references


This work was supported by Tianjin Natural Science Foundation (Grant Nos. 20JCYBJC01060, 20JCQNJC01450) and National Natural Science Foundation of China (Grant No. 61973175).

Author information



Corresponding author

Correspondence to Zhongxin Liu.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guo, L., Liu, Z. & Chen, Z. The greedy crowd and smart leaders: a hierarchical strategy selection game with learning protocol. Sci. China Inf. Sci. 64, 132206 (2021).

Download citation


  • multi-agent system
  • reinforcement learning
  • game theory
  • complex network
  • bipartite graph