Learning Multi-agent Strategies in Multi-stage Collaborative Games

  • W. Andy Wright
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2412)


An alternative approach to learning decision strategies in multi-state multiple agent systems is presented here. The method, which uses a game theoretic construction which is model free and does not rely on direct communication between the agents in the system. Limited experiments show that the method can find Nash equilibrium point for 3 player multi-stage game and converges more quickly than a comparable co-evolution method.


Nash Equilibrium Reinforcement Learning Mixed Strategy Pure Strategy Independent Learner 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    C. Claus and C. Boutiler. The dynamics of reinforcement learning in cooperative multi-agent systems. In AAAI-98, volume 1, 1998.Google Scholar
  2. 2.
    R.H. Deaves, D. Nicholson, D. Gough, and L. Binns. Multiple robot systems for decentralised SLAM investigations. In SPIE, volume 4196, page 360, 2000.CrossRefGoogle Scholar
  3. 3.
    D. Fudenberg and D.K. Levine. The Theory of Learning in Games. Economic Learning and Social Evolution. MIT Press, 1998.Google Scholar
  4. 4.
    T. Jaakkola, M.I. Jordan, and S.P. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6:1185–1201, 1994.zbMATHCrossRefGoogle Scholar
  5. 5.
    M. Kearns, Y. Mansour, and S. In UAI 2000 Singh. Fast planning in stochastic games. In UAI, 2000.Google Scholar
  6. 6.
    D. Leslie and S. Collins. Convergent multiple-timescales reinforcement learning algorithms in normal form games. Submitted to Annals of Applied Probability, 2002.Google Scholar
  7. 7.
    J.M. MacNamara, J.N. Webb, T. Collins, E.J. Slzekely, and A. Houstton. A general technique for computing evolutionary stable strategies based on errors in decision making. Theoretical Biology, 189:211–225, 1997.CrossRefGoogle Scholar
  8. 8.
    J. M. Maja. Behavior-based robotics as a tool for synthesis of artificial behavior and analysis of natural behavior. Trends in Cognitive Science, 2(3):82–87, 1998.CrossRefGoogle Scholar
  9. 9.
    J.F. Nash. Non-cooperative games. Annals of Mathematics, 54:286–295, 1951.CrossRefMathSciNetGoogle Scholar
  10. 10.
    M. Tan. Multi-agent reinforcement learning: Independent vs cooperative agents. In Proceedings of the tenth international conference on machine learning, pages 330–337, 1993.Google Scholar
  11. 11.
    C.J.C.H. Watkins and P. Dyan. Q-learning. Machine Learning, 8:279–292, 1992.zbMATHGoogle Scholar
  12. 12.
    W.A. Wright. Sequential strategy for learning multi-stage multi-agent collaborative games. In Georg Dorffner, Horst Bischof, and Kurt Hornik, editors, ICANN, volume 2130 of Lecture Notes in Computer Science, pages 874–884. Springer, 2001.Google Scholar
  13. 13.
    W.A. Wright. Solution to two predator single prey game: The two monster single princess game. In preparation, 2002.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • W. Andy Wright
    • 1
    • 2
  1. 1.BAE SYSTEMS (ATC)BristolUK
  2. 2.Department of MathematicsUniversity of BristolBristol

Personalised recommendations