Toward Guidelines for Modeling Learning Agents in Multiagent-Based Simulation: Implications from Q-Learning and Sarsa Agents

  • Keiki Takadama
  • Hironori Fujita
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3415)


This paper focuses on how simulation results are sensitive to agent modeling in multiagent-based simulation (MABS) and investigates such sensitivity by comparing results where agents have different learning mechanisms, i.e., Q-learning and Sarsa, in the context of reinforcement learning. Through an analysis of simulation results in a bargaining game as one of the canonical examples in game theory, the following implications have been revealed: (1) even a slight difference has an essential influence on simulation results; (2) testing in static and dynamic environments highlights the different tendency of results; and (3) three stages in both Q-learning and Sarsa agents (i.e., (a) competition; (b) cooperation; and (c) learning impossible) are found in the dynamic environment, while no stage is found in the static environment. From these three implications, the following very rough guidelines for modeling agents can be derived: (1) cross-element validation for specifying key factors that affect simulation results; (2) a comparison of results between the static and dynamic environments for determining candidates to be investigated in detail; and (3) sensitive analysis for specifying applicable range for learning agents.


Multiagent-based simulation sensitivity agent modeling learning mechanism bargaining game 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Axelrod, R.M.: The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration. Princeton University Press, Princeton (1997)Google Scholar
  2. 2.
    Axtell, R., Axelrod, R., Epstein, J., Cohen, M.D.: Aligning Simulation Models: A Case Study and Results. Computational and Mathematical Organization Theory (CMOT) 1(1), 123–141 (1996)CrossRefGoogle Scholar
  3. 3.
    Bäck, T., Rudolph, G., Schwefel, H.: Evolutionary Programming and Evolution Strategies: Similarities and Differences. In: The 2nd Annual Evolutionary Programming Conference, pp. 11–22 (1992)Google Scholar
  4. 4.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)zbMATHGoogle Scholar
  5. 5.
    Güth, W., Schmittberger, R., Schwarze, B.: An Experimental Analysis of Ultimatum Bargaining. Journal of Economic Behavior and Organization 3, 367–388 (1982)CrossRefGoogle Scholar
  6. 6.
    Holland, J.H., Holyoak, K.J., Nisbett, R.E., Thagard, P.R.: Induction. MIT Press, Cambridge (1986)Google Scholar
  7. 7.
    Moss, S., Davidsson, P. (eds.): MABS 2000. LNCS (LNAI), vol. 1979. Springer, Heidelberg (2001)Google Scholar
  8. 8.
    Muthoo, A.: Bargaining Theory with Applications. Cambridge University Press, Cambridge (1999)zbMATHCrossRefGoogle Scholar
  9. 9.
    Muthoo, A.: A Non-Technical Introduction to Bargaining Theory. World Economics, 145–166 (2000)Google Scholar
  10. 10.
    Neelin, J., Sonnenschein, H., Spiegel, M.: A Further Test of Noncooperative Bargaining Theory: Comment. American Economic Review 78(4), 824–836 (1988)Google Scholar
  11. 11.
    Nydegger, R.V., Owen, G.: Two-Person Bargaining: An Experimental Test of the Nash Axioms. International Journal of Game Theory 3(4), 239–249 (1974)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Osborne, M.J., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge (1994)zbMATHGoogle Scholar
  13. 13.
    Roth, A.E., Prasnikar, V., Okuno-Fujiwara, M., Zamir, S.: Bargaining and Market Behavior in Jerusalem, Ljubljana, Pittsburgh, and Tokyo: An Experimental Study. American Economic Review 81(5), 1068–1094 (1991)Google Scholar
  14. 14.
    Rubinstein, A.: Perfect Equilibrium in a Bargaining Model. Econometrica 50(1), 97–109 (1982)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Rummery, G.A., Niranjan, M.: On-line Q-learning Using Connectionist Systems. Technical Report CUED/F-INFENG/TR 166, Engineering Department, Cambridge University (1994)Google Scholar
  16. 16.
    Ståhl, I.: Bargaining Theory, Economics Research Institute at the Stockholm School of Economics (1972)Google Scholar
  17. 17.
    Sutton, R.: Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems, pp. 1038–1044. MIT Press, Cambridge (1996)Google Scholar
  18. 18.
    Sutton, R.S., Bart, A.G.: Reinforcement Learning – An Introduction. MIT Press, Cambridge (1998)Google Scholar
  19. 19.
    Takadama, K., Suematsu, Y.L., Sugimoto, N., Nawa, N.E., Shimohara, K.: Cross-Element Validation in Multiagent-based Simulation: Switching Learning Mechanisms in Agents. The Journal of Artificial Societies and Social Simulation (JASSS) 6(4) (2003),
  20. 20.
    Takadama, K., Fujita, H.: Lessons Learned from Comparison Between Q-learning and Sarsa Agents in Bargaining Game. In: NAACSOS (North American Association for Computational Social and Organizational Science) Conference 2004 (2004)Google Scholar
  21. 21.
    Watkins, C.J.C.H., Dayan, P.: Technical Note: Q-Learning. Machine Learning 8, 55–68 (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Keiki Takadama
    • 1
    • 2
  • Hironori Fujita
    • 3
  1. 1.Tokyo Institute of TechnologyYokohamaJapan
  2. 2.ATR Network Informatics LabsKyotoJapan
  3. 3.Hitotsubashi UniversityTokyoJapan

Personalised recommendations