Advertisement

Learning in a Game of Strategic Experimentation with Three-Armed Exponential Bandits

  • Nicolas Klein
Chapter
Part of the Static & Dynamic Game Theory: Foundations & Applications book series (SDGTFA)

Abstract

The present article provides some additional results for the two-player game of strategic experimentation with three-armed exponential bandits analyzed in Klein (Games Econ Behav 82:636–657, 2013). Players play replica bandits, with one safe arm and two risky arms, which are known to be of opposite types. It is initially unknown, however, which risky arm is good and which is bad. A good risky arm yields lump sums at exponentially distributed times when pulled. A bad risky arm never yields any payoff. In this article, I give a necessary and sufficient condition for the state of the world eventually to be found out with probability 1 in any Markov perfect equilibrium in which at least one player’s value function is continuously differentiable. Furthermore, I provide closed-form expressions for the players’ value function in a symmetric Markov perfect equilibrium for low and intermediate stakes.

References

  1. 1.
    Bellman, R.: A problem in the sequential design of experiments. Sankhya Indian J. Stat. (1933–1960) 16(3/4), 221–229 (1956)Google Scholar
  2. 2.
    Bolton, P., Harris, C.: Strategic experimentation. Econometrica 67, 349–374 (1999)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bolton, P., Harris, C.: Strategic experimentation: the Undiscounted case. In: Hammond, P.J., Myles, G.D. (eds.) Incentives, Organizations and Public Economics – Papers in Honour of Sir James Mirrlees, pp. 53–68. Oxford University Press, Oxford (2000)Google Scholar
  4. 4.
    Bradt, R., Johnson, S., Karlin, S.: On sequential designs for maximizing the sum of n observations. Ann. Math. Stat. 27, 1060–1074 (1956)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Gittins, J., Jones, D.: A dynamic allocation index for the sequential design of experiments. In: Progress in Statistics, European Meeting of Statisticians, 1972, vol. 1, pp. 241–266. North-Holland, Amsterdam (1974)Google Scholar
  6. 6.
    Keller G., Rady, S., Cripps, M.: Strategic experimentation with exponential bandits. Econometrica 73, 39–68 (2005)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Klein, N.: Strategic learning in teams. Games Econ. Behav. 82, 636–657 (2013)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Klein, N., Rady, S.: Negatively correlated bandits. Rev. Econ. Stud. 78, 693–732 (2011)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Robbins, H.: Some aspects of the sequential design of experiments. Bull. Am. Math. Soc. 58, 527–535 (1952)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Thompson, W.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285–294 (1933)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Nicolas Klein
    • 1
  1. 1.Université de Montréal and CIREQDépartement de Sciences ÉconomiquesMontréalCanada

Personalised recommendations