Skip to main content

Two arms, one arm known

  • Chapter
Bandit problems

Part of the book series: Monographs on Statistics and Applied Probability ((MSAP))

  • 1173 Accesses

Abstract

In this chapter we assume that there are two arms (k = 2) and that one arm, say arm 2 for definiteness, has known mean λ. The only uncertainty is embodied in F 1, now abbreviated to F, the distribution of the random measure Q 1. For arbitrary λ we can, without loss, assume that arm 2 always produces the known observation A. Since G is given by the pair (F, λ), we now speak of the (F, λ; A)-bandit.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bellman, R. (1956) A problem in the sequential design of experiments. Sankhyd A 16: 221–229.

    MATH  Google Scholar 

  • Berry, D. A. and Christensen, R. (1979) Empirical Bayes estimation of a binomial parameter via mixtures of Dirichlet processes. Ann. Statist. 7: 558–568.

    Article  MathSciNet  MATH  Google Scholar 

  • Berry, D. A. and Fristedt, B. (1979) Bernoulli one-armed bandits—arbitrary discount sequences. Ann. Statist. 7: 1086–1105.

    Article  MathSciNet  MATH  Google Scholar 

  • Berry, D. A. and Fristedt, B. (1983) Maximizing the length of a success run for many-armed bandits. Stochastic Process. Appl. 15: 317–325.

    Article  MathSciNet  MATH  Google Scholar 

  • Bradt, R. N., Johnson, S. M. and Karlin, S. (1956) On sequential designs for maximizing the sum of n observations. Ann. Math. Statist. 27: 1060–1074

    Article  MathSciNet  MATH  Google Scholar 

  • Clayton, M. K. and Berry, D. A. (1984) Bayesian nonparametric bandits. Statistics Tech. Rep. No. 427, Univ. of Minnesota, USA.

    Google Scholar 

  • Ferguson, T. S. (1973) A Bayesian analysis of some nonparametric problems. Ann. Statist. 1: 209–230.

    Article  MathSciNet  MATH  Google Scholar 

  • Gittins, J. C. and Jones, D. M. (1974) A dynamic allocation index for the sequential design of experiments. In Progress in Statistics (eds J. Gani et al.), pp. 241–266, North-Holland, Amsterdam.

    Google Scholar 

  • Gittins, J. C. and Jones, D. M. (1979) A dynamic allocation index for the discounted multiarmed bandit problem. Biometrika 66: 561–565.

    Article  Google Scholar 

  • Sethuraman, J. and Tiwari, R. C. (1982) Convergence of Dirichlet measures and the interpretation of their parameter. In Statistical Decision Theory and Related Topics III, Vol. 2 (eds S. Gupta and J. O. Berger ), pp. 305–315, Academic Press, New York.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1985 D. A. Berry and B. Fristedt

About this chapter

Cite this chapter

Berry, D.A., Fristedt, B. (1985). Two arms, one arm known. In: Bandit problems. Monographs on Statistics and Applied Probability. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-3711-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-94-015-3711-7_5

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-015-3713-1

  • Online ISBN: 978-94-015-3711-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics