Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Kolnogorov, A. V.

doi:10.1134/S0032946018010076

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Large Systems
Published: 13 April 2018

Volume 54, pages 84–100, (2018)
Cite this article

Problems of Information Transmission Aims and scope Submit manuscript

A. V. Kolnogorov¹

112 Accesses
14 Citations
Explore all metrics

Abstract

We consider the minimax setting for the two-armed bandit problem with normally distributed incomes having a priori unknown mathematical expectations and variances. This setting naturally arises in optimization of batch data processing where two alternative processing methods are available with different a priori unknown efficiencies. During the control process, it is required to determine the most efficient method and ensure its predominant application. We use the main theorem of game theory to search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution. To find them, a recursive integro-difference equation is obtained. We show that batch data processing almost does not increase the minimax risk if the number of batches is large enough.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaussian Two-Armed Bandit: Limiting Description

Article 01 July 2020

Poissonian Two-Armed Bandit: A New Approach

Article 01 April 2022

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

Article 01 August 2022

References

Berry, D.A. and Fristedt, B., Bandit Problems: Sequential Allocation of Experiments, London: Chapman & Hall, 1985.
Book MATH Google Scholar
Presman, E.L. and Sonin, I.M., Posledovatel’noe upravlenie po nepolnym dannym. Baiesovskii podkhod, Moscow: Nauka, 1982. Translated under the title Sequential Control with Incomplete Information, New York: Academic, 1990.
MATH Google Scholar
Tsetlin, M.L., Issledovaniya po teorii avtomatov i modelirovaniyu biologicheskikh sistem, Moscow: Nauka, 1969. Translated under the title Automaton Theory and Modeling of Biological Systems, New York: Academic, 1973.
MATH Google Scholar
Varshavsky, V.I., Kollektivnoe povedenie avtomatov (Collective Behavior of Automata), Moscow: Nauka, 1973. Translated under the title Kollektives Verhalten von Automaten, Warschawski, W.I., Berlin: Akademie, 1978.
Google Scholar
Sragovich, V.G., Adaptivnoe upravlenie (Adaptive Control), Moscow: Nauka, 1981. Translated under the title Mathematical Theory of Adaptive Control, Singapore: World Sci., 2006.
MATH Google Scholar
Nazin, A.V. and Poznyak, A.S., Adaptivnyi vybor variantov: rekurrentnye algoritmy (Adaptive Choice between Alternatives: Recursive Algorithms), Moscow: Nauka, 1986.
Google Scholar
Robbins, H., Some Aspects of the Sequential Design of Experiments, Bull. Amer. Math. Soc., 1952, vol. 58, no. 5, pp. 527–535.
Article MathSciNet MATH Google Scholar
Fabius, J. and van Zwet, W.R., Some Remarks on the Two-Armed Bandit, Ann. Math. Statist., 1970, vol. 41, no. 6, pp. 1906–1916.
Article MathSciNet MATH Google Scholar
Vogel, W., An Asymptotic Minimax Theorem for the Two Armed Bandit Problem, Ann. Math. Statist., 1960, vol. 31, no. 2, pp. 444–451.
Article MathSciNet MATH Google Scholar
Bather, J.A., The Minimax Risk for the Two-Armed Bandit Problem, Mathematical Learning Models—Theory and Algorithms, Herkenrath, U., Kalin, D., and Vogel, W., Eds., Lect. Notes Statist, vol. 20, New York: Springer, 1983, pp. 1–11.
Article MathSciNet MATH Google Scholar
Lai, T.L., Levin, B., Robbins, H., and Siegmund, D., Sequential Medical Trials (Stopping Rules/Asymptotic Optimality), Proc. Natl. Acad. Sci. USA, 1980, vol. 77, no. 6, Part 1, pp. 3135–3138.
Article MATH Google Scholar
Cesa-Bianchi, N. and Lugosi, G., Prediction, Learning, and Games, Cambridge: Cambridge Univ. Press, 2006.
Book MATH Google Scholar
Juditsky, A., Nazin, A.V., Tsybakov, A.B., and Vayatis, N., Gap-Free Bounds for Stochastic Multi-Armed Bandit, in Proc. 17th IFAC World Congr., Seoul, Korea, July 6–11, 2008, pp. 11560–11563. Available at http://www.ifac-papersonline.net/Detailed/37644.html.
Gasnikov, A.V., Nesterov, Yu.E., and Spokoiny, V.G., On the Efficiency of a Randomized Mirror Descent Algorithm in Online Optimization Problems, Zh. Vychisl. Mat. Mat. Fiz., 2015, vol. 55, no. 4, pp. 582–598 [Comput. Math. Math. Phys. (Engl. Transl.), 2015, vol. 55, no. 4, pp. 580–596].
MathSciNet MATH Google Scholar
Kolnogorov, A.V., Determination of Minimax Strategies and Risk in a Random Environment (the Two-Armed Bandit Problem), Avtomat. i Telemekh., 2011, no. 5, pp. 127–138 [Autom. Remote Control (Engl. Transl.), 2011, vol. 72, no. 5, pp. 1017–1027].
MathSciNet MATH Google Scholar
Kolnogorov, A.V., One-Armed Bandit Problem for Parallel Data Processing Systems, Probl. Peredachi Inf., 2015, vol. 51, no. 2, pp. 99–113 [Probl. Inf. Trans. (Engl. Transl.), 2015, vol. 51, no. 2, pp. 177–191].
MathSciNet MATH Google Scholar
Oleynikov, A.O., Numerical Optimization of Parallel Processing in a Stationary Environment, Trans. Karelian Res. Centre Russ. Acad. Sci., 2013, no. 1, pp. 73–78.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics and Information Science, Yaroslav-the-Wise Novgorod State University, Moscow, Russia
A. V. Kolnogorov

Authors

A. V. Kolnogorov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. V. Kolnogorov.

Additional information

Original Russian Text © A.V. Kolnogorov, 2018, published in Problemy Peredachi Informatsii, 2018, Vol. 54, No. 1, pp. 93–111.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kolnogorov, A.V. Gaussian Two-Armed Bandit and Optimization of Batch Data Processing. Probl Inf Transm 54, 84–100 (2018). https://doi.org/10.1134/S0032946018010076

Download citation

Received: 11 September 2017
Accepted: 24 November 2017
Published: 13 April 2018
Issue Date: January 2018
DOI: https://doi.org/10.1134/S0032946018010076

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Abstract

Access this article

Similar content being viewed by others

Gaussian Two-Armed Bandit: Limiting Description

Poissonian Two-Armed Bandit: A New Approach

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Abstract

Access this article

Similar content being viewed by others

Gaussian Two-Armed Bandit: Limiting Description

Poissonian Two-Armed Bandit: A New Approach

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation