Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case

Gasnikov, A. V.; Krymova, E. A.; Lagunovskaya, A. A.; Usmanova, I. N.; Fedorenko, F. A.

doi:10.1134/S0005117917020035

Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case

Stochastic Systems, Queueing Systems
Published: 12 February 2017

Volume 78, pages 224–234, (2017)
Cite this article

Automation and Remote Control Aims and scope Submit manuscript

A. V. Gasnikov^1,2,
E. A. Krymova²,
A. A. Lagunovskaya^3,1,
I. N. Usmanova^1,2 &
…
F. A. Fedorenko¹

374 Accesses
27 Citations
Explore all metrics

Abstract

In this paper the gradient-free modification of the mirror descent method for convex stochastic online optimization problems is proposed. The crucial assumption in the problem setting is that function realizations are observed with minor noises. The aim of this paper is to derive the convergence rate of the proposed methods and to determine a noise level which does not significantly affect the convergence rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Article Open access 07 July 2017

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

Distributionally robust stochastic programs with side information based on trimmings

Article Open access 22 November 2021

References

Gasnikov, A.V., Lagunovskaya, A.A., Usmanova, I.N., and Fedorenko, F.A., Gradient-Free Proximal Methods with Inexact Oracle for Convex Stochastic Nonsmooth Optimization Problems on the Simplex, Autom. Remote Control, 2016, vol. 77, no. 11, pp. 2018–2034.
Article Google Scholar
Lugosi, G. and Cesa-Bianchi, N., Prediction, Learning and Games, New York: Cambridge Univ. Press, 2006.
MATH Google Scholar
Agarwal, A., Dekel, O., and Xiao, L., Optimal Algorithm for Online Convex Optimization with Multi- Point Bandit Feedback, COLT, Proc. 23rd Ann. Conf. on Learning Theory, Haifa, 2010, pp. 28–40.
Google Scholar
Sridharan, K., Learning from an Optimization Viewpoint, PhD Dissertation, Toyota Technol. Inst. Chicago, 2011, arXiv:1204.4145.
Google Scholar
Bubeck, S., Introduction to Online Optimization, Princeton: Princeton Univ., 2011. http://www. princeton.edu/~sbubeck/BubeckLectureNotes.pdf.
Google Scholar
Shalev-Shwartz, S., Online Learning and Online Convex Optimization, Foundat. Trends Machine Learning, 2011, vol. 4, no. 2, pp. 107–194. http://www.cs.huji.ac.il/~shais/papers/OLsurvey.
Article MATH Google Scholar
Bubeck, S. and Cesa-Bianchi, N., Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Foundat. Trends Machine Learning, 2012, vol. 5, no. 1, pp. 1–122. http://www.princeton. edu/~sbubeck/SurveyBCB12.pdf.
Article MATH Google Scholar
Rakhlin, A. and Sridharan, K., Statistical Learning Theory and Sequential Prediction, e-print, 2014. http://stat.wharton.upenn.edu/~rakhlin/book draft.pdf.
MATH Google Scholar
Hazan, E., Introduction to Online Convex Optimization, e-print, 2015. http://ocobook.cs.princeton.edu/OCObook.pdf.
Google Scholar
Gasnikov, A.V., Nesterov, Yu.E., and Spokoinyi, V.G., On Efficiency of One Method for Randomization of the Mirror Descent in the Problems of Online Optimization, Zh. Vychisl. Mat. Mat. Fiz., 2015, vol. 55, no. 4, pp. 55–71.
Google Scholar
Duchi, J.C., Jordan, M.I., Wainwright, M.J., and Wibisono, A., Optimal Rates for Zero-order Convex Optimization: The Power of Two Function Evaluations, IEEE Transact. Inform., 2015, vol. 61, no. 5, pp. 2788–2806. http://www.eecs.berkeley.edu/~wainwrig/Papers/DucZero15.pdf.
Article MathSciNet Google Scholar
Nemirovskii, A.S. and Yudin, D.B., Slozhnost’ zadach i effektivnost’ metodov optimizatsii (Complexity of Problems and Efficiency of the Optimization Methods), Moscow: Nauka, 1979.
Google Scholar
Flaxman, A.D., Kalai, A.T., and McCahan, H.B., Online Convex Optimization in the Bandit Setting: Gradient Descent without a Gradient, Proc. 16 Annual ACM-SIAM Sympos. Discret. Algorithm., 2005, pp. 385–394. http://research.microsoft.com/en-us/um/people/adum/publications/2005-Online_Convex_Optimization_in_the_Bandit_Setting.pdf.
Google Scholar
Juditsky, A. and Nemirovski, A., First Order Methods for Nonsmooth Convex Large-scale Optimization. I, II, in Optimization for Machine Learning, Sra, S., Nowozin, S., and Wright, S., Eds., Boston: MIT Press, 2012.
Google Scholar
Gasnikov, A.V., Dvurechenskii, P.E., and Nesterov, Yu.E., Stochastic Gradient Methods with Inaccurate Oracle, Tr. MFTI, 2016, vol. 8, no, 1. pp. 41–91.
Google Scholar
Nemirovski, A., Lectures on Modern Convex Optimization Analysis, Algorithms, and Engineering Applications, Philadelphia: SIAM, 2013. http://www2.isye.gatech.edu/~nemirovs/Lect ModConvOpt.pdf.
MATH Google Scholar
Agarwal, A., Bartlett, P.L., Ravikumar, P., and Wainwright, M.J., Information-theoretic Lower Bounds on the Oracle Complexity of Stochastic Convex Optimization, IEEE Transact. Inform., 2012, vol. 58, no. 5, pp. 3235–3249, arXiv:1009.0571.
Article MathSciNet Google Scholar
Bubeck, S. and Eldan, R., Multi-Scale Exploration of Convex Functions and Bandit Convex Optimization, e-print, 2015. http://research.microsoft.com/en-us/um/people/sebubeck/ConvexBandits.pdf.
Google Scholar
Allen-Zhu, Z. and Orecchia, L., Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent, e-print, 2014, arXiv:1407.1537.
Google Scholar
Nesterov, Y., Primal-Dual Subgradient Methods for Convex Problems, Math. Program., Ser. B, 2009, vol. 120(1), pp. 261–283.
Article MathSciNet MATH Google Scholar
Ledoux, M., Concentration of Measure Phenomenon, Providence: Am. Math. Soc. (Math. Surveys Monogr., vol. 89), 2001.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology (State University), Moscow, Russia
A. V. Gasnikov, A. A. Lagunovskaya, I. N. Usmanova & F. A. Fedorenko
Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, Russia
A. V. Gasnikov, E. A. Krymova & I. N. Usmanova
Keldysh Institute of Applied Mathematics, Russian Academy of Sciences, Moscow, Russia
A. A. Lagunovskaya

Authors

A. V. Gasnikov
View author publications
You can also search for this author in PubMed Google Scholar
E. A. Krymova
View author publications
You can also search for this author in PubMed Google Scholar
A. A. Lagunovskaya
View author publications
You can also search for this author in PubMed Google Scholar
I. N. Usmanova
View author publications
You can also search for this author in PubMed Google Scholar
F. A. Fedorenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. V. Gasnikov.

Additional information

Original Russian Text © A.V. Gasnikov, E.A. Krymova, A.A. Lagunovskaya, I.N. Usmanova, F.A. Fedorenko, 2017, published in Avtomatika i Telemekhanika, 2017, No. 2, pp. 36–49.

This paper was recommended for publication by P.S. Shcherbakov, a member of the Editorial Board

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gasnikov, A.V., Krymova, E.A., Lagunovskaya, A.A. et al. Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case. Autom Remote Control 78, 224–234 (2017). https://doi.org/10.1134/S0005117917020035

Download citation

Received: 16 October 2014
Published: 12 February 2017
Issue Date: February 2017
DOI: https://doi.org/10.1134/S0005117917020035

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case

Abstract

Access this article

Similar content being viewed by others

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

The Frank-Wolfe Algorithm: A Short Introduction

Distributionally robust stochastic programs with side information based on trimmings

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case

Abstract

Access this article

Similar content being viewed by others

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

The Frank-Wolfe Algorithm: A Short Introduction

Distributionally robust stochastic programs with side information based on trimmings

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation