Abstract
In this paper the gradient-free modification of the mirror descent method for convex stochastic online optimization problems is proposed. The crucial assumption in the problem setting is that function realizations are observed with minor noises. The aim of this paper is to derive the convergence rate of the proposed methods and to determine a noise level which does not significantly affect the convergence rate.
Similar content being viewed by others
References
Gasnikov, A.V., Lagunovskaya, A.A., Usmanova, I.N., and Fedorenko, F.A., Gradient-Free Proximal Methods with Inexact Oracle for Convex Stochastic Nonsmooth Optimization Problems on the Simplex, Autom. Remote Control, 2016, vol. 77, no. 11, pp. 2018–2034.
Lugosi, G. and Cesa-Bianchi, N., Prediction, Learning and Games, New York: Cambridge Univ. Press, 2006.
Agarwal, A., Dekel, O., and Xiao, L., Optimal Algorithm for Online Convex Optimization with Multi- Point Bandit Feedback, COLT, Proc. 23rd Ann. Conf. on Learning Theory, Haifa, 2010, pp. 28–40.
Sridharan, K., Learning from an Optimization Viewpoint, PhD Dissertation, Toyota Technol. Inst. Chicago, 2011, arXiv:1204.4145.
Bubeck, S., Introduction to Online Optimization, Princeton: Princeton Univ., 2011. http://www. princeton.edu/~sbubeck/BubeckLectureNotes.pdf.
Shalev-Shwartz, S., Online Learning and Online Convex Optimization, Foundat. Trends Machine Learning, 2011, vol. 4, no. 2, pp. 107–194. http://www.cs.huji.ac.il/~shais/papers/OLsurvey.
Bubeck, S. and Cesa-Bianchi, N., Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Foundat. Trends Machine Learning, 2012, vol. 5, no. 1, pp. 1–122. http://www.princeton. edu/~sbubeck/SurveyBCB12.pdf.
Rakhlin, A. and Sridharan, K., Statistical Learning Theory and Sequential Prediction, e-print, 2014. http://stat.wharton.upenn.edu/~rakhlin/book draft.pdf.
Hazan, E., Introduction to Online Convex Optimization, e-print, 2015. http://ocobook.cs.princeton.edu/OCObook.pdf.
Gasnikov, A.V., Nesterov, Yu.E., and Spokoinyi, V.G., On Efficiency of One Method for Randomization of the Mirror Descent in the Problems of Online Optimization, Zh. Vychisl. Mat. Mat. Fiz., 2015, vol. 55, no. 4, pp. 55–71.
Duchi, J.C., Jordan, M.I., Wainwright, M.J., and Wibisono, A., Optimal Rates for Zero-order Convex Optimization: The Power of Two Function Evaluations, IEEE Transact. Inform., 2015, vol. 61, no. 5, pp. 2788–2806. http://www.eecs.berkeley.edu/~wainwrig/Papers/DucZero15.pdf.
Nemirovskii, A.S. and Yudin, D.B., Slozhnost’ zadach i effektivnost’ metodov optimizatsii (Complexity of Problems and Efficiency of the Optimization Methods), Moscow: Nauka, 1979.
Flaxman, A.D., Kalai, A.T., and McCahan, H.B., Online Convex Optimization in the Bandit Setting: Gradient Descent without a Gradient, Proc. 16 Annual ACM-SIAM Sympos. Discret. Algorithm., 2005, pp. 385–394. http://research.microsoft.com/en-us/um/people/adum/publications/2005-Online_Convex_Optimization_in_the_Bandit_Setting.pdf.
Juditsky, A. and Nemirovski, A., First Order Methods for Nonsmooth Convex Large-scale Optimization. I, II, in Optimization for Machine Learning, Sra, S., Nowozin, S., and Wright, S., Eds., Boston: MIT Press, 2012.
Gasnikov, A.V., Dvurechenskii, P.E., and Nesterov, Yu.E., Stochastic Gradient Methods with Inaccurate Oracle, Tr. MFTI, 2016, vol. 8, no, 1. pp. 41–91.
Nemirovski, A., Lectures on Modern Convex Optimization Analysis, Algorithms, and Engineering Applications, Philadelphia: SIAM, 2013. http://www2.isye.gatech.edu/~nemirovs/Lect ModConvOpt.pdf.
Agarwal, A., Bartlett, P.L., Ravikumar, P., and Wainwright, M.J., Information-theoretic Lower Bounds on the Oracle Complexity of Stochastic Convex Optimization, IEEE Transact. Inform., 2012, vol. 58, no. 5, pp. 3235–3249, arXiv:1009.0571.
Bubeck, S. and Eldan, R., Multi-Scale Exploration of Convex Functions and Bandit Convex Optimization, e-print, 2015. http://research.microsoft.com/en-us/um/people/sebubeck/ConvexBandits.pdf.
Allen-Zhu, Z. and Orecchia, L., Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent, e-print, 2014, arXiv:1407.1537.
Nesterov, Y., Primal-Dual Subgradient Methods for Convex Problems, Math. Program., Ser. B, 2009, vol. 120(1), pp. 261–283.
Ledoux, M., Concentration of Measure Phenomenon, Providence: Am. Math. Soc. (Math. Surveys Monogr., vol. 89), 2001.
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.V. Gasnikov, E.A. Krymova, A.A. Lagunovskaya, I.N. Usmanova, F.A. Fedorenko, 2017, published in Avtomatika i Telemekhanika, 2017, No. 2, pp. 36–49.
This paper was recommended for publication by P.S. Shcherbakov, a member of the Editorial Board
Rights and permissions
About this article
Cite this article
Gasnikov, A.V., Krymova, E.A., Lagunovskaya, A.A. et al. Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case. Autom Remote Control 78, 224–234 (2017). https://doi.org/10.1134/S0005117917020035
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0005117917020035