Abstract
We propose an approach to the construction of robust non-Euclidean iterative algorithms by convex composite stochastic optimization based on truncation of stochastic gradients. For such algorithms, we establish sub-Gaussian confidence bounds under weak assumptions about the tails of the noise distribution in convex and strongly convex settings. Robust estimates of the accuracy of general stochastic algorithms are also proposed.
Similar content being viewed by others
References
Nemirovski, A., Juditsky, A., Lan, G., and Shapiro, A., Robust Stochastic Approximation Approach to Stochastic Programming, SIAM J. Optim., 2009, vol. 19, no. 4, pp. 1574–1609.
Juditsky, A. and Nesterov, Y., Deterministic and Stochastic Primal-Dual Subgradient Algorithms for Uniformly Convex Minimization, Stoch. Syst., 2014, vol. 4, no. 1, pp. 44–80.
Ghadimi, S. and Lan, G., Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework, SIAM J. Optim., 2012, vol. 22, no. 4, pp. 1469–1492.
Tukey, J.W., A Survey of Sampling from Contaminated Distributions, in Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, Olkin, I., et al., Eds., Palo Alto: Stanford Univ. Press, 1960, pp. 448–485.
Huber, P.J., Robust Estimation of a Location Parameter, Ann. Math. Statist., 1964, vol. 35, no. 1, pp. 73–101.
Huber, P.J., Robust Statistics: A Review, Ann. Math. Statist., 1972, vol. 43, no. 4, pp. 1041–1067.
Huber, P.J., Robust Statistics, New York: Wiley, 1981.
Martin, R. and Masreliez, C., Robust Estimation via Stochastic Approximation, IEEE Trans. Inf. Theory, 1975, vol. 21, no. 3, pp. 263–271.
Polyak, B.T. and Tsypkin, Ya.Z., Adaptive Estimation Algorithms: Convergence, Optimality, Stability, Autom. Remote Control, 1979, vol. 40, no. 3, pp. 378–389.
Polyak, B.T. and Tsypkin, Ya.Z., Robust Pseudogradient Adaptation Algorithms, Autom. Remote Control, 1981, vol. 41, no. 10, pp. 1404–1409.
Polyak, B. and Tsypkin, J.Z., Robust Identification, Automatica, 1980, vol. 16, no. 1, pp. 53–63.
Price, E. and VandeLinde, V., Robust Estimation Using the Robbins-Monro Stochastic Approximation Algorithm, IEEE Trans. Inf. Theory, 1979, vol. 25, no. 6, pp. 698–704.
Stankovi´c, S.S. and Kovaˇcevi´c, B.D., Analysis of Robust Stochastic Approximation Algorithms for Process Identification, Automatica, 1986, vol. 22, no. 4, pp. 483–488.
Chen, H.-F., Guo, L., and Gao, A.J., Convergence and Robustness of the Robbins-Monro Algorithm Truncated at Randomly Varying Bounds, Stoch. Proc. Appl., 1987, vol. 27, pp. 217–231.
Chen, H.-F. and Gao, A.J., Robustness Analysis for Stochastic Approximation Algorithms, Stochast. Stochast. Rep., 1989, vol. 26, no. 1, pp. 3–20.
Nazin, A.V., Polyak, B.T., and Tsybakov, A.B., Optimal and Robust Kernel Algorithms for Passive Stochastic Approximation, IEEE Trans. Inf. Theory, 1992, vol. 38, no. 5, pp. 1577–1583.
Tsypkin, Ya.Z., Osnovy Informatsionnoi Teorii Identifikatsii (On the Foundations of Information and Identification Theory), Moscow: Nauka, 1984.
Tsypkin, Ya.Z., Informatsionnaya Teoriya Identifikatsii (Information Identification Theory), Moscow: Nauka, 1995.
Kwon, J., Lecu´e, G., and Lerasle, M., Median of Means Principle as a Divide-and-Conquer Procedure for Robustness, Sub-Sampling and Hyper-parameters Tuning, 2018, arXiv:1812.02435.
Chinot, G., Lecu´e, G., and Lerasle, M., Statistical Learning with Lipschitz and Convex Loss Functions, 2018, arXiv preprint arXiv:1810.01090.
Lecu´e, G. and Lerasle, M., Robust Machine Learning by Median-of-Means: Theory and Practice, 2017, arXiv preprint arXiv:1711.10306v2. Annals of Stat., to appear.
Lecu´e, G., Lerasle, M., and Mathieu, T., Robust Classification via MOM Minimization, 2018, arXiv preprint arXiv:1808.03106.
Lerasle, M. and Oliveira, R.I., Robust Empirical Mean Estimators, 2011, arXiv preprint arXiv:1112.3914.
Lugosi, G. and Mendelson, S., Risk Minimization by Median-of-Means Tournaments, 2016, arXiv preprint arXiv:1608.00757.
Lugosi, G. and Mendelson, S., Regularization, Sparse Recovery, and Median-of-Means Tournaments, 2017, arXiv preprint arXiv:1701.04112.
Lugosi, G. and Mendelson, S., Near-Optimal Mean Estimators with Respect to General Norms, 2018, arXiv preprint arXiv:1806.06233.
Hsu, D. and Sabato, S., Loss Minimization and Parameter Estimation with Heavy Tails, J. Machine Learning Res., 2016, vol. 17, no. 1, pp. 543–582.
Bubeck, S., Cesa-Bianchi, N., and Lugosi, G., Bandits with Heavy Tail, IEEE Trans. Inf. Theory, 2013, vol. 59, no. 11, pp. 7711–7717.
Devroye, L., Lerasle, M., Lugosi, G., and Oliveira, R.I., Sub-Gaussian Mean Estimators, Ann. Stat., 2016, vol. 44, no. 6, pp. 2695–2725.
Nemirovskii, A.S. and Yudin, D.B., Slozhnost’ zadach i effektivnost’ metodov optimizatsii, Moscow: Nauka, 1979. Translated under the title Problem Complexity and Method Efficiency in Optimization, Chichester: Wiley, 1983.
Lugosi, G. and Mendelson, S., Sub-Gaussian Estimators of the Mean of a Random Vector, Ann. Stat., 2019, vol. 47, no. 2, pp. 783–794.
Catoni, O., Challenging the Empirical Mean and Empirical Variance: A Deviation Study, Ann. IHP: Probab. Stat., 2012, vol. 48, no. 4, pp. 1148–1185.
Audibert, J.-Y. and Catoni, O., Robust Linear Least Squares Regression, Ann. Stat., 2011, vol. 39, no. 5, pp. 2766–2794.
Minsker, S., Geometric Median and Robust Estimation in Banach Spaces, Bernoulli, 2015, vol. 21, no. 4, pp. 2308–2335.
Wei, X. and Minsker, S., Estimation of the Covariance Structure of Heavy-Tailed Distributions, in Advances in Neural Information Processing Systems, 2017, pp. 2859–2868.
Chen, Y., Su, L., and Xu, J., Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent, Proc. ACM Measur. Analys. Comput. Syst., 2017, vol. 1, no. 2, article no. 44.
Yin, D., Chen, Y., Ramchandran, K., and Bartlett, P., Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates, 2018, arXiv preprint arXiv:1803.01498.
Cardot, H., C´enac, P., and Chaouch, M., Stochastic Approximation for Multivariate and Functional Median, in Proc. COMPSTAT’2010, Springer, 2010, pp. 421–428.
Cardot, H., C´enac, P., and Godichon-Baggioni, A., Online Estimation of the Geometric Median in Hilbert Spaces: Nonasymptotic Confidence Balls, Ann. Stat., 2017, vol. 45, no. 2, pp. 591–614.
Lan, G., An Optimal Method for Stochastic Composite Optimization, Math. Program., 2012, vol. 133, nos. 1–2, pp. 365–397.
Necoara, I., Nesterov, Y., and Glineur, F., Linear Convergence of First Order Methods for Non-Strongly Convex Optimization, Math. Program., 2018, pp. 1–39.
Juditsky, A. and Nemirovski, A., First Order Methods for Nonsmooth Convex Large-Scale Optimization, I: General Purpose Methods, in Optimization for Machine Learning, Sra, S., Nowozin, S., and Wright, S.J., Eds., Boston: MIT Press, 2011, pp. 121–148.
Freedman, D.A., On Tail Probabilities for Martingales, Ann. Probab., 1975, vol. 3, no. 1, pp. 100–118.
Chen, G. and Teboulle, M., Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions, SIAM J. Optim., 1993, vol. 3, no. 3, pp. 538–543.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Russian Text © The Author(s), 2019, published in Avtomatika i Telemekhanika, 2019, No. 9, pp. 64–90.
Rights and permissions
About this article
Cite this article
Nazin, A.V., Nemirovsky, A.S., Tsybakov, A.B. et al. Algorithms of Robust Stochastic Optimization Based on Mirror Descent Method. Autom Remote Control 80, 1607–1627 (2019). https://doi.org/10.1134/S0005117919090042
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0005117919090042