A Framework of Convergence Analysis of Mini-batch Stochastic Projected Gradient Methods
- 25 Downloads
In this paper, we establish a unified framework to study the almost sure global convergence and the expected convergence rates of a class of mini-batch stochastic (projected) gradient (SG) methods, including two popular types of SG: stepsize diminished SG and batch size increased SG. We also show that the standard variance uniformly bounded assumption, which is frequently used in the literature to investigate the convergence of SG, is actually not required when the gradient of the objective function is Lipschitz continuous. Finally, we show that our framework can also be used for analyzing the convergence of a mini-batch stochastic extragradient method for stochastic variational inequality.
KeywordsStochastic projected gradient method Variance uniformly bounded Convergence analysis
Mathematics Subject Classification62L20 90C25 90C15
The authors would like to thank the referees for their valuable comments which are helpful to improve this manuscript.
- 4.Kushner, H.J., Yin, G.G.: Stochastic approximation and recursive algorithms and applications. Springer, New York (2003) Google Scholar
- 13.Jofré, A., Thompson, P.: On variance reduction for stochastic smooth convex optimization with multiplicative noise (2017). arXiv:1705.02969
- 14.Nguyen, L., Nguyen, P.H., van Dijk, M., Richtarik, P., Scheinberg, K., Takac, M.: SGD and hogwild! Convergence without the bounded gradients assumption. In: Proceedings of the 35th international conference on machine learning, pp. 3750–3758 (2018)Google Scholar
- 15.Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 315–323 (2013) Google Scholar
- 17.Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, pp. 1646–1654 (2014)Google Scholar
- 18.Nguyen, L.M., Liu, J., Scheinberg, K., Takáč, M.: SARAH: a novel method for machine learning problems using stochastic recursive gradient. In: Proceedings of the 34th international conference on machine learning, pp. 2613–2621 (2017)Google Scholar
- 20.Facchinei, F., Pang, J.S.: Finite-dimensional variational inequalities and complementarity problems, vol. I. Springer Series in Operations Research. Springer, New York (2003)Google Scholar
- 21.Robbins, H., Siegmund, D.: A convergence theorem for non negative almost supermartingales and some applications. In: Proceedings of a Symposium Held at the Center for Tomorrow, the Ohio State University, pp. 233–257 (1971) Google Scholar
- 22.Billingsley, P.: Probability and measure, 3rd edn. Wiley Series in Probability and Mathematical Statistics. Wiley, New York (1995)Google Scholar