Conditional gradient type methods for composite nonlinear and stochastic optimization

Ghadimi, Saeed

doi:10.1007/s10107-017-1225-5

Conditional gradient type methods for composite nonlinear and stochastic optimization

Full Length Paper
Series A
Published: 24 January 2018

Volume 173, pages 431–464, (2019)
Cite this article

Mathematical Programming Submit manuscript

Saeed Ghadimi¹

934 Accesses
16 Citations
Explore all metrics

Abstract

In this paper, we present a conditional gradient type (CGT) method for solving a class of composite optimization problems where the objective function consists of a (weakly) smooth term and a (strongly) convex regularization term. While including a strongly convex term in the subproblems of the classical conditional gradient method improves its rate of convergence, it does not cost per iteration as much as general proximal type algorithms. More specifically, we present a unified analysis for the CGT method in the sense that it achieves the best known rate of convergence when the weakly smooth term is nonconvex and possesses (nearly) optimal complexity if it turns out to be convex. While implementation of the CGT method requires explicitly estimating problem parameters like the level of smoothness of the first term in the objective function, we also present a few variants of this method which relax such estimation. Unlike general proximal type parameter free methods, these variants of the CGT method do not require any additional effort for computing (sub)gradients of the objective function and/or solving extra subproblems at each iteration. We then generalize these methods under stochastic setting and present a few new complexity results. To the best of our knowledge, this is the first time that such complexity results are presented for solving stochastic weakly smooth nonconvex and (strongly) convex optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Article 21 February 2015

Inexact proximal stochastic gradient method for convex composite optimization

Article 11 August 2017

Adaptive Conditional Gradient Method

Article 27 September 2019

Notes

Reddi et al. [34] was released several months after releasing the first version of this work.

References

Cartis, C., Gould, N.I.M., Toint, P.L.: On the complexity of steepest descent, newton’s and regularized newton’s methods for nonconvex unconstrained optimization. SIAM J. Optim. 20(6), 2833–2852 (2010)
Article MathSciNet MATH Google Scholar
Chapelle, O., Sindhwani, V., Keerthi, S.S.: Optimization techniques for semi-supervised support vector machines. J. Mach. Learn. Res. 9, 203–233 (2008)
MATH Google Scholar
Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. SIAM J. Optim. 25(2), 856–881 (2015)
Article MathSciNet MATH Google Scholar
Devolder, O., Glineur, F., Nesterov, Y.E.: First-order methods with inexact oracle: the strongly convex case. December 2013, CORE Discussion Paper 2013/16
Dunn, J.C.: Rates of convergence for conditional gradient algorithms near singular and nonsingular extremals. SIAM J. Control Optim. 17(2), 674–701 (1979)
Article MathSciNet MATH Google Scholar
Dunn, J.C.: Convergence rates for conditional gradient sequences generated by implicit step length rules. SIAM J. Control Optim. 18(5), 473–487 (1980)
Article MathSciNet MATH Google Scholar
Frank, M., Wolfe, P.: An algorithm for quadratic programming. Nav. Res. Logist. Q. 3, 95–110 (1956)
Article MathSciNet Google Scholar
Garber, D., Hazan, E.: A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization. arXiv e-prints (2013)
Ghadimi, S., Lan, G., Zhang, H.: Generalized uniformly optimal methods for nonlinear programming, manuscript. Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, 32611, USA (August 2015)
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for constrained nonconvex stochastic programming. Math. Program. 155, 267–305 (2016)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic optimization. Math. Program. 156, 59–99 (2016)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23(4), 2341–2368 (2013)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: shrinking procedures and optimal algorithms. SIAM J. Optim. 23, 2061–2089 (2013)
Article MathSciNet MATH Google Scholar
Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: Advances in Neural Information Processing Systems (NIPS), p. 17 (2005)
Guélat, J., Marcotte, P.: Some comments on wolfe’s ’away step’. Math. Progr. 35(1), 110–119 (1986)
Article MathSciNet MATH Google Scholar
Harchaoui, Z., Juditsky, A., Nemirovski, A.S.: Conditional gradient algorithms for machine learning. NIPS OPT Workshop (2012)
Ito, M.: New results on subgradient methods for strongly convex optimization problems with a unified analysis. Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Japan, Tokyo (April 2015)
Jaggi, M.: Revisiting frank-wolfe: projection-free sparse convex optimization. In: The 30th International Conference on Machine Learning (2013)
Jensen, T., Jørgensen, J.H., Hansen, P., Jensen, S.: Implementation of an optimal first-order method for strongly convex total variation regularization. BIT Numer. Math. 52, 329–356 (2012)
Article MathSciNet MATH Google Scholar
Jiang, B., Zhang, S.: Iteration Bounds for Finding the \(\epsilon \)-Stationary Points for Structured Nonconvex Optimization. arXiv e-prints (2014)
Kakade, S.M., Shalev-Shwartz, S., Tewari, A.: Regularization techniques for learning with matrices. J. Mach. Learn. Res. 13, 1865–1890 (2012)
MathSciNet MATH Google Scholar
Lan, G.: The complexity of large-scale convex programming under a linear optimization oracle. Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, 32611, USA, (June 2013). http://www.optimization-online.org
Lan, G.: Bundle-level type methods uniformly optimal for smooth and non-smooth convex optimization. Math. Progr. 149(1), 1–45 (2015)
Article MATH Google Scholar
Lan, G., Zhou, Y.: Conditional gradient sliding for convex optimization. SIAM J. Optim. 26, 1379–1409 (2016)
Article MathSciNet MATH Google Scholar
Luss, R., Teboulle, M.: Conditional gradient algorithms for rank one matrix approximations with a sparsity constraint. SIAM Rev. 55, 65–98 (2013)
Article MathSciNet MATH Google Scholar
Mason, L., Baxter, J., Bartlett, P., Frean, M.: Boosting algorithms as gradient descent in function space. Proc. NIPS 12, 512–518 (1999)
Google Scholar
Nemirovski, A.S., Yudin, D.: Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)
Google Scholar
Nemirovskii, A.S., Nesterov, Y.E.: Optimal methods for smooth convex minimization. Zh. Vichisl. Mat. Fiz. 25, 356–369 (1985). (In Russian)
MathSciNet Google Scholar
Nesterov, Y.E.: Complexity bounds for primal-dual methods minimizing the model of objective function. Technical Report, CORE Discussion Papers, Februray (2015)
Nesterov, Y.E.: Universal gradient methods for convex optimization problems. Math. Progr. Ser. A (2014). https://doi.org/10.1007/s10107-014-0790-0
Nesterov, Y.E.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Boston (2004)
Book MATH Google Scholar
Nesterov, Y.E.: Gradient methods for minimizing composite objective functions. Math. Progr. Ser. B 140, 125–161 (2013)
Article MATH Google Scholar
Pshenichnyi, B.N., Danilin, I.M.: Numerical Methods in Extremal Problems. Mir Publishers, Moscow (1978)
Google Scholar
Reddi, S.J., Sra, S., Poczos, B., Smola, A.: Stochastic Frank-Wolfe Methods for Nonconvex Optimization. arXiv e-prints (2016)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67, 301–320 (2005)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author is very grateful to the associate editor and the anonymous referees for their valuable comments for improving the quality and presentation of the paper.

Author information

Authors and Affiliations

Princeton University, Princeton, NJ, USA
Saeed Ghadimi

Authors

Saeed Ghadimi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saeed Ghadimi.

Additional information

This work was done while the author was working at the School of Mathematics of the Institute for Research in Fundamental Sciences (IPM), P.O. Box: 19395-5746, Tehran, Iran, and supported by a grant from IPM.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghadimi, S. Conditional gradient type methods for composite nonlinear and stochastic optimization. Math. Program. 173, 431–464 (2019). https://doi.org/10.1007/s10107-017-1225-5

Download citation

Received: 01 February 2016
Accepted: 27 December 2017
Published: 24 January 2018
Issue Date: 23 January 2019
DOI: https://doi.org/10.1007/s10107-017-1225-5

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Conditional gradient type methods for composite nonlinear and stochastic optimization

Abstract

Access this article

Similar content being viewed by others

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Inexact proximal stochastic gradient method for convex composite optimization

Adaptive Conditional Gradient Method

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Conditional gradient type methods for composite nonlinear and stochastic optimization

Abstract

Access this article

Similar content being viewed by others

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Inexact proximal stochastic gradient method for convex composite optimization

Adaptive Conditional Gradient Method

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation