General inertial proximal gradient method for a class of nonconvex nonsmooth optimization problems

  • Zhongming Wu
  • Min LiEmail author


In this paper, we consider a general inertial proximal gradient method with constant and variable stepsizes for a class of nonconvex nonsmooth optimization problems. The proposed method incorporates two different extrapolations with respect to the previous iterates into the backward proximal step and the forward gradient step in classic proximal gradient method. Under more general parameter constraints, we prove that the proposed method generates a convergent subsequence and each limit point is a stationary point of the problem. Furthermore, the generated sequence is globally convergent to a stationary point if the objective function satisfies the Kurdyka–Łojasiewicz property. Local linear convergence also can be established for the proposed method with constant stepsizes by using a common error bound condition. In addition, we conduct some numerical experiments on nonconvex quadratic programming and SCAD penalty problems to demonstrate the advantage of the proposed method.


Nonconvex Nonsmooth Inertial proximal gradient method Kurdyka–Łojasiewicz property Global convergence 



The authors are grateful to thank the associate editor and two anonymous referees for their helpful suggestions on improving the quality of the original manuscript. This work was supported by National Natural Science Foundation of China Grants 11771078 and 71661147004, Natural Science Foundation of Jiangsu Province Grant BK20181258, Project 333 of Jiangsu Province Grant BRA2018351 and Postgraduate Research & Practice Innovation Program of Jiangsu Province Grant KYCX18_0200.


  1. 1.
    Alvarez, F.: On the minimizing property of a second order dissipative system in Hilbert spaces. SIAM J. Control Optim. 38(4), 1102–1119 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Alvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9(1–2), 3–11 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. Ser. A 137(1–2), 91–129 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Attouch, H., Peypouquet, J., Redont, P.: A dynamical approach to an inertial forward–backward algorithm for convex minimization. SIAM J. Optim. 24(1), 232–256 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Ban, G.-Y., El Karoui, N., Lim, A.E.B.: Machine learning and portfolio optimization. Manage. Sci. 64(3), 1136–1154 (2016)CrossRefGoogle Scholar
  7. 7.
    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)CrossRefzbMATHGoogle Scholar
  8. 8.
    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. Ser. A 146(1–2), 459–494 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Bolte, J., Sabach, S., Teboulle, M., Vaisbourd, Y.: First order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3), 2131–2151 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Boţ, R.I., Csetnek, E.R.: An inertial Tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J. Optim. Theory Appl. 171(2), 600–616 (2014)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Boţ, R.I., Csetnek, E.R.: Proximal-gradient algorithms for fractional programming. Optimization 66(8), 1383–1396 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Boţ, R.I., Csetnek, E.R., Hendrich, C.: Inertial Douglas-Rachford splitting for monotone inclusion problems. Appl. Math. Comput. 256(1), 472–487 (2015)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Boţ, R.I., Csetnek, E.R., László, S.C.: An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions. EURO J. Comput. Optim. 4(1), 3–25 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Chambolle, A., Dossal, C.: On the convergence of the iterates of the “fast iterative shrinkage/thresholding algorithm”. J. Optim. Theory Appl. 166(3), 968–982 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Chen, C., Chan, R.H., Ma, S., Yang, J.: Inertial proximal ADMM for linearly constrained separable convex optimization. SIAM J. Imaging Sci. 8(4), 2239–2267 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Chen, G.H-G., Rockafellar, R.T.: Convergence rates in forward–backward splitting. SIAM J. Optim. 7(2), 421–444 (1997)Google Scholar
  18. 18.
    Chen, X., Peng, J., Zhang, S.: Sparse solutions to random standard quadratic optimization problems. Math. Program. Ser. A 141(1–2), 273–293 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Combettes, P.L., Pesquet, J.-C.: Proximal splitting methods in signal processing. In: Bauschke, H.H., Burachik, R.S., Combettes, P.L., Elser, V., Luke, D.R., Wolkowicz, H. (eds.) Fixed-point Algorithms for Inverse Problems in Science and Engineering, pp.185–212. Springer (2011)Google Scholar
  20. 20.
    Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(44), 1289–1306 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Guo, K., Han, D.: A note on the Douglas–Rachford splitting method for optimization problems involving hypoconvex functions. J. Global Optim. 72(3), 431–441 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Guo, K., Yuan, X., Zeng, S.: Convergence analysis of ISTA and FISTA for “strongly + semi” convex programming. (2016)
  25. 25.
    Han, D., Sun, D., Zhang, L.: Linear rate convergence of the alternating direction method of multipliers for convex composite programming. Math. Oper. Res. 43(2), 622–637 (2018)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Han, D., Yuan, X.: Local linear convergence of the alternating direction method of multipliers for quadratic programs. SIAM J. Numer. Anal. 51(6), 3446–3457 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Horst, R., Thoai, N.V.: DC programming: overview. J. Optim. Theory Appl. 103(1), 1–43 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Ibaraki, T., Katoh, N.: Resource Allocation Problems: Algorithmic Approaches. MIT Press, Cambridge (1988)zbMATHGoogle Scholar
  29. 29.
    Jia, Z., Gao, X., Cai, X., Han, D.: Local linear convergence of the alternating direction method of multipliers for nonconvex separable optimization. Manuscript (2017)Google Scholar
  30. 30.
    Johnstone, P.R., Moulin, P.: Local and global convergence of a general inertial proximal splitting scheme for minimizing composite functions. Comput. Optim. Appl. 67(2), 259–292 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Koltchinskii, V., Lounici, K., Tsybakov, A.B., et al.: Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Stat. 39(5), 2302–2329 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Kurdyka, K.: On gradients of functions definable in o-minimal structures. Ann. I. Fourier 48(3), 769–783 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Lorenz, D.A., Pock, T.: An inertial forward-backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51(2), 311–325 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Luo, Z.-Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30(2), 408–425 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46(1), 157–178 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Maingé, P.-E., Gobinddass, M.: Convergence of one-step projected gradient methods for variational inequalities. J. Optim. Theory Appl. 171(1), 146–168 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Moudafi, A., Oliny, M.: Convergence of a splitting inertial proximal method for monotone operators. J. Comput. Appl. Math. 155(2), 447–454 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Dordrecht (2004)CrossRefzbMATHGoogle Scholar
  39. 39.
    Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. Ser. B 140(1), 125–161 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Ochs, P., Brox, T., Pock, T.: iPiasco: inertial proximal algorithm for strongly convex optimization. J. Math. Imaging Vis. 53(2), 171–181 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Ochs, P., Chen, Y., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Palomar, D.P., Eldar, Y.C.: Convex Optimization in Signal Processing and Communications. Cambridge University Press, Cambridge (2010)zbMATHGoogle Scholar
  43. 43.
    Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)CrossRefGoogle Scholar
  44. 44.
    Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)CrossRefGoogle Scholar
  45. 45.
    Sra, S., Nowozin, S., Wright, S.J.: Optimization for Machine Learning. MIT Press, Cambridge (2012)Google Scholar
  46. 46.
    Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. Ser. B 117(1–2), 387–423 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Yang, L.: Proximal gradient method with extrapolation and line search for a class of nonconvex and nonsmooth problems. arXiv:1711.06831v3 (2018)
  48. 48.
    Wen, B., Chen, X., Pong, T.K.: Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 27(1), 124–145 (2017)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Economics and ManagementSoutheast UniversityNanjingChina
  2. 2.School of Management and EngineeringNanjing UniversityNanjingChina

Personalised recommendations