Local Convergence of the Heavy-Ball Method and iPiano for Non-convex Optimization

  • Peter Ochs


A local convergence result for an abstract descent method is proved. The sequence of iterates is attracted by a local (or global) minimum, stays in its neighborhood, and converges within this neighborhood. This result allows algorithms to exploit local properties of the objective function. In particular, the abstract theory in this paper applies to the inertial forward–backward splitting method: iPiano—a generalization of the Heavy-ball method. Moreover, it reveals an equivalence between iPiano and inertial averaged/alternating proximal minimization and projection methods. Key for this equivalence is the attraction to a local minimum within a neighborhood and the fact that, for a prox-regular function, the gradient of the Moreau envelope is locally Lipschitz continuous and expressible in terms of the proximal mapping. In a numerical feasibility problem, the inertial alternating projection method significantly outperforms its non-inertial variants.


Inertial forward–backward splitting Non-convex feasibility Prox-regularity Gradient of Moreau envelopes Heavy-ball method Alternating projection Averaged projection iPiano 

Mathematics Subject Classification

90C26 90C30 65K05 49J52 


  1. 1.
    Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)CrossRefGoogle Scholar
  2. 2.
    Zavriev, S., Kostyuk, F.: Heavy-ball method in nonconvex optimization problems. Comput. Math. Model. 4(4), 336–341 (1993)CrossRefzbMATHGoogle Scholar
  3. 3.
    Ochs, P., Chen, Y., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for non-convex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Ochs, P.: Long Term Motion Analysis for Object Level Grouping and Nonsmooth Optimization Methods. Ph.D. thesis, Albert–Ludwigs–Universität Freiburg (2015)Google Scholar
  5. 5.
    Poliquin, R.A., Rockafellar, R.T.: Prox-regular functions in variational analysis. Trans. Am. Math. Soc. 348(5), 1805–1838 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Poliquin, R.A.: Integration of subdifferentials of nonconvex functions. Nonlinear Anal.: Theory Methods Appl. 17(4), 385–398 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer, Berlin (1998)zbMATHGoogle Scholar
  8. 8.
    Attouch, H., Bolte, J., Svaiter, B.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137(1–2), 91–129 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Kurdyka, K.: On gradients of functions definable in o-minimal structures. Annales de l’institut Fourier 48(3), 769–783 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. In: Les Équations aux Dérivées Partielles, pp. 87–89. Éditions du centre National de la Recherche Scientifique, Paris (1963)Google Scholar
  11. 11.
    Łojasiewicz, S.: Sur la géométrie semi- et sous- analytique. Annales de l’institut Fourier 43(5), 1575–1595 (1993)CrossRefzbMATHGoogle Scholar
  12. 12.
    Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4), 1205–1223 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2), 556–572 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Bochnak, J., Coste, M., Roy, M.F.: Real Algebraic Geometry. Springer, Berlin (1998)CrossRefzbMATHGoogle Scholar
  15. 15.
    Bolte, J., Daniilidis, A., Lewis, A.: A nonsmooth Morse-Sard theorem for subanalytic functions. J. Math. Anal. Appl. 321(2), 729–740 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    den Dries, L.V.: Tame Topology and -Minimal Structures, London Mathematical Society Lecture Notes Series, vol. 248. Cambridge University Press, Cambridge (1998)Google Scholar
  17. 17.
    Absil, P., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim. 16(2), 531–547 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116(1), 5–16 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Bolte, J., Daniilidis, A., Ley, A., Mazet, L.: Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362, 3319–3363 (2010)CrossRefzbMATHGoogle Scholar
  20. 20.
    Bento, G.C., Soubeyran, A.: A generalized inexact proximal point method for nonsmooth functions that satisfy the Kurdyka–Łojasiewicz inequality. Set-Valued Var. Anal. 23(3), 501–517 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Noll, D.: Convergence of non-smooth descent methods using the Kurdyka–Łojasiewicz inequality. J. Optim. Theory Appl. 160(2), 553–572 (2013)CrossRefzbMATHGoogle Scholar
  22. 22.
    Hosseini, S.: Convergence of nonsmooth descent methods via Kurdyka–Łojasiewicz inequality on Riemannian manifolds. Tech. Rep. 1523, Institut für Numerische Simulation, Rheinische Friedrich–Wilhelms–Universität Bonn, Bonn, Germany (2015)Google Scholar
  23. 23.
    Chouzenoux, E., Pesquet, J.C., Repetti, A.: Variable metric forward–backward algorithm for minimizing the sum of a differentiable function and a convex function. J. Optim. Theory Appl. 162(1), 107–132 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Bonettini, S., Loris, I., Porta, F., Prato, M., Rebegoldi, S.: On the Convergence of Variable Metric Line-Search Based Proximal-Gradient Method Under the Kurdyka–Lojasiewicz inequality. arXiv:1605.03791 [math] (2016)
  25. 25.
    Xu, Y., Yin, W.: A globally convergent algorithm for nonconvex optimization based on block coordinate update. J. Sci. Comput. 72(2), 700–734 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Chouzenoux, E., Pesquet, J.C., Repetti, A.: A block coordinate variable metric forward–backward algorithm. J. Glob. Optim. 66(3), 457–485 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Frankel, P., Garrigos, G., Peypouquet, J.: Splitting methods with variable metric for Kurdyka–Łojasiewicz functions and general convergence rates. J. Optim. Theory Appl. 165(3), 874–900 (2014)CrossRefzbMATHGoogle Scholar
  28. 28.
    Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Bot, R.I., Csetnek, E.R., László, S.: An inertial forward–backward algorithm for the minimization of the sum of two nonconvex functions. EURO J. Comput. Optim. 4(1), 3–25 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Ochs, P.: Unifying Abstract Inexact Convergence Theorems for Descent Methods and Block Coordinate Variable Metric iPiano. arXiv:1602.07283 [math] (2016)
  32. 32.
    Bot, R.I., Csetnek, E.R.: An inertial Tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J. Optim. Theory Appl. 171(2), 600–616 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Liang, J., Fadili, J., Peyré, G.: A Multi-step Inertial Forward–Backward Splitting Method for Non-convex Optimization. arXiv:1606.02118 [math] (2016)
  34. 34.
    Johnstone, P.R., Moulin, P.: Convergence Rates of Inertial Splitting Schemes for Nonconvex Composite Optimization. arXiv:1609.03626v1 [cs, math] (2016)
  35. 35.
    Li, H., Lin, Z.: Accelerated proximal gradient method for nonconvex programming. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS), pp. 379–387 (2015)Google Scholar
  36. 36.
    Stella, L., Themelis, A., Patrinos, P.: Forward–backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. 67(3), 443–487 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Li, G., Liu, T., Pong, T.K.: Peaceman-Rachford splitting for a class of nonconvex optimization problems. Comput. Optim. Appl. 68(2), 407–436 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    Li, G., Pong, T.K.: Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems. Math. Program. 159(1), 371–401 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Ochs, P., Dosovitskiy, A., Brox, T., Pock, T.: On iteratively reweighted algorithms for nonsmooth nonconvex optimization in computer vision. SIAM J. Imaging Sci. 8(1), 331–372 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Bolte, J., Pauwels, E.: Majorization–minimization procedures and convergence of SQP methods for semi-algebraic and tame programs. Math. Oper. Res. 41(2), 442–465 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Li, G., Pong, T.: Calculus of the exponent of Kurdyka–Łojasiewicz Inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. (2017).
  43. 43.
    Merlet, B., Pierre, M.: Convergence to equilibrium for the backward Euler scheme and applications. Commun. Pure Appl. Anal. 9(3), 685–702 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  44. 44.
    Xu, Y., Yin, W.: A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 6(3), 1758–1789 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  45. 45.
    Pock, T., Sabach, S.: Inertial proximal alternating linearized minimization (iPALM) for nonconvex and nonsmooth problems. SIAM J. Imaging Sci. 9(4), 1756–1787 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Poliquin, R., Rockafellar, R., Thibault, L.: Local differentiability of distance functions. Trans. Am. Math. Soc. 352(11), 5231–5249 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Daniilidis, A., Lewis, A., Malick, J., Sendov, H.: Prox-regularity of spectral functions and spectral sets. J. Convex Anal. 15(3), 547–560 (2008)MathSciNetzbMATHGoogle Scholar
  48. 48.
    Bolte, J., Nguyen, T., Peypouquet, J., Suter, B.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165(2), 471–507 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Li, G., Mordukhovich, B., Pham, T.: New fractional error bounds for polynomial systems with applications to Hölderian stability in optimization and spectral theory of tensors. Math. Program. 153(2), 333–362 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  50. 50.
    Li, G., Mordukhovich, B., Nghia, T., Pham, T.: Error bounds for parametric polynomial systems with applications to higher-order stability analysis and convergence rates. Math. Program. 1–34 (2016)Google Scholar
  51. 51.
    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, BErlin (2011)CrossRefzbMATHGoogle Scholar
  52. 52.
    Lewis, A.S., Luke, D.R., Malick, J.: Local linear convergence for alternating and averaged nonconvex projections. Found. Comput. Math. 9(4), 485–513 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  53. 53.
    Jourani, A., Thibault, L., Zagrodny, D.: Differential properties of the Moreau envelope. J. Funct. Anal. 266(3), 1185–1237 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  54. 54.
    Lewis, A., Malick, J.: Alternating projections on manifolds. Math. Oper. Res. 33(1), 216–234 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  55. 55.
    Lee, J.: Introduction to Smooth Manifolds. Graduate Texts in Mathematics 218. Springer, New York (2003)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Mathematical Optimization GroupSaarland UniversitySaarbrückenGermany

Personalised recommendations