Local Convergence of the Heavy-Ball Method and iPiano for Non-convex Optimization

Article
  • 72 Downloads

Abstract

A local convergence result for an abstract descent method is proved. The sequence of iterates is attracted by a local (or global) minimum, stays in its neighborhood, and converges within this neighborhood. This result allows algorithms to exploit local properties of the objective function. In particular, the abstract theory in this paper applies to the inertial forward–backward splitting method: iPiano—a generalization of the Heavy-ball method. Moreover, it reveals an equivalence between iPiano and inertial averaged/alternating proximal minimization and projection methods. Key for this equivalence is the attraction to a local minimum within a neighborhood and the fact that, for a prox-regular function, the gradient of the Moreau envelope is locally Lipschitz continuous and expressible in terms of the proximal mapping. In a numerical feasibility problem, the inertial alternating projection method significantly outperforms its non-inertial variants.

Keywords

Inertial forward–backward splitting Non-convex feasibility Prox-regularity Gradient of Moreau envelopes Heavy-ball method Alternating projection Averaged projection iPiano 

Mathematics Subject Classification

90C26 90C30 65K05 49J52 

References

  1. 1.
    Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)CrossRefGoogle Scholar
  2. 2.
    Zavriev, S., Kostyuk, F.: Heavy-ball method in nonconvex optimization problems. Comput. Math. Model. 4(4), 336–341 (1993)CrossRefMATHGoogle Scholar
  3. 3.
    Ochs, P., Chen, Y., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for non-convex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Ochs, P.: Long Term Motion Analysis for Object Level Grouping and Nonsmooth Optimization Methods. Ph.D. thesis, Albert–Ludwigs–Universität Freiburg (2015)Google Scholar
  5. 5.
    Poliquin, R.A., Rockafellar, R.T.: Prox-regular functions in variational analysis. Trans. Am. Math. Soc. 348(5), 1805–1838 (1996)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Poliquin, R.A.: Integration of subdifferentials of nonconvex functions. Nonlinear Anal.: Theory Methods Appl. 17(4), 385–398 (1991)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer, Berlin (1998)MATHGoogle Scholar
  8. 8.
    Attouch, H., Bolte, J., Svaiter, B.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137(1–2), 91–129 (2013)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Kurdyka, K.: On gradients of functions definable in o-minimal structures. Annales de l’institut Fourier 48(3), 769–783 (1998)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. In: Les Équations aux Dérivées Partielles, pp. 87–89. Éditions du centre National de la Recherche Scientifique, Paris (1963)Google Scholar
  11. 11.
    Łojasiewicz, S.: Sur la géométrie semi- et sous- analytique. Annales de l’institut Fourier 43(5), 1575–1595 (1993)CrossRefMATHGoogle Scholar
  12. 12.
    Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4), 1205–1223 (2006)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2), 556–572 (2007)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Bochnak, J., Coste, M., Roy, M.F.: Real Algebraic Geometry. Springer, Berlin (1998)CrossRefMATHGoogle Scholar
  15. 15.
    Bolte, J., Daniilidis, A., Lewis, A.: A nonsmooth Morse-Sard theorem for subanalytic functions. J. Math. Anal. Appl. 321(2), 729–740 (2006)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    den Dries, L.V.: Tame Topology and -Minimal Structures, London Mathematical Society Lecture Notes Series, vol. 248. Cambridge University Press, Cambridge (1998)Google Scholar
  17. 17.
    Absil, P., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim. 16(2), 531–547 (2005)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116(1), 5–16 (2009)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Bolte, J., Daniilidis, A., Ley, A., Mazet, L.: Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362, 3319–3363 (2010)CrossRefMATHGoogle Scholar
  20. 20.
    Bento, G.C., Soubeyran, A.: A generalized inexact proximal point method for nonsmooth functions that satisfy the Kurdyka–Łojasiewicz inequality. Set-Valued Var. Anal. 23(3), 501–517 (2015)MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Noll, D.: Convergence of non-smooth descent methods using the Kurdyka–Łojasiewicz inequality. J. Optim. Theory Appl. 160(2), 553–572 (2013)CrossRefMATHGoogle Scholar
  22. 22.
    Hosseini, S.: Convergence of nonsmooth descent methods via Kurdyka–Łojasiewicz inequality on Riemannian manifolds. Tech. Rep. 1523, Institut für Numerische Simulation, Rheinische Friedrich–Wilhelms–Universität Bonn, Bonn, Germany (2015)Google Scholar
  23. 23.
    Chouzenoux, E., Pesquet, J.C., Repetti, A.: Variable metric forward–backward algorithm for minimizing the sum of a differentiable function and a convex function. J. Optim. Theory Appl. 162(1), 107–132 (2014)MathSciNetCrossRefMATHGoogle Scholar
  24. 24.
    Bonettini, S., Loris, I., Porta, F., Prato, M., Rebegoldi, S.: On the Convergence of Variable Metric Line-Search Based Proximal-Gradient Method Under the Kurdyka–Lojasiewicz inequality. arXiv:1605.03791 [math] (2016)
  25. 25.
    Xu, Y., Yin, W.: A globally convergent algorithm for nonconvex optimization based on block coordinate update. J. Sci. Comput. 72(2), 700–734 (2017)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Chouzenoux, E., Pesquet, J.C., Repetti, A.: A block coordinate variable metric forward–backward algorithm. J. Glob. Optim. 66(3), 457–485 (2016)MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Frankel, P., Garrigos, G., Peypouquet, J.: Splitting methods with variable metric for Kurdyka–Łojasiewicz functions and general convergence rates. J. Optim. Theory Appl. 165(3), 874–900 (2014)CrossRefMATHGoogle Scholar
  28. 28.
    Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Bot, R.I., Csetnek, E.R., László, S.: An inertial forward–backward algorithm for the minimization of the sum of two nonconvex functions. EURO J. Comput. Optim. 4(1), 3–25 (2015)MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Ochs, P.: Unifying Abstract Inexact Convergence Theorems for Descent Methods and Block Coordinate Variable Metric iPiano. arXiv:1602.07283 [math] (2016)
  32. 32.
    Bot, R.I., Csetnek, E.R.: An inertial Tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J. Optim. Theory Appl. 171(2), 600–616 (2016)MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Liang, J., Fadili, J., Peyré, G.: A Multi-step Inertial Forward–Backward Splitting Method for Non-convex Optimization. arXiv:1606.02118 [math] (2016)
  34. 34.
    Johnstone, P.R., Moulin, P.: Convergence Rates of Inertial Splitting Schemes for Nonconvex Composite Optimization. arXiv:1609.03626v1 [cs, math] (2016)
  35. 35.
    Li, H., Lin, Z.: Accelerated proximal gradient method for nonconvex programming. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS), pp. 379–387 (2015)Google Scholar
  36. 36.
    Stella, L., Themelis, A., Patrinos, P.: Forward–backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. 67(3), 443–487 (2017)MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)MathSciNetCrossRefMATHGoogle Scholar
  38. 38.
    Li, G., Liu, T., Pong, T.K.: Peaceman-Rachford splitting for a class of nonconvex optimization problems. Comput. Optim. Appl. 68(2), 407–436 (2017)MathSciNetCrossRefMATHGoogle Scholar
  39. 39.
    Li, G., Pong, T.K.: Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems. Math. Program. 159(1), 371–401 (2016)MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    Ochs, P., Dosovitskiy, A., Brox, T., Pock, T.: On iteratively reweighted algorithms for nonsmooth nonconvex optimization in computer vision. SIAM J. Imaging Sci. 8(1), 331–372 (2015)MathSciNetCrossRefMATHGoogle Scholar
  41. 41.
    Bolte, J., Pauwels, E.: Majorization–minimization procedures and convergence of SQP methods for semi-algebraic and tame programs. Math. Oper. Res. 41(2), 442–465 (2016)MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    Li, G., Pong, T.: Calculus of the exponent of Kurdyka–Łojasiewicz Inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. (2017).  https://doi.org/10.1007/s10208-017-9366-8
  43. 43.
    Merlet, B., Pierre, M.: Convergence to equilibrium for the backward Euler scheme and applications. Commun. Pure Appl. Anal. 9(3), 685–702 (2010)MathSciNetCrossRefMATHGoogle Scholar
  44. 44.
    Xu, Y., Yin, W.: A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 6(3), 1758–1789 (2013)MathSciNetCrossRefMATHGoogle Scholar
  45. 45.
    Pock, T., Sabach, S.: Inertial proximal alternating linearized minimization (iPALM) for nonconvex and nonsmooth problems. SIAM J. Imaging Sci. 9(4), 1756–1787 (2016)MathSciNetCrossRefMATHGoogle Scholar
  46. 46.
    Poliquin, R., Rockafellar, R., Thibault, L.: Local differentiability of distance functions. Trans. Am. Math. Soc. 352(11), 5231–5249 (2000)MathSciNetCrossRefMATHGoogle Scholar
  47. 47.
    Daniilidis, A., Lewis, A., Malick, J., Sendov, H.: Prox-regularity of spectral functions and spectral sets. J. Convex Anal. 15(3), 547–560 (2008)MathSciNetMATHGoogle Scholar
  48. 48.
    Bolte, J., Nguyen, T., Peypouquet, J., Suter, B.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165(2), 471–507 (2017)MathSciNetCrossRefMATHGoogle Scholar
  49. 49.
    Li, G., Mordukhovich, B., Pham, T.: New fractional error bounds for polynomial systems with applications to Hölderian stability in optimization and spectral theory of tensors. Math. Program. 153(2), 333–362 (2015)MathSciNetCrossRefMATHGoogle Scholar
  50. 50.
    Li, G., Mordukhovich, B., Nghia, T., Pham, T.: Error bounds for parametric polynomial systems with applications to higher-order stability analysis and convergence rates. Math. Program. 1–34 (2016)Google Scholar
  51. 51.
    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, BErlin (2011)CrossRefMATHGoogle Scholar
  52. 52.
    Lewis, A.S., Luke, D.R., Malick, J.: Local linear convergence for alternating and averaged nonconvex projections. Found. Comput. Math. 9(4), 485–513 (2008)MathSciNetCrossRefMATHGoogle Scholar
  53. 53.
    Jourani, A., Thibault, L., Zagrodny, D.: Differential properties of the Moreau envelope. J. Funct. Anal. 266(3), 1185–1237 (2014)MathSciNetCrossRefMATHGoogle Scholar
  54. 54.
    Lewis, A., Malick, J.: Alternating projections on manifolds. Math. Oper. Res. 33(1), 216–234 (2008)MathSciNetCrossRefMATHGoogle Scholar
  55. 55.
    Lee, J.: Introduction to Smooth Manifolds. Graduate Texts in Mathematics 218. Springer, New York (2003)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Mathematical Optimization GroupSaarland UniversitySaarbrückenGermany

Personalised recommendations