Skip to main content
Log in

Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

We study the behavior of the trajectories of a second-order differential equation with vanishing damping, governed by the Yosida regularization of a maximally monotone operator with time-varying index, along with a new Regularized Inertial Proximal Algorithm obtained by means of a convenient finite-difference discretization. These systems are the counterpart to accelerated forward–backward algorithms in the context of maximally monotone operators. A proper tuning of the parameters allows us to prove the weak convergence of the trajectories to zeroes of the operator. Moreover, it is possible to estimate the rate at which the speed and acceleration vanish. We also study the effect of perturbations or computational errors that leave the convergence properties unchanged. We also analyze a growth condition under which strong convergence can be guaranteed. A simple example shows the criticality of the assumptions on the Yosida approximation parameter, and allows us to illustrate the behavior of these systems compared with some of their close relatives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. The idea consisting in regularizing with the help of the Moreau envelopes an inertial dynamic governed by a nonsmooth operator was already used in the modeling of elastic shocks in [5].

References

  1. Álvarez, F.: On the minimizing property of a second-order dissipative system in Hilbert spaces. SIAM J. Control Optim. 38(4), 1102–1119 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  2. Álvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9(1–2), 3–11 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  3. Attouch, H., Cabot, A.: Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity. J. Differ. Equ. 263(9), 5412–5458 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  4. Attouch, H., Cabot, A.: Convergence rates of inertial forward-backward algorithms, HAL-01453170 (2017)

  5. Attouch, H., Cabot, A., Redont, P.: The dynamics of elastic shocks via epigraphical regularization of a differential inclusion. Adv. Math. Sci. Appl. 12(1), 273–306 (2002)

    MathSciNet  MATH  Google Scholar 

  6. Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing damping, to appear in Math. Program. https://doi.org/10.1007/s10107-016-0992-8

  7. Attouch, H., Maingé, P.E.: Asymptotic behavior of second order dissipative evolution equations combining potential with non-potential effects. ESAIM Control Optim. Calc. Var. 17(3), 836–857 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  8. Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward–backward method is actually faster than \(\frac{1}{k^2}\). SIAM J. Optim. 26(3), 1824–1834 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  9. Attouch, H., Peypouquet, J., Redont, P.: Fast convergence of regularized inertial dynamics for nonsmooth convex optimization, Working paper. (2017)

  10. Attouch, H., Soueycatt, M.: Augmented Lagrangian and proximal alternating direction methods of multipliers in Hilbert spaces. Applications to games, PDE’s and control. Pac. J. Optim. 5(1), 17–37 (2009)

    MathSciNet  MATH  Google Scholar 

  11. Attouch, H., Wets, R.: Epigraphical processes: laws of large numbers for random LSC functions. Sem. Anal. Convexe Montp. 20, 13–29 (1990)

    MathSciNet  MATH  Google Scholar 

  12. Attouch, H., Wets, R.: Quantitative stability of variational systems: I, the epigraphical distance. Trans. Am. Math. Soc. 328(2), 695–729 (1991)

    MathSciNet  MATH  Google Scholar 

  13. Attouch, H., Wets, R.: Quantitative stability of variational systems: II, a framework for nonlinear conditioning. SIAM J. Optim. 3, 359–381 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  14. Bauschke, H., Combettes, P.: Convex Analysis and Monotone Operator Theory in Hilbert spaces, CMS Books in Mathematics. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  15. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  16. Brézis, H.: Opérateurs maximaux monotones dans les espaces de Hilbert et équations d’évolution, Lecture Notes 5, North Holland, (1972)

  17. Brézis, H.: Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer, Berlin (2011)

    MATH  Google Scholar 

  18. Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165(2), 471–507 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  19. Cabot, A., Engler, H., Gadat, S.: On the long time behavior of second order differential equations with asymptotically small dissipation. Trans. Am. Math. Soc. 361, 5983–6017 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  20. Chambolle, A., Dossal, Ch.: On the convergence of the iterates of the fast iterative shrinkage thresholding algorithm. J. Optim. Theory Appl. 166, 968–982 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  21. Haraux, A.: Systèmes dynamiques dissipatifs et applications, RMA 17, Masson, (1991)

  22. Jendoubi, M.A., May, R.: Asymptotics for a second-order differential equation with nonautonomous damping and an integrable source term. Appl. Anal. 94(2), 436–444 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  23. Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. Math. Program. 159(1–2), 81–107 (2016). Ser. A

    Article  MathSciNet  MATH  Google Scholar 

  24. May, R.: Asymptotic for a second order evolution equation with convex potential and vanishing damping term. Turk. J. Math. 41(3), 681–685 (2017)

    Article  MathSciNet  Google Scholar 

  25. Matet, S., Rosasco, L., Villa, S., Vu, B.C.: Don’t relax: early stopping for convex regularization. arXiv:1707.05422v1 [math.OC] (2017)

  26. Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(O(1/k^2)\). Sov. Math. Dokl. 27, 372–376 (1983)

    MATH  Google Scholar 

  27. Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course, of Applied Optimization, vol. 87. Kluwer Academic Publishers, Boston (2004)

    MATH  Google Scholar 

  28. Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)

    Article  MathSciNet  MATH  Google Scholar 

  29. Peypouquet, J.: Convex Otimization in Normed Spaces: Theory, Methods and Examples. With a Foreword by Hedy Attouch. Springer Briefs in Optimization, p. xiv+124. Springer, Cham (2015)

    Google Scholar 

  30. Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17(3–4), 1113–1163 (2010)

    MathSciNet  MATH  Google Scholar 

  31. Polyak, B.T.: Introduction to Optimization. Optimization Software, New York (1987)

    MATH  Google Scholar 

  32. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  33. Rockafellar, R.T.: Monotone operators associated with saddle-functions and mini-max problems, In: Nonlinear operators and nonlinear equations of evolution in Banach spaces 2. In: 18th Proceedings of Symposia in Pure Mathematics, F.E. Browder Ed., American Mathematical Society, pp. 241–250 (1976)

  34. Rockafellar, R.T.: Augmented lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1, 97–116 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  35. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, Grundlehren der mathematischen Wissenschafte, vol. 317. Springer, Berlin (1998)

    Google Scholar 

  36. Schmidt, M., Le Roux, N., Bach, F.: Convergence rates of inexact proximal-gradient methods for convex optimization. In: Advances in Neural Information Processing Systems (NIPS), (2011)

  37. Su, W., Boyd, S., Candès, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. Neural Inf. Process. Syst. 27, 2510–2518 (2014)

    MATH  Google Scholar 

  38. Villa, S., Salzo, S., Baldassarres, L., Verri, A.: Accelerated and inexact forward–backward. SIAM J. Optim. 23(3), 1607–1633 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors thank P. Redont for his careful and constructive reading of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hedy Attouch.

Additional information

H. Attouch: Effort sponsored by the Air Force Office of Scientific Research, Air Force Material Command, USAF, under Grant Number F49550-1 5-1-0500. J. Peypouquet: supported by Fondecyt Grant 1140829, Millenium Nucleus ICM/FIC RC130003 and Basal Project CMM Universidad de Chile.

Auxiliary results

Auxiliary results

1.1 Yosida regularization of an operator A

Given a maximally monotone operator A and \(\lambda >0\), the resolvent of A with index \(\lambda \) and the Yosida regularization of A with parameter \(\lambda \) are defined by

$$\begin{aligned} J_{\lambda A} = \left( I + \lambda A \right) ^{-1}\qquad \hbox {and}\qquad A_{\lambda } = \frac{1}{\lambda } \left( I- J_{\lambda A} \right) , \end{aligned}$$

respectively. The operator \(J_{\lambda A}: {\mathcal {H}}\rightarrow {\mathcal {H}}\) is nonexpansive and eveywhere defined (indeed it is firmly non-expansive). Moreover, \(A_{\lambda }\) is \(\lambda \)-cocoercive: for all \(x, y \in {\mathcal {H}}\) we have

$$\begin{aligned} \langle A_{\lambda }y - A_{\lambda }x, y-x\rangle \ge \lambda \Vert A_{\lambda }y - A_{\lambda }x \Vert ^2 . \end{aligned}$$

This property immediately implies that \(A_{\lambda }: {\mathcal {H}}\rightarrow {\mathcal {H}}\) is \(\frac{1}{\lambda }\)-Lipschitz continuous. Another property that proves useful is the resolvent equation (see, for example, [16, Proposition 2.6] or [14, Proposition 23.6])

$$\begin{aligned} (A_\lambda )_{\mu }= A_{(\lambda +\mu )}, \end{aligned}$$
(84)

which is valid for any \(\lambda , \mu >0\). This property allows to compute simply the resolvent of \(A_\lambda \): for any \(\lambda , \mu >0\) by

$$\begin{aligned} J_{\mu A_\lambda } = \frac{\lambda }{\lambda + \mu }I + \frac{\mu }{\lambda + \mu }J_{(\lambda + \mu )A}. \end{aligned}$$

Also note that for any \(x \in {\mathcal {H}}\), and any \(\lambda >0\)

$$\begin{aligned} A_\lambda (x) \in A (J_{\lambda A}x)= A( x - \lambda A_\lambda (x)). \end{aligned}$$

Finally, for any \(\lambda >0\), A and \(A_{\lambda }\) have the same solution set \(S:=A_{\lambda }^{-1} (0) = A^{-1}(0)\). For a detailed presentation of the properties of the maximally monotone operators and the Yosida approximation, the reader can consult [14] or [16].

1.2 Existence and uniqueness of solution in the presence of a source term

Let us first establish the existence and uniqueness of the solution trajectory of the Cauchy problem associated to the continuous regularized dynamic (1) with a source term.

Lemma A.1

Take \(t_0>0\). Let us suppose that \(\lambda : [t_0, +\infty [ \rightarrow {\mathbb {R}}^+\) is a measurable function such that \(\lambda (t) \ge \underline{\lambda }\) for some \(\underline{\lambda }>0\). Suppose that \(f \in L^1 ([t_0, T], {\mathcal {H}})\) for all \(T \ge t_0\). Then, for any \(x_0 \in {\mathcal {H}}, \ v_0 \in {\mathcal {H}} \), there exists a unique strong global solution \(x: [t_0,+\infty [ \rightarrow {\mathcal {H}}\) of the Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{ll} \ddot{x}(t) + \frac{\alpha }{t} \dot{x}(t) +A_{\lambda (t)}(x(t)) = f(t)\\ x(t_0)= x_0, \ \dot{x}(t_0)=v_0. \end{array}\right. \end{aligned}$$
(85)

Proof

The argument is standard, and consists in writing (85) as a first-order system in the phase space. By setting

$$\begin{aligned} X(t)= \begin{pmatrix} x(t) \\ \dot{x}(t) \end{pmatrix}, \quad F (t,u,v)= \begin{pmatrix} v \\ -\frac{\alpha }{t} v - A_{\lambda (t)}u + f(t) \end{pmatrix} \text{ and } X_0= \begin{pmatrix} x_0 \\ v_0 \end{pmatrix}, \end{aligned}$$

the system can be written as

$$\begin{aligned} \left\{ \begin{array}{ll} \dot{X}(t) = F (t,X(t))\\ X(t_0)= X_0 . \end{array}\right. \end{aligned}$$
(86)

Using the \(\frac{1}{\lambda }\)-Lipschitz continuity property of \(A_{\lambda }\), one can easily verify that the conditions of the Cauchy–Lipschitz theorem are satisfied. Precisely, we can apply the non-autonomous version of this theorem given in [21, Proposition 6.2.1]. Thus, we obtain a strong solution, that is, \(t\mapsto \dot{x}(t)\) is locally absolutely continuous. If, moreover, we suppose that the functions \(\lambda (\cdot )\) and f are continuous, then the solution is a classical solution of class \({\mathcal {C}}^2\). \(\square \)

1.3 Opial’s Lemma

The following results are often referred to as Opial’s Lemma [28]. To our knowledge, it was first written in this form in Baillon’s thesis. See [30] for a proof.

Lemma A.2

Let S be a nonempty subset of \({\mathcal {H}}\) and let \(x: [0,+\infty [ \rightarrow {\mathcal {H}}\). Assume that

  1. (i)

    for every \(z\in S\), \(\lim _{t\rightarrow \infty }\Vert x(t)-z\Vert \) exists;

  2. (ii)

    every weak sequential limit point of x(t), as \(t\rightarrow \infty \), belongs to S.

Then x(t) converges weakly as \(t\rightarrow \infty \) to a point in S.

Its discrete version is

Lemma A.3

Let S be a non empty subset of \({\mathcal {H}}\), and \((x_k)\) a sequence of elements of \({\mathcal {H}}\). Assume that

  1. (i)

    for every \(z\in S\), \(\lim _{k\rightarrow +\infty }\Vert x_k-z\Vert \) exists;

  2. (ii)

    every weak sequential limit point of \((x_k)\), as \(k\rightarrow \infty \), belongs to S.

Then \(x_k\) converges weakly as \(k\rightarrow \infty \) to a point in S.

1.4 Variation of the function \(\gamma \mapsto \gamma A_{\gamma }x\)

Lemma A.4

Let \(\gamma , \delta >0\), and \(x, y\in {\mathcal {H}}\). Then, for each \(z\in S= A^{-1} (0)\), and all \(t \ge t_0\), we have

$$\begin{aligned} \Vert \gamma A_{\gamma }x - \delta A_{\delta }y\Vert \le 2 \Vert x-y \Vert + 2 \Vert x-z \Vert \frac{|\gamma - \delta |}{\gamma } \end{aligned}$$
(87)

Proof

We use successively the definition of the Yosida approximation, the resolvent identity [14, Proposition 23.28 (i)], and the nonexpansive property of the resolvent, to obtain

$$\begin{aligned} \Vert \gamma A_{\gamma }x - \delta A_{\delta }y\Vert&\le \Vert x-y\Vert + \Vert J_{\gamma A}x - J_{\delta A}y \Vert \\&= \Vert x-y\Vert + \Vert J_{\delta A}\left( \frac{\delta }{\gamma }x + \left( 1- \frac{\delta }{\gamma } \right) J_{\gamma A}x\right) - J_{\delta A}y \Vert \\&\le \Vert x-y\Vert + \Vert \frac{\delta }{\gamma }x + \left( 1- \frac{\delta }{\gamma } \right) J_{\gamma A}x -y\Vert \\&\le 2\Vert x-y\Vert + |1- \frac{\delta }{\gamma } | \Vert J_{\gamma A}x -x \Vert . \end{aligned}$$

Since \(J_{\gamma A}z =z\) for \(z\in S\), and using again the nonexpansive property of the resolvent, we deduce that

$$\begin{aligned} \Vert \gamma A_{\gamma }x - \delta A_{\delta }y\Vert&\le 2\Vert x-y\Vert + |1- \frac{\delta }{\gamma } | \Vert (J_{\gamma A}x -J_{\gamma A}z) + (z -x \Vert )\\&\le 2\Vert x-y\Vert + 2 \Vert x-z \Vert \frac{|\gamma - \delta |}{\gamma }, \end{aligned}$$

which gives the claim. \(\square \)

1.5 On integration and decay

Lemma A.5

Let \(w,\eta :[t_0,+\infty [\rightarrow [0,+\infty [\) be absolutely continuous functions such that \(\eta \notin L^1 (t_0, +\infty )\),

$$\begin{aligned} \int _{t_0}^{+ \infty } w(t)\,\eta (t)\,dt < + \infty , \end{aligned}$$

and \(|\dot{w}(t)| \le \eta (t)\) for almost every \(t>t_0\). Then, \(\lim _{t\rightarrow +\infty } w(t) =0\).

Proof

First, for almost every \(t>t_0\), we have

$$\begin{aligned} \left| \frac{d}{dt} w^2(t)\right| = 2\left| \frac{d}{dt} w(t)\right| w(t) \le 2w(t)\,\eta (t). \end{aligned}$$

Therefore, \(|\frac{d}{dt} w^2|\) belongs to \(L^1\). This implies that \(\lim _{t\rightarrow +\infty } w^2(t) \) exists. Since w is nonnegative, it follows that \(\lim _{t\rightarrow +\infty } w(t) \) exists as well. But this limit is necessarily zero because \(\eta \notin L^1\). \(\square \)

1.6 On boundedness and anchoring

Lemma A.6

Let \(t_0>0\), and let \(w: [t_0, +\infty [ \rightarrow \mathbb {R}\) be a continuously differentiable function which is bounded from below. Given a nonegative function \(\theta \), let us assume that

$$\begin{aligned} t\ddot{w}(t) + \alpha \dot{w}(t) + \theta (t)\le k(t), \end{aligned}$$
(88)

for some \(\alpha > 1\), almost every \(t>t_0\), and some nonnegative function \(k\in L^1 (t_0, +\infty )\). Then, the positive part \([\dot{w}]_+\) of \(\dot{w}\) belongs to \(L^1(t_0,+\infty )\), and \(\lim _{t\rightarrow +\infty }w(t)\) exists. Moreover, we have \(\int _{t_0}^{+\infty } \theta (t) dt < + \infty \).

Proof

Multiply (88) by \(t^{\alpha -1}\) to obtain

$$\begin{aligned} \frac{d}{dt} \big (t^{\alpha } \dot{w}(t)\big )+ t^{\alpha -1} \theta (t)\le t^{\alpha -1} k (t). \end{aligned}$$

By integration, we obtain

$$\begin{aligned} \dot{w}(t) + \frac{1}{t^{\alpha } } \int _{t_0}^t s^{\alpha -1}\theta (s)ds \le \frac{{t_0}^{\alpha }|\dot{w}(t_0)|}{t^{\alpha } } + \frac{1}{t^{\alpha } } \int _{t_0}^t s^{\alpha -1}k (s)ds. \end{aligned}$$
(89)

Hence,

$$\begin{aligned}{}[\dot{w}]_{+}(t) \le \frac{{t_0}^{\alpha }|\dot{w}(t_0)|}{t^{\alpha } } + \frac{1}{t^{\alpha } } \int _{t_0}^t s^{\alpha -1} k(s)ds, \end{aligned}$$

and so,

$$\begin{aligned} \int _{t_0}^{\infty } [\dot{w}]_{+}(t) dt \le \frac{{t_0}^{\alpha }|\dot{w}(t_0)|}{(\alpha -1) t_0^{\alpha -1}} + \int _{t_0}^{\infty }\frac{1}{t^{\alpha }} \left( \int _{t_0}^t s^{\alpha -1} k(s) ds\right) dt. \end{aligned}$$

Applying Fubini’s Theorem, we deduce that

$$\begin{aligned} \int _{t_0}^{\infty }\frac{1}{t^{\alpha }} \left( \int _{t_0}^t s^{\alpha -1} k(s) ds\right) dt = \int _{t_0}^{\infty } \left( \int _{s}^{\infty } \frac{1}{t^{\alpha }} dt\right) s^{\alpha -1} k(s) ds = \frac{1}{\alpha -1} \int _{t_0}^{\infty }k(s) ds. \end{aligned}$$

As a consequence,

$$\begin{aligned} \int _{t_0}^{\infty } [\dot{w}]_{+}(t) dt \le \frac{{t_0}^{\alpha }|\dot{w}(t_0) |}{(\alpha -1) t_0^{\alpha -1}} + \frac{1}{\alpha -1} \int _{t_0}^{\infty }k(s) ds < + \infty . \end{aligned}$$

This implies \(\lim _{t\rightarrow +\infty }w(t)\) exists. Back to (89), integrating from \(t_0\) to t, using Fubini’s Theorem again, and then letting t tend to \(+\infty \), we obtain

$$\begin{aligned} \lim _{t\rightarrow +\infty } w(t) - w(t_0) + \frac{1}{\alpha -1} \int _{t_0}^{\infty }\theta (s) ds \le \frac{{t_0}^{\alpha }|\dot{w}(t_0) |}{(\alpha -1) t_0^{\alpha -1}} + \frac{1}{\alpha -1} \int _{t_0}^{\infty }k(s) ds < + \infty . \end{aligned}$$

Hence \(\int _{t_0}^{\infty }\theta (s) ds < + \infty \). \(\square \)

1.7 A summability result for real sequences

Lemma A.7

Let \(\alpha >1\), and let \((h_k)\) be a sequence of real numbers which is bounded from below, and such that

$$\begin{aligned} (h_{k+1} - h_{k}) - \left( 1- \frac{\alpha }{k}\right) (h_{k} - h_{k-1}) + \omega _k \le \theta _k \end{aligned}$$
(90)

for all \(k\ge 1\). Suppose that \((\omega _k)\), and \((\theta _k)\) are two sequences of nonnegative numbers, such that \(\sum _k k\theta _{k} <+\infty \). Then

$$\begin{aligned} \sum _{k \in \mathbb {N}} [h_k - h_{k-1} ]_{+}< +\infty \quad \text{ and } \quad \sum _{k \in \mathbb {N}} k\omega _k < +\infty . \end{aligned}$$

Proof

Since \((\omega _k)\) is nonegative, we have

$$\begin{aligned} (h_{k+1} - h_{k}) - \left( 1- \frac{\alpha }{k}\right) (h_{k} - h_{k-1}) \le \theta _k. \end{aligned}$$

Setting \(b_k := [h_k - h_{k-1} ]_{+}\) the positive part of \(h_k - h_{k-1}\), we immediately infer that

$$\begin{aligned} b_{k+1} \le \left( 1- \frac{\alpha }{k}\right) b_k + \theta _k \end{aligned}$$

for all \(k\ge 1\). Multiplying by k and rearranging the terms, we obtain

$$\begin{aligned} (\alpha -1)b_k\le (k-1)b_k-kb_{k+1}+k\theta _k. \end{aligned}$$

Summing for \(k=1,\dots , K\), and using the telescopic property, along with the fact that \(Kb_{K+1}\ge 0\), we deduce that

$$\begin{aligned} (\alpha -1)\sum _{k=1}^Kb_k\le \sum _{k=1}^Kk\theta _k, \end{aligned}$$

which gives

$$\begin{aligned} \sum _{k \in \mathbb {N}} [h_k - h_{k-1} ]_{+} < +\infty . \end{aligned}$$

Let us now prove that \(\sum _{k \in \mathbb {N}} k\omega _k < +\infty \), which is the most delicate part of the proof. To this end, write \(\delta _k= h_{k} - h_{k-1}\), and \(\alpha _k =\left( 1- \frac{\alpha }{k}\right) \), so that (90) becomes

$$\begin{aligned} \delta _{k+1} + \omega _k \le \alpha _k\delta _k+\theta _k. \end{aligned}$$

An immediate recurrence (it can be easily seen by induction) shows that

$$\begin{aligned} \delta _{k+1} +\sum _{i=1}^k\left[ \left( \prod _{j=i+1}^k \alpha _j\right) \omega _i\right] \le \left( \prod _{j=1}^k \alpha _j\right) \delta _1+\sum _{i=1}^k\left[ \left( \prod _{j=i+1}^k \alpha _j\right) \theta _i\right] , \end{aligned}$$

with the convention \(\prod _{j=k+1}^k \alpha _j=1\). To simplify the notation, write \(A_{i}^k=\prod _{j=i}^k \alpha _j\). Sum the above inequality for \(k=1,\dots , K\) to deduce that

$$\begin{aligned} h_{K+1}-h_1 +\sum _{k=1}^{K}\sum _{i=1}^kA_{i+1}^k\omega _i \le \delta _1\sum _{k=1}^{K}A_1^k +\sum _{k=1}^{K}\sum _{i=1}^kA_{i+1}^k\theta _i. \end{aligned}$$
(91)

Now, using Fubini’s Theorem, we obtain

$$\begin{aligned} h_{K+1}-h_1 +\sum _{i=1}^{K}\left[ \omega _i\sum _{k=i}^KA_{i+1}^k\right] \le \delta _1\sum _{k=1}^{K}A_{1}^k +\sum _{i=1}^{K}\left[ \theta _i\sum _{k=i}^KA_{i+1}^k\right] . \end{aligned}$$
(92)

Simple computations (using integrals in the estimations) show that

$$\begin{aligned} \left( \frac{i}{k}\right) ^{\alpha }\le A_{i+1}^k\le \left( \frac{i+1}{k+1}\right) ^{\alpha }, \end{aligned}$$

and

$$\begin{aligned} \frac{i}{\alpha -1}\le \sum _{k=i}^\infty A_{i+1}^k\le \frac{i}{\alpha -1}\left( \frac{i+1}{i}\right) ^{\alpha } \end{aligned}$$

(see also [4] for further details). Letting \(K\rightarrow +\infty \) in (92), we deduce that

$$\begin{aligned} \sum _{i=1}^{\infty }i\omega _i\le C+D\sum _{i=1}^\infty i\theta _i<+\infty \end{aligned}$$

for appropriate constants C and D. \(\square \)

1.8 A discrete Gronwall lemma

Lemma A.8

Let \(c\ge 0\) and let \((a_k)\) and \((\beta _j )\) be nonnegative sequences such that \((\beta _j )\) is summable and

$$\begin{aligned} a_k^2 \le c^2 + \sum _{j=1}^k \beta _j a_j \end{aligned}$$

for all \(k\in \mathbb {N}\). Then, \(\displaystyle a_k \le c + \sum _{j=1}^{\infty } \beta _j\) for all \(k\in \mathbb {N}\).

Proof

For \(k\in \mathbb {N}\), set \(A_k := \max _{1\le m \le k} a_m \). Then, for \(1\le m\le k\), we have

$$\begin{aligned} a_m^2 \le c^2 + \sum _{j=1}^m \beta _j a_j \le c^2 + A_k \sum _{j=1}^{\infty } \beta _j. \end{aligned}$$

Taking the maximum over \(1\le m\le k\), we obtain

$$\begin{aligned} A_k^2 \le c^2 + A_k \sum _{j=1}^{\infty } \beta _j. \end{aligned}$$

Bounding by the roots of the corresponding quadratic equation, we obtain the result. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Attouch, H., Peypouquet, J. Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators. Math. Program. 174, 391–432 (2019). https://doi.org/10.1007/s10107-018-1252-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-018-1252-x

Keywords

Mathematics Subject Classification

Navigation