Abstract
We analyze worst-case complexity of a Proximal augmented Lagrangian (Proximal AL) framework for nonconvex optimization with nonlinear equality constraints. When an approximate first-order (second-order) optimal point is obtained in the subproblem, an \(\epsilon \) first-order (second-order) optimal point for the original problem can be guaranteed within \({\mathcal {O}}(1/ \epsilon ^{2 - \eta })\) outer iterations (where \(\eta \) is a user-defined parameter with \(\eta \in [0,2]\) for the first-order result and \(\eta \in [1,2]\) for the second-order result) when the proximal term coefficient \(\beta \) and penalty parameter \(\rho \) satisfy \(\beta = {\mathcal {O}}(\epsilon ^\eta )\) and \(\rho = \varOmega (1/\epsilon ^\eta )\), respectively. We also investigate the total iteration complexity and operation complexity when a Newton-conjugate-gradient algorithm is used to solve the subproblems. Finally, we discuss an adaptive scheme for determining a value of the parameter \(\rho \) that satisfies the requirements of the analysis.
This is a preview of subscription content, access via your institution.
Notes
- 1.
Circumstances under which the penalty parameter sequence of ALGENCAN is bounded are discussed in [1, Section 5].
References
- 1.
Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: On augmented Lagrangian methods with general lower-level constraints. SIAM J. Optim. 18(4), 1286–1309 (2008). https://doi.org/10.1137/060654797
- 2.
Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: Second-order negative-curvature methods for box-constrained and general constrained optimization. Comput. Optim. Appl. 45(2), 209–236 (2010). https://doi.org/10.1007/s10589-009-9240-y
- 3.
Andreani, R., Fazzio, N., Schuverdt, M., Secchin, L.: A sequential optimality condition related to the quasi-normality constraint qualification and its algorithmic consequences. SIAM J. Optim. 29(1), 743–766 (2019). https://doi.org/10.1137/17M1147330
- 4.
Andreani, R., Haeser, G., Ramos, A., Silva, P.J.S.: A second-order sequential optimality condition associated to the convergence of optimization algorithms. IMA J. Numer. Anal. 37(4), 1902–1929 (2017)
- 5.
Andreani, R., Martínez, J.M., Ramos, A., Silva, P.J.S.: A cone-continuity constraint qualification and algorithmic consequences. SIAM J. Optim. 26(1), 96–110 (2016). https://doi.org/10.1137/15M1008488
- 6.
Andreani, R., Secchin, L., Silva, P.: Convergence properties of a second order augmented Lagrangian method for mathematical programs with complementarity constraints. SIAM J. Optim. 28(3), 2574–2600 (2018). https://doi.org/10.1137/17M1125698
- 7.
Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, Cambridge (2014)
- 8.
Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149(1), 301–327 (2015). https://doi.org/10.1007/s10107-014-0753-5
- 9.
Birgin, E.G., Floudas, C.A., Martínez, J.M.: Global minimization using an augmented Lagrangian method with variable lower-level constraints. Math. Program. 125(1), 139–162 (2010). https://doi.org/10.1007/s10107-009-0264-y
- 10.
Birgin, E.G., Gardenghi, J., Martínez, J.M., Santos, S.A., Toint, P.L.: Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models. Math. Program. 163(1–2), 359–368 (2017)
- 11.
Birgin, E.G., Haeser, G., Ramos, A.: Augmented Lagrangians with constrained subproblems and convergence to second-order stationary points. Comput. Optim. Appl. 69(1), 51–75 (2018). https://doi.org/10.1007/s10589-017-9937-2
- 12.
Birgin, E.G., Martínez, J.M.: Complexity and performance of an augmented Lagrangian algorithm. Optim. Methods Softw. (2020). https://doi.org/10.1080/10556788.2020.1746962
- 13.
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011). https://doi.org/10.1561/2200000016
- 14.
Cartis, C., Gould, N., Toint, P.: On the evaluation complexity of composite function minimization with applications to nonconvex nonlinear programming. SIAM J. Optim. 21(4), 1721–1739 (2011)
- 15.
Cartis, C., Gould, N., Toint, P.: Complexity bounds for second-order optimality in unconstrained optimization. J. Complex. 28(1), 93–108 (2012)
- 16.
Cartis, C., Gould, N.I.M., Toint, P.L.: On the evaluation complexity of cubic regularization methods for potentially rank-deficient nonlinear least-squares problems and its relevance to constrained nonlinear optimization. SIAM J. Optim. 23(3), 1553–1574 (2013). https://doi.org/10.1137/120869687
- 17.
Cartis, C., Gould, N.I.M., Toint, P.L.: On the complexity of finding first-order critical points in constrained nonlinear optimization. Math. Program. Ser. A 144, 93–106 (2014)
- 18.
Cartis, C., Gould, N.I.M., Toint, P.L.: Optimization of orders one to three and beyond: characterization and evaluation complexity in constrained nonconvex optimization. J. Complex. 53, 68–94 (2019)
- 19.
Curtis, F.E., Jiang, H., Robinson, D.P.: An adaptive augmented Lagrangian method for large-scale constrained optimization. Math. Program. 152(1), 201–245 (2015). https://doi.org/10.1007/s10107-014-0784-y
- 20.
Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156(1), 59–99 (2016). https://doi.org/10.1007/s10107-015-0871-8
- 21.
Grapiglia, G.N., Nesterov, Y.: Regularized Newton methods for minimizing functions with Hölder continuous Hessians. SIAM J. Optim. 27(1), 478–506 (2017). https://doi.org/10.1137/16M1087801
- 22.
Grapiglia, G.N., Yuan, Y.X.: On the complexity of an augmented Lagrangian method for nonconvex optimization. arXiv e-prints arXiv:1906.05622 (2019)
- 23.
Haeser, G., Liu, H., Ye, Y.: Optimality condition and complexity analysis for linearly-constrained optimization without differentiability on the boundary. Math. Program. (2018). https://doi.org/10.1007/s10107-018-1290-4
- 24.
Hajinezhad, D., Hong, M.: Perturbed proximal primal-dual algorithm for nonconvex nonsmooth optimization. Math. Program. (2019). https://doi.org/10.1007/s10107-019-01365-4
- 25.
Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4(5), 303–320 (1969). https://doi.org/10.1007/BF00927673
- 26.
Hong, M., Hajinezhad, D., Zhao, M.M.: Prox-PDA: The proximal primal-dual algorithm for fast distributed nonconvex optimization and learning over networks. In: D. Precup, Y.W. Teh (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70, pp. 1529–1538. PMLR (2017). http://proceedings.mlr.press/v70/hong17a.html
- 27.
Jiang, B., Lin, T., Ma, S., Zhang, S.: Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis. Comput. Optim. Appl. 72(1), 115–157 (2019). https://doi.org/10.1007/s10589-018-0034-y
- 28.
Liu, K., Li, Q., Wang, H., Tang, G.: Spherical principal component analysis. In: Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 387–395 (2019). https://doi.org/10.1137/1.9781611975673.44
- 29.
Nouiehed, M., Lee, J.D., Razaviyayn, M.: Convergence to second-order stationarity for constrained non-convex optimization. arXiv e-prints arXiv:1810.02024 (2018)
- 30.
O’Neill, M., Wright, S.J.: A log-barrier Newton-CG method for bound constrained optimization with complexity guarantees. IMA J. Numer. Anal. (2020). https://doi.org/10.1093/imanum/drz074
- 31.
Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Optimization (Sympos., Univ. Keele, Keele, 1968), pp. 283–298. Academic Press, London (1969)
- 32.
Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976). https://doi.org/10.1287/moor.1.2.97
- 33.
Royer, C.W., O’Neill, M., Wright, S.J.: A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization. Math. Program. (2019)
- 34.
Sun, J., Qu, Q., Wright, J.: Complete dictionary recovery over the sphere. In: 2015 International Conference on Sampling Theory and Applications (SampTA), pp. 407–410 (2015)
- 35.
Zhang, J., Luo, Z.Q.: A proximal alternating direction method of multiplier for linearly constrained nonconvex minimization. SIAM J. Optim. 30(3), 2272–2302 (2020). https://doi.org/10.1137/19M1242276
Acknowledgements
Research supported by Award N660011824020 from the DARPA Lagrange Program, NSF Awards 1628384, 1634597, and 1740707; and Subcontract 8F-30039 from Argonne National Laboratory.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs of Elementary Results
Appendix: Proofs of Elementary Results
Proof of Theorem 1
Since \(x^*\) is a local minimizer of (1), it is the unique global solution of
for \(\delta > 0\) sufficiently small. For the same \(\delta \), we define \(x_k\) to be the global solution of
for a given \(\rho _k\), where \(\{ \rho _k \}_{k \ge 1}\) is a positive sequence such that \(\rho _k \rightarrow +\infty \). Note that \(x_k\) is well defined because the feasible region is compact and the objective is continuous. Suppose that z is any accumulation point of \(\{ x_k \}_{k \ge 1}\), that is, \(x_k \rightarrow z\) for \(k \in {\mathcal {K}}\), for some subsequence \({\mathcal {K}}\). Such a z exists because \(\{ x_k \}_{k \ge 1}\) lies in a compact set, and moreover, \(\Vert z - x^* \Vert \le \delta \). We want to show that \(z = x^*\). By the definition of \(x_k\), we have for any \(k \ge 1\) that
By taking the limit over \({\mathcal {K}}\), we have \(f(x^*) \ge f(z) + \frac{1}{4} \Vert z - x^* \Vert ^4\). From (70), we have
By taking limits over \({\mathcal {K}}\), we have that \(c(z) = 0\). Therefore, z is the global solution of (68), so that \(z = x^*\).
Without loss of generality, suppose that \(x_k \rightarrow x^*\) and \(\Vert x_k - x^* \Vert < \delta \). By first and second-order optimality conditions for (69), we have
Define \(\lambda _k \triangleq \rho _k c(x_k)\) and \(\epsilon _k \triangleq \max \{ \Vert x_k - x^* \Vert ^3, 3 \Vert x_k - x^* \Vert ^2, \sqrt{ 2( f(x^*) - \inf _{k \ge 1} f(x_k) )/\rho _k} \}\). Then by (71), (72), (73) and Definition 2, \(x_k\) is \(\epsilon _k\)-2o. Note that \(x_k \rightarrow x^*\) and \(\rho _k \rightarrow +\infty \), so \(\epsilon _k \rightarrow 0^+\). \(\square \)
Proof of Lemma 1
We prove by contradiction. Otherwise for any \(\alpha \) we could select sequence \(\{ x_k \}_{k \ge 1} \subseteq S_{\alpha }^0\) such that \(f(x_k) + \frac{\rho _0}{2} \Vert c(x_k) \Vert ^2 < - k\). Let \(x^*\) be an accumulation point of \(\{ x_k \}_{k \ge 1}\) (which exists by compactness of \(S_\alpha ^0\)). Then there exists index K such that \(f(x^*) + \frac{\rho _0}{2} \Vert c(x^*) \Vert ^2 \ge -K + 1 > f(x_k) + \frac{\rho _0}{2} \Vert c(x_k) \Vert ^2 + 1\) for all \(k \ge K\), which contradicts the continuity of \(f(x) + \frac{\rho _0}{2} \Vert c(x) \Vert ^2\). \(\square \)
Rights and permissions
About this article
Cite this article
Xie, Y., Wright, S.J. Complexity of Proximal Augmented Lagrangian for Nonconvex Optimization with Nonlinear Equality Constraints. J Sci Comput 86, 38 (2021). https://doi.org/10.1007/s10915-021-01409-y
Received:
Revised:
Accepted:
Published:
Keywords
- Optimization with nonlinear equality constraints
- Nonconvex optimization
- Proximal augmented Lagrangian
- Complexity analysis
- Newton-conjugate-gradient
Mathematics Subject Classification
- 68Q25
- 90C06
- 90C26
- 90C30
- 90C60