Abstract
The performance of branch-and-bound algorithms for deterministic global optimization is strongly dependent on the ability to construct tight and rapidly convergent schemes of lower bounds. One metric of the efficiency of a branch-and-bound algorithm is the convergence order of its bounding scheme. This article develops a notion of convergence order for lower bounding schemes for constrained problems, and defines the convergence order of convex relaxation-based and Lagrangian dual-based lower bounding schemes. It is shown that full-space convex relaxation-based lower bounding schemes can achieve first-order convergence under mild assumptions. Furthermore, such schemes can achieve second-order convergence at KKT points, at Slater points, and at infeasible points when second-order pointwise convergent schemes of relaxations are used. Lagrangian dual-based full-space lower bounding schemes are shown to have at least as high a convergence order as convex relaxation-based full-space lower bounding schemes. Additionally, it is shown that Lagrangian dual-based full-space lower bounding schemes achieve first-order convergence even when the dual problem is not solved to optimality. The convergence order of some widely-applicable reduced-space lower bounding schemes is also analyzed, and it is shown that such schemes can achieve first-order convergence under suitable assumptions. Furthermore, such schemes can achieve second-order convergence at KKT points, at unconstrained points in the reduced-space, and at infeasible points under suitable assumptions when the problem exhibits a specific separable structure. The importance of constraint propagation techniques in boosting the convergence order of reduced-space lower bounding schemes (and helping mitigate clustering in the process) for problems which do not possess such a structure is demonstrated.
Similar content being viewed by others
References
Adjiman, C.S., Floudas, C.A.: Rigorous convex underestimators for general twice-differentiable problems. J. Glob. Optim. 9(1), 23–40 (1996)
Belotti, P., Lee, J., Liberti, L., Margot, F., Wächter, A.: Branching and bounds tightening techniques for non-convex MINLP. Optim. Methods Softw. 24(4–5), 597–634 (2009)
Ben-Tal, A., Eiger, G., Gershovitz, V.: Global minimization by reducing the duality gap. Math. Program. 63(1–3), 193–212 (1994)
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. Society for Industrial and Applied Mathematics (2001). doi:10.1137/1.9780898718829
Bompadre, A., Mitsos, A.: Convergence rate of McCormick relaxations. J. Glob. Optim. 52(1), 1–28 (2012)
Bompadre, A., Mitsos, A., Chachuat, B.: Convergence analysis of Taylor models and McCormick-Taylor models. J. Glob. Optim. 57(1), 75–114 (2013)
Du, K., Kearfott, R.B.: The cluster problem in multivariate global optimization. J. Glob. Optim. 5(3), 253–265 (1994)
Dür, M.: Dual bounding procedures lead to convergent Branch-and-Bound algorithms. Math. Program. 91(1), 117–125 (2001)
Dür, M., Horst, R.: Lagrange duality and partitioning techniques in nonconvex global optimization. J. Optim. Theory Appl. 95(2), 347–369 (1997)
Epperly, T.G.W., Pistikopoulos, E.N.: A reduced space branch and bound algorithm for global optimization. J. Glob. Optim. 11(3), 287–311 (1997)
Floudas, C.A., Gounaris, C.E.: A review of recent advances in global optimization. J. Glob. Optim. 45(1), 3–38 (2009)
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx (2014)
Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer, Berlin (1996)
Hunter, J.K.: An Introduction to Real Analysis. University of California at Davis, Department of Mathematics (2014)
Kannan, R., Barton, P.I.: The cluster problem in constrained global optimization. J. Glob. Optim. (2017). doi:10.1007/s10898-017-0531-z
Khan, K.A.: Sensitivity analysis for nonsmooth dynamic systems. Ph.D. thesis, Massachusetts Institute of Technology (2015)
Khan, K.A., Watson, H.A.J., Barton, P.I.: Differentiable McCormick relaxations. J. Glob. Optim. 67(4), 687–729 (2017)
Krawczyk, R., Nickel, K.: Die zentrische form in der Intervallarithmetik, ihre quadratische Konvergenz und ihre Inklusionsisotonie. Computing 28(2), 117–137 (1982)
Liberti, L., Pantelides, C.C.: Convex envelopes of monomials of odd degree. J. Glob. Optim. 25(2), 157–168 (2003)
McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: part I: convex underestimating problems. Math. Program. 10(1), 147–175 (1976)
Misener, R., Floudas, C.A.: ANTIGONE: Algorithms for Continuous/Integer Global Optimization of Nonlinear Equations. J. Glob. Optim. 59(2–3), 503–526 (2014)
Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. Society for Industrial and Applied Mathematics, Philadelphia (2009)
Najman, J., Mitsos, A.: Convergence analysis of multivariate McCormick relaxations. J. Glob. Optim. 66(4), 597–628 (2016)
Ratschek, H., Rokne, J.: Computer methods for the range of functions. Mathematics and its Applications, Ellis Horwood Ltd (1984)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Rote, G.: The convergence rate of the sandwich algorithm for approximating convex functions. Computing 48(3–4), 337–361 (1992)
Sahlodin, A.M., Chachuat, B.: Convex/concave relaxations of parametric ODEs using Taylor models. Comput. Chem. Eng. 35(5), 844–857 (2011)
Schechter, M.: Principles of Functional Analysis, vol. 36, 2nd edn. American Mathematical Society (2001)
Schöbel, A., Scholz, D.: The theoretical and empirical rate of convergence for geometric branch-and-bound methods. J. Glob. Optim. 48(3), 473–495 (2010)
Scholz, D.: Theoretical rate of convergence for interval inclusion functions. J. Glob. Optim. 53(4), 749–767 (2012)
Sopasakis, P., Giraudo, D.: Basic properties of the point-to-set distance function. Mathematics Stack Exchange. http://math.stackexchange.com/questions/107478/basic-properties-of-the-point-to-set-distance-function (Version: 2012-02-10. Accessed 24 May 2017)
Stuber, M.D., Scott, J.K., Barton, P.I.: Convex and concave relaxations of implicit functions. Optim. Methods Softw. 30(3), 1–37 (2015)
Tawarmalani, M., Sahinidis, N.V.: Convex extensions and envelopes of lower semi-continuous functions. Math. Program. 93(2), 247–263 (2002)
Tawarmalani, M., Sahinidis, N.V.: Global optimization of mixed-integer nonlinear programs: a theoretical and computational study. Math. Program. 99(3), 563–591 (2004)
Tawarmalani, M., Sahinidis, N.V.: A polyhedral branch-and-cut approach to global optimization. Math. Program. 103(2), 225–249 (2005)
Tsoukalas, A., Mitsos, A.: Multivariate McCormick relaxations. J. Glob. Optim. 59(2–3), 633–662 (2014)
Wechsung, A.: Global optimization in reduced space. Ph.D. thesis, Massachusetts Institute of Technology (2014)
Wechsung, A., Schaber, S.D., Barton, P.I.: The cluster problem revisited. J. Glob. Optim. 58(3), 429–438 (2014)
Acknowledgements
The authors would like to thank Garrett Dowdy and Peter Stechlinski for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors gratefully acknowledge financial support from BP. This work was conducted as a part of the BP-MIT conversion research program.
Proofs
Proofs
1.1 Proof of Proposition 1
Proposition 1
Consider Problem (P) with \(m_E = 0\). Suppose f and \(g_j, \forall j \in \{1,\ldots ,m_I\}\), are Lipschitz continuous on \(X \times Y\). Furthermore, suppose \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}) \in X \times Y\) such that \({\mathbf {g}}({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}) < {\mathbf {0}}\) (i.e. \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }})\) is a Slater point). The dual lower bounding scheme has arbitrarily high convergence order at \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }})\).
Proof
The arguments below are closely related to the proof of Corollary 2.
Since we wish to prove that the dual lower bounding scheme has arbitrarily high convergence order at the feasible point \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }})\), it suffices to show that for each \(\beta > 0\), there exists \(\tau \ge 0, \delta > 0\) such that for every \(Z \in {\mathbb {I}}(X \times Y)\) with \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}) \in Z\) and \(w(Z) \le \delta \),
and the desired result follows by analogy to Lemma 5 by observing that the dual lower bounding scheme is at least first-order convergent at \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }})\).
Let \(g_j({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}) = -\varepsilon _j < 0, \forall j \in \{1,\ldots ,m_I\}\). Since \(g_j\) is continuous for each \(j \in \{1,\ldots ,m_I\}\), there exists \(\delta _j > 0, \forall j \in \{1,\ldots ,m_I\}\), such that \({||}({\mathbf {x}},{\mathbf {y}}) - ({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}){||}_{\infty } < \delta _j\) implies \({|}g_j({\mathbf {x}},{\mathbf {y}}) - g_j({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}){|} < \frac{\varepsilon _j}{2}\) (see Lemma 2).
Define \(\delta := {\mathop {\min }\limits _{j \in \{1,\ldots ,m_I\}}}{\delta _j}\), and note that \(\delta > 0\). Consider \(Z \in {\mathbb {I}}(X \times Y)\) such that \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}) \in Z\) and \(w(Z) \le \delta \). For each \(({\mathbf {x}},{\mathbf {y}}) \in Z, j \in \{1,\ldots ,m_I\}\) we have \({|}g_j({\mathbf {x}},{\mathbf {y}}) - g_j({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }}){|} < \frac{\varepsilon _j}{2}\). Therefore, for each \(j \in \{1,\ldots ,m_I\}, g_j({\mathbf {x}},{\mathbf {y}})< -\frac{\varepsilon _j}{2} < 0, \,\forall ({\mathbf {x}},{\mathbf {y}}) \in Z\). Consequently,
since Problem (P) is effectively unconstrained over the small intervals Z around \(({\mathbf {x}}^{\text{ S }},{\mathbf {y}}^{\text{ S }})\), which implies \(\tau = 0\) and \(\delta = \underset{j \in \{1,\ldots ,m_I\}}{\min }{\delta _j}\) satisfy the requirements. \(\square \)
1.2 Proof of Proposition 2
Proposition 2
Consider Problem (P) with \(m_E = 0\). Suppose the functions f and \(g_j, j = 1,\ldots ,m_I\), are each of the form (W). Let \((f^{\text{ cv }}_{X(Z) \times Z})_{Z \in {\mathbb {I}}Y}\) denote a continuous scheme of convex relaxations of f in Y with convergence order \(\beta ^{\text{ cv }}_f > 0\) and corresponding constant \(\tau ^{\text{ cv }}_f, (g^{\text{ cv }}_{j,X(Z) \times Z})_{Z \in {\mathbb {I}}Y}, j = 1,\ldots , m_I\), denote continuous schemes of convex relaxations of \(g_1,\ldots ,g_{m_I}\), respectively, in Y with pointwise convergence orders \(\gamma ^{\text{ cv }}_{g,1}> 0, \ldots , \gamma ^{\text{ cv }}_{g,m_I} > 0\) and corresponding constants \(\tau ^{\text{ cv }}_{g,1}, \ldots , \tau ^{\text{ cv }}_{g,m_I}\).
Suppose \({\mathbf {y}}^{\text{ S }} \in Y\) is an unconstrained point in the reduced-space, and the scheme of lower bounding problems \(({\mathscr {L}}(Z))_{Z \in {\mathbb {I}}Y}\) with
has convergence of order \(\beta \in (0,\beta ^{\text{ cv }}_f]\) at \({\mathbf {y}}^{\text{ S }}\). Then the scheme of lower bounding problems \(({\mathscr {L}}(Z))_{Z \in {\mathbb {I}}Y}\) is at least \(\beta ^{\text{ cv }}_f\)-order convergent at \({\mathbf {y}}^{\text{ S }}\).
Proof
The proof is similar to the proof of Corollary 2.
Since \({\mathbf {y}}^{\text{ S }}\) is an unconstrained point in the reduced-space and \(g_j\) is continuous for each \(j \in \{1,\ldots ,m_I\}\) by virtue of Assumption 1, \(\exists \delta > 0\) such that \(\forall {\mathbf {z}}\in Y\) with \({||}{\mathbf {z}}- {\mathbf {y}}^{\text{ S }}{||}_{\infty } \le \delta \) (see Lemma 2), we have \({\mathbf {g}}({\mathbf {x}},{\mathbf {z}}) < {\mathbf {0}}, \forall {\mathbf {x}}\in X\).
Consider \(Z \in {\mathbb {I}}Y\) with \({\mathbf {y}}^{\text{ S }} \in Z\) and \(w(Z) \le \delta \). We have \(\overline{{\mathbf {g}}}(X(Z) \times Z) \subset {\mathbb {R}}^{m_I}_{-}\) and \(\overline{{\mathbf {g}}}^{\text {cv}}_{X(Z) \times Z}(X(Z) \times Z) \subset {\mathbb {R}}^{m_I}_{-}\). Consequently,
The desired result follows by analogy to Lemma 5 based on the assumption that \(({\mathscr {L}}(Z))_{Z \in {\mathbb {I}}Y}\) is at least \(\beta \)-order convergent at \({\mathbf {y}}^{\text{ S }}\). \(\square \)
1.3 Proof of Proposition 3
Proposition 3
Consider Problem (P) with \(m_E = 0\). Suppose \({\mathbf {y}}^{\text{ S }} \in Y\) is an unconstrained point in the reduced-space. Furthermore, suppose the reduced-space dual lower bounding scheme has convergence of order \(\beta > 0\) at \({\mathbf {y}}^{\text{ S }}\). Then the reduced-space dual lower bounding scheme has arbitrarily high convergence order at \({\mathbf {y}}^{\text{ S }}\).
Proof
The proof is closely related to the proof of Proposition 1.
Since \({\mathbf {y}}^{\text{ S }}\) is an unconstrained point in the reduced-space and \(g_j\) is continuous for each \(j \in \{1,\ldots ,m_I\}\) by virtue of Assumption 1, there exists \(\delta > 0\) such that \(\forall {\mathbf {z}}\in Y\) satisfying \({||}{\mathbf {z}}- {\mathbf {y}}^{\text{ S }}{||}_{\infty } \le \delta \) (see Lemma 2), we have \({\mathbf {g}}({\mathbf {x}},{\mathbf {z}}) < 0, \forall {\mathbf {x}}\in X\).
Consider \(Z \in {\mathbb {I}}Y\) with \(w(Z) \le \delta \). Since \(\overline{{\mathbf {g}}}(X(Z) \times Z) \subset {\mathbb {R}}^{m_I}_{-}\), Problem (P) can be reformulated as
The dual lower bound can be bounded from below as
The desired result follows by analogy to Lemma 5 and the assumption that the dual lower bounding scheme is at least \(\beta \)-order convergent at \({\mathbf {y}}^{\text{ S }}\). \(\square \)
1.4 Proof of arguments in Example 18
Proof
We first show that \(x^{\text{ f }}_Z - x^* \ge 0.2\varepsilon + o\left( \varepsilon \right) \).
Next, we derive an expression for \({\bar{x}}_Z(y) - x^*(y)\).
We next establish the dependence of the different terms in Eq. (11) on \(\varepsilon \). We first derive an expression for \(\exp (3 - y^{\text{ L }}) + \exp (3 - y) + 4x^{\text{ U }}_Z\).
Next, we derive an expression for \(4x^{\text{ U }}_Z \left( \exp (3 - y^{\text{ L }}) - \exp (3 - y) \right) \).
Finally, we consider \(\sqrt{\left( \exp (3 - y^{\text{ L }}) \right) ^2 + 40 + 4x^{\text{ U }}_Z \left( \exp (3 - y^{\text{ L }}) - \exp (3 - y) \right) } + \displaystyle \sqrt{\left( \exp (3-y) \right) ^2 + 40}\).
Substituting the above expressions in Eq. (11), we get
with
for some \({\hat{\tau }} \ge 0\) since \(y \in Z = [y^{\text{ L }}, y^{\text{ U }}]\) with \(w(Z) = O(\varepsilon )\) and each term in the expression for \(\alpha \) is \(O(\varepsilon )\). Note that \(\alpha \ge 0\) (since \(\bar{x}_Z(y) \ge x^*(y), \, \forall y \in Z\)). \(\square \)
Rights and permissions
About this article
Cite this article
Kannan, R., Barton, P.I. Convergence-order analysis of branch-and-bound algorithms for constrained problems. J Glob Optim 71, 753–813 (2018). https://doi.org/10.1007/s10898-017-0532-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-017-0532-y
Keywords
- Global optimization
- Constrained optimization
- Convergence order
- Convex relaxation
- Lagrangian dual
- Branch-and-bound
- Lower bounding scheme
- Reduced-space