1 Introduction

During the last few decades, interior-point methods for convex programming have made a significant impact on the modern optimization landscape and are now grown to mature and well established areas of mathematical programming. Beginning in the late 1970s, the development of Khachiyan’s ellipsoid method [38] as first polynomial-time algorithm for linear programming (LP) and Karmarkar’s projective-scaling method [37] as first proven polynomial-time interior-point method (IPM) blurred the long-perceived border between linear and nonlinear optimization and revealed that “the great watershed in optimization isn’t between linearity and nonlinearity, but convexity and nonconvexity” [15, 62]. Since then, hundreds of researchers have joined this “interior-point revolution” [69] and contributed numerous new results and trends to the further growing area of conic convex optimization and its applications in a host of different disciplines including optimal control, design, statistics, finance, and many of the other engineering and management sciences. The reader is referred to the various chapters in this handbook providing best evidence for the breadth and depth of research that has evolved in recent years, and more specifically to the review articles [55, 69] for accounts on the historical and technical development of IPMs.

Forming the foundation for the initial development of conic optimization [54] in general and the area of semidefinite programming (SDP) in particular, the study of different classes of IPMs has already been highlighted in the “old” handbook of SDP [68] in which three of four chapters on algorithms [3, 53, 67] give a detailed overview of some of the most important potential-reduction, primal-dual, and path-following IPM variants. The more recent paradigm of self-regularity that has been developed since then is described, in this new version, in a new Chap. 15 of this Handbook. In addition, algorithmic implementations of some of the most successful methods are now provided in sophisticated and user-friendly optimization software, which are exemplified in several of the software chapters and more comprehensively discussed in the survey by Mittelmann for this new rendition of the Handbook.

While the theory and algorithms of IPMs today are relatively well understood and already widely used in practice, however, their theoretical efficiency and proven polynomial-time complexity does not – and was not expected to – signify ultimate and universal success in solving each corresponding problem instance without remaining challenges or difficulties. Similar to the LP case in which the simplex method continues to be of major importance in practice and often superior especially for specially-structured problems, alternative algorithms also exist for SDP and some of its related variants. Most notably, bundle methods for nonsmooth optimization can be used to solve certain classes of trace-constrained SDPs as equivalent eigenvalue problems, and the reader can find their description in more detail in the fourth chapter on algorithms in the earlier handbook [33] or the slightly newer review article [52]. Different from IPMs whose second-order nature limits their applicability especially for large-scale problems, bundle methods are able to exploit sparsity and special structures and thus perform better on problems whose geometry is sufficiently well understood, for example, SDP relaxations of many combinatorial problems.

Nevertheless, in contrast to the size limitation of general second-order methods, the rich theory of IPMs continues to provide a strong theoretical tool for the analysis and design of new algorithms that typically is unavailable for the majority of other, lower-order methods. In addition – and more importantly at least for the motivation of the present chapter – IPMs can also be computationally advantageous due to their overall robustness when solving problems whose underlying structure or geometry is less well studied or completely unknown. This independence of problem instance and specific application is rapidly gaining significance as SDP approaches and models become more widely used and need to be routinely solved by researchers and practitioners without time or resources to customize a first-order method. Hence, the main objective of this contribution is to review some of the recent enhancements of IPMs with a particular emphasis on cutting-plane algorithms and warm-starting strategies for LP and SDP relaxations of combinatorial optimization problems. Some preliminaries for this discussion are briefly provided in the following Sect. 17.2. Other means of improvements, such as the exploitation of sparsity and parallelization, are omitted here but treated as subjects in some of the other handbook chapters.

The subsequent central discussion then focuses first on various cutting-plane approaches that replace an original, difficult problem with a sequence of approximate and more easily solvable relaxations which are iteratively improved by adding new violated inequalities that remove infeasible iterates until the new solution becomes feasible also for the initial problem. This basic scheme depends on the suitable selection or generation of separating inequalities and possible ways to accelerate the solution of subsequent relaxations. Some new enhancements of methods based on analytic centers and related approaches are described in Sect. 17.3, which also highlights some of the potential uses of primal-dual interior-point techniques for the early addition and possible removal of cutting-planes to expedite convergence and keep each subproblem at a manageable size, thus reducing the overall computation time of the algorithm. The second topic that received significant attention in recent years includes interior-point warm starts to overcome the inherent challenge that optimal solutions are necessarily non-interior and thus not suited as initial starting points for IPMs. Predominantly studied for LP, we present possible generalizations for SDP in Sect. 17.4. A practical implementation of some of these ideas with representative computational results are briefly discussed in Sect. 17.5, and the concluding Sect. 17.6 offers summarizing remarks and describes current directions of further research.

2 Preliminaries and Notation

Let C ∈ n ×n and Ai ∈ n ×n, i = 1, , m, be real symmetric matrices, and b ∈ m be a real vector. A semidefinite program in primal-dual standard form is given by

$$\begin{array}{rcl} & & \min \ C \bullet X\text{ s.t. }\mathcal{A}(X) = b,\ X \succcurlyeq 0\end{array}$$
(17.1a)
$$\begin{array}{rcl}& & \max \ {b}^{T}y\text{ s.t. }S = C -{\mathcal{A}}^{T}(y) \succcurlyeq 0\end{array}$$
(17.1b)

where X ≽ 0 constrains the matrix X ∈ n ×n to be symmetric and positive semidefinite, y ∈ m is the unrestricted dual variable, S ∈ n ×n is the symmetric and positive semidefinite dual matrix, and

$$\begin{array}{rcl} & & C \bullet X = Tr (CX) =\sum\limits_{i=1}^{n}\sum\limits_{j=1}^{n}{C}_{ij}{X}_{ij},\end{array}$$
(17.2a)
$$\begin{array}{rcl}& & \mathcal{A} : {\mathbb{R}}^{n\times n} \rightarrow {\mathbb{R}}^{m} : X\mapsto {({A}_{ 1} \bullet X,\ldots ,{A}_{m} \bullet X)}^{T},\end{array}$$
(17.2b)
$$\begin{array}{rcl}& & {\mathcal{A}}^{T} : {\mathbb{R}}^{m} \rightarrow {\mathbb{R}}^{n\times n} : y\mapsto \sum\limits_{ i=1}^{m}{y}_{ i}{A}_{i}. \end{array}$$
(17.2c)

It is easy to see that problem (17.1) includes linear programs as special case if we let C = { Diag} (c) and Ai = { Diag} (ai) be the diagonal matrices of c ∈ n and of the rows of A = (a1, , an)T ∈ m ×n, respectively. The problem pair satisfies weak duality

$$C \bullet X - {b}^{T}y = X \bullet S \geq 0$$
(17.3)

for every feasible primal-dual iterate (X, y, S), and it satisfies strong duality if there exists a strictly feasible (interior) point X ≻ 0 and S ≻ 0, for which both X and S are positive definite. In particular, then (X, y, S) is optimal if and only if it is feasible and satisfies the complementarity condition XS = 0, or equivalently, if X ∙ S = 0. For simplicity, throughout this chapter we assume that strictly feasible points exist.

2.1 Primal-Dual Interior-Point Methods

The general scheme of most primal-dual methods is to start from an initial, generally feasible or infeasible interior point (X0, y0, S0) and to compute in each iteration a direction (ΔXk, Δyk, ΔSk) and a step length αk > 0 so that the sequence of iterates

$$({X}^{k+1},{y}^{k+1},{S}^{k+1}) = ({X}^{k},{y}^{k},{S}^{k}) + {\alpha }^{k}(\Delta {X}^{k},\Delta {y}^{k},\Delta {S}^{k})$$
(17.4)

remains interior but converges to an optimal solution of (17.1). The central idea for the computation of directions stems from the concept of the central path that can either be motivated from a perturbed set of first-order optimality conditions of problem (17.1), or derived from the optimality conditions of its primal (or dual) barrier problem

$$\min \ C \bullet X - \mu \log \det (X)\text{ s.t. }\mathcal{A}(X) = b\ (X \succ 0).$$
(17.5)

The log-determinant term \(\log \det (X) =\log \prod\limits_{i=1}^{n}{\lambda }_{i}(X) =\sum\limits_{i=1}^{n}\log {\lambda }_{i}(X)\) forms a self-concordant barrier function for the set of symmetric, positive definite matrices with strictly positive eigenvalues λi > 0, i = 1, , n, and μ > 0 is a positive barrier parameter. Then substituting \(S = \mu {X}^{-1}\) in the optimality conditions of problem (17.5), the parametrized solutions (X(μ), y(μ), S(μ)) define the primal-dual central path

$$\begin{array}{rcl} \mathcal{C} =& & \{(X(\mu ),y(\mu ),S(\mu )) :\ X(\mu ) \succ 0,\ S(\mu ) \succ 0,\ \mu > 0,\qquad \end{array}$$
(17.6a)
$$\begin{array}{rcl} & & \mathcal{A}(X(\mu )) = b,\ {\mathcal{A}}^{T}(y(\mu )) + S(\mu ) = C,\ X(\mu )S(\mu ) = \mu E\}\qquad \end{array}$$
(17.6b)

where E ∈ n ×n is the identity, and (X(μ), y(μ), S(μ)) is unique for every μ > 0.

Path-following IPMs compute iterates in a neighborhood of \(\mathcal{C}\) by taking steps into a direction that solves a suitable linearization of the associated Newton system

$$\begin{array}{rcl} \mathcal{A}(\Delta X)& =& b -\mathcal{A}(X)\end{array}$$
(17.7a)
$$\begin{array}{rcl}{ \mathcal{A}}^{T}(\Delta y) + \Delta S& =& C -{\mathcal{A}}^{T}(y) - S\end{array}$$
(17.7b)
$$\begin{array}{rcl} X(\Delta S) + (\Delta X)S& =& \mu E - XS\end{array}$$
(17.7c)

and successively reduce the barrier parameter μ toward zero. Because  (17.7c) is typically not symmetric, unlike for LP, system (17.7) is generally overdetermined and therefore must be symmetrized. The reader is referred to the old handbook [53, 67] for a discussion of different strategies and the resulting families of search directions. The generic primal-dual path-following IPM outlined above is given in Algorithm 1.

Like every other IPM, primal-dual algorithms usually terminate at approximate solutions within a specified tolerance ε of a feasible and optimal solution. The second parameter σ in Algorithm 1 controls the reduction of the barrier parameter in each iteration and particularly for the class of predictor-corrector methods can be varied between 0 and 1 to result in the best known iteration complexity of \(O(\sqrt{n}L\log (1/\epsilon ))\) for any IPM, where L is used to denote the bit length of the (rational) problem data.

2.2 Handling of Linear Inequalities

For reference in later sections we consider and briefly discuss two non-standard forms of semidefinite programs using inequality constraints and unrestricted variables in the primal problem, respectively. The first situation arises naturally in cutting-plane methods for combinatorial problems which modify the standard formulation (17.1) by additional linear inequalities on the primal side, and a new associated dual variable

$$\begin{array}{rcl} \min \ & C \bullet X& \text{ s.t. }\mathcal{A}(X) = b,\ \mathcal{P}(X) \geq q,\ X \succcurlyeq 0\end{array}$$
(17.8a)
$$\begin{array}{rcl} \max \ & {b}^{T}y + {q}^{T}z& \text{ s.t. }S = C -{\mathcal{A}}^{T}(y) -{\mathcal{P}}^{T}(z) \succcurlyeq 0,\ z \geq 0.\end{array}$$
(17.8b)

Here \(\mathcal{P}\) and \({\mathcal{P}}^{T}\) are defined analogously to \(\mathcal{A}\) and \({\mathcal{A}}^{T}\) in (17.2b) and (17.2c) using real symmetric n ×n matrices Pi, q is a real vector, and z is the additional nonnegative dual variable. We let \(w = \mathcal{P}(X) - q \geq 0\) denote the primal nonnegative slack variable and for every feasible primal-dual iterate (X, w, y, z, S) obtain a new duality gap of

$$C \bullet X - {b}^{T}y - {q}^{T}z = X \bullet S + {w}^{T}z \geq 0.$$
(17.9)

In the central path equations (17.6) and the corresponding augmented Newton system,  (17.7a) and (17.7c) stay the same for (17.10a) and (17.10d), (17.7b) is changed to (17.10c), and (17.10b) and (17.10e) account for the new constraint and complementarity

$$\begin{array}{rcl} \mathcal{A}(\Delta X)& =& b -\mathcal{A}(X)\end{array}$$
(17.10a)
$$\begin{array}{rcl} \mathcal{P}(\Delta X) - \Delta w& =& q -\mathcal{P}(X) + w\end{array}$$
(17.10b)
$$\begin{array}{rcl}{ \mathcal{A}}^{T}(\Delta y) + {\mathcal{P}}^{T}(\Delta z) + \Delta S& =& C -{\mathcal{A}}^{T}(y) -{\mathcal{P}}^{T}(z) - S\end{array}$$
(17.10c)
$$\begin{array}{rcl} X(\Delta S) + (\Delta X)S& =& \mu E - XS\end{array}$$
(17.10d)
$$\begin{array}{rcl} W(\Delta z) + Z(\Delta w)& =& \mu e - Wz\end{array}$$
(17.10e)

where W = { Diag} (w), Z = { Diag} (z), and e = (1, , 1)T is the vector of all ones. In particular, unlike X and S, the two diagonal matrices W and Z commute and thus do not cause any new computational difficulties other than the often significant increase in size, if the number of inequalities is large. Moreover, whereas all equality constraints are satisfied at optimality and hence active by definition, especially in the context of relaxations many inequalities are likely to remain inactive and can be removed in principle, if known in advance. Typically, the redundancy of a primal constraint is indicated by a zero dual variable, but since IPMs always terminate at approximate solutions that are still interior, the complementarity condition WZ = 0, or equivalently, wizi = 0 is never satisfied exactly. For the common attempt to fix primal or dual variables to zero, which then can be removed and thereby reduce the problem size, a useful concept is based on the notion of an indicator function [19].

Let (Xk, wk, yk, zk, Sk) be a sequence of iterates defined by  (17.4) and

$$({w}^{k+1},{z}^{k+1}) = ({w}^{k},{z}^{k}) + {\alpha }^{k}(\Delta {w}^{k},\Delta {z}^{k})$$
(17.11)

that converges to an optimal solution (X ∗ , w ∗ , y ∗ , z ∗ , S ∗ ) for problem (17.8). Denote by \({\mathcal{Z}}_{0}^{{_\ast}} =\{ i : {z}_{i}^{{_\ast}} = 0\}\) the set of zero dual variables at optimality. An indicator χ for \({\mathcal{Z}}_{0}^{{_\ast}}\) can be defined as a function that assigns to each (wk, zk, Δwk, Δzk) a vector of extended reals χ(wk, zk, Δwk, Δzk) for which there exist vectors θ and ϕ such that

$${ \lim }_{k\rightarrow \infty }{\chi }_{i}({w}^{k},{z}^{k},\Delta {w}^{k},\Delta {z}^{k}) = \left \{\begin{array}{@{}l@{\quad }l@{}} {\theta }_{i}\quad &\text{ if }i \in {\mathcal{Z}}_{0}^{{_\ast}} \\ {\phi }_{i}\quad &\text{ if }i\notin {\mathcal{Z}}_{0}^{{_\ast}} \end{array} \right.$$
(17.12)

and maxθi < minϕi. It is said to satisfy a sharp or uniform separation property if

$$\max \{{\theta }_{i} : i \in {\mathcal{Z}}_{0}^{{_\ast}}\}\ll \min \{ {\phi }_{ i} : i\notin {\mathcal{Z}}_{0}^{{_\ast}}\},\text{ or if }{\theta }_{ i} = \theta \text{ and }{\phi }_{i} = \phi $$

for some θ and ϕ for all \(i \in {\mathcal{Z}}_{0}^{{_\ast}}\) and \(i\notin {\mathcal{Z}}_{0}^{{_\ast}}\), respectively. Common functions include a variable indicator χvar, a primal-dual indicator χprd, and the Tapia indicator χtap

$$\begin{array}{rcl}{ \chi }_{i}^{\mathrm{var}}({w}^{k},{z}^{k},\Delta {w}^{k},\Delta {z}^{k}) =& {z}_{ i}^{k},& {\theta }_{ i}^{\mathrm{var}} = {\phi }_{ i}^{\mathrm{var}} = {z}_{ i}^{{_\ast}},\end{array}$$
(17.13a)
$$\begin{array}{rcl}{ \chi }_{i}^{\mathrm{prd}}({w}^{k},{z}^{k},\Delta {w}^{k},\Delta {z}^{k}) =& {z}_{ i}^{k}/{w}_{ i}^{k},& {\theta }^{\mathrm{prd}} = 0,{\phi }^{\mathrm{prd}} = \infty ,\end{array}$$
(17.13b)
$$\begin{array}{rcl}{ \chi }_{i}^{\mathrm{tap}}({w}^{k},{z}^{k},\Delta {w}^{k},\Delta {z}^{k}) =& ({z}_{ i}^{k} + \Delta {z}_{ i}^{k})/{z}_{ i}^{k},& {\theta }^{\mathrm{tap}} = 0,{\phi }^{\mathrm{tap}} = 1.\end{array}$$
(17.13c)

Both primal-dual and Tapia indicator satisfy sharp and uniform separation under strict complementarity, while only the Tapia indicator is scale independent and its identification test χitap < θtap ⇒ zi ∗  = 0 relatively insensitive to the choice of the threshold parameter θtap. The variable indicator, although the simplest and most popular technique, does not satisfy these properties and thus may not give sufficiently quick and accurate information to improve the performance of an algorithm. A discussion of further properties and indicators is given in the review article [19].

2.3 Handling of Free Variables

The second non-standard form of relevance for some approaches in our later discussion uses a new vector u of unconstrained variables in the primal equality constraints

$$\begin{array}{rcl} & & \min \ C \bullet X + {d}^{T}u\text{ s.t. }\mathcal{A}(X) + Bu = b,\ X \succcurlyeq 0\end{array}$$
(17.14a)
$$\begin{array}{rcl} & & \max \ {b}^{T}y\text{ s.t. }{B}^{T}y = d,\ S = C -{\mathcal{A}}^{T}(y) \succcurlyeq 0\end{array}$$
(17.14b)

where d and B are the primal objective vector and constraint matrix associated with u, respectively. Although this extended formulation fits easily into the general theoretical framework of primal-dual IPMs, specifically with the same duality gap

$$C \bullet X + {d}^{T}u - {b}^{T}y = X \bullet S \geq 0$$
(17.15)

as the standard formulation (17.1), some practical consequences affect the computation of search directions (ΔX, Δu, Δy, ΔS) from the corresponding new Newton system

$$\begin{array}{rcl} \mathcal{A}(\Delta X) + B\Delta u& =& b -\mathcal{A}(X) - Bu\end{array}$$
(17.16a)
$$\begin{array}{rcl}{ B}^{T}\Delta y& =& d - {B}^{T}y\end{array}$$
(17.16b)
$$\begin{array}{rcl}{ \mathcal{A}}^{T}(\Delta y) + \Delta S& =& C -{\mathcal{A}}^{T}(y) - S\end{array}$$
(17.16c)
$$\begin{array}{rcl} X(\Delta S) + (\Delta X)S& =& \mu E - XS.\end{array}$$
(17.16d)

Discussed by Anjos and Burer [6], this Newton system remains invertible but now becomes indefinite and thus more difficult to solve in a quick, stable, and accurate manner. Furthermore, the often proposed approach to replace u by the difference of two nonnegative variables \(u = {u}^{+} - {u}^{-}\), u +  ≥ 0, u −  ≥ 0, and then solve an associated symmetric quasi-definite system has the simultaneous effect that the dual equality BTy = d is split into BTy ≥ d and BTy ≤ d which can be problematic due to degeneracy, unboundedness of the primal optimal solution set, and loss of an interior for the dual. In consideration of alternative approaches for LP [44] and SDP [39], they use ideas by Mészáros [44] and provide theoretical and computational support to handle free variables by regularization techniques that perturb equation (17.16b) to

$${B}^{T}\Delta y - \delta \Delta u = d - {B}^{T}y$$
(17.17)

where δ > 0 is the positive regularization parameter. This technique results in a definite system that can also maintain structure and sparsity of the matrices Ai and B. Although the new search direction may deviate from the original direction by O(δ), the careful choice of the regularization parameter, such that δkΔuk ≤ βσμ ∕ 2 in every iteration for some constant β > 0, guarantees the global convergence of a primal-dual path-following IPM to an optimal solution of the original problem (17.14).

3 Interior-Point Cutting-Plane Algorithms

The study of cutting-plane methods based on interior-point algorithms has received long-standing attention, and the reader is advised to consult the excellent tutorial chapter by Mitchell [48] for a recent survey and more detailed review of the relevant literature also for more general column-generation schemes, that include the use of IPMs in cutting-plane and cutting-surface methods for LP and SDP together with some related subgradient approaches for nonsmooth optimization. Mitchell was also among the first to explore specifically Karmarkar’s algorithm [37] for the solution of LP relaxations of combinatorial problems [45, 51], whereas improvements by SDP relaxations and their solvability with more general IPMs were largely prepared by Alizadeh [1, 2] and since then have led to a rich methodology and applicability of IPMs combined with cutting planes, branch-and-bound, or branch-and-cut. Another useful account in which much of the former material is comprehensively described is the recent, highly recommended survey chapter by Krishnan and Terlaky [42].

Computational studies of IPMs for LP and SDP relaxations with cutting planes have shown encouraging results [7, 34, 46, 49] but also revealed some practical limitations compared to active-set and simplex-based methods in the context of LP, that benefit from their superior warm-start capabilities, and in contrast to bundle methods for SDP that offer better ways to exploit structure and sparsity [24, 32]. Few attempts to design simplex-type methods for SDP were conceptually satisfying but have not yet been practically successful [59]. In particular, unlike simplex or bundle methods so far only IPMs provide theoretical performance guarantees and are shown to run in polynomial time also in the worst case. Besides their importance to prove convergence and analyze the complexity of algorithms, from a practical perspective it has been argued that IPMs can compete with simplex methods if many constraints are added at once, or if it is necessary to stabilize a column-generation process [48]. An early observation by Bixby et al. [14] states that especially in the situation of multiple dual solutions, the tendency of IPMs to produce points in the center of the optimal face typically facilitates the generation of stronger cutting planes than corresponding simplex-based methods, whose solutions coincide with face vertices that may give much weaker cuts. Consequently, some implementations use IPMs to expedite convergence and gain stability when adding large numbers of inequalities in earlier iterations, and only switch to a simplex-based approach once the process is close to optimality so that only few minor changes remain necessary [14, 50].

Primarily focusing on SDP relaxations, this section continues to describe cutting-plane methods that are based solely on IPMs and designed to exploit their strengths in utilizing interior points and specifically analytic centers for stronger separation, as well as the stability of primal-dual methods for a dynamic mechanism to add and again remove possibly large numbers of inequalities already at intermediate iterates.

3.1 Basic SDP Relaxation and Cutting Planes

Many SDP relaxations in combinatorial optimization arise from unconstrained or constrained binary quadratic programs in one of three essentially equivalent forms

$$\begin{array}{rcl} \min \ & {u}^{T}Qu + {q}^{T}u& \text{ s.t. }{u}_{i} \in \{ 0,1\}\ \forall \ i\ (u \in \Omega (u))\end{array}$$
(17.18a)
$$\begin{array}{rcl} \min \ & {v}^{T}Qv + 2{c}^{T}v& \text{ s.t. }{v}_{i} \in \{-1,1\}\ \forall \ i\ (v \in \Omega (v))\end{array}$$
(17.18b)
$$\begin{array}{rcl} \min \ & {x}^{T}Cx& \text{ s.t. }{x}_{i} \in \{-1,1\}\ \forall \ i\ (x \in \Omega (x)).\end{array}$$
(17.18c)

Using the variable transformation \(v = 2u - e\) and suitable definition of the objective vector \(c = Qe + q\), the binary quadratic (0, 1)-program (17.18a) is equivalent to the ( − 1, 1)-program (17.18b) which can also be written as (17.18c) if increasing its dimension by one, augmenting the matrix Q with the vector c in the new row and column, \(C = \left (\begin{matrix}\scriptstyle Q &\scriptstyle c \\ \scriptstyle {c}^{T}&\scriptstyle 0\end{matrix}\right )\), and introducing an additional variable whose value can be set arbitrarily to either plus or minus 1 [9, 17, 31, 34]. For the converse, it is clear that the last formulation is also subsumed in (17.18b) by letting Q = C and c = 0; furthermore it is well known to be equivalent to the maximum-cut problem if \(C = (W - Diag (We))/4\), where W is the weighted adjacency matrix of the underlying graph. Finally, for each formulation the set Ω may denote a set of other polyhedral equalities or inequalities.

Starting from the max-cut formulation (17.18c), the typical SDP relaxation follows from the observation that \({x}^{T}Cx = C \bullet x{x}^{T} = C \bullet X\), where X = xxT for some ( − 1, 1)-vector x if and only if X belongs to the cut polytope 1, which is defined as the set of symmetric positive semidefinite matrices that have unit diagonal and are of rank 1. The elliptope relaxes 1 by dropping the non-convex rank-1-constraint and can be tightened by additional triangle inequalities that produce the metric polytope 2

$$\begin{array}{rcl} \mathcal{E} =\{ X \succcurlyeq 0 : diag (X) = e\}& \qquad {\mathcal{E}}^{2} = \mathcal{E}\cap \{ X :& {X}_{ij} + {X}_{ik} + {X}_{jk} \geq -1, \\ {\mathcal{E}}^{1} = \mathcal{E}\cap \{ X : rank (X) = 1\}& & {X}_{ ij} - {X}_{ik} - {X}_{jk} \geq -1\} \\ \end{array}$$

where 1 ⊆ 2 ⊆ . In particular, the resulting SDP relaxations over elliptope or metric polytope match either the standard form (17.1) with \(\mathcal{A}(X) = diag (X)\) and b = e

$$\min C \bullet X\text{ s.t. }diag (X) = e,\ X \succcurlyeq 0$$
(17.19)

or the non-standard form (17.8), respectively. Because the large number of O(n3) triangle inequalities in 2 cannot be included all at once, however, a relevant subset is typically chosen and added successively within the general framework of a cutting-plane method. Finally, several other yet exponential classes of valid inequalities for the cut polytope 1 include the more general hypermetric and odd-cycle inequalities

$$\begin{array}{rcl} a{a}^{T} \bullet X \geq 1\text{ where }{a}_{ i} \in \{ 0,1\}{,\ \min }_{{x}_{i}\in \{-1,1\}}\left \vert {a}^{T}x\right \vert = 1,\ {e}^{T}a\text{ odd}\qquad & &\end{array}$$
(17.20a)
$$\begin{array}{rcl} X(C \setminus F) - X(F) \leq \left \vert C\right \vert - 2\text{ where }C\text{ is cycle},\ F \subseteq C,\ \left \vert F\right \vert \text{ odd}\qquad & &\end{array}$$
(17.20b)

which are described and analyzed in more detail in a classic monograph by Deza and Laurent [18]. The basic scheme of cutting-plane methods is outlined in Algorithm 2.

Basically independent of the specific algorithm for Step 1, violated cutting planes in Step 2 are often found using complete enumeration especially for triangle inequalities and other polynomial classes of cuts. For exponentially many cutting planes, such as clique or odd-cycle inequalities, other strategies are to extend already known cuts by growing previously selected cliques or cycles, or to use one of several other separation routines or cleverly chosen heuristics. Specific details for Steps 1 and 3, including warm starts and indicators which inequalities to drop, are discussed later.

3.2 Analytic-Center and Cutting-Surface Methods

Similar in spirit to the original ellipsoid (non-interior-point) method by Khachiyan [38], the basic idea of an analytic-center cutting-plane method (ACCPM) is to generate a sequence of approximate descriptions to some set of interest until finding a point that belongs to this set within some tolerance ε. Often employed as initial step in a larger interior-point scheme, these methods can also be used for the purpose of optimization, e.g., if the set \(\mathcal{C}\) is chosen as the set of optimal solutions to some LP or SDP relaxation with a large number of inequalities whose direct solution is computationally intractable. The study and analysis of the related algorithms is commonly based on a formulation like the following generic, convex feasibility problem [48]

Given a convex set \(\mathcal{C}\subseteq {\mathbb{R}}^{m}\) , either prove that \(\mathcal{C}\) is empty or find a point y ∈ ℝ m such that the Euclidean distance from y to \(\mathcal{C}\) is no greater than ε.

In its basic scheme, an ACCPM for problems of the form (17.8) typically starts from an initial outer approximation of the dual feasible region \(\{y \in {\mathbb{R}}^{m} : S = C -{\mathcal{A}}^{T}(y) \succcurlyeq 0\}\) and subsequently computes the new analytic center of each current relaxation, that is initially characterized by the corresponding pair of primal and dual barrier problems

$$\begin{array}{rcl} & & \min \ C \bullet X -\log \det (X)\text{ s.t. }\mathcal{A}(X) = 0\ (X \succ 0)\end{array}$$
(17.21a)
$$\begin{array}{rcl} & & \max \ -\log \det ({S}^{-1})\text{ s.t. }{\mathcal{A}}^{T}(y) + S = C\ (S \succ 0).\end{array}$$
(17.21b)

Under the common assumption that the solution set \(\mathcal{C}\) is bounded and contained in some box around the origin, the ACCPM algorithm then calls an oracle subroutine that either confirms that the new center belongs to \(\mathcal{C}\), in which case the process is terminated, or returns a separating inequality that is violated by the current point and defines a new cut that contains \(\mathcal{C}\) in one of its two halfspaces. This basic scheme is summarized in Algorithm 3 and concludes that the solution set \(\mathcal{C}\) is empty if no feasible point could be found after some permissible maximum number of iterations.

For the special case of LP, corresponding algorithms are well studied with a complexity of O(mlog(1 ∕ ε)2) whose proof largely depends on the use of efficient IPMs for the repeated solution of analytic centers. In particular, ACCPM variants that use approximate or volumetric centers further improve both practically and theoretically, reducing the complexity to O(mlog(1 ∕ ε)). Good reviews that are focused primarily on LP methods are given by Goffin and Vial [27] and Mitchell [47]. The above extension to SDP, where the initial relaxation is defined by a semidefiniteness constraint and linear constraints are added as cutting planes, remains solvable in finitely many iterations that are polynomial in the number of initial constraints and the maximum number of constraints added at each iteration [16, 64, 66]. The use of linear cut inequalities has also been extended to second-order cuts [57, 58] whose new analytic centers can be recovered within O(plog(p + 1)) Newton steps where p is the number of cuts added, to semidefinite cuts [56], and in other related ACCPM approaches for LP in combination with column generation and branch-and-price [20] as well as for feasibility problems in more general conic optimization [10, 11].

The extension from adding linear inequalities as cutting planes to second-order or semidefinite cuts as cutting surfaces can be exemplified by the SDP formulation

$$\begin{array}{rcl} & & \min \ C \bullet PV {P}^{T}\text{ s.t. }\mathcal{A}(PV {P}^{T}) = b,\ V \succcurlyeq 0\end{array}$$
(17.22a)
$$\begin{array}{rcl} & & \max \ {b}^{T}y\text{ s.t. }{\mathcal{A}}^{T}(y) + S = C,\ {P}^{T}SP \succcurlyeq 0\end{array}$$
(17.22b)

where the primal matrix X = PVPT is constrained by a new variable matrix V and a matrix P that simultaneously relaxes positive semidefiniteness in the dual. In particular, if {rank} (P) = n, then this pair of problems is equivalent to the original primal-dual pair (17.1) but otherwise enables to relax the dual by imposing additional restrictions on V. For example, the LP-like requirement that V be a diagonal matrix implies that the positive semidefiniteness constraint in the dual problem is relaxed to mere nonnegativity of the diagonal entries of PTSP. Similarly, it is possible to impose a block-diagonal structure onto V which in effect relaxes the dual, because only the corresponding blocks of PTSP then need to be positive semidefinite [56].

Other classes of cutting-plane or cutting-surface methods for SDP relaxations specifically over the set of constant-trace matrices \(\mathcal{X} =\{ X \succcurlyeq 0 : E \bullet X = 1\}\), where E is the identity and the right-hand-side set to one without loss of generality, arise from the Lagrangean dual and its formulation as equivalent eigenvalue optimization

$$\begin{array}{rcl} {\max }_{y}\Phi (y) & =& \max {}_{y}{\min }_{X\in \mathcal{X}}\ {b}^{T}y + (C -{\mathcal{A}}^{T}(y)) \bullet X\end{array}$$
(17.23a)
$$\begin{array}{rcl} & =& {\max }_{y}{b}^{T}y + {\lambda }_{\min }(C -{\mathcal{A}}^{T}(y)). \end{array}$$
(17.23b)

For the inner minimization subproblem over the set \(\mathcal{X}\) in (17.23a), an optimal solution is given by X = uuT where u is a vector from that eigenspace associated with the smallest (possibly negative) eigenvalue \({\lambda }_{\min }(C -{\mathcal{A}}^{T}(y))\) of the dual slack matrix \(S = C -{\mathcal{A}}^{T}(y)\). In particular, then the positive semidefiniteness of S in the original dual is equivalent to the nonsmooth constraint λmin(S) ≥ 0 which can be relaxed and successively used to generate semidefinite cutting surfaces of the form uTSu ≥ 0.

A good survey and further discussion of the above methods and several others is also given by Krishnan and Mitchell [41] who present a unifying framework of several cutting-plane methods for trace-constrained SDPs using the augmented formulation

$$\max \ {y}_{0} + {b}^{T}y - (\gamma /2){\left \Vert y -\hat{ y}\right \Vert }^{2}\text{ s.t. }{P}_{ i}^{T}(C -{\mathcal{A}}^{T}(y)){P}_{ i} \succcurlyeq {y}_{0}E$$
(17.24)

where y0 and y are the optimization variables, y is the current iterate, and γ ≥  0 is a positive parameter that is adjusted dynamically based on the progress of the algorithm. The variable y0 corresponds to the primal constant-trace constraint E ∙ X = 1, and the matrices (or vectors) Pi are the cuts that are generated in each iteration. Depending on how the bundle \(\mathcal{P} =\{ {P}_{1},{P}_{2},\ldots \}\) is treated, e.g., aggregated into a single matrix or added as separate constraints, and whether or not the parameter γ is zero or positive, formulation (17.24) reduces to various other known cutting-plane methods including the classical polyhedral cutting-plane method (γ = 0, Pi a vector), a primal active-set technique (γ = 0, Pi a matrix), or the classical polyhedral or spectral bundle method (γ > 0, Pi a vector or matrix, respectively), among others.

3.3 Approaches Based on Primal-Dual IPMs

The efficiency of IPMs to compute (approximate) optimal solutions or analytic centers in each iteration of the basic cutting-plane scheme in Algorithm 2 or the ACCPM in Algorithm 3 is often countered by their limitation to effectively use previous iterates to warm start or otherwise facilitate the solution of successive relaxations. In addition to the different warm-starting strategies that are described in the next section, this observation also motivated to investigate the effects of adding violated inequalities already much earlier at intermediate iterates, that are on one hand still relatively far from the optimal solution but, on the other hand, sufficiently interior to possibly produce much stronger cuts [34, 49]. In particular, the stability of primal-dual methods may enable to continue the algorithm with only minor adjustments to the current iterate and without the necessity to restart from a new initial point.

The principal success of such approaches based on a primal-dual path-following method for SDP relaxations of binary quadratic programs is first demonstrated by Helmberg and Rendl [34] who distinguish small and large adds at intermediate and final iterates of different rounds, respectively. Small adds allow the early addition of violated inequalities and are compensated by only a minor adjustment that pushes the current iterate back into the interior along the line segment towards the analytic center so that newly added violated inequalities are again satisfied. After every small add typically three regular iterations suffice until a full Newton step occurs which thereby recovers feasibility. After a total of 10 small adds no further cuts are added until an optimal solution of the current relaxation is obtained, which then is followed by a large add, removal of redundant inequalities as indicated by small dual variables, and a full restart of the next relaxation round from a newly chosen initial point.

For the selection of violated cuts and the detection of redundant inequalities, two common criteria are the amount of violation and the value of each associated dual variable, respectively, which essentially coincide with the variable indicators described in Sect. 17.2.2. Fortunately, the drawbacks of this indicator for the detection of zero variables are largely irrelevant for the detection of new inequalities which are not yet included in the problem and therefore not constrained by barrier terms that prevent their violation. Its lack of sharp and uniform property and its scale dependence are more significant for the removal of unnecessary inequalities, especially at intermediate iterates that inhibit their early recognition because the barrier parameter may still be too large for the dual variables to converge to zero [34]. In the context of LP, Mitchell [46] experienced that the costs of more sophisticated tests for the removal of inequalities outweigh the benefits of the reduction in the relaxation size. For SDP, some encouraging approaches based on the more promising primal-dual and Tapia indicators combined with enhanced strategies for warm-starting are currently explored using a new implementation [23]. These strategies, implementation, and computational results are part of the discussion in the next two sections.

4 Interior-Point Warm-Starting Strategies

By warm-starting, we mean to exploit information or a known optimal point of some initial optimization problem to accelerate or otherwise facilitate the solution of a closely related problem instance. Of major importance for the solution of successive relaxations in cutting-plane or column-generation approaches as well as branch-and-bound or cut, warm starts play an important role also in many other situations and practical applications. For example, many engineering designs or financial services are subject to frequently changing product specifications or market prices, and one typically anticipates only minor adjustments to accommodate for such changes so that the previous product still provides a good baseline or starting point for the re-optimization process. Repeated changes in the problem data also arise when solving specifically large-scale LPs using Benders or Dantzig–Wolfe decomposition, and especially after few changes it is expected that information from the previously solved problem can be used to simplify the solution of the updated problem instance.

Motivated by the general success of IPMs to provide competitive and at times superior alternatives to simplex algorithms in the case of LP, the problem to warm start IPMs has sparked significant interest and achieved encouraging progress especially during the last few years. Whereas computational savings are still – and will likely remain – inferior to warm starts using a dual simplex method, that benefit from the indirect characterization of solutions in terms of their sets of basic variables which is particularly effective if these sets are not too different, the perception that interior-point warm starts are generally not possible seems rectified but deserves and requires further investigation especially for SDP for which IPM warm starts are frequently without alternative. In the current literature restricted to only LP, each of the following strategies is the author’s often straightforward generalization for SDP; the reader is pointed to the references to compare their original LP formulations.

4.1 Restoration of Interiority Using Shifted Barriers

It is widely recognized that one of the major challenges in warm-starting IPMs is that optimal or near-optimal solutions are often not suited as initial iterates, and that IPMs tend to perform better when (re-)started from well-centered interior points in proximity to the central path or analytic center rather than close to the boundary of the underlying cone. While several ad-hoc ideas exist to correct or adjust solutions that are close to this boundary, maybe most prominently the concept of an iterate pool that keeps track of intermediate points or approximate analytic centers to not simply choose the last solution but a previous, still sufficiently interior iterate [28], more recent approaches have shown encouraging results based on an earlier idea to relax interiority by one of several modified or shifted-barrier formulations [60].

First considering the SDP in standard form (17.1), the arguably simplest and most intuitive of these formulations is based on the shifted-barrier method by Freund [25]

$$\min \ C \bullet X - \mu \log \det (X + \mu H)\text{ s.t. }\mathcal{A}(X) = b\ (X + \mu H \succ 0)$$
(17.25)

where H ≻ 0 is a constant, positive definite shift matrix that relaxes the positive definiteness of X, and μ is the regular barrier parameter that now also controls the amount of relaxation which fully vanishes only as μ tends to zero. As discussed in the original reference for LP, the close resemblance of this formulation with the original barrier problem (17.5) facilitates its theoretical analysis which specifically shows that for suitable H and the proper choice of an initial parameter value μ0, the number of iterations required to achieve μ ≤ ε is still bounded by \(O(\sqrt{n}L\log (1/\epsilon ))\). Since the choices of both H and the initial μ0 require a priori knowledge of an approximate analytic center of the dual feasible region, however, these results are mostly theoretical and the approach in general not fully satisfactory for practical implementation.

During the last few years, several new approaches have been formulated that adopt the essential idea underlying these shifted-barrier methods to relax the explicit positive semidefiniteness of the primal matrix X and ensure this necessary condition for feasibility indirectly through some other means. Furthermore, and unlike the first formulation (17.25), especially those methods that relax both X and the dual matrix S within a primal-dual framework have shown good computational performance. Originally formulated for LP, the following SDP formulation is the straightforward extension of the primal-dual penalty method approach by Benson and Shanno [13]

$$\begin{array}{rcl} & \min \quad \qquad C \bullet X + D \bullet {X}^{{\prime}} & \\ & \text{ s.t. }\quad \qquad \mathcal{A}(X) = b,\ U \geq X \succcurlyeq -{X}^{{\prime}},\ {X}^{{\prime}}\geq 0&\end{array}$$
(17.26a)
$$\begin{array}{rcl} & \max \quad \qquad {b}^{T}y - U \bullet {S}^{{\prime}} & \\ & \text{ s.t. }\quad \qquad {\mathcal{A}}^{T}(y) + S = C,\ D \geq S + {S}^{{\prime}}\succcurlyeq 0,\ {S}^{{\prime}}\geq 0&\end{array}$$
(17.26b)

which introduces two auxiliary non-negative matrices X and S that relax the positive semidefiniteness of X and S, respectively, but are simultaneously penalized in the objective function to again vanish at optimality and thereby confer this condition back onto the original matrices. The associated inequalities require that U − X, and X are non-negative, and that X + X is positive semidefinite for the primal, and that \(D - S - {S}^{{\prime}}\), and S are non-negative, and that S + S is positive semidefinite for the dual. Hence, for this formulation to be exact it is important that D and U are chosen sufficiently large to force X and S to zero but also to provide admissible upper bounds on the optimal matrices X and S. Adaptive strategies to choose and dynamically update these penalties for LP have shown good success and already led to promising computational performance both by itself and combined with additional techniques to identify blocking components of the resulting search direction using sensitivity analysis [30]. In particular, because X and S are now unrestricted matrices with free variable entries, a regularized version of this method can conveniently be treated using the framework of Sect. 17.2.3.

A second approach [21] removes the dependency of formulation (17.26) on additional parameters and utilizes the possibility of IPMs to start from infeasible iterates while achieving positive semidefiniteness of X and S via their associated slack matrices

$$\begin{array}{rcl} \min \ & C \bullet X& \text{ s.t. }\mathcal{A}(X) = b,\ X - {X}^{{\prime}} = 0,\ {X}^{{\prime}}\succcurlyeq 0\end{array}$$
(17.27a)
$$\begin{array}{rcl} \max \ & {b}^{T}y& \text{ s.t. }{\mathcal{A}}^{T}(y) + S = C,\ S - {S}^{{\prime}} = 0,\ {S}^{{\prime}}\succcurlyeq 0.\end{array}$$
(17.27b)

The redundancy in this formulation, that is always exact and in fact an equivalent reformulation of the original SDP (17.1) in a higher-dimensional space, can be exploited computationally for the efficient solution of the resulting associated Newton system

$$\begin{array}{rcl} \mathcal{A}(\Delta X)& =& b -\mathcal{A}(X)\end{array}$$
(17.28a)
$$\begin{array}{rcl}{ \mathcal{A}}^{T}(\Delta y) + \Delta S& =& C -{\mathcal{A}}^{T}(y) - S\end{array}$$
(17.28b)
$$\begin{array}{rcl} \Delta X - \Delta {X}^{{\prime}}& =& {X}^{{\prime}}- X\end{array}$$
(17.28c)
$$\begin{array}{rcl} \Delta S - \Delta {S}^{{\prime}}& =& {S}^{{\prime}}- S\end{array}$$
(17.28d)
$$\begin{array}{rcl}{ X}^{{\prime}}(\Delta {S}^{{\prime}}) + (\Delta {X}^{{\prime}}){S}^{{\prime}}& =& \mu E - {X}^{{\prime}}{S}^{{\prime}}\end{array}$$
(17.28e)

which analogously to (17.7) can be expressed in terms of (ΔX, Δy, ΔS) by combining the last three equations into a single, regularized complementary slackness condition

$${X}^{{\prime}}(\Delta S) + (\Delta X){S}^{{\prime}} = \mu E - {X}^{{\prime}}S - X{S}^{{\prime}} + {X}^{{\prime}}{S}^{{\prime}}.$$
(17.29)

Supported by a theoretical complexity analysis that establishes efficiency of infeasible, primal-dual IPMs for the associated LP formulation [22], this method also shows encouraging computational performance to warm-start perturbations and SDP relaxations of binary quadratic programs combined with cutting-plane schemes [23].

4.2 Modified Shifted Barriers in Cutting-Plane Schemes

Similar to Sect. 17.2.2 that extends the standard-form SDP (17.1) by additional inequalities to problems of the form (17.8), we now describe some more specific approaches and show how to modify the previous shifted-barrier methods for warm starts after the addition of cutting planes. Restricting this discussion to linear cuts of the form \(\mathcal{P}(X) \geq q\), one consequence of including violated inequalities is that the associated slacks \(w = q -\mathcal{P}(X)\) are negative and thus not within the interior of the nonnegative orthant as underlying cone. Possible remedies include either to simply initialize w to be strictly positive and accept initial infeasibility of the cut inequality, or to attempt to preserve information of the current iterate and – at least temporarily – shift the underlying barrier to also accept an initial point some of whose entries are negative.

First proposed for LP by Mitchell and Todd [51], their alternative approach to deal with the addition of violated cutting planes in an IPM framework is not to relax nonnegativity of the associated slack w but to modify the system of inequality constraints by an additional variable z and solve the auxiliary phase-I-type problem

$$\min \ z\text{ s.t. }\mathcal{A}(X) = b,\ \mathcal{P}(X) - w + (\beta - \eta )z = q,\ X \succcurlyeq 0,\ (w,z) \geq 0$$
(17.30)

where β > 0 is a vector and \(\eta = \mathcal{P}({X}^{0}) - q\) the initial violation at some iterate X0, so that (X, w, z) = (X0, β, 1) is an interior feasible point that fully utilizes the previous iterate. Implementations and computational success with this approach are reported using the primal projective standard-form variant of Karmarkar’s algorithm within an LP cutting-plane method applied to matching problems [51], and using the dual affine method combined with a branch-and-bound code for linear ordering [49].

A similar approach that is closely related to the original shifted-barrier methods [25, 60] generalizes the parametrized infeasible-start barrier problem by Freund [26]

$$\begin{array}{rcl} & \min \ & (C + \epsilon ({\mathcal{A}}^{T}(y) + {\mathcal{P}}^{T}(z) + S - C)) \bullet X - \mu \epsilon \log \det (X)\qquad \end{array}$$
(17.31a)
$$\begin{array}{rcl} & \text{ s.t. }& \mathcal{A}(X) = b,\ \mathcal{P}(X) - w = q + \epsilon (\mathcal{P}(X) - w - q)\ (X \succ 0).\qquad \end{array}$$
(17.31b)

This formulation is feasible for ε = 1 independent of the initial value of the slack w, that therefore can also be set to be strictly positive, and it reduces to the original problem as the parameter ε and the product εμ are decreased to zero. Unaware of any practical implementation, theoretical results for the LP case show that primal-dual path-following IPMs maintain an iteration complexity of O(nLlog(1 ∕ ε)) or \(O(\sqrt{n}L\log (1/\epsilon ))\) for suitable infeasible or feasible starting points, respectively [26].

Finally, the two following approaches adjust strategies (17.26) and (17.27) for warm starts after the addition of cutting planes by relaxing nonnegativity of the slack variable w associated with the violated cut. The primal-dual penalty approach [13]

$$\begin{array}{rcl} & & \min \quad C \bullet X + {d}^{T}\xi \\ & & \text{ s.t. }\quad \mathcal{A}(X) = b,\ \mathcal{P}(X) - w = q,\ u \geq w \geq -\xi ,\ \xi \geq 0,\ X \succcurlyeq 0 \\ & & \max \quad {b}^{T}y + {q}^{T}z - {u}^{T}\psi \\ & & \text{ s.t. }\quad {\mathcal{A}}^{T}(y) + {\mathcal{P}}^{T}(z) + S = C,\ d \geq z + \psi \geq 0,\ \psi \geq 0,\ S \succcurlyeq \end{array}$$
(0)

utilizes auxiliary variables ξ and ψ to relax the nonnegativity of w and its corresponding dual variable z, and works toward nonnegativity by adding these variables with penalties d and u to the two objectives. The alternative slack approach [21, 22]

$$\begin{array}{rcl} & & \min \quad C \bullet X \\ & & \text{ s.t. }\quad \mathcal{A}(X) = b,\ \mathcal{P}(X) - w = q,\ w - \xi = 0,\ \xi \geq 0,\ X \succcurlyeq 0\end{array}$$
(17.32a)
$$\begin{array}{rcl} & & \max \quad {b}^{T}y + {q}^{T}z \\ & & \text{ s.t. }\quad {\mathcal{A}}^{T}(y) + {\mathcal{P}}^{T}(z) + S = C,\ z - \psi = 0,\ \psi \geq 0,\ S \succcurlyeq 0\end{array}$$
(17.32b)

uses ξ and ψ slightly differently to regularize the complementarity condition (17.10e) analogously to (17.29). Some results with these approaches are reviewed in Sect. 17.5.

4.3 Restoration of Feasibility After Perturbations

The problem to restore feasibility and related questions of sensitivity with respect to problem perturbations have led to additional results and techniques in the context of warm-starting IPMs. Although it is principally sufficient to take one Newton step with unit length into the direction (ΔX, Δy, ΔZ) obtained from system (17.7) to fully recover primal and dual feasibility, linear approximations of the complementarity equation may cause the resulting iterates X + ΔX and S + ΔS to become badly centered, or to not anymore be positive (semi)definite. For a pure Newton adjustment

$$\begin{array}{rcl} \mathcal{A}(\Delta X)& =& b -\mathcal{A}(X) \\ {\mathcal{A}}^{T}(\Delta y) + \Delta S& =& C -{\mathcal{A}}^{T}(y) - S \\ X(\Delta S) + (\Delta X)S& =& 0 \end{array}$$
(17.33c)

the analysis by Gondzio and Grothey [29] first provides theoretical bounds on the permissible perturbations drop so that all resulting infeasibilities can be absorbed by one full Newton step to continue a primal-dual path-following IPM from the new iterate while staying within some predescribed neighborhood of the central path. The applicability of their theoretical results are then demonstrated using an infeasible path-following method and show promising performance also in practice for several large-scale, specially-structured LPs. The relevance of a sensitivity analysis is also highlighted in the unblocking warm-starting approach by the same authors [30].

A similar reoptimization strategy for the restoration of feasibility after problem perturbations is proposed and analyzed by Yildirim and Wright [70] who use primal and dual scaling matrices Σ and Λ for variants of a least-squares adjustment (LSA)

$$\begin{array}{rcl} {\min }_{\Delta X}\left \Vert \Sigma \Delta X\right \Vert & \text{ s.t. }\mathcal{A}(X + \Delta X) = b,\ X + \Delta X \succeq 0&\end{array}$$
(17.34a)
$$\begin{array}{rcl} {\min }_{\Delta y,\Delta S}\left \Vert \Lambda \Delta S\right \Vert & \text{ s.t. }{\mathcal{A}}^{T}(y + \Delta y) + (S + \Delta S) = C,\ S + \Delta S \succeq 0.&\end{array}$$
(17.34b)

Some specific choices for Σ and Λ include the identity E for a “plain” LSA, the inverses X − 1 and S − 1 for a “weighted” LSA, and \({X}^{-1/2}{S}^{1/2}\) and \({X}^{1/2}{S}^{-1/2}\) for a jointly weighted LSA, respectively. The proposed warm-starting strategy is again based on the idea of an iterate pool, for which all intermediate iterates of the initial problem are stored and one of these points is selected as starting point for the perturbed problem instance. Similar to the former approach, several theoretical bounds can be established to estimate the size of perturbations that can be absorbed in only one Newton step by points that are sufficiently interior. The applicability and computational promise of this approach is further demonstrated in a later paper [36].

5 Implementation and Computational Results

The vast majority of papers cited in this chapter and many of the references therein give computational results that demonstrate the applicability of the various methods and formulations included in our preceding discussion. To exemplify this recently achieved progress of using IPMs for cutting-plane schemes and warm-starting, this particular section first describes a specific implementation developed for one of our own papers [23], that combines a primal-dual interior-point cutting-plane method as described in Sect. 17.3.3 with the slack warm-starting strategy (17.32) and shows its computational success in solving SDP relaxations of maximum cut and the single-row facility layout problem in less time than alternative methods. In addition, we also collect some specific results for warm-starting perturbations of the Netlib LP test problems as reported in four recent papers on warm-starting IPMs [13, 22, 30, 36].

5.1 An Interior-Point Cutting-Plane Implementation

The new interior-point cutting-plane method outlined in Algorithm 4 combines several of the previous approaches and ideas for solving SDP relaxations of the form (17.8) by integrating a cutting-plane scheme into the infeasible primal-dual path-following IPM in Algorithm 1. In particular, this method does not require to solve successive relaxations to optimality but adds and removes cuts dynamically at intermediate iterates, using some of the feasibility indicators (17.13) in Sect. 17.2.2 applied to the cut violation \(v = q -\mathcal{P}(X)\) and the shifted-barrier warm-starting technique (17.32) for the initialization of the associated primal-dual slacks w and z. The resulting algorithm basically works like a regular IPM that takes steps in different spaces in different iterations k based on the set of inequalities Ik ⊆ I currently added to the problem.

In Step 1, indicator functions are applied to the cut violations \({v}_{i} = {q}_{i} - {P}_{i} \bullet X\) to gain information which inequalities remain inactive (vi < 0), tend to become active (vi = 0), or are violated (vi > 0). New cuts are added aggressively, if either the variable or Tapia indicator show a cut violation, and again removed more conservatively only if their violation is eliminated and their redundancy suggested by both the Tapia and primal-dual indicator. Although this precautious strategy may result in keeping too many inequalities in the problem and still offers room for further improvement, it seems to guarantee that no single inequalities is repeatedly removed and re-added which otherwise be prevented by permanently enforcing an inequality after some finite number of swaps. Unlike the discussion in Sect. 17.2.2, it should also be observed that indicators are not directly applied to the primal-dual variables w and z, which are converted into free variables by introducing the auxiliary slacks ξ and ψ as shown in (17.32) and thus handled using the techniques in Sect. 17.2.3.

The initialization of these new variables in Step 2 is motivated by the theoretical analysis and computational experience with the slack warm-starting approach that these seemingly redundant slacks can mitigate the impact of violated cuts and are sufficiently decoupled from the other constraints to allow much smaller initial values than otherwise necessary when initializing the original variables (w, z) ≫ 0. The particular values in Algorithm 4 are chosen so to preserve feasibility of all the original constraints and to continue the algorithm with the same barrier parameter value for the computation of the new search direction in Step 3 as obtained from the previous iterate. The last three steps are also essentially identical to those in Algorithm 1, and without any other changes the current implementation simply removes the associated variables (wi, zi, ξi, ψi) whenever dropping a previously added inequality i ∈ I.

Finally, for some increased flexibility when testing the method, the frequency parameter κ in Algorithm 4 can be chosen so to add new cuts before every Newton iteration (if κ = 1) or to allow some intermediate steps in which the current set of cuts remains unchanged (if κ > 1). In particular, better computational results are typically achieved for κ = 2, i.e., when updating the set I every other iteration. Thus different from the approach by Helmberg and Rendl [34] that is described in more detail in Sect. 17.3.3 and only adds cuts whenever feasibility is fully established, this new implementation also makes use of the computational efficiency and stability of infeasible IPMs that are advantageously exploited by the slack warm-starting approach to more quickly accommodate violated inequalities and handle the resulting infeasible iterates without any additional correction steps or the need to restart.

5.2 Results on Maximum Cut and Facility Layout

Using an implementation of Algorithm 4 based on the primal-dual path-following IPM in the software package SDPT3 [65], Table 17.1 shows representative results for solving SDP relaxations of max-cut instances (17.19) using triangle inequalities as cuts

$$\begin{array}{rcl} \min \ & C \bullet X\text{ s.t. }& diag (X) = e,\ X \succcurlyeq 0\end{array}$$
(17.35a)
$$\begin{array}{rcl} & & {X}_{ij} + {X}_{jk} + {X}_{ik} \geq -1\text{ for }i < j < k\end{array}$$
(17.35b)
$$\begin{array}{rcl} & & {X}_{ij} - {X}_{jk} - {X}_{ik} \geq -1\text{ for distinct }(i,j,k).\end{array}$$
(17.35c)

All test instances were generated using RUDY [61] and include small and medium-sized random graphs of varying density, and 2D and 3D spin glasses with Gaussian or uniformly distributed ± 1 bonds. For each problem type and size the results are averaged over 10 instances, and the cpu times (in seconds) are used to compute a competitiveness ratio γ of the dynamic detection of cuts in Algorithm 4 over the direct solution of the final relaxation with that static subset of cuts that remains active at optimality and for which the dual variables and primal-dual indicators are bounded away from zero. Inequalities were checked every other iteration (κ = 2 in Algorithm 4), and the number of new cuts was restricted to at most 200 at a time.

The clear tendency of finding relevant inequalities and an optimal solution faster than the “static-cut” approach that basically assumes an oracle and only solves the final yet immediately much larger relaxation can also be confirmed on SDP relaxations of the single-row facility layout problem (SRFLP) which is further highlighted by Anjos and Liers in Chap. ??. In its standard formulation, each SRFLP instance consists of n one-dimensional facilities with given positive lengths i, i = 1, , n, and pairwise weights cij with the objective to arrange the facilities so as to minimize the total weighted sum of their center-to-center distances. In particular, if all the facilities have the same length, then SRFLP becomes an instance of the linear ordering problem which is itself a special case of the quadratic assignment problem also showing that SRFLP is NP-hard, in general. Formulated as a constrained binary quadratic program, the following gives the SDP relaxation by Anjos and Yen [8] which again uses the triangle inequalities as cuts

$$\begin{array}{rcl} & \min \ & \sum\limits_{i<j<k}\left ({c}_{ik}{\mathcal{l}}_{j}{X}_{ij,jk} - {c}_{jk}{\mathcal{l}}_{i}{X}_{ij,ik} - {c}_{ij}{\mathcal{l}}_{k}{X}_{ik,jk}\right ) \\ & \text{ s.t. }& \sum\limits_{k\neq i,j}\left ({X}_{ij,jk} - {X}_{ij,ik} - {X}_{ik,jk}\right ) = -(n - 2),\ diag (X) = e,\ X \succcurlyeq 0, \\ & & {X}_{{i}_{1}{i}_{2},{j}_{1}{j}_{2}} + {X}_{{i}_{1}{i}_{2},{k}_{1}{k}_{2}} + {X}_{{j}_{1}{j}_{2},{k}_{1}{k}_{2}} \geq -1\text{ for ordered }({i}_{1}{i}_{2},{j}_{1}{j}_{2},{k}_{1}{k}_{2}), \\ & & {X}_{{i}_{1}{i}_{2},{j}_{1}{j}_{2}} - {X}_{{i}_{1}{i}_{2},{k}_{1}{k}_{2}} - {X}_{{j}_{1}{j}_{2},{k}_{1}{k}_{2}} \geq -1\text{ for distinct }({i}_{1}{i}_{2},{j}_{1}{j}_{2},{k}_{1}{k}_{2})\end{array}$$

Using several small and medium-sized instances from the SRFLP library [40], the representative results in Table 17.2 are obtained and interpreted analogously to those in Table 17.1. The full result list and discussion is contained in the original paper [23].

Table 17.1 Representative results for RUDY-Generated max-cut instances

5.3 Warm-Starting Results After Perturbations

The third set of results in Table 17.3 is collected from three warm-starting approaches [13, 21, 30] that report numerical results specifically for LP instances of the Netlib suite after data perturbations. Although only those instances are included for which results were available for all three approaches, the full list looks very similar [22]. Results with the methods in Sect. 17.4.3 are also prepared by John and Yildirim [36] on the same instances, but presented in a different form and thus not repeated here.

Table 17.2 Representative results for single-row facility layout instances

For each method, the results in Table 17.3 show the average number of iterations required to solve randomly perturbed instances of every original Netlib problem first using a warm start from the known optimal solution of its unperturbed problem instance, and then again from a cold start using the default initial point specified by the chosen optimization solver. Although using the same perturbation scheme that is explained in each of the relevant papers, the observed differences between the results obtained by each of these three methods are more likely a consequence of the randomly perturbed instances and the specific implementation of their underlying algorithms rather than an indication as to whether a single approach performs better than the two others. Hence without conclusion about a “winner” among these three approaches, however, these results demonstrate that warm-starting IPMs is possible in general and offers reductions in the number of iterations of over 50% on average.

6 Conclusion

We reviewed some recent progress of using interior-point algorithms in the context of cutting-plane methods and semidefinite relaxations of combinatorial optimization problems. Today, IPMs are well understood in theory and already widely used in practice, but their further improvement especially for large-scale and specially-structured problems remains a central topic of active research. The second-order nature of IPMs guarantees their exceptional overall stability but often restricts the size of problems that can be handled practically. This challenge gains relevance as LP relaxations nowadays are often replaced by SDP relaxations to provide tighter bounds for many important applications. In particular, in the last few years analytic-center cutting-plane methods for convex feasibility problems have been extended from sets with polyhedral descriptions to sets characterized by positive semidefiniteness constraints and linear matrix inequalities. Moreover, the well-studied approach to tighten successive descriptions using linear inequalities is now enhanced by second-order and semidefinite cuts, and further generalizations to conic optimization remain under current investigation. Alternative approaches based on primal-dual methods exploit the stability of IPMs for the dynamic addition and removal of cuts at intermediate iterates and are of interest if combined with indicator functions and enhanced strategies for warm-starting after the addition of cuts or after data perturbations. Computational results demonstrate the potential of these approaches.

Table 17.3 Representative Results for Warm-Starting LP Perturbations

We conclude with some ideas to promote further progress in this area. The question whether it is possible to also define good second-order cone relaxations for binary quadratic programs that could compromise between the solvability of LP relaxations and the better bounds of SDP relaxations together with the development of new or improved solution strategies using more efficient IPMs or alternative methods seems a promising avenue with only few current contributions. Moreover, the combination of interior-point/simplex schemes that seem to perform very well in the case of LP are currently unavailable for SDP unless significant progress can be made to also design practical simplex-type approaches for SDP and conic optimization in general. The other possibility to develop better warm-starting approaches for SDP are prepared in this chapter, but it is important to reemphasize that these formulations are extended from LP and that their computational impact currently remains unexplored for the addition of nonlinear cuts and for general perturbations of SDPs. Finally, the development of a meaningful suite of reoptimization test problems and a common testing environment for the different approaches remains another urgent necessity to gain further insight and stimulate or accelerate new ideas and progress.