BranchandSandwich: a deterministic global optimization algorithm for optimistic bilevel programming problems. Part I: Theoretical development
Abstract
We present a global optimization algorithm, BranchandSandwich, for optimistic bilevel programming problems that satisfy a regularity condition in the inner problem. The functions involved are assumed to be nonconvex and twice continuously differentiable. The proposed approach can be interpreted as the exploration of two solution spaces (corresponding to the inner and the outer problems) using a single branchandbound tree. A novel branching scheme is developed such that classical branchandbound is applied to both spaces without violating the hierarchy in the decisions and the requirement for (global) optimality in the inner problem. To achieve this, the wellknown features of branchandbound algorithms are customized appropriately. For instance, two pairs of lower and upper bounds are computed: one for the outer optimal objective value and the other for the inner value function. The proposed bounding problems do not grow in size during the algorithm and are obtained from the corresponding problems at the parent node.
Keywords
Bilevel programming Nonconvex inner problem Branch and bound1 Introduction
Reallife decisions are often made hierarchically. For instance, the allocation of resources by a federal government is a multilevel allocation process where one level of government distributes resources to several lower levels of government [20]. The overall objective of such a process is to achieve the goals of the federal government. However, each lower level of government will react independently according to best serving their own interests; these (rational) reactions may be gainful or detrimental to the overall objective. Hence, decision making within such a framework requires careful analysis in order to ensure the best possible outcome for all the decision makers.
The concept of hierarchical decision making, in the presence of two decision makers, namely one leader and one follower, dates back to 1952 when von Stackelberg introduced the basic leader/follower strategy in a duopoly setting. Since then, hierarchical decision making has found application to numerous practical problems across various disciplines, such as economics, management, agriculture, transportation and engineering [19, 39, 48, 53, 65]. Of particular interest are hierarchical systems in parameter estimation [59, 66], environmental policies in biofuel production [10] and chemical equilibria [21, 22].
In this work, we employ the wellknown mathematical formulation of a twolevel decision making process, known as the bilevel programming problem and propose a new solution strategy for finding global solution(s). Special cases of this problem have been studied extensively and many algorithms have been proposed in the literature [9, 33, 42]. However, the general nonconvex form is a very challenging problem for which only two algorithms exist to the best of our knowledge: the first method for general (nonconvex) bilevel problems developed by Mitsos et al. [60] and the approximation method introduced by Tsoukalas et al. [74].
Severe implications stem from the hierarchical structure of bilevel problems, such as nonconvex, disconnected or even empty feasible regions, especially in the presence of outer constraints and nonconvex inner problems. In this paper, we propose a deterministic global optimization algorithm, BranchandSandwich (B&S), based on the refinement of two sets of convergent lower and upper bounds. Namely, valid lower and upper bounds are computed simultaneously for the outer and inner objective values. Their convergence to the corresponding optimal values is proved within a branchandbound framework.
To this end, we first introduce a novel branching scheme such that the hierarchy in the decisions is maintained and the requirement for (global) optimality in the inner problem is not violated. As for the bounding, two pairs of lower and upper bounds are computed: one for the outer optimal objective value and the other for the inner value function. KKTbased relaxations are used to construct the inner upper bounding problem and the outer lower bounding problem. The inner upper bound serves as a constant bound cut in the outer lower bounding problem. These two problems result in convergent bounds on the inner and the outer optimal objective values; they are both nonconvex and must be solved globally, i.e., with classical global optimization techniques,^{1} e.g., [37, 44, 63]. Wellknown convexification techniques are employed to construct a convex inner lower bounding problem whose value is used in the selection operation and in fathoming. The outer upper bounding problem is motivated by Mitsos et al. [60], but flexibility is added in that convex relaxations of the original inner problem, i.e., problem (2), over refined subsets of the inner space can be solved. The proposed bounding problems do not grow in size during the course of the algorithm and are obtained from the corresponding problems of the parent node.
The paper is organized as follows: Sect. 2 is devoted to background theory and a discussion of the challenges that need to be addressed. Our bounding ideas are presented in Sect. 3 without considering any branching. Connections to relevant semiinfinite programming literature are also pointed out in this section. The branching scheme is introduced in Sect. 4 and the proposed bounding problems are modified to allow branching. The overall algorithm is presented in Sect. 5 and conclusions are presented in Sect. 6.
2 Background theory
2.1 Definitions, notations and properties
To start with, the expressions relaxation and restriction are briefly explained below.
Definition 1
A relaxation (restriction) of a minimization problem is another minimization problem with (i) an objective function always lower than or equal to (greater than or equal to) the original objective function,^{2} and with (ii) a feasible region that is a superset (subset) of the original’s problem feasible region.
The limitation of the optimistic approach is that it violates the basic assumption of noncooperation. However, there are several applications where limited cooperation is permitted and \(\varepsilon \)optimal solutions, using the optimistic approach, are appropriate, such as applications in the management of multidivisional firms [6]. On the other hand, there exist instances for which cooperation is ruled out and the use of the pessimistic approach is more realistic, such as applications in production planning [8]. Alternatively, if one wishes to avoid using either approach, a regularization approach can be employed that bypasses the difficulties of nonuniquely determined lowerlevel solutions [27]. The problem of tackling the pessimistic or regularized approaches is outside the scope of this work. However, we briefly mention that the extension of B&S to the pessimistic approach does not appear to be straightforward due to two main reasons: (i) the use of the equivalent optimal value reformulation is no longer possible in the pessimistic approach; (ii) the proposed branching scheme that entails branching on \(y\) may have serious implications on the worstcase (\(\max \) with respect to \(y\)) outer objective and needs to be thoroughly investigated. On the other hand, it appears that B&S can trivially be applied to the regularized approach.
Finally, problem (8) is related to many wellknown optimization problems, such as multiobjective optimization problems [53], maxmin problems [4, 34, 35, 73, 74, 78], MPEC [25, 62], and (usual or generalized) semiinfinite optimization problems [14, 15, 38, 61, 67, 74]. Many of these problems are known to be \(\textit{NP}\)hard which implies that the bilevel problem is also \(\textit{NP}\)hard [32].
2.2 Optimistic bilevel programming problems
 A1:
All the functions involved, i.e., \(F(\cdot ,\cdot )\), \(G(\cdot ,\cdot )\), \(f(\cdot ,\cdot )\), \(g(\cdot ,\cdot )\), are lower semicontinuous (l.s.c.) on \(X\times Y\); and,
 A2:
the set \(\{(x,y) \in X \times Y \mid f(x,y) \le w(x)\}\) is compact.
Definition 2
Remark 1
For an \(\varepsilon \)optimal pair, as defined in Def. 2, we also use the notation \(f^*\) to refer to the corresponding inner optimal objective value, namely, \(f^* = w(x^*)\) in (15).
Remark 2
Notice that \(\varepsilon _f\) is an optimality tolerance from the perspective of the inner problem. However, it can also be interpreted as a feasibility tolerance (for one of the constraints) from the perspective of the overall problem (3).
Remark 3
In this paper, our aim is to compute \(\varepsilon \)optimal solutions as defined in Def. 2. This implies that we apply \(\varepsilon _f\)optimality in the inner problem throughout the paper.
The following assumptions are also made throughout the paper.
Assumption 1
All the functions involved, i.e., \(F(\cdot ,\cdot )\), \(G(\cdot ,\cdot )\), \(f(\cdot ,\cdot )\), \(g(\cdot ,\cdot )\), are continuous on \(X\times Y\).
Assumption 2
The sets \(X\) and \(Y\) are compact.
Assumption 3
All the functions involved, i.e., \(F(\cdot ,\cdot )\), \(G(\cdot ,\cdot )\), \(f(\cdot ,\cdot )\), \(g(\cdot ,\cdot )\), are twice continuously differentiable on \(X\times Y\).
Assumption 4
A constraint qualification holds for the inner problem (2) for all values of \(x\).
Remark 4
With regards to Assumption 4, if, for instance, the MFCQ is assumed (without reference to a point), then we assume that it is valid at each local minimizer of problem (2). Then, \(y \in O(x)\) implies that there are vectors \(\mu ,\lambda ,\nu \) such that \((x,y,\mu ,\lambda ,\nu ) \in {\varOmega }_\mathrm{KKT}\) [28].
Remark 5
Assumption 4 is a mild but essential assumption. Violation of this assumption (a constraint qualification may not be satisfied for all values of \(x\)) may result in an undesirable outcome, such as an infeasible outer lower bounding problem (\(\mathrm{LB}_{0}\)) (cf. Sect. 3) whilst the original bilevel problem possesses an optimal solution [30]. The implications of lack of regularity are discussed in detail in [46].
In the present work and the companion paper [47], all test problems considered satisfy Assumption 4. In particular, either the Abadie or the linear/concave constraint qualification holds [12, 58].
2.3 Challenges
There are several important challenges one needs to overcome when considering bilevel optimization problems with a nonconvex inner problem. One challenge is related to the difficulty of creating convergent lower bounding problems since deriving an equivalent onelevel formulation, as is possible with convex inner problems, is no longer feasible. The second important challenge stems from the need to identify correctly the global solution(s) of a nonconvex inner problem and thus derive valid bounds on its optimal objective value. For instance, solving a nonconvex inner problem locally or partitioning the inner space and treating each subdomain independently of one another may lead to invalid solutions/bounds. An analytical discussion on these issues can be found in [57]. Here, we consider an example from [58], to visualize these challenges.
Example 1
To deal with the challenge of generating a convergent overall lower bounding problem, we introduce a constant upper bound cut on the inner value function \(w(x)\). The right hand side of the cut is tightened when appropriate, resulting in one cut only each time the lower bounding problem is solved. Such a formulation results in a significant simplification compared to the lower bounding problem presented in [60].
3 Bounding scheme: initial node
The proposed bounding scheme applies to formulation (3) and is based on the fact that a relaxed inner problem yields a restriction of the overall problem (3) and a restricted inner problem yields a relaxed overall problem. In view of the above, we create two auxiliary inner problems, i.e., the auxiliary relaxed and restricted inner problems, whose objective values are always lower and upper bounds, respectively, on the inner optimal value for the domain of interest. We then apply appropriate techniques to solve these problems efficiently. In particular, for the auxiliary relaxed inner problem we apply classical convexification techniques [36, 71] to construct and solve a convex problem. For the auxiliary restricted inner problem, we apply generalized semiinfinite programming techniques [41, 69] to reduce it to a finite problem. The resulting problem is nonconvex; hence, we can either solve a convex relaxation or solve it directly to global optimality.
The resulting bounds on the inner optimal objective value are employed in the classical sense to prune branches from the branchandbound tree. Furthermore, the inner upper bound is also used in the bounding scheme for the outer problem. Specifically, it is used to construct a constant upper bound cut to augment the overall lower bounding problem.
Remark 6
Recall that in this section, we develop our bounding ideas over the whole \(X\times Y\) space; hence, we use the zero subscript in the problem names to denote that the calculations are done over the whole \(X\times Y\) domain, i.e., at the root node. The proposed bounding problems of this section are employed in Step 1 of the B&S algorithm (cf. Sect. 5). To allow branching, modifications of these problems over subdomains of \(X \times Y\) are discussed in Sect. 4, after our branching scheme is introduced.
3.1 Inner problem bounding scheme
Remark 7
We support our choice of minimizing with respect to both the inner and outer variables in (\(\mathrm{ILB}_{0}\)) by showing that fixing the outer variables and minimizing with respect to the inner variables would yield an invalid lower bound. Let \(f(x,y) = x^2 y + \sin y\) for \(x \in [0,1]\) and \(y \in [1,6]\). The optimal point is \((x^*,y^*)=(0,4.7124)\) with (inner) objective value \(f^* = 1\). As Fig. 2a illustrates, if we fix \(x\), e.g., \(\bar{x}=1\), and compute the minimum of the function \(f(\bar{x},y)\), the resulting optimal objective value is \(f'= 1.8415\) at \(y=1\) which is not a valid lower bound for all \(x \in [0,1]\).
Remark 8
Remark 9
The choice of maximizing with respect to the \(x\) variables in (\(\mathrm{IUB}_{0}\)) is supported by showing that using a feasible point \((\bar{x},\bar{y}) \in {\varOmega }\), such that \(\bar{y} \in O(\bar{x})\), would not yield a valid inner upper bound. Let \(F(x,y)=x^2+y\) and \(f(x,y) =\frac{y^2}{x}\), where \(x \in [1,4]\) and \(y \in [2,4]\). The optimal point is \((x^*,y^*)=(1,2)\) with (inner) objective value \(f^* = 4\). As Fig. 2b demonstrates, if we fix \(x\), e.g., \(\bar{x} = 2\), implying \(\bar{y}=2\), the resulting objective value is \(f'= 2\), which is not a valid inner upper bound.
3.2 Overall problem bounding scheme
The issue of a potential infeasible restricted approximate problem has also been encountered when applying convexification/approximation techniques to the nonconvex lower level problem of a semiinfinite programming (SIP) problem [14, 15, 38, 56, 61]. In these works, the existence of SIPSlater points (cf. [56, Def. A.1]) is assumed in order for the approximate problem to be guaranteed to provide a feasible point in the original SIP problem after a suitably refined subdivision of the inner space.
Although subdivision is required in the existing approaches for the approximate lower level problem to converge to the actual problem, no branching with respect to \(y\), as the lower level variable, is permitted. Hence, the whole inner space is always considered and no (inner) subregion can be fathomed. As a result, a growth in problem complexity with the number of subdivisions is unavoidable.
In our approach, it is possible to consider only some subregions of \(Y\) and eliminate others via fathoming. This is discussed in the next section.
4 Branching scheme and bounding on subdomains
Employing the bounding scheme discussed in Sect. 3, we develop a branchandbound algorithm that yields valid bounds for both the inner problem (2) and the overall problem (3) over successively refined partitions of the \(X\times Y\) space. Necessarily, the branching framework for bilevel problems must differ from that for single level problems due to their hierarchical structure. In particular, it has been argued that branching on the inner variable may yield an invalid solution [57, 60, 72]. This is due to the sequential decision making underlying bilevel optimization, that implies the leader committing first and then the whole space \(Y\) being considered to ensure global optimality in the inner problem [57, 72]. However, Mitsos et al. [60] support branching on \(y\) by introducing dummy variables \(z\) in place of \(y\) in the inner problem. Thus, branching is permitted on \(y\), the outer variable, but not permitted on \(z\), the inner variable.
In this work, we pursue a different strategy, in order to permit branching on \(y\), without the need to introduce artificial inner variables. This is achieved by branching on the inner variable as normal, namely making no distinction between the inner and outer variables during branching, but at the same time considering the “whole” \(Y\) via appropriate management of nodes (cf. Sect. 4.1 on list management). This is achieved by introducing a novel branching scheme that allows exhaustive partitioning of the domain \(X\times Y\) for general bilevel problems. The main idea is to maintain several (sub)lists of nodes with special properties in addition to the classical universal list of unfathomed nodes \(\mathcal{L}\). These lists will allow us to examine the “whole” \(Y\) for different subsets of \(X\) despite \(Y\) also being partitioned.
In addition to how the lists of nodes are managed, most properties of a classical branchandbound method are affected by allowing partitioning of the inner space. In the forthcoming subsections we develop key steps of the B&S algorithm and show how these ensure that (i) the subdivision process is exhaustive, (ii) the proposed overall lower bounding problem is monotonically improving, (iii) the node fathoming rules discard redundant regions safely, and (iv) the incumbent solution corresponds to a feasible point of (1).
4.1 List management
The crux of the proposed branching scheme is the use of additional lists of nodes over and above the classical list of open nodes \({\fancyscript{L}}\). By “open” nodes we refer to nodes that can be explored further. In the context of our solution method, open nodes in \({\fancyscript{L}}\) correspond to the outer (overall) problem. Nodes that are removed from this list because they are known not to contain the global solution of the bilevel problem are either fully fathomed or outer fathomed within the B&S algorithmic framework. In the former case, it is implied that these nodes need not be explored further in any space, outer or inner; hence, they are deleted from \({\fancyscript{L}}\). In the latter case, it is implied that the nodes are no longer needed for the outer problem but cannot yet be discarded fully from the branchandbound tree, because they may contain a globally optimal solution for the inner problem for a subdomain of \(X\) which has not yet been fathomed. For those nodes that are discarded from the overall problem but remain open with respect to the inner problem, we maintain another list of nodes, the socalled list of inner open nodes \({\fancyscript{L}}_\mathrm{In}\), which contains nodes that may be explored further with respect to the inner space only. We also refer to list \({\fancyscript{L}}_\mathrm{In}\) as the list of outerfathomed nodes, implying that these nodes have been deleted from \({\fancyscript{L}}\). The rules used to determine which nodes are either fully or outer fathomed from \({\fancyscript{L}}\) are stated in Sect. 4.6.
Remark 10
A node that has not yet been fully fathomed belongs either to list \({\fancyscript{L}}\) or to list \({\fancyscript{L}}_\mathrm{In}\). Lists \({\fancyscript{L}}\) and \({\fancyscript{L}}_\mathrm{In}\) never have common nodes, i.e., \({{\fancyscript{L}}} \cap {\fancyscript{L}}_\mathrm{In}=\emptyset \).
Definition 3
Independent lists appearing in each partitioning example shown in Fig. 3
\({{\fancyscript{L}}}^{1}\)  \({{\fancyscript{L}}}^{2}\)  \({{\fancyscript{L}}}^{3}\)  

Fig. 3a  \(\{{{\fancyscript{L}}}^1_{1}, {{\fancyscript{L}}}^1_{2} \}\)  –  – 
Fig. 3b  \(\{{{\fancyscript{L}}}^1_{1}\}\)  \(\{{{\fancyscript{L}}}^2_{1}\}\)  – 
Fig. 3c  \(\{{{\fancyscript{L}}}^1_{1} , {{\fancyscript{L}}}^1_{2}\}\)  \(\{ {{\fancyscript{L}}}^2_{1}\}\)  – 
Fig. 3d  \(\{{{\fancyscript{L}}}^1_{1}\}\)  \(\{ {{\fancyscript{L}}}^2_{1}\}\)  \(\{{{\fancyscript{L}}}^3_{1}\}\) 
Sublists appearing in each partitioning example shown in Fig. 3
\({{\fancyscript{X}}}_1\)  \({{\fancyscript{X}}}_2\)  \({{\fancyscript{X}}}_3\)  

Fig. 3a  \({{\fancyscript{L}}}^1_{1} = \{1,2\}, {{\fancyscript{L}}}^1_{2} = \{2,3,4\}\)  –  – 
Fig. 3b  \({{\fancyscript{L}}}^1_{1} = \{1,5\}\)  \({{\fancyscript{L}}}^2_{1} = \{3,4,6\}\)  – 
Fig. 3c  \({{\fancyscript{L}}}^1_{1} = \{1,7,8\}, {{\fancyscript{L}}}^1_{2} = \{1,9\}\)  \({{\fancyscript{L}}}^2_{1} = \{3,4,6\}\)  – 
Fig. 3d  \({{\fancyscript{L}}}^1_{1} = \{7,8,10,11\}\)  \({{\fancyscript{L}}}^2_{1} = \{3,4,6\}\)  \({{\fancyscript{L}}}^3_{1} = \{9,12\}\) 
This requirement can be visualized in Fig. 3, where for a member set \({{\fancyscript{X}}}_p, p \in P\), more than one sublist may satisfy (32)–(34), i.e., \(s_p\ge 1\), and a given node can appear in more than one sublist. In particular, in Fig. 3a, \({{\fancyscript{X}}}_1\) has two sublists, \({{\fancyscript{L}}}^1_{1}\) and \({{\fancyscript{L}}}^1_{2}\), and node \(2\) belongs to both sublists. This is necessary to ensure that each sublist represents a \(Y\) partition.
Remark 11
In other words, in the B&S framework, the set \(\{{{\fancyscript{X}}}_p:p \in P\}\) cannot be any \(X\) partition, but has to satisfy the requirement that for all \(x \in {{\fancyscript{X}}}_p\) the “whole” \(Y\) is maintained. This requirement is met via the management of the independent lists (30) and their sublists (32).
Remark 12
Remark 13
Note that in our approach, lists \({{\fancyscript{L}}}\) and \({{\fancyscript{L}}}_\mathrm{In}\) are core lists as opposed to lists \({{{\fancyscript{L}}}_p}\), \(p \in P\), which are auxiliary. The purpose of the first two lists is to indicate which nodes may require further exploration (branching). The union of these two lists corresponds to the overall list of open nodes in a classical branchandbound method. On the other hand, the purpose of the auxiliary lists \({{{\fancyscript{L}}}_p}\), \(p \in P\), is to classify all nodes in the union of \({{\fancyscript{L}}}\) and \({{\fancyscript{L}}}_\mathrm{In}\) according to \(X\) and \(Y\) partitioning.
Having introduced the key partitioning concepts, we now present the bounding problems at some node \(k \ne 1\) in the branchandbound tree. These are derived from the bounding problems for the root node presented in Sect. 3.
4.2 Subdivision process
Definition 4
Bisection has been shown to be an exhaustive subdivision process [45, Prop. IV. 2.]. However, it is essential to show that its exhaustiveness property with respect to the subdivision of \(X\) is not compromised in the context of the B&S algorithm. This is because the \(X\) partition is managed by the independent lists (30) and the fewer the independent lists the less refined the \(X\) partition. To guarantee that independent lists are replaced by new independent lists covering refined \(X\) subdomains, we maintain a symmetry in branches across nodes of the same level in the branchandbound tree by using the lowest index rule.
Definition 5
Namely, if more than one coordinate \(i\), \(i=1,\ldots ,n+m\), satisfy the longest edge requirement, then the one with the smallest index is selected to be subdivided using bisection.
Definition 6
 (1)
Apply lowest index selection rule of Def. 5 to select a branching variable.
 (2)Partition \(k\) and create two new nodes, \(k_1,k_2\), based on Def. 4.
 (2.1)If branching on \(y\)variable, every sublist \({{\fancyscript{L}}}^{p}_s\) containing \(k\) is modified:$$\begin{aligned} {{\fancyscript{L}}}^{p}_s = ({{\fancyscript{L}}}^{p}_s \setminus k) \cup \{k_1,k_2\}. \end{aligned}$$(42)
 (2.2)If branching on \(x\)variable, every sublist \({{\fancyscript{L}}}^{p}_s\) containing \(k\) is replaced by one or two new sublists:
 for \(i=1,2\), if \(\mathrm {ri}(X^{(k_i)}) \cap \mathrm {ri}(X^{(j)}) \ne \emptyset \ \forall j \in {{\fancyscript{L}}}^{p}_s\setminus \{k\}\), create:$$\begin{aligned} {{\fancyscript{L}}}_{s_i}^{p}= ({{\fancyscript{L}}}^{p}_s\setminus \{k\})\cup \{k_i\}. \end{aligned}$$(43)

 (2.1)
 (3)
If (IC) is true, replace \({{\fancyscript{L}}}^{p}\) with two new independent lists \({{\fancyscript{L}}}^{p_1}\) and \({{\fancyscript{L}}}^{p_2}\) as in (36)–(37) and increase \(P\) as in (38).
Theorem 1
The subdivision process of the BranchandSandwich algorithm is exhaustive.
Proof
The proof is provided in Part II of this work [47]. \(\square \)
Remark 14
Bisection is a typical branching strategy, but it has been shown to be non optimal compared to hybrid approaches that combine different subdivision strategies [43]. Nevertheless, exploring branching strategies is beyond the scope of this article.
4.3 Inner lower bound
4.4 Best inner upper bound
Remark 15
The set defined above differs from the set \({\varOmega }_\mathrm{KKT}\) in (\(\mathrm{InKKT}_{0}\)) due to the different bound constraints. However, the linearity of these constraints, resulting in nonzero gradients for all \(y\) values,^{12} ensures that the constraint set over each subregion \(Y^{(k)}\subset Y\) still satisfies appropriate constraint qualifications provided that Assumption 4 is satisfied. For instance, if the concave constraint qualification [13] is satisfied for problem (2) over \(Y\), then it also holds for the same problem over any subregion \(Y^{(k)}\).
In what follows, based on the inner upper bounds computed, let \(f^{\mathrm{UB},p}\) denote the best inner upper bound value assigned to each independent list \({{\fancyscript{L}}}^{p}\), \(p\in P\).
Definition 7
Example 2
Independent lists of the tree in Fig. 4, at different stages of the exploration of the branchandbound tree
#Nodes in the tree  \(X\) partition  Independent list(s)  Best inner upper bound 

1  \({{\fancyscript{X}}}_1 = [1,1]\)  \({{\fancyscript{L}}}^{1} = \{1\}\)  \(f^{\mathrm{UB},1} = 0\) 
3  \({{\fancyscript{X}}}_1 = [1,1]\)  \({{\fancyscript{L}}}^{1} = \{2,3\}\)  \(f^{\mathrm{UB},1} = 0.0352\) 
5  \({{\fancyscript{X}}}_1 = [1,1]\)  \({{\fancyscript{L}}}^{1} = \{\{3,4\},\{3,5\}\}\)  \(f^{\mathrm{UB},1} = 0.0352\) 
7  \({{\fancyscript{X}}}_1 = [1,1]\)  \({{\fancyscript{L}}}^{1} = \{\{3,4\},\{3,6,7\}\}\)  \(f^{\mathrm{UB},1} = 0.0352\) 
9  \({{\fancyscript{X}}}_1 = [1,0]\)  \({{\fancyscript{L}}}^{1} = \{4,8\}\)  \(f^{\mathrm{UB},1} = 0.0542\) 
\({{\fancyscript{X}}}_2 = [0,1]\)  \({{\fancyscript{L}}}^{2} = \{6,7,9\}\)  \(f^{\mathrm{UB},2} = 0.0352\) 
4.5 Monotonically improving outer lower bound
Lemma 1
Proof
 Branching on \(y\): within list \({{\fancyscript{L}}}^{p}\), every sublist \({{\fancyscript{L}}}^{p}_s\) that contains \(k\) is modified as in (42). This modification results in no change to the overall \({{\fancyscript{X}}}_p\) subdomain covered; hence,This also means that in each sublist \({{\fancyscript{L}}}^{p'}_s\), containing \(k_1\) and \(k_2\), the underlying subdivision of \(Y\) is now refined since at each successor node, we have \(Y^{(k_i)} \subset Y^{(k)}\), \(i=1,2\). Based on the properties of (RIUB), this implies:$$\begin{aligned} p' := p. \end{aligned}$$As a result, from (44) (cf. Def. 7) and (46), it is clear that \(f^{\mathrm{UB},p'} \le f^{\mathrm{UB},p}\).$$\begin{aligned} \min \{\bar{f}^{(k_1)}, \bar{f}^{(k_2)}\} \le \bar{f}^{(k)}. \end{aligned}$$(46)
 Branching on \(x\): within list \({{\fancyscript{L}}}^{p}\), every sublist \({{\fancyscript{L}}}^{p}_s\) that contains \(k\) is replaced by one or two new sublists as in (43). At each successor node, we have \(X^{(k_i)} \subset X^{(k)}\), \(i=1,2\), and by formulation (RIUB):Then, if (IC) is not satisfied, list \({{\fancyscript{L}}}^{p}\) is preserved, i.e.,:$$\begin{aligned} \bar{f}^{(k_i)} \le \bar{f}^{(k)},\ i=1,2. \end{aligned}$$(47)and \(k_1,k_2 \in {{\fancyscript{L}}}^{p'}\). Otherwise, if (IC) is satisfied, list \({{\fancyscript{L}}}^{p}\) is replaced by two new independent lists \({{\fancyscript{L}}}^{p_1}\) and \({{\fancyscript{L}}}^{p_2}\) based on (36)–(37) such that:$$\begin{aligned} p' := p, \end{aligned}$$(A)where \({{{\fancyscript{X}}}_{p_i}} \subset {{{\fancyscript{X}}}_{p}}\). Without loss of generality, let$$\begin{aligned} k_i \in {{\fancyscript{L}}}^{p_i},\ i=1,2, \end{aligned}$$Then, in both cases, (A) and (B), by (44) and (47) we have \(f^{\mathrm{UB},p'} \le f^{\mathrm{UB},p}\).$$\begin{aligned} p' := p_1. \end{aligned}$$(B)
Remark 16
Observe that \(f^{\mathrm{UB},p'}\) is at least as tight in case (B) as in case (A), because \({{\fancyscript{X}}}_{p'} \subset {{\fancyscript{X}}}_{p}\) in the former as opposed to \({{\fancyscript{X}}}_{p'} = {{\fancyscript{X}}}_{p}\) in the latter.
Theorem 2
Proof
4.6 Node fathoming rules
The success of branchandbound methods depends on the ability to discard subregions of the original domain due to guarantees ensuring that the global optimal solution cannot be found there. This process is known as fathoming. In the tree of Fig. 4, no fathoming rule was employed and no \(X\) or \(Y\) subregion was deleted. Thus, in this case each independent list held the whole \(Y\), i.e., \({{\fancyscript{Y}}}_p = Y\), \(p=1,\ldots ,P\), via a number of \(Y\) partitions.
In what follows, we describe how classical fathoming rules, such as the “fathombyinfeasibility” and “fathombyvaluedominance” rules, are incorporated in our algorithm such that subregions of both \(X\) and \(Y\) spaces are discarded safely. As a result, each independent list may not hold the whole \(Y\) any more and \({{\fancyscript{Y}}}_p \subseteq Y\), \(p =1,\ldots ,P\).
To start with, we highlight the implicit use of two trees during the course of our algorithm, one for the inner problem and one for the outer problem. In the algorithm, we apply the classical fathoming rules to both trees independently. For brevity, we refer to those as inner and outerfathoming rules.
Definition 8
 (1)
\({{\underline{f}}}^{(k)}=\infty \),^{14}
 (2)
\({{\underline{f}}}^{(k)}> f^{\mathrm{UB},p}\),
Definition 9
 (1)
\({{\underline{F}}}^{(k)}=\infty \),
 (2)
\({{\underline{F}}}^{(k)} \ge F^\mathrm{UB}\varepsilon _F\),
Moreover, if a sublist contains outerfathomed nodes only, i.e., it no longer contains any nodes in \({\fancyscript{L}}\) which are open from the perspective of the overall problem, then it can be deleted. This may lead to full fathoming of the corresponding nodes as long as they do not appear in other sublists of the same independent list. The rules are summarized below.
Definition 10
 1.
If \({{\fancyscript{L}}}^{p}_i \cap {{\fancyscript{L}}} = \emptyset \) and \({{\fancyscript{L}}}^{p}_i \cap {{\fancyscript{L}}}^{p}_j = \emptyset \) for all \(j\ne i \in 1,\ldots ,s_p\), then fully fathom all nodes \(k \in {{\fancyscript{L}}}^{p}_i\), i.e., delete them from \({{\fancyscript{L}}_\mathrm{In}}\) and \({{\fancyscript{L}}}^{p}\). Delete also sublist \({{\fancyscript{L}}}^{p}_i\) and decrease \(s_p\).
 2.
If \({{\fancyscript{L}}}^{p}_i \cap {{\fancyscript{L}}} = \emptyset \) and \({{\fancyscript{L}}}^{p}_i \cap {{\fancyscript{L}}}^{p}_j \ne \emptyset \) for some \(j\ne i \in 1,\ldots ,s_p\), then delete sublist \({{\fancyscript{L}}}^{p}_i\) and decrease \(s_p\).
 3.
If \(s_p=0\), delete list \({{\fancyscript{L}}}^{p}\) and decrease \(P\).
 1.
Open nodes: those in \({{\fancyscript{L}}} \cap {{\fancyscript{L}}}^{p}\) for some \(p \in P\). We continue exploration of these nodes with respect to both the outer and inner problems.
 2.
Outerfathomed nodes: those in \({{\fancyscript{L}}_\mathrm{In}}\cap {{\fancyscript{L}}}^{p}\) for some \(p \in P\). We continue exploration of these nodes with respect to the inner problem only.
 3.
Fathomed nodes: deleted from all the lists. No further exploration of these nodes is required.
Lemma 2
Proof
4.7 Valid outer upper bound and incumbent solution
Theorem 3
Consider a node \(k \in {{\fancyscript{L}}}^{p}\). Set \(\bar{x}=x^{(k)}\), where \((x^{(k)},y^{(k)})\) is the solution of the lower bounding problem (LB) at node \(k\). Solve problem (RISP) over all nodes \(j \in {{\fancyscript{L}}}^{p}\) such that \(\bar{x} \in X^{(j)}\). Find \(k' \in {{\fancyscript{L}}}^{p}\) based on (MinRISP). Then, (UB), if feasible, computes a valid upper bound on the optimal objective value of (1).
Proof
Remark 18
Recall that in this work, we employ the \(\alpha \)BB convexification techniques [1, 2] whenever we want to construct convex relaxations. As a result, over refined subsets of \(Y\), the convex relaxed problems (RISP) approximate the inner subproblems (\(\mathrm{ISP}\)) with a quadratic convergence rate [17, 52], which helps to achieve nearequality in (60) and to identify a valid upper bound (cf. also [38, Prop. 4.1(ii)]).
4.8 Selection operation
Definition 11
 (i)
find a node in \({\fancyscript{L}}\) with lowest overall lower bound: \(k^\mathrm{LB} = \mathop {\mathrm {arg\, min}}\limits _{j \in {{\fancyscript{L}}}} \{{{\underline{F}}}^{(j)}\}\);
 (ii)
find the corresponding \({{\fancyscript{X}}}_p\) subdomain, \(p\in P\), such that \(k^\mathrm{LB} \in {{\fancyscript{L}}}^{p}\);
 (iii)
select a node \(k \in {{\fancyscript{L}}} \cap {{\fancyscript{L}}}^{p}\) and a node \(k_\mathrm{In} \in {{\fancyscript{L}}_\mathrm{In}} \cap {{\fancyscript{L}}}^{p}\), if non empty, using (ISR).
5 BranchandSandwich algorithm
Given outer and inner objective tolerances \(\varepsilon _F\), \(\varepsilon _f\), respectively, the proposed global optimization algorithm for nonconvex bilevel problems is as follows.
Algorithm 1
 Step 0:
 Initialization Initialize lists: \({{\fancyscript{L}}}:={{\fancyscript{L}}}_\mathrm{In}:=\emptyset \). Set the incumbent:the iteration counter: \(\mathrm{Iter}:=0\), and the node counter: \(n_\mathrm{node}:=1\) corresponding to the whole domain \(X\times Y\).$$\begin{aligned} (x^\mathrm{UB},y^\mathrm{UB}):=\emptyset \ \text {and}\ F^\mathrm{UB} := \infty , \end{aligned}$$
 Step 1:
 Inner and outer bounds
 Step 1.1:

Solve the auxiliary relaxed inner problem (\(\mathrm{RILB}_{0}\)) to compute \({{\underline{f}}}^{(1)}\). If infeasible goto Step 2.
 Step 1.2:

Solve the auxiliary restricted inner problem (\(\mathrm{RIUB}_{0}\)). Set \(f_X^\mathrm{UB} := \bar{f}^{(1)}\).
 Step 1.3:
 Solve the lower bounding problem (LB\(_0\)) globally to obtain \({{\underline{F}}}^{(1)}\). If infeasible, goto Step 2. Otherwise, if a feasible solution \((x^{(1)},y^{(1)})\) is computed, add node to the universal list:with properties \(({{\underline{f}}}^{(1)},\bar{f}^{(1)},{{\underline{F}}}^{(1)}, x^{(1)}, l^{(1)})\), where \(l^{(1)}:=0\). Initialize the partition of \(X\), i.e., \(p:=1\) and \({{\fancyscript{X}}}_1:= X\), and generate the first independent list:$$\begin{aligned} {{\fancyscript{L}}}:=\{1\} \end{aligned}$$Set the best inner upper bound for \({{\fancyscript{X}}}_1\):$$\begin{aligned} {{\fancyscript{L}}}^{1}:=\{1\}. \end{aligned}$$$$\begin{aligned} f^{\mathrm{UB},1}:=f_X^\mathrm{UB}. \end{aligned}$$
 Step 1.4:
 Set \(\bar{x}:=x^{(1)}\) and compute \({{\underline{w}}}(\bar{x})\) using (RISP\(_0\)). Then, solve (\(\mathrm{UB}_{0}\)) locally to obtain \(\bar{F}^{(1)}\). If a feasible solution \((x_{f},y_{f})\) is obtained, update the incumbent:$$\begin{aligned} (x^\mathrm{UB},y^\mathrm{UB})=(x_{f},y_{f}) \ \text {and}\ F^\mathrm{UB} = \bar{F}^{(1)}. \end{aligned}$$
 Step 2:

Node(s) Selection If \({\fancyscript{L}}=\emptyset \), terminate and report the incumbent solution and value. Otherwise, increase the iteration counter, \(\mathrm{Iter}=\mathrm{Iter}+1\), and select a list \({{\fancyscript{L}}}^{p}\), a node \(k \in {{\fancyscript{L}}}^{p} \cap {{\fancyscript{L}}}\) and a node \(k_\mathrm{In} \in {{\fancyscript{L}}}^{p} \cap {\fancyscript{L}}_\mathrm{In}\), if \({{\fancyscript{L}}}^{p} \cap {\fancyscript{L}}_\mathrm{In} \ne \emptyset \), based on Def. 11. Remove \(k\) from \({\fancyscript{L}}\) and \(k_\mathrm{In}\) from \({\fancyscript{L}}_\mathrm{In}\).
 Step 3:
 Branching
 Step 3.1:
 Apply steps (1)(4) of Def. 6 on node \(k\) to create two new nodes, i.e., \(n_\mathrm{node} + 1\) and \(n_\mathrm{node} + 2\). Set \(n_\mathrm{new}:=2\) and$$\begin{aligned} \begin{array}{lllll} {{\underline{f}}}^{(n_\mathrm{node} + 1)} &{}:=&{} {{\underline{f}}}^{(n_\mathrm{node} + 2)} &{}:=&{} {{\underline{f}}}^{(k)};\\ \bar{f}^{(n_\mathrm{node} + 1)} &{}:=&{} \bar{f}^{(n_\mathrm{node} + 2)} &{}:=&{} \bar{f}^{(k)};\\ {{\underline{F}}}^{(n_\mathrm{node} + 1)} &{}:=&{} {{\underline{F}}}^{(n_\mathrm{node} + 2)} &{}:=&{} {{\underline{F}}}^{(k)}; \\ x^{(n_\mathrm{node} + 1)} &{}:=&{} x^{(n_\mathrm{node} + 2)} &{}:=&{} x^{(k)}; \\ l^{(n_\mathrm{node} + 1)} &{}:=&{} l^{(n_\mathrm{node} + 2)} &{}:=&{} l^{(k)} + 1. \end{array} \end{aligned}$$
 Step 3.2:
 If a node \(k_\mathrm{In}\) is selected, apply steps (1)(4) of Def. 6 on \(k_\mathrm{In}\) to create two new (outerfathomed) nodes, i.e., \(n_\mathrm{node} + 3\) and \(n_\mathrm{node} + 4\). Set \(n_\mathrm{new}:=4\) and$$\begin{aligned} \begin{array}{lllll} {{\underline{f}}}^{(n_\mathrm{node} + 3)} &{}:=&{} {{\underline{f}}}^{(n_\mathrm{node} + 4)} &{}:=&{} {{\underline{f}}}^{(k_\mathrm{In})};\\ \bar{f}^{(n_\mathrm{node} + 3)} &{}:=&{} \bar{f}^{(n_\mathrm{node} + 4)} &{}:=&{} \bar{f}^{(k_\mathrm{In})};\\ l^{(n_\mathrm{node} + 3)} &{}:=&{} l^{(n_\mathrm{node} + 4)} &{}:=&{} l^{(k_\mathrm{In})} + 1. \\ \end{array} \end{aligned}$$
 Step 3.3:

List management: For \(i = n_\mathrm{node} + 1,\ldots ,n_\mathrm{node} + n_\mathrm{new}\), find the corresponding subdomain \({{\fancyscript{X}}}_{p_i}\) such that \(i \in {{\fancyscript{L}}}^{p_i}\) and set/update \(f^{\mathrm{UB},p_i}\). Apply the innervaluedominance node fathoming rule (cf. Def. 8).
 Step 4:

Inner lower bound If there is no \(i \in \{n_\mathrm{node} + 1,\ldots ,n_\mathrm{node} + n_\mathrm{new}\}\), such that \(i \in {{\fancyscript{L}}} \cup {\fancyscript{L}}_\mathrm{In}\), apply the listdeletion fathoming rules (cf. Def. 10) and goto Step 2. Otherwise, for \(i \in \{n_\mathrm{node} + 1,\ldots ,n_\mathrm{node} + n_\mathrm{new}\}\), such that \(i \in {{\fancyscript{L}}}^{p_i}\), solve the auxiliary relaxed inner problem (RILB) to compute \({{\underline{f}}}^{(i)}\). If feasible and \({{\underline{f}}}^{(i)} \le f^{\mathrm{UB},p_i}\), then:
 if \(i \in \{n_\mathrm{node} + 1, n_\mathrm{node}+2\}\), add node \(i\) to the list \({\fancyscript{L}}\) with properties$$\begin{aligned}({{\underline{f}}}^{(i)},\bar{f}^{(i)},{{\underline{F}}}^{(i)}, x^{(i)}, l^{(i)});\end{aligned}$$
 else if \(n_\mathrm{new}=4\) and \(i \in \{ n_\mathrm{node} + 3, n_\mathrm{node}+4\}\), add node \(i\) to the list \({\fancyscript{L}}_\mathrm{In}\) with properties$$\begin{aligned}({{\underline{f}}}^{(i)},\bar{f}^{(i)}, l^{(i)}).\end{aligned}$$
 Step 5:

Inner Upper Bound If there is no \(i \in \{n_\mathrm{node} + 1,\ldots ,n_\mathrm{node} + n_\mathrm{new}\}\), such that \(i \in {{\fancyscript{L}}} \cup {\fancyscript{L}}_\mathrm{In}\), goto Step 2. Otherwise, for \(i \in \{n_\mathrm{node} + 1,\ldots ,n_\mathrm{node} + n_\mathrm{new}\}\), such that \(i \in {{\fancyscript{L}}} \cup {\fancyscript{L}}_\mathrm{In}\), solve the auxiliary restricted inner problem (RIUB) to compute \(\bar{f}^{(i)}\). Update \(\bar{f}^{(i)}\) in \({\fancyscript{L}}\) or in \({\fancyscript{L}}_\mathrm{In}\) and then update \(f^{\mathrm{UB},p_i}\) using Eq. (44) (cf. Def. 7). Apply the innervaluedominance node fathoming rule (cf. Def. 8) and, if necessary, the list deletion procedure (cf. Def. 10).
 Step 6:

Outer lower bound If there is no \(i \in \{n_\mathrm{node} + 1,n_\mathrm{node} + 2\}\), such that \(i \in {{\fancyscript{L}}}\), goto Step 2. Otherwise, for \(i \in \{n_\mathrm{node} + 1,n_\mathrm{node} + 2\}\), such that \(i \in {{\fancyscript{L}}}\), solve the lower bounding problem (\(\mathrm{LB}\)) globally to obtain \({{\underline{F}}}^{(i)}\). If a feasible solution \((x^{(i)},y^{(i)})\) is obtained with \({{\underline{F}}}^{(i)} \le F^\mathrm{UB} + \varepsilon _F\), update \({{\underline{F}}}^{(i)}\) and \(x^{(i)}\) in \({\fancyscript{L}}\). If \({{\underline{F}}}^{(i)} \ge F^\mathrm{UB}  \varepsilon _F\), move \(i\) from \({\fancyscript{L}}\) to the list \({\fancyscript{L}}_\mathrm{In}\) with properties \(({{\underline{f}}}^{(i)},\bar{f}^{(i)}, l^{(i)})\) and apply the list deletion procedure (cf. Def. 10).
 Step 7:

Outer upper bound If there is no \(i \in \{n_\mathrm{node} + 1,n_\mathrm{node} + 2\}\), such that \(i \in {{\fancyscript{L}}}\), goto Step 2. Otherwise, for \(i = n_\mathrm{node} + 1,n_\mathrm{node} + 2\), such that \(i \in {{\fancyscript{L}}}\), do:
 Step 7.1:

Set \(\bar{x}:=x^{(i)}\) and, using (RISP), compute \({{\underline{w}}}^{(j)}(\bar{x})\) for all \(j \in {{\fancyscript{L}}}^{p_i}\), such that \(\bar{x} \in X^{(j)}\). Set \(i'\) based on (MinRISP).
 Step 7.2:
 Solve (UB) (locally) to obtain \(\bar{F}^{(i')}\). If a feasible solution \((x^{(i')}_{f},y^{(i')}_{f})\) is obtained with \(\bar{F}^{(i')} < F^\mathrm{UB}\) update the incumbent:Move from the list \({\fancyscript{L}}\) to the list \({\fancyscript{L}}_\mathrm{In}\) all nodes \(j\) such that \({{\underline{F}}}^{(j)} \ge F^\mathrm{UB}  \varepsilon _F\) and apply the list deletion procedure (cf. Def. 10). Increase the node counter, i.e., \(n_\mathrm{node} = n_\mathrm{node} + n_\mathrm{new}\), and goto Step 2.$$\begin{aligned} (x^\mathrm{UB},y^\mathrm{UB})=(x^{(i')}_{f},y^{(i')}_{f}) \ \text {and}\ F^\mathrm{UB} = \bar{F}^{(i')}. \end{aligned}$$
Note that in Steps 4–5, we start by testing whether the new nodes for inner and outer exploration, created in Step 3, still exist following the application of node fathoming in Step 3.3 and in Step 4. As more node fathoming may take place in Step 5 (resp. Step 6), we start Step 6 (resp. Step 7) by testing whether there still exist any new nodes for outer exploration. A detailed stepbystep application of the algorithm on two suitable test problems is presented in Part II of this work [47], which focuses on the application of the algorithm to problems from the literature as well as on its convergence properties.
6 Conclusions
We presented a branchandbound scheme, the B&S algorithm, for the solution of optimistic bilevel programming problems that satisfy an appropriate regularity condition in the inner problem. The novelty of our scheme is threefold as it: (i) encompasses implicitly two branchandbound trees, (ii) introduces a simple outer lower bounding problem with the useful feature that it is always obtained from the lower bounding problem of the parent node, and (iii) allows branchandbound with respect to the inner and the outer variables without distinction, but at the same time it keeps track of the partitioning of \(Y\) for successively refined subdomains of \(X\). The convergence properties of the algorithm are explored in Part II of this work and finite convergence to an \(\varepsilon \)optimal global solution is proved [47]. The success of the proposed method depends on having good inner upper bounds for value dominance fathoming and for (outer) fathoming by outer infeasibility. The application of the algorithm to numerical examples is also considered in Part II.
Footnotes
 1.
A hybrid approach combining global and local optimization techniques, when appropriate [64], can also be investigated.
 2.
Notice that in a maximization framework, the relation between the objective function of a relaxation/restriction and the original objective function is reversed, e.g., a relaxation of a maximization problem has an objective function always greater than or equal to the original objective function.
 3.
If there are no \((x,y)\in {\varOmega }\), then the bilevel program (1) is infeasible.
 4.
The satisfaction of a constraint qualification ensures that the KKT are necessary conditions for a given point to be a minimizer.
 5.
Assuming regularity refers to assuming the satisfaction of an appropriate constraint qualification [40].
 6.
The nonbranching version has also been extended in [55] to solve the mixedinteger nonlinear bilevel programming problem.
 7.
 8.The Lagrange function of (22) is:If we differentiate it with respect to \(y\), \(x_0\) vanishes, i.e.,:$$\begin{aligned} L(x_0,x,y,\mu ,\lambda ,\nu ) = x_0 + f(x,y) + {\mu }^{\mathrm {T}}g(x,y) + {\lambda }^{\mathrm {T}}(y^\mathrm{L}y) + {\nu }^{\mathrm {T}}(yy^\mathrm{U}). \end{aligned}$$$$\begin{aligned} \nabla _y L(x_0,x,y,\mu ,\lambda ,\nu ) = \nabla _y f(x,y) + \nabla _y{g}^{\mathrm {T}}(x,y)\mu  \lambda + \nu . \end{aligned}$$
 9.
Overall, we have lists \({\fancyscript{L}}\), \({\fancyscript{L}}_\mathrm{In}\) and \(\{{{\fancyscript{L}}}^{1},\ldots ,{{\fancyscript{L}}}^{P}\}\).
 10.
Recall that we use quotes to refer to the “whole” inner space because some \(Y\) subdomains are eliminated at some point due to fathoming (cf. Sect. 4.6). As a result, we may have \({{\fancyscript{Y}}}_p \subset Y\) for some \(p \in P\).
 11.
For subdividing a given node \(k\), one can use the socalled bisection of ratio r. Node \(k\) is divided into subrectangles such that the ratio of the volume of the smaller subrectangle to that of \(k\) equals \(r\), where \(0<r\le 1/2\). When \(r=1/2\), the bisection is called exact [76].
 12.The gradients of the bound constraints are as follows:where \(1_m\) denotes a vector of ones in \(\mathrm{I\!R}^m\).$$\begin{aligned} \begin{array}{llcll} \nabla _y\left[ y^{(k,\mathrm L)}y\right] &{}=&{} 1_m &{}\ne 0 &{}\forall y,\\ \nabla _y\left[ yy^{(k,\mathrm U)}\right] &{}=&{} 1_m &{}\ne 0 &{}\forall y, \end{array} \end{aligned}$$
 13.
The outer lower bounding problem is not solved for nodes in list \({\fancyscript{L}}_\mathrm{In}\).
 14.
The rule \(\bar{f}^{(k)}=\infty \) could also be added to innerfathoming rules only in the cases where the set \({\varOmega }_\mathrm{KKT}\) is independent of the bounds on \(y\).
Notes
Acknowledgments
We gratefully acknowledge funding by the Leverhulme Trust through the Philip Leverhulme Prize and by the EPSRC through a Leadership Fellowship [EP/J003840/1]. We also thank two anonymous referees for their thorough and insightful comments that helped us to improve the paper.
References
 1.Adjiman, C.S., Dallwig, S., Floudas, C.A., Neumaier, A.: A global optimization method, \(\alpha \)BB, for general twicedifferentiable constrained NLPs—I. Theoretical advances. Comput. Chem. Eng. 22(9), 1137–1158 (1998)CrossRefGoogle Scholar
 2.Adjiman, C.S., Androulakis, I.P., Floudas, C.A.: A global optimization method, \(\alpha \)BB, for general twicedifferentiable constrained NLPsII. Implementation and computational results. Comput. Chem. Eng. 22(9), 1159–1179 (1998)Google Scholar
 3.Androulakis, I.P., Maranas, C.D., Floudas, C.A.: \(\alpha {\rm BB}\): a global optimization method for general constrained nonconvex problems. J. Global Optim. 7(4), 337–363 (1995). State of the art in global optimization: computational methods and applications (Princeton, NJ, 1995)Google Scholar
 4.Audet, C., Hansen, P., Jaumard, B., Savard, G.: On the linear maxmin and related programming problems. In: Migdalas, A., Pardalos, P.M., Värbrand, P. (eds.) Multilevel Optimization: Algorithms and Applications, pp. 181–208. Kluwer, Dordrecht (1998)CrossRefGoogle Scholar
 5.Bard, J.F.: An algorithm for solving the general bilevel programming problem. Math. Oper. Res. 8(2), 260–272 (1983)CrossRefGoogle Scholar
 6.Bard, J.F.: Coordination of a multidivisional organization through two levels of management. Omega 11(5), 457–468 (1983)CrossRefGoogle Scholar
 7.Bard, J.F.: Convex twolevel optimization. Math. Program. 40(1, (Ser. A)), 15–27 (1988)CrossRefGoogle Scholar
 8.Bard, J.F.: Production planning with variable demand. Omega 18(1), 35–42 (1990)CrossRefGoogle Scholar
 9.Bard, J.F.: Practical Bilevel Optimization, Nonconvex Optimization and its Applications, vol. 30. Kluwer, Dordrecht (1998)CrossRefGoogle Scholar
 10.Bard, J.F., Plummer, J., Sourie, J.C.: A bilevel programming approach to determining tax credits for biofuel production. Eur. J. Oper. Res. 120(1), 30–46 (2000)CrossRefGoogle Scholar
 11.Bazaraa, M.S., Goode, J.J., Shetty, C.M.: Constraint qualifications revisited. Manag. Sci. 18, 567–573 (1972)CrossRefGoogle Scholar
 12.Bazaraa, M.S., Shetty, C.M.: Nonlinear Programming. Wiley, New York (1979)Google Scholar
 13.Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1995)Google Scholar
 14.Bhattacharjee, B., Green Jr, W.H., Barton, P.I.: Interval methods for semiinfinite programs. Comput. Optim. Appl. 30(1), 63–93 (2005)CrossRefGoogle Scholar
 15.Bhattacharjee, B., Lemonidis, P., Green Jr, W.H., Barton, P.I.: Global solution of semiinfinite programs. Math. Program. 103(2, Ser. B), 283–307 (2005)CrossRefGoogle Scholar
 16.Blankenship, J.W., Falk, J.E.: Infinitely constrained optimization problems. J. Optim. Theory Appl. 19(2), 261–281 (1976)CrossRefGoogle Scholar
 17.Bompadre, A., Mitsos, A.: Convergence rate of McCormick relaxations. J. Global Optim. 52(1), 1–28 (2012)Google Scholar
 18.Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)CrossRefGoogle Scholar
 19.Bracken, J., McGill, J.T.: Defense applications of mathematical programs with optimization problems in the constraints. Oper. Res. 22(5), 1086–1096 (1974)CrossRefGoogle Scholar
 20.Cassidy, R.G., Kirby, M.J.L., Raike, W.M.: Efficient distribution of resources through three levels of government. Manag. Sci. 17(8), B462–B473 (1971)CrossRefGoogle Scholar
 21.Clark, P.A.: Bilevel programming for steadystate chemical process design—I. Fundamentals and algorithms. Comput. Chem. Eng. 14(1), 87–97 (1990)CrossRefGoogle Scholar
 22.Clark, P.A.: Bilevel programming for steadystate chemical process design—II. Performance study for nondegenerate problems. Comput. Chem. Eng. 14(1), 99–109 (1990)CrossRefGoogle Scholar
 23.Colson, B., Marcotte, P., Savard, G.: An overview of bilevel optimization. Ann. Oper. Res. 153, 235–256 (2007)CrossRefGoogle Scholar
 24.Csendes, T., Ratz, D.: Subdivision direction selection in interval methods for global optimization. SIAM J. Numer. Anal. 34(3), 922–938 (1997)CrossRefGoogle Scholar
 25.Demiguel, V., Friedlander, M.P., Nogales, F.J., Scholtes, S.: A twosided relaxation scheme for mathematical programs with equilibrium constraints. SIAM J. Optim. 16(2), 587–609 (2005)CrossRefGoogle Scholar
 26.Dempe, S.: Firstorder necessary optimality conditions for general bilevel programming problems. J. Optim. Theory Appl. 95(3), 735–739 (1997)CrossRefGoogle Scholar
 27.Dempe, S.: A bundle algorithm applied to bilevel programming problems with nonunique lower level solutions. Comput. Optim. Appl. 15(2), 145–166 (2000)CrossRefGoogle Scholar
 28.Dempe, S.: Foundations of Bilevel Programming, Nonconvex Optimization and its Applications, vol. 61. Kluwer, Dordrecht (2002)Google Scholar
 29.Dempe, S.: Annotated bibliography on bilevel programming and mathematical programs with equilibrium constraints. Optimization 52(3), 333–359 (2003)CrossRefGoogle Scholar
 30.Dempe, S., Dutta, J.: Is bilevel programming a special case of a mathematical program with complementarity constraints? Math. Program. 131(1–2), 37–48 (2012)Google Scholar
 31.Dempe, S., Zemkoho, A.B.: The generalized MangasarianFromowitz constraint qualification and optimality conditions for bilevel programs. J. Optim. Theory Appl. 148(1), 46–68 (2011)CrossRefGoogle Scholar
 32.Deng, X.: Complexity issues in bilevel linear programming. In: Multilevel Optimization: Algorithms and Applications, Nonconvex Optim. Appl., vol. 20, pp. 149–164. Kluwer, Dordrecht (1998)Google Scholar
 33.Faísca, N.P., Dua, V., Rustem, B., Saraiva, P.M., Pistikopoulos, E.N.: Parametric global optimisation for bilevel programming. J. Global Optim. 38(4), 609–623 (2007)CrossRefGoogle Scholar
 34.Falk, J.E.: A linear maxmin problem. Math. Program. 5, 169–188 (1973)CrossRefGoogle Scholar
 35.Falk, J.E., Hoffman, K.: A nonconvex maxmin problem. Naval Res. Logist. Quart. 24(3), 441–450 (1977)CrossRefGoogle Scholar
 36.Floudas, C.A.: Deterministic Global Optimization, Nonconvex Optimization and its Applications, vol. 37. Kluwer, Dordrecht (2000)CrossRefGoogle Scholar
 37.Floudas, C.A., Akrotirianakis, I.G., Caratzoulas, S., Meyer, C.A., Kallrath, J.: Global optimization in the 21st century: advances and challenges. Comput. Chem. Eng. 29(6), 1185–1202 (2005)CrossRefGoogle Scholar
 38.Floudas, C.A., Stein, O.: The adaptive convexification algorithm: a feasible point method for semiinfinite programming. SIAM J. Optim. 18(4), 1187–1208 (2007)CrossRefGoogle Scholar
 39.FortunyAmat, J., McCarl, B.: A representation and economic interpretation of a twolevel programming problem. J. Oper. Res. Soc. 32(9), 783–792 (1981)CrossRefGoogle Scholar
 40.Gauvin, J.: A necessary and sufficient regularity condition to have bounded multipliers in nonconvex programming. Math. Program. 12(1), 136–138 (1977)CrossRefGoogle Scholar
 41.Guerra Vázquez, F., Rückmann, J.J., Stein, O., Still, G.: Generalized semiinfinite programming: a tutorial. J. Comput. Appl. Math. 217(2), 394–419 (2008)CrossRefGoogle Scholar
 42.Gümüş, Z.H., Floudas, C.A.: Global optimization of nonlinear bilevel programming problems. J. Global Optim. 20(1), 1–31 (2001)CrossRefGoogle Scholar
 43.Horst, R.: Bisection by global optimization revisited. J. Optim. Theory Appl. 144(3), 501–510 (2010)Google Scholar
 44.Horst, R., Pardalos, P.M., Thoai, N.V.: Introduction to Global Optimization, Nonconvex Optimization and its Applications, vol. 48, 2nd edn. Kluwer, Dordrecht (2000)CrossRefGoogle Scholar
 45.Horst, R., Tuy, H.: Global Optimization, 3rd edn. Springer, Berlin (1996). (Deterministic approaches)CrossRefGoogle Scholar
 46.Jongen, H.Th., Shikhman, V.: Bilevel optimization: on the structure of the feasible set. Math. Program. 136(1, Ser. B), 65–89 (2012)Google Scholar
 47.Kleniati, P.M., Adjiman, C.S.: BranchandSandwich: a deterministic global optimization algorithm for optimistic bilevel programming problems. Part II: convergence analysis & numerical results. J. Global Optim. (2014). doi: 10.1007/s1089801301208
 48.LeBlanc, L.J., Boyce, D.E.: A bilevel programming algorithm for exact solution of the network design problem with useroptimal flows. Transp. Res. B 20(3), 259–265 (1986)CrossRefGoogle Scholar
 49.Loridan, P., Morgan, J.: Approximate solutions for twolevel optimization problems. In: Trends in Mathematical Optimization (Irsee, 1986), Internat. Schriftenreihe Numer. Math., vol. 84, pp. 181–196. Birkhäuser, Basel (1988)Google Scholar
 50.Loridan, P., Morgan, J.: Weak via strong Stackelberg problem: new results. J. Global Optim. 8(3), 263–287 (1996). Hierarchical and bilevel programmingCrossRefGoogle Scholar
 51.Lucchetti, R., Mignanego, F., Pieri, G.: Existence theorems of equilibrium points in Stackelberg games with constraints. Optimization 18(6), 857–866 (1987)CrossRefGoogle Scholar
 52.Maranas, C.D., Floudas, C.A.: Global minimum potential energy conformations of small molecules. J. Global Optim. 4(2), 135–170 (1994)Google Scholar
 53.Migdalas, A.: Bilevel programming in traffic planning: models, methods and challenge. J. Global Optim. 7(4), 381–405 (1995)CrossRefGoogle Scholar
 54.Mirrlees, J.A.: The theory of moral hazard and unobservable behaviour: Part I. Rev. Econ. Stud. 66(1), 3–21 (1999)CrossRefGoogle Scholar
 55.Mitsos, A.: Global solution of nonlinear mixedinteger bilevel programs. J. Global Optim. 47(4), 557–582 (2010)CrossRefGoogle Scholar
 56.Mitsos, A.: Global optimization of semiinfinite programs via restriction of the righthand side. Optimization 60(10–11), 1291–1308 (2011)CrossRefGoogle Scholar
 57.Mitsos, A., Barton, P.I.: Issues in the development of global optimization algorithms for bilevel programs with a nonconvex inner program. Tech. rep., Massachusetts Institute of Technology (2006)Google Scholar
 58.Mitsos, A., Barton, P.I.: A Test Set for Bilevel Programs. Tech. rep., Massachusetts Institute of Technology (2010). http://yoric.mit.edu/sites/default/files/bileveltestset.pdf
 59.Mitsos, A., Bollas, G.M., Barton, P.I.: Bilevel optimization formulation for parameter estimation in liquid–liquid phase equilibrium problems. Chem. Eng. Sci. 64(3), 548–559 (2009)CrossRefGoogle Scholar
 60.Mitsos, A., Lemonidis, P., Barton, P.I.: Global solution of bilevel programs with a nonconvex inner program. J. Global Optim. 42(4), 475–513 (2008)CrossRefGoogle Scholar
 61.Mitsos, A., Lemonidis, P., Lee, C.K., Barton, P.I.: Relaxationbased bounds for semiinfinite programs. SIAM J. Optim. 19(1), 77–113 (2008)CrossRefGoogle Scholar
 62.Outrata, J., Kočvara, M., Zowe, J.: Nonsmooth approach to optimization problems with equilibrium constraints, Nonconvex Optimization and its Applications, vol. 28. Kluwer, Dordrecht (1998). Theory, applications and numerical resultsGoogle Scholar
 63.Pardalos, P.M., Romeijn, H.E., Tuy, H.: Recent developments and trends in global optimization. J. Comput. Appl. Math. 124(1–2), 209–228 (2000)CrossRefGoogle Scholar
 64.Pereira, F.E., Jackson, G., Galindo, A., Adjiman, C.S.: The HELD algorithm for multicomponent, multiphase equilibrium calculations with generic equations of state. Comput. Chem. Eng. 36, 99–118 (2012)CrossRefGoogle Scholar
 65.Ryu, J.H., Dua, V., Pistikopoulos, E.N.: A bilevel programming framework for enterprisewide process networks under uncertainty. Comput. Chem. Eng. 28(6–7), 1121–1129 (2004)CrossRefGoogle Scholar
 66.Sahin, K.H., Ciric, A.R.: A dual temperature simulated annealing approach for solving bilevel programming problems. Comput. Chem. Eng. 23(1), 11–25 (1998)CrossRefGoogle Scholar
 67.Stein, O., Still, G.: On generalized semiinfinite optimization and bilevel optimization. Eur. J. Oper. Res. 142(3), 444–462 (2002)CrossRefGoogle Scholar
 68.Stein, O., Still, G.: Solving semiinfinite optimization problems with interior point techniques. SIAM J. Control Optim. 42(3), 769–788 (2003)CrossRefGoogle Scholar
 69.Still, G.: Generalized semiinfinite programming: numerical aspects. Optimization 49(3), 223–242 (2001)CrossRefGoogle Scholar
 70.Still, G.: Solving generalized semiinfinite programs by reduction to simpler problems. Optimization 53(1), 19–38 (2004)CrossRefGoogle Scholar
 71.Tawarmalani, M., Sahinidis, N.V.: Convexification and Global Optimization in Continuous and MixedInteger Nonlinear Programming, Nonconvex Optimization and its Applications, vol. 65. Kluwer, Dordrecht (2002)Google Scholar
 72.Tsoukalas, A.: Global Optimization Algorithms for Multilevel and Generalized Semiinfinite Problems. Ph.D. thesis, Imperial College London (2009)Google Scholar
 73.Tsoukalas, A., Parpas, P., Rustem, B.: A smoothing algorithm for finite minmaxmin problems. Optim. Lett. 3(1), 49–62 (2009)CrossRefGoogle Scholar
 74.Tsoukalas, A., Rustem, B., Pistikopoulos, E.N.: A global optimization algorithm for generalized semiinfinite, continuous minimax with coupled constraints and bilevel problems. J. Global Optim. 44(2), 235–250 (2009)CrossRefGoogle Scholar
 75.Tsoukalas, A., Wiesemann, W., Rustem, B.: Global optimisation of pessimistic bilevel problems. In: Lectures on Global Optimization, Fields Inst. Commun., vol. 55, pp. 215–243 (2009)Google Scholar
 76.Tuy, H.: Convex Analysis and Global Optimization, Nonconvex Optimization and its Applications, vol. 22. Kluwer, Dordrecht (1998)Google Scholar
 77.Vicente, L.N., Calamai, P.H.: Bilevel and multilevel programming: a bibliography review. J. Global Optim. 5(3), 291–306 (1994)CrossRefGoogle Scholar
 78.Vicente, L.N., Calamai, P.H.: Geometry and local optimality conditions for bilevel programs with quadratic strictly convex lower levels. In: Du, D., Pardalos, P.M. (eds.) Minimax and Applications. Kluwer, Dordrecht (1995)Google Scholar
 79.Visweswaran, V., Floudas, C.A., Ierapetritou, M.G., Pistikopoulos, E.N.: A decompositionbased global optimization approach for solving bilevel linear and quadratic programs. In: State of the Art in Global Optimization (Princeton, NJ, 1995), Nonconvex Optim. Appl., vol. 7, pp. 139–162. Kluwer, Dordrecht (1996)Google Scholar
 80.Ye, J.J.: Constraint qualifications and KKT conditions for bilevel programming problems. Math. Oper. Res. 31(4), 811–824 (2006)CrossRefGoogle Scholar
 81.Yezza, A.: Firstorder necessary optimality conditions for general bilevel programming problems. J. Optim. Theory Appl. 89(1), 189–219 (1996)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.