A Partition-Based Optimization Approach for Level Set Approximation: Probabilistic Branch and Bound

Zabinsky, Zelda B.; Huang, Hao

doi:10.1007/978-3-030-11866-2_6

Zelda B. Zabinsky³ &
Hao Huang⁴

Part of the book series: Women in Engineering and Science ((WES))

930 Accesses
5 Citations

Abstract

We present a partition-based random search optimization algorithm, called probabilistic branch and bound (PBnB), to approximate a level set that achieves a user-defined target. Complex systems are often modeled with computer simulations, both deterministic and stochastic, and a decision-maker may desire a set of near-optimal solutions, such as solutions with performance metrics in the best 10% overall, instead of a single estimate of a global optimum. Our approach is valid for black-box, ill-structured, noisy function evaluations, involving both integer and real-valued decision variables. PBnB iteratively maintains, prunes, or branches subregions of a bounded solution space based on an updated confidence interval of a target quantile. Finite-time probability bounds are derived on the maximum volume of incorrectly maintained or incorrectly pruned regions. Thus, the user has a statistical quantification of the output. For example, with probability greater than 0.9, the final maintained subregion is inside the target level set with the volume of incorrectly maintained points less than 2% of the volume of the initial set. Numerical results on noisy and non-noisy test functions demonstrate the performance of the PBnB algorithm. Tests on a sphere function, with spherical level sets, allow a comparison between the theoretical bounds and numerical results. PBnB has been applied to several application areas including: weather impacts on air traffic flow management; policy decisions on screening and treatment budget allocation for hepatitis C; combining portable ultrasound machines with reserved MRI usage for orthopedic care; and optimizing water distribution networks using simulation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ali MM, Khompatraporn C, Zabinsky ZB (2005) A numerical evaluation of several stochastic algorithms on selected continuous global optimization test problems. J Glob Optim 31:635–672
Article MathSciNet Google Scholar
Bechhofer RE, Dunnett CW, Sobel M (1954) A two-sample multiple decision procedure for ranking means of normal populations with a common unknown variance. Biometrika 41:170–176
MathSciNet MATH Google Scholar
Chen CH, He D (2005) Intelligent simulation for alternatives comparison and application to air traffic management. J Syst Sci Syst Eng 14(1):37–51
Article Google Scholar
Chen CH, He D, Fu M, Lee LH (2008) Efficient simulation budget allocation for selecting an optimal subset. INFORMS J Comput 20(4):579–595
Article Google Scholar
Chen CH, Yucesan E, Dai L, Chen HC (2010) Efficient computation of optimal budget allocation for discrete event simulation experiment. IIE Trans 42(1):60–70
Article Google Scholar
Chen CH, Lee LH (2011) Stochastic simulation optimization: an optimal computing budget allocation. World Scientific, Singapore
Google Scholar
Conover WJ (1999) Practical nonparametric statistics, 3rd edn. Wiley, New York
Google Scholar
Csendes T, Pintér J (1993) A new interval method for locating the boundary of level sets. Int J Comput Math 49(1–2):53–59
Article Google Scholar
Fu MC, Chen CH, Shi L (2008) Some topics for simulation optimization. In Proceedings of the 40th conference on winter simulation. IEEE, Piscataway, p 27–38
Google Scholar
Fu MC (2015) Handbook of simulation optimization. Springer, New York
Book Google Scholar
Ho YC, Cassandras CG, Chen CH, Dai L (2000) Ordinal optimisation and simulation. J Oper Res Soc 51:490–500
Article Google Scholar
Ho YC, Zhao QC, Jia QS (2007) Ordinal optimization: soft optimization for hard problems. Springer, Berlin
Book Google Scholar
Hu J, Fu MC, Marcus SI (2007) A model reference adaptive search method for global optimization. Oper Res 55(3):549–568
Article MathSciNet Google Scholar
Huang H (2016) Discrete-event simulation and optimization to improve the performance of a healthcare system. Ph.D. Thesis, University of Washington
Google Scholar
Huang H, Zabinsky ZB (2013) Adaptive probabilistic branch and bound with confidence intervals for level set approximation. In: Pasupathy R, Kim SH, Tolk A, Hill R, Kuhl ME (eds) Proceedings of the 2013 conference on winter simulation. IEEE, Piscataway, p 980–991
Chapter Google Scholar
Huang H, Zabinsky ZB (2014) Multiple objective probabilistic branch and bound for Pareto optimal approximation. In: Proceedings of the 2014 conference on winter simulation. IEEE, Piscataway, p 3916–3927
Chapter Google Scholar
Huang H, Zabinsky ZB, Heim JA, Fishman P (2015) Simulation optimization for medical imaging resource allocation. In: Extended abstract of the 2015 conference on INFORMS healthcare. Nashville
Google Scholar
Huang H, Zabinsky ZB, Li Y, Liu S (2016) Analyzing hepatitis C screening and treatment strategies using probabilistic branch and bound. In: Roeder TMK, Frazier PI, Szechtman R, Zhou E, Huschka T, Chick SE (eds) Proceedings of the 2016 conference on winter simulation. IEEE Press, Piscataway, p 2076–2086
Chapter Google Scholar
Kim SH, Nelson BL (2001) A fully sequential procedure for indifference-zone selection in simulation. ACM Trans Model Comput Simul 11(3):251–273
Article Google Scholar
Linz D, Huang H, Zabinsky ZB (2015) Partition based optimization for updating sample allocation strategy using lookahead. In: Yilmaz L, Chan WKV, Roeder, TMK Moon I, Macal C, Rossetti MD (eds) Proceedings of the 2015 conference on winter simulation. IEEE Press, Huntington Beach
Google Scholar
Nelson BL, Swann J, Goldsman D, Song W (2001) Simple procedures for selecting the best simulated system when the number of alternatives is large. Oper Res 49(6):950–963
Article Google Scholar
Ólafsson S (2004) Two-stage nested partitions method for stochastic optimization. Methodol Comput Appl Probab 6:5–27
Article MathSciNet Google Scholar
Pintér J (1990) Globally optimized calibration of environmental models. Ann Oper Res 25(1):211–221
Article MathSciNet Google Scholar
Prasetio Y (2005) Simulation-based optimization for complex stochastic systems. Ph.D. Thesis, University of Washington
Google Scholar
Rinott Y (1978) On two-stage selection procedures and related probability-inequalities. Commun Stat Theory Methods 7(8):799–811
Article MathSciNet Google Scholar
Shi L, Ólafsson S (2000) Nested partitions method for stochastic optimization. Methodol Comput Appl Probab 2(3):271–291
Article MathSciNet Google Scholar
Shi L, Ólafsson S (2009) Nested partitions method, theory and applications. Springer, New York
MATH Google Scholar
Tsai YA, Pedrielli G, Mathesen L, Huang H, Zabinsky ZB, Candelieri A, Perego R (2018) Stochastic optimization for feasibility determination: an application to water pump operation in water distribution networks. In: Rabe M, Juan AA, Mustafee N, Skoogh A, Jain S, Johansson B (eds) Under review in the proceedings of the 2018 conference on winter simulation. IEEE, Piscataway
Google Scholar
Wang W (2011) Adaptive random search for noisy and global optimization. Ph.D. Thesis, University of Washington
Google Scholar
Xu WL, Nelson BL (2013) Empirical stochastic branch-and-bound for optimization via simulation. IIE Trans 45(7):685–698
Article Google Scholar
Xu J, Zhang S, Huang E, Chen CH, Lee LH, Celik N (2016) MO²TOS: multi-fidelity optimization with ordinal transformation and optimal sampling. Asia Pac J Oper Res 33(3):1650017
Article Google Scholar
Zabinsky ZB (1998) Stochastic methods for practical global optimization. J Glob Optim 13:433–444
Article MathSciNet Google Scholar
Zabinsky ZB, Wang W, Prasetio Y, Ghate A, Yen JW (2011) Adaptive probabilistic branch and bound for level set approximation. In: Jain S, Creasey RR, Himmelspach J, White KP, Fu M (eds) Proceedings of the 2011 conference on winter simulation. IEEE, Piscataway, p 46–57
Google Scholar

Download references

Acknowledgements

This work has been funded in part by the Department of Laboratory Medicine at Seattle Children’s Hospital, and by National Science Foundation (NSF) grants CMMI-1235484 and CMMI-1632793.

Author information

Authors and Affiliations

University of Washington, Seattle, WA, USA
Zelda B. Zabinsky
Yuan Ze University, Taoyuan, Taiwan
Hao Huang

Authors

Zelda B. Zabinsky
View author publications
You can also search for this author in PubMed Google Scholar
Hao Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zelda B. Zabinsky .

Editor information

Editors and Affiliations

Department of Industrial and Systems Engineering, Auburn University, Auburn, AL, USA
Alice E. Smith

Appendix

Proof of Theorem 1

Proof

We consider the iterative effect on δ _k as subregions are pruned or maintained. We use the superscript k to denote the iteration that subregions are pruned $\{\sigma ^k_i:{P}_i = 1\}$ or maintained $\{\sigma ^k_i:{M}_i = 1\}$. By (6.15) in the algorithm, we have

$$\displaystyle \begin{aligned} \delta_k&=\frac{\delta_{k-1} v\left(\widetilde{\varSigma}^C_{k-1}\right) - \sum_{i:{M}_i = 1}v\left(\sigma^{k-1}_i\right)} {v\left(\widetilde{\varSigma}^C_{k-1}\right) -\sum_{i:{P}_i = 1}v\left(\sigma^{k-1}_i\right) - \sum_{i:{M}^{k-1}_i = 1} v\left(\sigma^{k-1}_i\right)} \end{aligned} $$

and removing the pruned and maintained subregions from $\widetilde {\varSigma }^C_{k-1}$ yields the next current set of subregions $\widetilde {\varSigma }^C_{k}$, used in the denominator, then

$$\displaystyle \begin{aligned} &=\frac{\delta_{k-1} v\left(\widetilde{\varSigma}^C_{k-1}\right) - \sum_{i:{M}_i = 1}v\left(\sigma^{k-1}_i\right)}{v\left(\widetilde{\varSigma}^C_{k}\right)} \end{aligned} $$

and invoking (6.15) in the algorithm again to replace δ _k−1 with its equivalence in terms of δ _k−2 (assuming that the maintained regions are in the level set and pruned regions are out of the level set), we have

$$\displaystyle \begin{aligned} &=\frac{\frac{\delta_{k-2}v\left(\widetilde{\varSigma}^C_{k-2}\right)- \sum_{i:{M}_i = 1}v\left(\sigma^{k-2}_i\right)}{v\left(\widetilde{\varSigma}^C_{k-1}\right)} v\left(\widetilde{\varSigma}^C_{k-1}\right) - \sum_{i:{M}_i = 1}v\left(\sigma^{k-1}_i\right)}{v\left(\widetilde{\varSigma}^C_{k}\right)}\\ &=\frac{\delta_{k-2} v\left(\widetilde{\varSigma}^C_{k-2}\right) - \sum_{l=k-2}^{k-1}\sum_{i:{M}_i = 1}v\left(\sigma^{l}_i\right)}{v\left(\widetilde{\varSigma}^C_{k}\right)}\\ &\quad \vdots \\ &=\frac{\delta_{1} v\left(\widetilde{\varSigma}^C_{1}\right) - \sum_{l=1}^{k-1}\sum_{i:{M}_i = 1}v\left(\sigma^{l}_i\right)}{v\left(\widetilde{\varSigma}^C_{k}\right)} \end{aligned} $$

and by the initial setting of δ ₁ and Σ˜1C,

$$\displaystyle \begin{aligned} &=\frac{\delta v(S) - \sum_{l=1}^{k-1}\sum_{i:{M}_i = 1}v\left(\sigma^{l}_i\right)}{v\left(\widetilde{\varSigma}^C_{k}\right)} \end{aligned} $$

and $\sum _{l=1}^{k-1}\sum _{i:{M}_i = 1}v\left (\sigma ^{l}_i\right )=v\left (\widetilde {\varSigma }^M_{k}\right )$ since it denotes the volume of all maintained subregions at the end of the k − 1 iteration, therefore,

$$\displaystyle \begin{aligned} &=\frac{\delta v(S) - v\left(\widetilde{\varSigma}^M_{k}\right)}{v\left(\widetilde{\varSigma}^C_{k}\right)}. \end{aligned} $$

(6.44)

Based on the definition of quantile, and when X is uniformly sampled on S, we have

$$\displaystyle \begin{aligned} y(\delta, S)&=\mathop{\arg\min}\limits_{y\in \{f(x):x \in S\}} \left\{P(f(X)\leq y | X \in S)\geq \delta \right\} \\ &=\mathop{\arg\min}\limits_{y\in \{f(x):x \in S\}} \left\{\frac{v(\{x\in S: f(x)\leq y\})}{v(S)}\geq \delta \right\} \end{aligned} $$

and subtracting v(Σ˜kM)−𝜖 kM+𝜖 kP v(S) from both sides and multiplying v(S) v(Σ˜kC) on both sides,

$$\displaystyle \begin{aligned} &=\mathop{\arg\min}\limits_{y\in \{f(x):x \in S\}} \left\{\frac{v\left(\{x\in S: f(x)\leq y\}\right)-v\left(\widetilde{\varSigma}^M_{k}\right)+ \epsilon^M_k-\epsilon^P_k}{v\left(\widetilde{\varSigma}^C_k\right)}\right.\\ &\quad \qquad \qquad \qquad \geq \left. \frac{\delta v(S)-v\left(\widetilde{\varSigma}^M_{k}\right)+ \epsilon^M_k-\epsilon^P_k}{v\left(\widetilde{\varSigma}^C_k\right)} \right\} \end{aligned} $$

and by (6.44), also $v\left (\{x\in \widetilde {\varSigma }^P_k:f(x)<y\}\right )=\epsilon ^P_k$ and $v\left (\{x\in \widetilde {\varSigma }^M_k:f(x)<y\}\right )= v\left (\widetilde {\varSigma }^M_{k}\right )-\epsilon ^M_k$,

$$\displaystyle \begin{aligned} &=\mathop{\arg\min}\limits_{y\in \{f(x):x \in S\}} \left\{\frac{v\left(\left\{x\in S\setminus \left\{\widetilde{\varSigma}^P_k \cup \widetilde{\varSigma}^M_{k}\right\}: f(x)\leq y\right\}\right)}{v\left(\widetilde{\varSigma}^C_k\right)}\right.\\ &\left.\quad \qquad \qquad \qquad \geq \delta_k + \frac{\epsilon^M_k}{v\left(\widetilde{\varSigma}^C_k\right)}- \frac{\epsilon^P_k}{v\left(\widetilde{\varSigma}^C_k\right)}\right\} \end{aligned} $$

and since Σ˜kC = S ∖{Σ˜kP ∪Σ˜kM}, and X is uniformly distributed in Σ˜kC and Σ˜kC ⊂ S,

$$\displaystyle \begin{aligned} &=\mathop{\arg\min}\limits_{y\in \{f(x):x \in \widetilde{\varSigma}^C_k\}} \left\{P\left(f(X)\leq y \bigg| X \in \widetilde{\varSigma}^C_k\right)\geq \delta_k + \frac{\epsilon^M_k}{v\left(\widetilde{\varSigma}^C_k\right)}- \frac{\epsilon^P_k}{v\left(\widetilde{\varSigma}^C_k\right)}\right\}\\ &= y\left(\delta_k + \frac{\epsilon^M_k}{v\left(\widetilde{\varSigma}^C_k\right)}- \frac{\epsilon^P_k}{v\left(\widetilde{\varSigma}^C_k\right)}, \widetilde{\varSigma}^C_k\right). \end{aligned} $$

Since $0 \leq \epsilon ^M_k \leq \frac {\epsilon \widetilde {\varSigma }^M_k}{v(S)}$ and $0 \leq \epsilon ^P_k \leq \frac {\epsilon \widetilde {\varSigma }^P_k}{v(S)}$, an upper bound of $y\left (\delta _k + \frac {\epsilon ^M_k}{v\left (\widetilde {\varSigma }^C_k\right )}- \frac {\epsilon ^P_k}{v\left (\widetilde {\varSigma }^C_k\right )}, \widetilde {\varSigma }^C_k\right )$ can be achieved when $\epsilon ^P_k=0$ and $\epsilon ^M_k=\frac {\epsilon v\left (\widetilde {\varSigma }^M_{k}\right )}{v(S)}$, yielding

$$\displaystyle \begin{aligned} y(\delta, S) \leq y\left(\delta_k + \frac{\epsilon v\left(\widetilde{\varSigma}^M_{k}\right)}{v(S)v\left(\widetilde{\varSigma}^C_k\right)}, \widetilde{\varSigma}^C_k\right) = y\left(\delta_{ku}, \widetilde{\varSigma}^C_k\right).\end{aligned} $$

(6.45)

Similarly, we have a lower bound when $\epsilon ^M_k=0$ and $\epsilon ^P_k=\frac {\epsilon v(\widetilde {\varSigma }^P_{k})}{v(S)}$, yielding

$$\displaystyle \begin{aligned} y(\delta, S) \geq y\left(\delta_k - \frac{\epsilon v\left(\widetilde{\varSigma}^P_{k}\right)}{v(S)v\left(\widetilde{\varSigma}^C_k\right)}, \widetilde{\varSigma}^C_k\right) = y\left(\delta_{kl}, \widetilde{\varSigma}^C_k\right).\end{aligned} $$

(6.46)

Note, if $\epsilon ^M_k=0$ and $\epsilon ^P_k=0$, that is, there is no error in pruning and maintaining, then $y(\delta , S)=y\left (\delta _k, \widetilde {\varSigma }^C_{k}\right )$.

At the beginning of any iteration k, the current set $\widetilde {\varSigma }^C_k$ is uniformly sampled for N _k = c _k samples. Since the samples are independent and uniformly distributed in the current set $\widetilde {\varSigma }^C_{k}$, each sample acts like a Bernoulli trial and falls in a δ _kl or δ _ku level set with δ _kl or δ _ku probability, respectively. Therefore, using properties of a binomial distribution, we can build a 1 − α _k quantile confidence interval as f(z _(r)) ≤ y(δ, S) ≤ f(z _(s)) (Conover 1999) with $y\left (\delta _{kl}, \widetilde {\varSigma }^C_{k}\right )$ and $y\left (\delta _{ku}, \widetilde {\varSigma }^C_{k}\right )$ based on (6.45) and (6.46), where f(z _(r)) and f(z _(s)) are the rth and sth order samples that have the following binomial properties

$$\displaystyle \begin{aligned} & P\left(f(z_{(r)})>y\left(\delta_{kl}, \widetilde{\varSigma}^C_{k}\right)\right) \leq \sum^{r-1}_{i=0}{{N_k}\choose{i}}(\delta_{kl})^i\left(1-\delta_{kl}\right)^{N_k-i}{} \end{aligned} $$

(6.47)

$$\displaystyle \begin{aligned} & P\left(f(z_{(s)})\geq y\left(\delta_{ku}, \widetilde{\varSigma}^C_{k}\right)\right) \geq \sum^{s-1}_{i=0}{{N_k}\choose{i}}(\delta_{ku})^i(1-\delta_{ku})^{N_k-i}.{} \end{aligned} $$

(6.48)

The 1 − α _k confidence interval can be approximated by two one-sided intervals. We split α _k into two halves, and allocate one half to each probability bound. Therefore, find the maximum r for which (6.47) is less than or equal to $\frac {\alpha _k}{2}$ and the minimum s for which (6.48) is greater than or equal to $1-\frac {\alpha _k}{2}$, that is

$$\displaystyle \begin{aligned} \max r: \sum^{r-1}_{i=0}{{N_k}\choose{i}}(\delta_{kl})^i(1-\delta_{kl})^{N_k-i} \leq \frac{\alpha_k}{2}\ \text{ and } {} \end{aligned} $$

(6.49)

$$\displaystyle \begin{aligned} \min s: \sum^{s-1}_{i=0}{{N_k}\choose{i}}(\delta_{ku})^i(1-\delta_{ku})^{N_k-i} \geq 1-\frac{\alpha_k}{2} . {} \end{aligned} $$

(6.50)

Combining (6.47)–(6.50), as in Conover (1999), we have

$$\displaystyle \begin{aligned} P\left(f(z^k_{(r)}) \leq y\left(\delta_{kl}, \widetilde{\varSigma}^C_k\right)\leq y\left(\delta_{ku}, \widetilde{\varSigma}^C_k\right) \leq f\left(z^k_{(s)}\right)\right)\geq 1-\alpha_k. \end{aligned} $$

(6.51)

When there is no noise, $0 \leq \epsilon ^P_k \leq \frac {\epsilon v\left (\widetilde {\varSigma }^P_{k}\right )}{v(S)}$ and $0 \leq \epsilon ^M_k \leq \frac {\epsilon v\left (\widetilde {\varSigma }^M_{k}\right )}{v(S)}$, the 1 − α _k confidence interval of y(δ, S) is given by [f(z _(r)), f(z _(s))] based on (6.45), (6.46), and (6.51), that is

$$\displaystyle \begin{aligned} P\left(f(z_{(r)})\leq y(\delta, S) \leq f(z_{(s)})\right) \geq 1-\alpha_k.\end{aligned} $$

□

Proof of Theorem 2

Proof

We note that the event $v\left (L(\delta , S)\cap \hat {\sigma }^k_p\right )\leq D^P_k\epsilon _k$ is equivalent to the event $v\left (\left \{x: f(x) \leq y(\delta , S), x\in \hat {\sigma }^k_p\right \}\right )\leq D^P_k\epsilon _k$ by the definition of L(δ, S), and therefore, the probability of that event, that is, that the volume of the incorrectly pruned region is less than or equal to $D^P_k\epsilon _k$, can be expressed as

$$\displaystyle \begin{aligned} P\left(v\left(L(\delta , S)\cap \hat{\sigma}^k_p\right)\leq D^P_k\epsilon_k \big| A_k\right)&=P\left(v\left(\{x: f(x) \leq y(\delta , S), x\in \hat{\sigma}^k_p\}\right)\right.\\ &\quad \qquad \left.\leq D^P_k\epsilon_k \big| A_k\right). \end{aligned} $$

(6.52)

Now, consider the probability expression of quantile in (6.4) from the main article, and let $\delta _p=\frac {D^P_k\epsilon _k}{v(\hat {\sigma }^k_p)}$. We first prove the theorem under the special case that y(δ, S) is continuous in δ and $y(\delta _p,\hat {\sigma }^k_p)$ is continuous in δ _p, which implies that $v\left (\left \{x: f(x) = y, x\in \hat {\sigma }^k_p\right \}\right )=0, \forall y$ and that (6.4) holds at equality. When X is a uniform sample in $\hat {\sigma }^k_p$, we have

then multiplying $v(\hat {\sigma }^k_p)$ on both sides, we have

$$\displaystyle \begin{aligned} & \qquad v\left(\left\{x: f(x) < y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right) = D^P_k\epsilon_k. \end{aligned} $$

Hence, we have

$$\displaystyle \begin{aligned} D^P_k\epsilon_k &= v\left(\left\{x: f(x) < y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right) \\ &=v\left(\left\{x: f(x) \leq y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right)\notag\\ &\quad - v\left(\left\{x: f(x) = y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right) \end{aligned} $$

and in the special case that $v\left (\{x: f(x) = y, x\in \hat {\sigma }^k_p\}\right )=0, \forall y$, we have

$$\displaystyle \begin{aligned} &\qquad = v\left(\{x: f(x) \leq y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\}\right). {} \end{aligned} $$

(6.53)

We substitute the expression for $D^P_k\epsilon _k$ from (6.53) into the probability expression in (6.52), yielding

and from the properties of level sets, if $y(\delta ,S) \leq y\left (\delta _p , \hat {\sigma }^k_p\right )$, then $ \{x: f(x) \leq y(\delta , S), x\in \hat {\sigma }^k_p\} \subseteq \{x: f(x) \leq y\left (\delta _p , \hat {\sigma }^k_p\right ), x\in \hat {\sigma }^k_p\} $, therefore,

$$\displaystyle \begin{aligned} &\quad = P\left(\left. y(\delta, S) \leq y\left(\delta_p , \hat{\sigma}^k_p\right) \right| A_k\right) \end{aligned} $$

and in the special case that $v\left (\left \{x: f(x) = y, x\in \hat {\sigma }^k_p\right \}\right )=0, \forall y$, we have that

$$\displaystyle \begin{aligned} &\quad = P\left(\left. y(\delta, S) < y\left(\delta_p , \hat{\sigma}^k_p\right) \right| A_k\right), \end{aligned} $$

and by the condition A _k and the pruned assumption, we have $y(\delta , S) \leq y\left (\delta _{ku}, \widetilde {\varSigma }_k^C\right ) \leq f(z_{(s)}) < f(x_{(p),(1)})$, where x _(p),(1) is the best sample out of $D^P_k N^p_k$ independent samples in $\hat {\sigma }^k_p$, therefore,

$$\displaystyle \begin{aligned} &\quad \geq P\left(\left.f(x_{(p),(1)}) \leq y\left(\delta_p , \hat{\sigma}^k_p\right) \right| A_k\right)\\ &\quad =1-P\left(\left.f(x_{(p),(1)}) > y\left(\delta_p , \hat{\sigma}^k_p\right) \right| A_k\right), \end{aligned} $$

and since each of the DkPNkp independent uniform samples X in σ̂pk satisfies

$$\displaystyle \begin{aligned} &\quad \geq 1-\left(1-\delta_p\right)^{D^P_k N^p_k}. {} \end{aligned} $$

(6.54)

Since $N^p_k =\left \lceil \frac {\ln {\alpha _k}}{\ln {(1-\frac {\epsilon _k}{v(\sigma _i)}})}\right \rceil $ in Step 4, and $\delta _p=\frac {D^P_k \epsilon _k}{v(\hat {\sigma }_p)}=\frac {D^P_k \epsilon _k}{D^P_k v(\sigma _i)}=\frac {\epsilon _k}{v(\sigma _i)}$, where σ _i is a subregion pruned at the kth iteration, and $D^P_k \geq 1$, we know $N^p_k \geq \frac {\ln {\alpha _k}}{\ln \left (1-\frac {\epsilon _k}{v(\sigma _i)}\right )}=\frac {\ln {\alpha _k}}{\ln {(1-\delta _p)}} \Rightarrow \ln {\left (1-\delta _p\right )^{D^P_k N^p_k}}\leq \ln {\alpha _k} \Rightarrow \left (1-\delta _p\right )^{D^P_k N^p_k}\leq \alpha _k$. Multiplying − 1 on both sides and adding one to both sides, the inequality becomes $1-\left (1-\delta _p\right )^{D^P_k N^p_k}\geq 1-\alpha _k$, hence

$$\displaystyle \begin{aligned} & P\left(v\left(\{x: f(x) \leq y(\delta , S), x\in \hat{\sigma}^k_p\}\right)\leq D^P_k\epsilon_k |A_k\right) \\ &\quad \geq 1-\alpha_k. \end{aligned} $$

(6.55)

This and (6.52) yield the theorem statement in (6.21) in the special case.

Now, in the more general case where y(δ, S) and $y(\delta _p,\hat {\sigma }_p^k)$ may have discontinuities, the $v\left (\left \{x: f(x) = y, x\in \hat {\sigma }^k_p\right \}\right )$ may be positive for some y. The flow of the proof is the same, however, the possibility of discontinuities changes equalities to inequalities while accounting for $v\left (\left \{x: f(x) = y, x\in \hat {\sigma }^k_p\right \}\right )$, as follows.

When X is a uniform sample in $\hat {\sigma }^k_p$, the probability expression of quantile in (6.4) with $\delta _p=\frac {D^P_k\epsilon _k}{v\left (\hat {\sigma }^k_p\right )}$ now can be expressed as

$$\displaystyle \begin{aligned} & P\left(f(X)<y\left(\delta_p,\hat{\sigma}^k_p\right)\right)=\frac{v\left(\left\{x: f(x) < y(\delta_p , \hat{\sigma}^k_p), x\in \hat{\sigma}^k_p\right\}\right)}{v(\hat{\sigma}^k_p)}\leq \delta_p =\frac{D^P_k\epsilon_k}{v(\hat{\sigma}^k_p)}, \end{aligned} $$

then multiplying $v(\hat {\sigma }^k_p)$ on both sides, we have

$$\displaystyle \begin{aligned} & v\left(\left\{x: f(x) < y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right)\leq D^P_k\epsilon_k. \end{aligned} $$

Hence, we have

$$\displaystyle \begin{aligned} & D^P_k\epsilon_k \geq v\left(\left\{x: f(x) < y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right) \\ & \qquad =v\left(\left\{x: f(x) \leq y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right) \\ & \qquad \qquad - v\left(\left\{x: f(x) = y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right). {} \end{aligned} $$

(6.56)

We substitute the expression for $D^P_k\epsilon _k$ from (6.56) into the probability expression in (6.52), and due to the possibility of discontinuities, this is a stricter event, yielding an inequality in the probability as follows:

$$\displaystyle \begin{aligned} & P\left(v\left(\{x: f(x) \leq y(\delta , S), x\in \hat{\sigma}^k_p\}\right)\leq D^P_k\epsilon_k |A_k\right) \\ &\quad \geq P\left(\left.v\left(\left\{x: f(x) \leq y(\delta , S), x\in \hat{\sigma}^k_p\right\}\right) \right.\right.\\ &\quad \leq v\left(\left\{x: f(x) \leq y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right) \\ &\qquad \left.\left. - v\left(\left\{x: f(x) = y\left(\delta_p , \hat{\sigma}^k_p\right), x\in \hat{\sigma}^k_p\right\}\right) \right| A_k\right) \end{aligned} $$

and since $=v\left (\{x: f(x) < y\left (\delta _p , \hat {\sigma }^k_p\right ), x\in \hat {\sigma }^k_p\}\right )$

and now comparing the level sets associated with y(δ, S) and $y\left (\delta _p , \hat {\sigma }^k_p\right )$, we see that, if $y(\delta , S)<y\left (\delta _p , \hat {\sigma }^k_p\right )$, then even in the presence of discontinuities, $ \{x: f(x) \leq y(\delta , S), x\in \hat {\sigma }^k_p\} \subseteq \{x: f(x) < y\left (\delta _p , \hat {\sigma }^k_p\right ), x\in \hat {\sigma }^k_p\}$, so we have the following

$$\displaystyle \begin{aligned} &\quad \geq P\left(\left. y(\delta, S) < y\left(\delta_p , \hat{\sigma}^k_p\right) \right| A_k\right), \end{aligned} $$

and by the condition A _k and the pruned assumption, we have $y(\delta , S) \leq y(\delta _{ku}, \widetilde {\varSigma }_k^C) \leq f(z_{(s)}) < f(x_{(p),(1)})$, where x _(p),(1) is the best sample out of $D^P_k N^p_k$ independent samples in $\hat {\sigma }^k_p$, therefore,

$$\displaystyle \begin{aligned} &\quad \geq P\left(\left.f(x_{(p),(1)}) \leq y\left(\delta_p , \hat{\sigma}^k_p\right) \right| A_k\right)\\ &\quad =1-P\left(\left.f(x_{(p),(1)}) > y\left(\delta_p , \hat{\sigma}^k_p\right) \right| A_k\right), \end{aligned} $$

and since each of the ${D^P_k N^p_k}$ independent uniform samples X in ${\hat {\sigma }^k_p}$ satisfies

$$\displaystyle \begin{aligned} P\left(f(X) > y\left(\delta_p , \hat{\sigma}^k_p\right) \right)=1-P\left(f(X) \leq y\left(\delta_p , \hat{\sigma}^k_p\right) \right) \leq 1- \delta_p, \end{aligned} $$

we have

$$\displaystyle \begin{aligned} &\quad \geq 1-\left(1-\delta_p\right)^{D^P_k N^p_k} \end{aligned} $$

(6.57)

which is the same inequality as in (6.54).

As in the special case, since $N^p_k =\left \lceil \frac {\ln {\alpha _k}}{\ln {(1-\frac {\epsilon _k}{v(\sigma _i)}})}\right \rceil $ in Step 4, and $\delta _p=\frac {D^P_k \epsilon _k}{v(\hat {\sigma }_p)}=\frac {D^P_k \epsilon _k}{D^P_k v(\sigma _i)}=\frac {\epsilon _k}{v(\sigma _i)}$, where σ _i is a subregion pruned at the kth iteration, and $D^P_k \geq 1$, we have that

$$\displaystyle \begin{aligned} & P(v(\{x: f(x) \leq y(\delta , S), x\in \hat{\sigma}^k_p\})\leq D^P_k\epsilon_k |A_k) \\ &\quad \geq 1-\alpha_k\end{aligned} $$

(6.58)

which yields the theorem statement in (6.21) in the general case, too. □

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zabinsky, Z.B., Huang, H. (2020). A Partition-Based Optimization Approach for Level Set Approximation: Probabilistic Branch and Bound. In: Smith, A. (eds) Women in Industrial and Systems Engineering. Women in Engineering and Science. Springer, Cham. https://doi.org/10.1007/978-3-030-11866-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-11866-2_6
Published: 14 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11865-5
Online ISBN: 978-3-030-11866-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

A Partition-Based Optimization Approach for Level Set Approximation: Probabilistic Branch and Bound

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation