Abstract
We consider the following semi-infinite linear programming problems: \(\max \) (resp., \(\min \)) \(c^Tx\) s.t. \(y^TA_ix+(d^i)^Tx \le b_i\) (resp., \(y^TA_ix+(d^i)^Tx \ge b_i)\), for all \(y \in {{\mathcal {Y}}}_i\), for \(i=1,\ldots ,N\), where \({{\mathcal {Y}}}_i\subseteq {\mathbb {R}}^m_+\) are given compact convex sets and \(A_i\in {\mathbb {R}}^{m_i\times n}_+\), \(b=(b_1,\ldots b_N)\in {\mathbb {R}}_+^N\), \(d^i\in {\mathbb {R}}_+^n\), and \(c\in {\mathbb {R}}_+^n\) are given non-negative matrices and vectors. This general framework is useful in modeling many interesting problems. For example, it can be used to represent a sub-class of robust optimization in which the coefficients of the constraints are drawn from convex uncertainty sets \({{\mathcal {Y}}}_i\), and the goal is to optimize the objective function for the worst case choice in each \({{\mathcal {Y}}}_i\). When the uncertainty sets \({{\mathcal {Y}}}_i\) are ellipsoids, we obtain a sub-class of second-order cone programming. We show how to extend the multiplicative weights update method to derive approximation schemes for the above packing and covering problems. When the sets \({{\mathcal {Y}}}_i\) are simple, such as ellipsoids or boxes, this yields substantial improvements in running time over general convex programming solvers. We also consider the mixed packing/covering problem, in which both packing and covering constraints are given, and the objective is to find an approximately feasible solution.
Similar content being viewed by others
Notes
This is already part of the problem definition, but we repeat here for ease of reference in the rest of the the paper.
\({\tilde{O}}(\cdot )\) suppresses polylogarithmic factors that depend on m, N, and \(\frac{1}{\epsilon }\).
Note that the definition of these oracles comes naturally from the corresponding algorithms for packing/covering LPs; whether weaker oracles suffice is an interesting open question.
A typical example is the so-called Gaussian kernel, where \(q_{ij}=e^{-\Vert z^i-z^j\Vert ^2/(2\sigma ^2)}\).
That is, \(\log p\) is concave.
Throughout “\(\log \)” denotes the natural logarithm.
\({\tilde{O}}(\cdot )\) suppresses polylogarithmic factors that depend on m, N, and \(\epsilon \).
We assume here the natural logarithm.
In fact, in the packing algorithm, \(y^TA_ix(t)\) is bounded from above by \(T/(1-\epsilon _3)\) by the property of the oracle MinVec and the fact that \(M(t)<T\), while in the covering algorithm, it is bounded by T since the integral is taken over the active subset \({{\mathcal {Y}}}_i(t)\).
References
Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. 95(1), 3–51 (2003)
Allen-Zhu, Z., Lee, Y.T., Orecchia, L.: Using optimization to obtain a width-independent, parallel, simpler, and faster positive SDP solver, pp. 1824–1831 (2016). http://dl.acm.org/citation.cfm?id=2884435.2884562
Allen-Zhu, Z., Orecchia, L.: Nearly-linear time positive LP solver with faster convergence rate, pp. 229–236 (2015). https://doi.org/10.1145/2746539.2746573
Arora, S., Hazan, E., Kale, S.: Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In: Proceedings of the 46th Symposium on Foundations of Computer Science (FOCS), pp. 339–348 (2005)
Arora, S., Kale, S.: A combinatorial, primal–dual approach to semidefinite programs. In: Proceedings of the 39th Symposium on Theory of Computing (STOC), pp. 227–236 (2007)
Bartal, Y., Byers, J., Raz, D.: Fast, distributed approximation algorithms for positive linear programming with applications to flow control. SIAM J. Comput. 33(6), 1261–1279 (2004)
Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3), 167–175 (2003). https://doi.org/10.1016/S0167-6377(02)00231-6
Ben-Tal, A., Hazan, E., Koren, T., Mannor, S.: Oracle-based robust optimization via online learning. Oper. Res. 63(3), 628–638 (2015)
Ben-Tal, A., Nemirovski, A.: Robust optimization: methodology and applications. Math. Program. 92(3), 453–480 (2002)
Bertsimas, D., Brown, D.B., Caramanis, C.: Theory and applications of robust optimization. SIAM Rev. 53(3), 464–501 (2011)
Bertsimas, D., Thiele, A.: A robust optimization approach to supply chain management. In: Integer Programming and Combinatorial Optimization, 10th International IPCO Conference, New York, NY, USA, 7–11 June 2004, Proceedings, pp. 86–100 (2004)
Bhalgat, A., Gollapudi, S., Munagala, K.: Optimal auctions via the multiplicative weight method. In: Proceedings of the Fourteenth ACM Conference on Electronic Commerce, EC ’13, pp. 73–90. ACM, New York (2013)
Blum, A.: On-line algorithms in machine learning. In: Developments from a June 1996 Seminar on Online Algorithms: The State of the Art, pp. 306–325. Springer, London (1998)
Brönnimann, H., Goodrich, M.T.: Almost optimal set covers in finite VC-dimension. Discrete Comput. Geom. 14(4), 463–479 (1995)
Brown, G.W.: Iterative solution of games by fictitious play. Activity Anal. Prod. Allocation 13(1), 374–376 (1951)
Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8(3–4), 231–357 (2015). https://doi.org/10.1561/2200000050
Chau, C.K., Elbassioni, K., Khonji, M.: Truthful mechanisms for combinatorial ac electric power allocation. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1005–1012. International Foundation for Autonomous Agents and Multiagent Systems (2014)
Chazelle, B.: The Discrepancy Method: Randomness and Complexity. Cambridge University Press, New York (2000)
Constantine, C., Mannor, S., Xu, H.: Robust Optimization in Machine Learning, pp. 369–402. MIT Press, Cambridge (2012)
Daskalakis, C., Deckelbaum, A., Kim, A.: Near-optimal no-regret algorithms for zero-sum games. In: Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’11, pp. 235–254. SIAM (2011)
Diedrich, F., Jansen, K.: Faster and simpler approximation algorithms for mixed packing and covering problems. Theor. Comput. Sci. 377(1–3), 182–204 (2007)
Elbassioni, K., Makino, K., Mehlhorn, K., Ramezani, F.: On randomized fictitious play for approximating saddle points over convex sets. Algorithmica 73(2), 441–459 (2015). https://doi.org/10.1007/s00453-014-9902-8
Elbassioni, K., Nguyen, T.T.: Approximation schemes for binary quadratic programming problems with low CP-rank decompositions. arXiv preprint arXiv:1411.5050 (2014)
Freund, Y., Schapire, R.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29(1–2), 79–103 (1999)
Garg, N., Khandekar, R.: Fractional covering with upper bounds on the variables: solving LPs with negative entries. In: Proceedings of the 14th European Symposium on Algorithms (ESA), pp. 371–382 (2004)
Garg, N., Könemann, J.: Faster and simpler algorithms for multicommodity flow and other fractional packing problems. In: Proceedings of the 39th Symposium on Foundations of Computer Science (FOCS), pp. 300–309 (1998)
Goldfarb, D., Iyengar, G.: Robust portfolio selection problems. Math. Oper. Res. 28(1), 1–38 (2003)
Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S., Kimura, H. (eds.) Recent Advances in Learning and Control. Lecture Notes in Control and Information Sciences, pp. 95–110. Springer, Berlin (2008)
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1 (2014)
Grigoriadis, M., Khachiyan, L.: Approximate solution of matrix games in parallel. In: Advances in Optimization and Parallel Computing, pp. 129–136 (1992)
Grigoriadis, M., Khachiyan, L.: A sublinear-time randomized approximation algorithm for matrix games. Oper. Res. Lett. 18(2), 53–58 (1995)
Grigoriadis, M., Khachiyan, L.: Coordination complexity of parallel price-directive de-composition. Math. Oper. Res. 21(2), 321–340 (1996)
Grigoriadis, M.D., Khachiyan, L.G., Porkolab, L., Villavicencio, J.: Approximate max–min resource sharing for structured concave optimization. SIAM J. Optim. 41, 1081–1091 (2001)
Hazan, E.: Efficient algorithms for online convex optimization and their application. Ph.D. thesis, Princeton University, USA (2006)
Helmbold, D.P., Schapire, R.E., Singer, Y., Warmuth, M.K.: On-line portfolio selection using multiplicative updates. In: Machine Learning, Proceedings of the Thirteenth International Conference (ICML ’96), Bari, Italy, 3–6 July 1996, pp. 243–251 (1996)
Jain, R., Yao, P.: A parallel approximation algorithm for positive semidefinite programming. In: IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, 22–25 October 2011, pp. 463–471 (2011)
Kale, S.: Efficient algorithms using the multiplicative weights update method. Ph.D. thesis, Princeton University, USA (2007)
Khandekar, R.: Lagrangian relaxation based algorithms for convex programming problems. Ph.D. thesis, Indian Institute of Technology, Delhi (2004)
Koufogiannakis, C., Young, N.: Beating simplex for fractional packing and covering linear programs. In: Proceedings of the 48th Symposium on Foundations of Computer Science (FOCS), pp. 494–504 (2007)
Littlestone, N., Warmuth, M.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
Lobo, M.S., Vandenberghe, L., Boyd, S., Lebret, H.: Applications of second-order cone programming. Linear Algebra Appl. 284(13), 193–228 (1998)
López, M., Still, G.: Semi-infinite programming. Eur. J. Oper. Res. 180(2), 491–518 (2007)
Lorenz, R., Boyd, S.: Robust minimum variance beamforming. IEEE Trans. Signal Process. 53(5), 1684–1696 (2005)
Lovász, L., Vempala, S.: Fast algorithms for logconcave functions: sampling, rounding, integration and optimization. In: Proceedings of the 47th Symposium on Foundations of Computer Science (FOCS), pp. 57–68 (2006)
Luby, M., Nisan, N.: A parallel approximation algorithm for positive linear programming. In: Proceedings of the 25th Symposium on Theory of Computing (STOC), pp. 448–457 (1993)
Magaril-Il’yaev, G.G., Tikhomirov, V.M.: Convex Analysis: Theory and Applications, vol. 222. American Mathematical Society, Providence (2003)
Nemirovskii, A., Yudin, D.: Problem Complexity and Method Efficiency in Optimization. A Wiley-Interscience Publication. Wiley, London (1983)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/sqr(k)). Sov. Math. Dokl. 27, 372–376 (1983)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course, vol. 87. Springer, Berlin (2013)
Peng, R., Tangwongsan, K.: Faster and simpler width-independent parallel algorithms for positive semidefinite programming. In: Proceedings of the Twenty-Fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’12, pp. 101–108. ACM, New York (2012)
Plotkin, S., Shmoys, D., Tardos, É.: Fast approximation algorithms for fractional packing and covering problems. In: Proceedings of the 32nd Symposium on Foundations of Computer Science (FOCS), pp. 495–504 (1991)
Robinson, J.: An iterative method of solving a game. Ann. Math. 54(2), 296–301 (1951)
Shivaswamy, P.K., Bhattacharyya, C., Smola, A.J.: Second order cone programming approaches for handling missing and uncertain data. J. Mach. Learn. Res. 7, 1283–1314 (2006)
Tsang, I.W., Kwok, J.T.: Efficient hyperkernel learning using second-order cone programming. IEEE Trans. Neural Netw. 17(1), 48–58 (2006)
Tulabandhula, T., Rudin, C.: Robust optimization using machine learning for uncertainty sets. In: International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014, Fort Lauderdale, FL, USA, 6–8 January 2014 (2014)
Wang, D., Rao, S., Mahoney, M.W.: Unified acceleration method for packing and covering problems via diameter reduction. arXiv preprint arXiv:1508.02439 (2015)
Young, N.: Sequential and parallel algorithms for mixed packing and covering. In: Proceedings of the 42nd Symposium on Foundations of Computer Science (FOCS), pp. 538–546 (2001)
Zass, R., Shashua, A.: A unifying approach to hard and probabilistic clustering. In: Tenth IEEE International Conference on Computer Vision, 2005. ICCV 2005, vol. 1, pp. 294–301. IEEE (2005)
Zhu, Z.A., Orecchia, L.: A novel, simple interpretation of Nesterov’s accelerated method as a combination of gradient and mirror descent. CoRR arXiv:1407.1537 (2014)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Efficient Implementations of the Oracle Integral
In this section we discuss how to efficiently implement the oracle Integral\((p,f,\epsilon ,\sigma ,{{\mathcal {Y}}})\) for a given compact convex set Y, a log-concave function \(p:Y\rightarrow {\mathbb {R}}_+\), a function \(f:Y\rightarrow {\mathbb {R}}^k\) which is either \(f(y):=1\) or \(f(y):=y\), and \(\epsilon ,\sigma \in [0,1)\).
Write \(F(y):=f(y)p(y)\) for \(y\in {{\mathcal {Y}}}\). Then \(F_i(y)\) is log-concave for all \(i\in [k]\). Thus, the question is to how to efficiently well-approximate \(\int _{Y}F_i(y)dy\).
1.1 General Convex Sets
Lovász and Vempala show how to implement the integration oracle in \({\tilde{O}}(m^4)\) time. More precisely, they prove the following result.
Theorem 4
[44] Let \(F:Y\rightarrow {\mathbb {R}}\) be a log-concave function over a compact convex set \(Y\subseteq {\mathbb {R}}^m\). Assume also the availability of a point \(\tilde{y}\) that (approximately) maximizes F over Y. Given \(\epsilon ,\sigma >0\), a number A can be computed such that with probability \(1-\sigma \),
in \(O\left( \displaystyle \frac{m^4}{\epsilon ^2}\log ^7\frac{m}{\epsilon \sigma }\right) \) oracle calls, each involving ether a membership oracle for Y and/or an evaluation of F.
1.2 Reduction to Volume Computation
We consider here the particular case that we actually need in our algorithms, where \(F_i(y)\) is either \((1\pm \epsilon )^{a^Ty}\) or \(y_i(1\pm \epsilon )^{a^Ty}\), for some \(a\in {\mathbb {R}}_+^m\), and whereFootnote 9\(a^Ty\le T/m\) for all \(y\in {{\mathcal {Y}}}\subseteq {\mathbb {R}}_+^m\). Standardly, the idea is slice Y along the direction orthogonal to a and approximate the values of the function \(a^ty\) by their values at the boundary of the slices.
For simplicity let us consider the case when \(F_i(y)=F(y)=(1+\epsilon )^{a^Ty}\), and we can compute the the maximum and minimum of a linear function over Y exactly. Let \(y_{\min }\in {\text {argmin}}_{y\in {{\mathcal {Y}}}}a^Ty\), \(y_{\max }\in {\text {argmax}}_{y\in {{\mathcal {Y}}}}a^Ty\), and define the set of points \(y_{\min }=y^0,y^1,\ldots ,y^r=y_{\max }\) in Y, such that \(a^Ty^i=a^Ty^{i-1}+\eta \) for \(i=1,\ldots ,r\), where \(\eta :=\frac{\log (1+\epsilon _1)}{\log (1+\epsilon )}\) and \(r:=\lceil \frac{a^Ty_{\max }-a^Ty_{\min }}{\eta }\rceil \le \lceil \frac{T}{m\eta }\rceil \). Note that \((1+\epsilon )^{a^Ty^i}=(1+\epsilon _1)(1+\epsilon )^{a^Ty^{i-1}}\). Note also that, once we have \(y_{\min }\) and \(y_{\max }\), we can easily compute \(y^i=y_{\min }+\lambda _i(y_{\max }-y_{\min })\), where \(\lambda _i:=\frac{\eta \cdot i}{a^Ty_{\max }-a^Ty_{\min }}\), for \(i=1,\ldots ,r\).
For \(i=1,\ldots ,r\), define the slice \(Y(i):=\{y\in {{\mathcal {Y}}}:~a^Ty^{i-1}\le a^Ty\le a^Ty^i\}\). Then it is easy to see that
satisfies (33). Indeed,
If \(F(y)=y\), then \({\text {vol}}(Y(i)\) in (34) should be replaced by \(\int _{Y(i)}ydy\).
1.3 Euclidean Balls
We can further show that for the case of Euclidean balls (and similarly for Ellipsoids), one can reduce the integral computation to numerical integration over a single variable.
For \(z\in {\mathbb {R}}^m\) and \(\rho \in {\mathbb {R}}_+\), let \({\mathbf {B}}_m(y^0,\rho ):=\{y\in {\mathbb {R}}^m:~\Vert y-y^0\Vert \le \rho \}\) be the ball of radius \(\rho \) centered at \(y^0\). Denote by \({\text {vol}}_m({\mathbf {B}}_m(y^0,\rho ))\) the volume of the ball \({\mathbf {B}}_m(y^0,\rho )\) in \({\mathbb {R}}^m\).
Lemma 24
Let \(Y:={\mathbf {B}}_m(y^0,\rho )\), \(a\in {\mathbb {R}}^m\) be a given vector, and \(F(y):=y(1+\epsilon )^{a^Ty}\) for \(y\in {\mathbb {R}}^m\). Let \({\hat{y}}:={\text {argmin}}\{a^Ty:~y\in {{\mathcal {Y}}}\}\). Then
where \(v_{m-1} (h)\) is the volume of an \((m-1)\)-dimensional ball of radius \(\sqrt{h(2\rho -h)})\).
Proof
To simplify matters, we use the change of variable \(z=Uy\), where \(U\in {\mathbb {R}}^{m\times m}\) is a rotation transformation (i.e., \(U^TU=I\)) such that \(Ua = \Vert a\Vert \varvec{1}_m\). Hence, \(dy = \prod _i dy_i = \det U \prod _i dz_i = dz\). Note that this transformation maps the ball \({\mathbf {B}}_m(y^0,\rho )\) to to the ball \(Z:={\mathbf {B}}_m(Uy^0,\rho )\), that is, \(U {{\mathcal {Y}}}=Z\).
For a given \(h\in [0,2\rho ]\), let \({\mathbf {B}}(h):={\mathbf {B}}_{m-1}({\hat{z}}+h\varvec{1}_m,\sqrt{h(2\rho -h)})\) be the lower-dimensional ball centered at \(U \left( {\hat{y}}+ha/||a||\right) = {\hat{z}}+h\varvec{1}_m\) with radius \(\sqrt{\rho ^2-(\rho -h)^2}=\sqrt{h(2\rho -h)}\). For any vector \(z \in {\mathbb {R}}^m\), we denote by \(z_{{\overline{m}}}\) the vector of the first \(m-1\) components of y, with \(y_m\) being the mth component.
The equality before the last is due to the fact that for \(i=1,\cdots ,m-1\), \(\int _{{\mathbf {B}}(h)} z_idz_{\overline{m}}={\hat{z}}_i{\text {vol}}_{m-1} ({\mathbf {B}}(h))\) due to symmetry, while \(\int _{{\mathbf {B}}(h)} ({\hat{z}}_m+h)dz_{{\overline{m}}}=({\hat{z}}_m+h){\text {vol}}_{m-1} ({\mathbf {B}}(h))\). \(\square \)
Corollary 5
Suppose that there is an algorithm that approximates the integral \(\int _{h_1}^{h_2}\tau (h)dh\) to within an additive error \(\epsilon \), in time \(q(\tau _{max},\frac{1}{\epsilon })\), where \(\tau _{\max }:=\max _{h\in [h_1,h_2]}\log \tau (h)\). Then there is an algorithm that approximates the integral in (35) within an additive error of \(\epsilon \) in time \(q(O(m+\frac{T}{m}+H)\log m,\frac{1}{\epsilon })\), where H is the maximum number of bits needed to represent any of the components of a, \(\rho \), and \(y^0\).
Proof
The function \(\tau (h)\) inside the integral on the R.H.S. of (35) can be written as
where \(\varGamma \) is Euler’s gamma function, and \(\Vert a\Vert h\le T/m\). It can be easily verified that \(\tau _{\max }=O((m+\frac{T}{m}+H)\log m)\). \(\square \)
Proofs Omitted from Section 5.1
Lemma 25
\(\varPhi (t+1) \le \displaystyle \varPhi (t) \exp \Big ( -{\epsilon }\delta (t)\sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t)} \frac{p_i(y,t)}{\varPhi (t)} y^TA_i\mathbf {1}_{j(t)}dy \Big )\).
Proof
First, we note that
The last inequality is due to the fact that updates of x(t) are non-negative, and hence, \(g_i(y)^Tx(t)\le g_i(y)^Tx(t)+\delta (t) g_i(y)^T\varvec{1}_{j(t)}=g_i(y)^Tx(t+1)\), implying that \({{\mathcal {Y}}}_i(t+1) \subseteq {{\mathcal {Y}}}_i(t)\) and \(I(t+1)\subseteq I(t)\).
By the definition of the oracle MaxVec, the exponent of \((1-\epsilon )\) satisfies
Recalling that for \(z \in [0,1]\) and for \(a\ge 0\), \((1-a)^z \le 1-az\), we have
Finally, using \(1-z \le e^{-z}\) for all z, we get:
\(\square \)
Define \(1-{\bar{\epsilon }}:=\frac{(1-\epsilon _2)(1-\epsilon _1)}{1+\epsilon _1}\).
Lemma 26
Let \(\kappa (t):=\sum _{t'=0}^{t-1} \delta (t')\sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^T\mathbf {1}_{j(t')}dy\). Then with probability at least \(1-2N\sigma t\), \(\kappa (t) \ge \frac{(1-{\bar{\epsilon }})\varvec{1}^Tx(t)}{z^*}\) for all t.
Proof
Let \(x^*\) be an optimal solution for (Covering). Then by the feasibility of \(x^*\) for (Covering), \(g_i(y)^Tx^* \ge 1\) for all \(y \in {{\mathcal {Y}}}_i\) and for all \(i\in [N]\). Multiplying both sides by \(p_i(y,t')\), integrating over the respective active sets, and summing over all \(i\in I(t')\), we get
Therefore,
For \({i\in I(t')}\), let \({\bar{y}}^i(t)\) and \({\bar{\phi }}^i(t)\) be the outputs of the oracles \({\textsc {Integral}}(p_i(y,t'),f(y):=y,\epsilon _1,\sigma ,{{\mathcal {Y}}}_i(t'))\) and \({\textsc {Integral}}(p_i(y,t'), f(y):=1,\epsilon _1,\sigma , {{\mathcal {Y}}}_i(t'))\) in steps 15 and 16 of iteration \(t'\), respectively. Then we have by the union bound,
which imply
These together with the setting of \(j(t')\) in step 18 give that with probability at least \( 1-2N\sigma \),
It follows that with probability at least \( 1-2N\sigma \),
Multiplying both sides of (40) by \(x_j^*/\varPhi (t')\) (a non-negative quantity), summing over all \(j\in [n]\), and recalling that \(z^* = \varvec{1}^Tx^* = \sum _j x_j^*\), we get that with probability at least \( 1-2N\sigma \),
where the last inequality follows from (39). Using this result, we can conclude that with probability at least \( 1-2N\sigma \),
Here, the last equality is a result of the update formula for \(x(t+1)\) in step 23 of Algorithm 4. \(\square \)
Lemma 27
For all t, with probability at least \( 1-2N\sigma t\), it holds that
Proof
By repeated application of (38) and using Lemma 9 we can bound \(\varPhi (t)\) as follows:
\(\square \)
Lemma 28
Suppose \(\varPhi (t) \le \gamma \) for some \(\gamma >0\) and some iteration t of the algorithm. Assume also that there is \(v_i>0\) such that \({\text {vol}}({{\mathcal {Y}}}_i(t))\ge v_i\) for all \(i\in I(t)\). In addition, suppose that for given \(T>0\) and \(\epsilon _3 >0\) there exists an \(\alpha \) such that
where \(\alpha _0:=2T\). Then \((1-\epsilon )^{g_i(y)^Tx(t)} \le \gamma (1-\epsilon )^{-\alpha }\) for all \(y \in {{\mathcal {Y}}}_i(t)\), for all \(i \in I(t)\).
Proof
Towards a contradiction, suppose that there exist \(i\in I(t)\) and \(y^* \in {{\mathcal {Y}}}_i(t)\) such that \((1-\epsilon )^{g_i({y^*})^Tx(t)} > \gamma (1-\epsilon )^{-\alpha }\) in iteration t. Then define \(\mu ^* = \frac{\alpha }{\alpha _0}\in (0,1)\). Also, define the following sets:
\(\square \)
Claim 7
\(y^*+\mu '(y-y^*) \in {{\mathcal {Y}}}_i^{++}(t)\) for all \(\mu ' \in [0,\frac{1}{\mu ^*}]\) and \(y \in {{\mathcal {Y}}}_i^+(t)\). In particular, \({{\mathcal {Y}}}_i^+(t) \subseteq {{\mathcal {Y}}}_i^{++}(t)\).
Proof
\(y^*\) is in \({{\mathcal {Y}}}_i^+(t)\) since \((y^*-y^*)^TA_ix(t)=0\le \alpha /2\). \({{\mathcal {Y}}}_i^+(t)\) is the intersection of a half-space and \({{\mathcal {Y}}}_i(t)\), which are both convex. Thus, \({{\mathcal {Y}}}_i^+(t)\) is convex. Therefore, for all \(\mu ' \in [0,\frac{1}{\mu ^*}]\) and for any \(y \in {{\mathcal {Y}}}_i^+(t)\), we have
Consider the point \(y^*+\mu '\mu ^*(y-y^*)\). Since this point is in \({{\mathcal {Y}}}_i^+(t)\), we can substitute it into the definition of \({{\mathcal {Y}}}_i^{++}(t)\) to get that
In particular for \(\mu '=1\), we have \(y\in {{\mathcal {Y}}}_i^{++}(t)\), implying the claim. \(\square \)
Claim 8
\({\text {vol}}({{\mathcal {Y}}}_i^{++}(t)) = \displaystyle \left( \frac{1}{\mu ^*}\right) ^{m_i} {\text {vol}}({{\mathcal {Y}}}_i^+(t))\)
Proof
Immediate from the definition in (45). \(\square \)
Claim 9
\({{\mathcal {Y}}}_i(t) \subseteq {{\mathcal {Y}}}_i^{++}(t)\).
Proof
Suppose for a contradiction that there exists a point \(y \in {{\mathcal {Y}}}_i(t)\backslash {{\mathcal {Y}}}_i^{++}(t)\). Then by Claim 7, y is also outside \({{\mathcal {Y}}}_i^+(t)\). Now define:
(Note that the maximum exists since \({{\mathcal {Y}}}_i^+(t)\) is closed and bounded.) Define also \(y^+ = y^* + \mu ^+(y-y^*)\). By the definition of \(\mu ^+\), \(y^+\) is on the boundary of \({{\mathcal {Y}}}_i^+(t)\), and the segment joining \(y\not \in {{\mathcal {Y}}}_i^+(t)\) and \(y^*\in {{\mathcal {Y}}}_i^+(t)\) crosses the hyperplane \(\{z\in {\mathbb {R}}^{m_i}: (z-y^*)^TA_ix(t)=\frac{\alpha }{2}\}\) at \(z=y^+\). This implies that \((y^+-y^*)^TA_ix(t) = \frac{\alpha }{2}\). Therefore,
Thus,
where the first inequality is due to the non-negativity of \(y^{*T}A_ix(t)\) (by (A1)) and \((d^i)^Tx(t)\), the second is because \(\mu ^+ < \mu ^*\) (by Claim 7, as \(y^*+\frac{1}{\mu ^+}(y^+-y^*)\not \in {{\mathcal {Y}}}_i^{++}(t)\)), and the third is because \(y \in {{\mathcal {Y}}}_i(t)\) and thus \(g_i(y)^Tx(t) \le T\) by the definition of \({{\mathcal {Y}}}_i(t)\). But then plugging \(\mu ^*=\frac{\alpha }{\alpha _0}\) in (46) gives \(\alpha _0<2T\), contradicting the definition of \(\alpha _0\). Thus, no points \(y \in {{\mathcal {Y}}}_i(t) \backslash {{\mathcal {Y}}}_i^{++}(t)\) exist, and \({{\mathcal {Y}}}_i(t) \subseteq {{\mathcal {Y}}}_i^{++}(t)\). \(\square \)
Recalling that \(y^TA_ix(t) \le {y^*}^TA_ix(t)+\alpha /2\), and hence, \(g_i(y)^Tx(t)\le g_i(y^*)^Tx(t)+\alpha /2\), for all y in \({{\mathcal {Y}}}_i^+(t)\), we have
From the claims above, the volumes of the sets \({{\mathcal {Y}}}_i(t)\), \({{\mathcal {Y}}}_i^+(t)\) and \({{\mathcal {Y}}}_i^{++}(t)\) are related by \({\text {vol}}({{\mathcal {Y}}}_i^+(t)) = \left( \mu ^*\right) ^{m_i}{\text {vol}}({{\mathcal {Y}}}_i^{++}(t)) \ge \left( {\mu ^*}\right) ^{m_i}{\text {vol}}({{\mathcal {Y}}}_i(t))\ge \left( {\mu ^*}\right) ^{m_i} v_i\). Therefore,
The point \(y^*\) is one such that \((1-\epsilon )^{g_i({y^*})^Tx(t)} > \gamma (1-\epsilon )^{-\alpha }\). Also, \(\alpha \) was chosen such that \((1-\epsilon )^{-\alpha /2}\left( \frac{\alpha }{\alpha _0}\right) ^{m_i}v_i>1\). Using these two facts in (47), we obtain
This contradicts the hypothesis of the lemma.
Rights and permissions
About this article
Cite this article
Elbassioni, K., Makino, K. & Najy, W. A Multiplicative Weight Updates Algorithm for Packing and Covering Semi-infinite Linear Programs. Algorithmica 81, 2377–2429 (2019). https://doi.org/10.1007/s00453-018-00539-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-018-00539-4