Advertisement

Asynchronous level bundle methods

  • Franck Iutzeler
  • Jérôme Malick
  • Welington de OliveiraEmail author
Full Length Paper Series A
  • 62 Downloads

Abstract

In this paper, we consider nonsmooth convex optimization problems with additive structure featuring independent oracles (black-boxes) working in parallel. Existing methods for solving these distributed problems in a general form are synchronous, in the sense that they wait for the responses of all the oracles before performing a new iteration. In this paper, we propose level bundle methods handling asynchronous oracles. These methods require original upper-bounds (using upper-models or scarce coordinations) to deal with asynchronicity. We prove their convergence using variational-analysis techniques and illustrate their practical performance on a Lagrangian decomposition problem.

Keywords

Nonsmooth optimization Level bundle methods Distributed computing Asynchronous algorithms 

Mathematics Subject Classification

90C25 90C30 65K05 

Notes

Acknowledgements

We are grateful to the two referees for their rich feedback on the initial version of our paper. We would like to acknowledge the partial financial support of PGMO (Gaspard Monge Program for Optimization and operations research) of the Hadamard Mathematics Foundation, through the project “Advanced nonsmooth optimization methods for stochastic programming”.

References

  1. 1.
    Arda, A., Feyzmahdavian, H.R., Johansson, M.: Analysis and implementation of an asynchronous optimization algorithm for the parameter server (2016). arXiv preprint arXiv:1610.05507
  2. 2.
    Bacaud, L., Lemaréchal, C., Renaud, A., Sagastizábal, C.: Bundle methods in stochastic optimal power management: a disaggregated approach using preconditioners. Comput. Optim. Appl. 20, 227–244 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Bernardes, N.C.: On nested sequences of convex sets in Banach spaces. J. Math. Anal. Appl. 389, 558–561 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, vol. 23. Prentice Hall, Englewood Cliffs (1989)zbMATHGoogle Scholar
  5. 5.
    Briant, O., Claude Lemaréchal, P., Meurdesoif, S.M., Perrot, N., Vanderbeck, F.: Comparison of bundle and classical column generation. Math. Programm. 113, 299–344 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Bruno, S.V.B., Moraes, L.A.M., de Oliveira, W.: Optimization techniques for the Brazilian natural gas network planning problem. Energy Syst. 8, 81–101 (2017)CrossRefGoogle Scholar
  7. 7.
    de Oliveira, W.: Target radius methods for nonsmooth convex optimization. Oper. Res. Lett. 45, 659–664 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    de Oliveira, W., Sagastizábal, C.: Level bundle methods for oracles with on-demand accuracy. Optim. Methods Softw. 29, 1180–1209 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    de Oliveira, W., Solodov, M.: Bundle methods for inexact data. Technical report (2018)Google Scholar
  10. 10.
    Dubost, L., Gonzalez, R., Lemaréchal, C.: A primal-proximal heuristic applied to the french unit-commitment problem. Math. Program. 104, 129–151 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Fischer, F., Helmberg, C.: A parallel bundle framework for asynchronous subspace optimization of nonsmooth convex functions. SIAM J. Optim. 24, 795–822 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Frangioni, A.: Standard bundle methods: untrusted models and duality. Technical report, Universita di Pisa (2018)Google Scholar
  13. 13.
    Frangioni, A., Gorgone, E.: Bundle methods for sum-functions with “easy” components: applications to multicommodity network design. Math. Program. 145, 133–161 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Geoffrion, A.M.: Generalized benders decomposition. J. Optim. Theory Appl. 10, 237–260 (1972)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Hannah, R., Yin, W.: More iterations per second, same quality–why asynchronous algorithms may drastically outperform traditional ones (2017). arXiv preprint arXiv:1708.05136
  16. 16.
    Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis Minimization Algorithms, vol. 305 and 306. Springer, Berlin (1993)CrossRefzbMATHGoogle Scholar
  17. 17.
    Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013)Google Scholar
  18. 18.
    Kim, K., Petra, C., Zavala, V.: An asynchronous bundle-trust-region method for dual decomposition of stochastic mixed-integer programming. SIAM J. Optim. 29, 318–342 (2019)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Kiwiel, K.C.: Proximal level bubdle methods for convex nondiferentiable optimization, saddle-point problems and variational inequalities. Math. Program. 69, 89–109 (1995)zbMATHGoogle Scholar
  20. 20.
    Konecnỳ, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence (2016). arXiv preprint arXiv:1610.02527
  21. 21.
    Lemaréchal, C.: An extension of davidon methods to nondifferentiable problems. Math. Program. Study 3, 95–109 (1975)CrossRefzbMATHGoogle Scholar
  22. 22.
    Lemaréchal, C.: Lagrangian relaxation. In: Jünger, M., Naddef, D. (eds.) Computational Combinatorial Optimization: Optimal or Provably Near-Optimal Solutions, pp. 112–156. Springer, Berlin, Heidelberg (2001).  https://doi.org/10.1007/3-540-45586-8_4
  23. 23.
    Lemaréchal, C., Nemirovskii, A., Nesterov, Y.: New variants of bundle methods. Math. Program. 69, 111–147 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Ma, C., Smith, V., Jaggi, M., Jordan, M., Richtarik, P., Takac, M.: Adding vs. averaging in distributed primal-dual optimization. In: International Conference on Machine Learning, pp. 1973–1982 (2015)Google Scholar
  25. 25.
    Malick, J., de Oliveira, W., Zaourar, S.: Uncontrolled inexact information within bundle methods. EURO J. Comput. Optim. 5, 5–29 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Mishchenko, K., Iutzeler, F., Malick, J., Amini, M.-R.: A delay-tolerant proximal-gradient algorithm for distributed learning. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, PMLR, 10–15, pp. 3584–3592 (Jul 2018)Google Scholar
  27. 27.
    Moritsch, H.W., Pflug, GCh., Siomak, M.: Asynchronous nested optimization algorithms and their parallel implementation. Wuhan Univ. J. Nat. Sci. 6(1–2), 560–567 (2001).  https://doi.org/10.1007/BF03160302
  28. 28.
    Peng, Z., Yangyang, X., Yan, M., Yin, W.: Arock: an algorithmic framework for asynchronous parallel coordinate updates. SIAM J. Sci. Comput. 38, A2851–A2879 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Heidelberg (1998)CrossRefzbMATHGoogle Scholar
  30. 30.
    Rockafellar, R.T., Wets, R.J.-B.: Scenarios and policy aggregation in optimization under uncertainty. Math. Oper. Res. 16, 119–147 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Sagastizábal, C.: Divide to conquer: decomposition methods for energy optimization. Math. Program. 134, 187–222 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Sagastizábal, C.: A VU-point of view of nonsmooth optimization. Proc. Int. Congr. Math. 3, 3785–3806 (2018)Google Scholar
  33. 33.
    Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM, Bangkok (2009)CrossRefzbMATHGoogle Scholar
  34. 34.
    Smulian, V.: On the principle of inclusion in the space of the type \((b)\). Rec. Math. [Mat. Sbornik] N.S. 5(47), 317–328 (1939)MathSciNetzbMATHGoogle Scholar
  35. 35.
    Sun, T., Hannah, R., Yin, W.:Asynchronous coordinate descent under more realistic assumption. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 6183–6191, Curran Associates Inc., Long Beach, California, USA (2017). http://dl.acm.org/citation.cfm?id=3295222.3295366
  36. 36.
    Tsitsiklis, J., Bertsekas, D., Athans, M.: Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans. Autom. Control 31, 803–812 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    van Ackooij, W., de Oliveira, W.: Level bundle methods for constrained convex optimization with various oracles. Comput. Optim. Appl. 57, 555–597 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    van Ackooij, W., Malick, J.: Decomposition algorithm for large-scale two-stage unit-commitment. Ann. Oper. Res. 238, 587–613 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    van Ackooij, W., Frangioni, A.: Incremental bundle methods using upper models. SIAM. J. Optimi. 28(1), 379–410 (2018).  https://doi.org/10.1137/16M1089897
  40. 40.
    Wolf, C., Fábián, C.I., Koberstein, A., Suhl, L.: Applying oracles of on-demand accuracy in two-stage stochastic programming. A computational study. Eur. J. Oper. Res. 239, 437–448 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Zhang, R., Kwok, J.: Asynchronous distributed ADMM for consensus optimization. In: International Conference on Machine Learning, pp. 1701–1709 (2014)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2019

Authors and Affiliations

  • Franck Iutzeler
    • 1
  • Jérôme Malick
    • 2
  • Welington de Oliveira
    • 3
    Email author
  1. 1.Lab. J. KuntzmannUGAGrenobleFrance
  2. 2.Lab. J. KuntzmannCNRSGrenobleFrance
  3. 3.MINES ParisTech, PSL - Research UniversityCMA - Centre de Mathématiques AppliquéesSophia AntipolisFrance

Personalised recommendations