From Global Optimization to Optimal Learning

  • Francesco ArchettiEmail author
  • Antonio Candelieri
Part of the SpringerBriefs in Optimization book series (BRIEFSOPTI)


What is the relation between finding the global minimum of the function below and the learning paradigm (Fig. 2.1)? What learning models have in common with global optimization methods? Outlining possible answers and linking them to other parts of the book are the objective of this chapter.


  1. Ahmed, M.O., Vaswani, S., Schmidt, M.: Combining Bayesian Optimization and Lipschitz Optimization (2018). arXiv preprint arXiv:1810.04336
  2. Archetti, F., Betrò, B.: A priori analysis of deterministic strategies. In: Dixon L., Szego G.P. (eds.) Towards Global Optimisation, vol. 2. North Holland (1978)Google Scholar
  3. Archetti, F.: A sampling technique for global optimisation. In: Dixon L., Szego G.P. (eds.) Towards Global Optimisation, vol. 1. North Holland (1975)Google Scholar
  4. Archetti, F.: A stopping criterion for global optimization algorithms. Quaderni del Dipartimento di Ricerca Operativa e Scienze Statistiche A-61 (1979)Google Scholar
  5. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)zbMATHGoogle Scholar
  6. Bagattini, F., Schoen, F., Tigli, L.: Clustering methods for large scale geometrical global optimization. Optim. Methods Softw. 1–24 (2019)Google Scholar
  7. Barsce, J.C., Palombarini, J.A., & Martínez, E.C.: Towards autonomous reinforcement learning: automatic setting of hyper-parameters using Bayesian optimization. In: 2017 XLIII Latin American Computer Conference (CLEI), pp. 1–9. IEEE (2017, September)Google Scholar
  8. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetzbMATHGoogle Scholar
  9. Betro, B.: Bayesian testing of nonparametric hypotheses and its application to global optimization. J. Optim. Theory Appl. 42(1), 31–50 (1984)MathSciNetzbMATHGoogle Scholar
  10. Betrò, B., Rotondi, R.: A Bayesian algorithm for global optimization. Ann. Oper. Res. 1(2), 111–128 (1984)zbMATHGoogle Scholar
  11. Boender, C.G.E., Kan, A.H.G.R., Timmer, G.T.: A stochastic method for global optimization. Math. Program. 22, 125–140 (1982)MathSciNetzbMATHGoogle Scholar
  12. Borji, A., Itti, L.: Bayesian optimization explains human active search. Adv. Neural Inf. Process. Syst. 26(NIPS2013), 55–63 (2013)zbMATHGoogle Scholar
  13. Brooks, S.H.: A discussion of random methods for seeking maxima. Oper. Res. 6, 244–251 (1958). Scholar
  14. Clough, D.J.: An asymptotic extreme value sampling theory for estimation of global maximum. CORS J. 102–115 (1969)Google Scholar
  15. Dixon, L.C.W., Szegö, G.P.: Towards Global Optimization 1. Dixon, L.C.W., Szegö, G.P. eds. North Holland (1975)Google Scholar
  16. Dixon, L.C.W., Szegö, G.P.: Towards Global Optimization 2. In: Dixon, L.C.W., Szegö, G.P. (eds.). North Holland (1978)Google Scholar
  17. Dodge, J., Anderson, C., Smith, N.A.: Random search for hyperparameters using determinantal point processes (2017). arXiv preprintGoogle Scholar
  18. Engel, Y., Mannor, S., & Meir, R. (2003). Bayes meets Bellman: The Gaussian process approach to temporal difference learning. In Proceedings of the 20th International Conference on Machine Learning (ICML-03) (pp. 154–161)Google Scholar
  19. Evtushenko, Y.G.: Numerical methods for finding global extrema (Case of a non-uniform mesh). USSR Comput. Math. Math. Phys. 11, 38–54 (1971). Scholar
  20. Falkner, S., Klein, A., Hutter, F.: Combining Hyperband and Bayesian Optimization. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS), Bayesian Optimization Workshop (2017)Google Scholar
  21. Florea, A.C., Andonie, R.: A dynamic early stopping criterion for random search in SVM hyperparameter optimization. In: IFIP International Conference on Artificial Intelligence Applications and Innovations, pp. 168–180. Springer, Cham (2018)Google Scholar
  22. Gershman, S.J.: Deconstructing the human algorithms for exploration. Cognition 173, 34–42 (2018). Scholar
  23. Gershman, S.J.: Uncertainty and exploration. bioRxiv, 265504 (2018).
  24. Gopnik, A., O’Grady, S., Lucas, C. G., Griffiths, T. L., Wente, A., Bridgers, S., et al.: Changes in cognitive flexibility and hypothesis search across human life history from childhood to adolescence to adulthood. Proc. National Acad. Sci. 114(30), 7892–7899 (2017)Google Scholar
  25. Griffiths, T.L., Lucas, C., Williams, J., Kalish, M.L. (2009). Modeling human function learning with Gaussian processes. In: Advances in Neural Information Processing Systems, pp. 553–560Google Scholar
  26. Hansen, N.: The CMA evolution strategy: a tutorial (2016). arXiv preprint arXiv:1604.00772
  27. Hauschild, M., Pelikan, M.: An introduction and survey of estimation of distribution algorithms. Swarm Evol. Comput. 1(3), 111–128 (2011)Google Scholar
  28. Jalali, A., Azimi, J., Fern, X., Zhang, R.: A Lipschitz exploration-exploitation scheme for Bayesian optimization. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 8188 LNAI, 210–224 (2013). Scholar
  29. Jones, D.R., Perttunen, C.D., Stuckman, B.E.: Lipschitzian optimization without the Lipschitz constant. J. Optim. Theory Appl. 79(1), 157–181 (1993)MathSciNetzbMATHGoogle Scholar
  30. Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A.: Hyperband: a novel bandit-based approach to hyperparameter optimization (2016). arXiv preprint arXiv:1603.06560
  31. Ling, C.K., Low, K.H., Jaillet, P.: Gaussian process planning with Lipschitz continuous reward functions: towards unifying bayesian optimization, active learning, and beyond. In: AAAI, pp 1860–1866 (2016)Google Scholar
  32. Locatelli, M., Schoen, F.: Global Optimization: theory, Algorithms, and Applications, vol. 15. Siam (2013)Google Scholar
  33. Locatelli, M., Schoen, F.: Random linkage: a family of acceptance/rejection algorithms for global optimisation. Math. Program. Series B. 85, 379–396 (1999). Scholar
  34. Malherbe, C., Vayatis, N.: Global optimization of Lipschitz functions (2017). arXiv preprint arXiv:1703.02628
  35. Mania, H., Guy, A., Recht, B.: Simple random search provides a competitive approach to reinforcement learning (2018). arXiv preprint arXiv:1803.07055
  36. Missov, T.I., Ermakov, S.M.: On importance sampling in the problem of global optimization. Monte Carlo Methods Appl. 15, 135–144 (2009). Scholar
  37. Norkin, V.I., Pflug, G.C., Ruszczynski, A.: A branch and bound method for stochastic global optimization. Math. Program. 83, 425–450 (1998)MathSciNetzbMATHGoogle Scholar
  38. Ortega, P.A., Wang, J.X., Rowland, M., Genewein, T., Kurth-Nelson, Z., Pascanu, R., et al.: Meta-learning of sequential strategies (2019). arXiv preprint arXiv:1905.03030
  39. Pardalos, P. M., & Romeijn, H. E. (Eds.). (2013). Handbook of global optimization (Vol. 2). Springer Science & Business MediaGoogle Scholar
  40. Parsopoulos, K.E., Vrahatis, M.N.: Particle swarm optimization method in multiobjective problems. In: Proceedings of the 2002 ACM symposium on Applied computing, pp. 603–607. ACM (2002, March)Google Scholar
  41. Pelikan, M., Goldberg, D.E., Cantú-Paz, E.: BOA: the Bayesian optimization algorithm. In: Genetic and Evolutionary Computation Conference, pp. 525–532 (1999)Google Scholar
  42. Peterson, J., Bourgin, D., Reichman, D., Griffiths, T., Russell, S.: Cognitive model priors for predicting human decisions. In International Conference on Machine Learning, pp. 5133–5141 (2019, May)Google Scholar
  43. Pintér, J.D. (ed.).: Global Optimization: scientific and Engineering Case Studies, vol. 85. Springer Science & Business Media (2006)Google Scholar
  44. Powell, W.B., Ryzhov, I.O.: Optimal Learning, vol. 841. Wiley (2012)Google Scholar
  45. Rastrigin, L.A.: The convergence of the random search method in the extremal control of a many parameter system. Autom. Rem. Control. 24(10), 1337–1342 (1963)Google Scholar
  46. Schoen, F.: Random and quasi-random linkage methods in global optimization. J. Global Optim. 13, 445–454 (1998). Scholar
  47. Schulz, L.: The origins of inquiry: inductive inference and exploration in early childhood. Trends Cognitive Sciences 16(7), 382–389 (2012)Google Scholar
  48. Schulz, E., Tenenbaum, J., Duvenaud, D.K., Speekenbrink, M., Gershman, S.J.: Probing the compositionality of intuitive functions. In: Advances in Neural Information Processing Systems, pp. 3729–3737 (2016)Google Scholar
  49. Schulz, E., Tenenbaum, J.B., Reshef, D.N., Speekenbrink, M., Gershman, S.: Assessing the Perceived Predictability of Functions. In: CogSci (2015)Google Scholar
  50. Schulz, E., Speekenbrink, M., Krause, A.: A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018a)MathSciNetzbMATHGoogle Scholar
  51. Schulz, E., Konstantinidis, E., Speekenbrink, M.: Putting bandits into context: how function learning supports decision making. J. Exp. Psychol.-Learn. Memoru Cognition 44(6), 927–943 (2018b)Google Scholar
  52. Sergeyev, Y.D., Candelieri, A., Kvasov, D.E., Perego, R.: Safe global optimization of expensive noisy black-box functions in the δ-Lipschitz framework (2019). arXiv preprint arXiv:1908.06010
  53. Sergeyev, Y.D., Kvasov, D.E.: Deterministic Global Optimization: an Introduction to the Diagonal Approach. Springer (2017)Google Scholar
  54. Sergeyev, Y.D., Kvasov, D.E.: Global search based on efficient diagonal partitions and a set of Lipschitz constants. SIAM J. Optim. 16 (2006). Scholar
  55. Sergeyev, Y.D., Kvasov, D.E., Mukhametzhanov, M.S.: On the efficiency of nature-inspired metaheuristics in expensive global optimization with limited budget. Scientific Reports 8(1), 453 (2018)Google Scholar
  56. Shubert, B.O.: A sequential method seeking the global maximum of a function. SIAM J. Numer. Anal. 9(3), 379–388 (1972)MathSciNetzbMATHGoogle Scholar
  57. Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design (2009). arXiv preprint arXiv:0912.3995
  58. Strongin, R.G.: Numerical methods in multiextremal problems. Nauka, Moscow, USSR (1978)Google Scholar
  59. Strongin, R.G.: Algorithms for multi-extremal mathematical programming problems employing the set of joint space-filling curves. J. Global Optim. 2, 357–378 (1992). Scholar
  60. Torn, A.A.: A search clustering approach to global optimization. Towards Global Optim. 2 (1978)Google Scholar
  61. Ulmer, H., Streichert, F., Zell, A.: Evolution strategies assisted by Gaussian processes with improved preselection criterion. In: The Congress on Evolutionary Computation, CEC'03, vol. 1, pp. 692–699, IEEE (2003, December)Google Scholar
  62. Wang, J., Xu, J., & Wang, X.: Combination of hyperband and Bayesian optimization for hyperparameter optimization in deep learning (2018). arXiv preprint arXiv:1801.01596
  63. Wang, L., Shan, S., Wang, G.G.: Mode-pursuing sampling method for global optimization on expensive black-box functions. Eng. Optim. 36, 419–438 (2004). Scholar
  64. Wilson, A.G., Dann, C., Lucas, C., Xing, E.P.: The human kernel. In: Advances in Neural Information Processing Systems, pp. 2854–2862 (2015)Google Scholar
  65. Wilson, A., Fern, A., Tadepalli, P.: Using trajectory data to improve bayesian optimization for reinforcement learning. J. Mach. Learn. Res. 15(1), 253–282 (2014)MathSciNetzbMATHGoogle Scholar
  66. Wu, C.M., Schulz, E., Speekenbrink, M., Nelson, J.D., Meder, B.: Generalization guides human exploration in vast decision spaces. Nat. Human Behav. 2(12), 915 (2018)Google Scholar
  67. Zabinsky, Z.B., Smith, R.L.: Pure adaptive search in global optimization. Math. Program. 53(1–3), 323–338 (1992)MathSciNetzbMATHGoogle Scholar
  68. Zabinsky, Z.B., Wang, W., Prasetio, Y., Ghate, A., Yen, J.W.: Adaptive probabilistic branch and bound for level set approximation. In: Proceedings of the Winter Simulation Conference, pp. 4151–4162. Winter Simulation Conference (2011, December)Google Scholar
  69. Zabinsky, Z.B.: Random search algorithms. In: Wiley Encyclopedia of Operations Research and Management Science (2011)Google Scholar
  70. Zabinsky, Z.B.: Stochastic adaptive search methods: theory and implementation. In: Handbook of Simulation Optimization, pp. 293–318. Springer, New York (2015)Google Scholar
  71. Zhigljavsky, A.: Mathematical theory of global random search. LGU, Leningrad (1985)Google Scholar
  72. Zhigljavsky, A.: Branch and probability bound methods for global optimization. Informatica 1(1), 125–140 (1990)MathSciNetzbMATHGoogle Scholar
  73. Zhigljavsky, A., Zilinskas, A.: Stochastic Global Optimization, vol. 9. Springer Science & Business Media (2007)Google Scholar
  74. Zhigljavsky, A.A., Chekmasov, M.V.: Comparison of independent, stratified and random covering sample schemes in optimization problems. Math. Comput. Model. 23, 97–110 (1996). Scholar
  75. Zielinski, R.: A statistical estimate of the structure of multi-extremal problems. Math. Program. 21, 348–356 (1981). Scholar
  76. Žilinskas, A., Zhigljavsky, A.: Stochastic global optimization: a review on the occasion of 25 years of Informatica. Informatica 27(2), 229–256 (2016)zbMATHGoogle Scholar
  77. Žilinskas, A., Gillard, J., Scammell, M., Zhigljavsky, A.: Multistart with early termination of descents. J. Glob. Optim. 1–16 (2019)Google Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science, Systems and CommunicationsUniversity of Milano-BicoccaMilanItaly

Personalised recommendations