The Surrogate Model

  • Francesco ArchettiEmail author
  • Antonio Candelieri
Part of the SpringerBriefs in Optimization book series (BRIEFSOPTI)


This Chapter presents the first key component of BO, that is, the probabilistic surrogate model. Section 3.1 is focused on Gaussian processes (GPs); Sect. 3.2 introduces the sequential optimization method known as Thompson sampling, also based on GP; finally, Sect. 3.3 presents other probabilistic models which might represent, in some cases, a suitable alternative to GP.


  1. Basu, K., Ghosh, S.: Analysis of Thompson sampling for Gaussian process optimization in the bandit setting (2017). arXiv preprint arXiv:1705.06808
  2. Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Advances in Neural Information Processing Systems, pp. 2546–2554 (2011)Google Scholar
  3. Berkenkamp, F., Schoellig, A.P., Krause, A.: No-regret Bayesian Optimization with unknown Hyperparameters(2019). arXiv preprint arXiv:1901.03357
  4. Bijl, H., Schön, T.B., van Wingerden, J.W., Verhaegen, M.: A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization (2016). arXiv preprint arXiv:1604.00169
  5. Chapelle, O., Li, L.: An empirical evaluation of Thompson sampling. In: NIPS, pp. 2249–2257 (2011)Google Scholar
  6. Duvenaud, D., Lloyd, J.R., Grosse, R., Tenenbaum, J.B., Ghahramani, Z.: Structure discovery in nonparametric regression through compositional kernel search (2013). arXiv preprint arXiv:1302.4922
  7. Eriksson, D., Dong, K., Lee, E., Bindel, D., Wilson, A.G.: Scaling Gaussian process regression with derivatives. In: Advances in Neural Information Processing Systems, pp. 6868–6878 (2018)Google Scholar
  8. Garrido-Merchán, E.C., Hernández-Lobato, D.: Dealing with categorical and integer-valued variables in Bayesian optimization with Gaussian processes (2018). arXiv preprint arXiv:1805.03463
  9. Hebbal, A., Brevault, L., Balesdent, M., Talbi, E.G., Melab, N.: Bayesian Optimization using Deep Gaussian Processes (2019). arXiv preprint arXiv:1905.03350
  10. Hennig, P., Kiefel, M.: Quasi-Newton method: a new direction. J. Mach. Learn. Res. 14, 843–865 (2013)Google Scholar
  11. Hennig, P.: Fast probabilistic optimization from noisy gradients. In: International Conference on Machine Learning, pp. 62–70 (2013, February)Google Scholar
  12. Ho, T.K.: Random decision forests. In: Conference in Document Analysis and Recognition, pp. 278–282 (1995)Google Scholar
  13. Kandasamy, K., Krishnamurthy, A., Schneider, J., Póczos, B. Parallelised bayesian optimisation via Thompson sampling. In: International Conference on Artificial Intelligence and Statistics, pp. 133–142 (2018, March)Google Scholar
  14. Kandasamy, K., Krishnamurthy, A., Schneider, J., Poczos, B.: Asynchronous parallel Bayesian optimisation via Thompson sampling (2017). arXiv preprint arXiv:1705.09236
  15. Nyikosa, F.M., Osborne, M.A., Roberts, S. J.: Bayesian optimization for dynamic problems (2018). arXiv preprint arXiv:1803.03432
  16. Ouyang, Y., Gagrani, M., Nayyar, A., Jain, R.: Learning unknown markov decision processes: a Thompson sampling approach. In: Advances in Neural Information Processing Systems, pp. 1333–1342 (2017)Google Scholar
  17. Peifer, M., Chamon, L., Paternain, S., Ribeiro, A.: Sparse multiresolution representations with adaptive kernels (2019). arXiv preprint arXiv:1905.02797
  18. Shilton, A., Gupta, S., Rana, S., Vellanki, P., Li, C., Park, L., Venkatesh, S., Sutti, A., Rubin, D., Dorin, T., Vahid, A., Height, M.:. Covariance function pre-training with m-kernels for accelerated Bayesian optimisation (2018). arXiv preprint arXiv:1802.05370
  19. Schulz, E., Speekenbrink, M., Krause, A.: A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018)MathSciNetCrossRefGoogle Scholar
  20. Snoek, J., Rippel, O., Swersky, K., Kiros, R., Satish, N., Sundaram, N., Patwary, M., Prabhat, M., Adams, R.: Scalable bayesian optimization using deep neural networks. In: International Conference on Machine Learning, pp. 2171–2180 (2015, June)Google Scholar
  21. Schulz, E., Speekenbrink, M., Hernández-Lobato, J.M., Ghahramani, Z., Gershman, S.J.: Quantifying mismatch in Bayesian optimization. In: Nips Workshop on Bayesian Optimization: black-box Optimization and Beyond (2016)Google Scholar
  22. Solak, E., Murray-Smith, R., Leithead, W.E., Leith, D.J., Rasmussen, C.E.: Derivative observations in Gaussian process models of dynamic systems. In: Advances in Neural Information Processing Systems, pp. 1057–1064 (2003)Google Scholar
  23. Springenberg, J.T., Klein, A., Falkner, S., Hutter, F.: Bayesian optimization with robust Bayesian neural networks. In: Advances in Neural Information Processing Systems, pp. 4134–4142 (2016)Google Scholar
  24. Tong Y.L.: Fundamental properties and sampling distributions of the multivariate normal distribution. In: The Multivariate Normal Distribution. Springer Series in Statistics. Springer, New York, NY (1990)Google Scholar
  25. Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285–294 (1933)CrossRefGoogle Scholar
  26. Wang, Z., Hutter, F., Zoghi, M., Matheson, D., de Feitas, N.: Bayesian optimization in a billion dimensions via random embeddings. J. Artif. Intell. Res. 55, 361–387 (2016)MathSciNetCrossRefGoogle Scholar
  27. Williams, C.K., Rasmussen, C.E.: Gaussian Processes for Machine Learning, vol. 2, no. 3, p. 4. MIT Press, Cambridge (2006)Google Scholar
  28. Wills, A.G., Schön, T.B.: On the construction of probabilistic Newton-type algorithms. In: IEEE 56th Annual Conference on Decision and Control (CDC), pp. 6499–6504, IEEE. (2017, December)Google Scholar
  29. Wu, A., Aoi, M.C., Pillow, J.W.: Exploiting gradients and Hessians in Bayesian optimization and Bayesian quadrature (2017). arXiv preprint arXiv:1704.00060
  30. Yan, L., Duan, X., Liu, B., Xu, J.: Bayesian optimization based on k-optimality. Entropy 20(8), 594 (2018)CrossRefGoogle Scholar
  31. Zhigljavsky, A., Žilinskas, A.: Selection of a covariance function for a Gaussian random field aimed for modeling global optimization problems. Optim. Lett. 1–11 (2019)Google Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science, Systems and CommunicationsUniversity of Milano-BicoccaMilanItaly

Personalised recommendations