# Exploration Enhanced Expected Improvement for Bayesian Optimization

## Abstract

Bayesian optimization (BO) is a sample-efficient method for global optimization of expensive, noisy, black-box functions using probabilistic methods. The performance of a BO method depends on its selection strategy through an acquisition function. This must balance improving our understanding of the function in unknown regions (exploration) with locally improving on known promising samples (exploitation). Expected improvement (EI) is one of the most widely used acquisition functions for BO. Unfortunately, it has a tendency to over-exploit, meaning that it can be slow in finding new peaks. We propose a modification to EI that will allow for increased early exploration while providing similar exploitation once the system has been suitably explored. We also prove that our method has a sub-linear convergence rate and test it on a range of functions to compare its performance against the standard EI and other competing methods. Code related to this paper is available at: https://github.com/jmaberk/BO_with_E3I.

## Notes

### Acknowledgements

This research was supported by an Australian Government Research Training Program (RTP) Scholarship awarded to Mr Berk, and was partially funded by the Australian Government through the Australian Research Council (ARC). Prof Venkatesh is the recipient of an ARC Australian Laureate Fellowship (FL170100006).

## References

- 1.Bochner, S.: Lectures on Fourier Integrals: With an Author’s Supplement on Monotonic Functions, Stieltjes Integrals and Harmonic Analysis. Princeton University Press, Princeton (1959). Translated from the Original German by Morris Tenenbaum and Harry PollardzbMATHGoogle Scholar
- 2.Brochu, E., Cora, V.M., de Freitas, N.: A tutorial on Bayesian optimisation of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arxiv.org (2010)
- 3.Bull, A.D.: Convergence rates of efficient global optimization algorithms. J. Mach. Learn. Res.
**12**(Oct), 2879–2904 (2011)MathSciNetzbMATHGoogle Scholar - 4.González, J., Longworth, J., James, D.C., Lawrence, N.D.: Bayesian optimization for synthetic gene design. arXiv preprint arXiv:1505.01627 (2015)
- 5.Hernández-Lobato, J.M., Hoffman, M.W., Ghahramani, Z.: Predictive entropy search for efficient global optimization of black-box functions. In: Advances in Neural Information Processing Systems, pp. 918–926 (2014)Google Scholar
- 6.Jalali, A., Azimi, J., Fern, X., Zhang, R.: A Lipschitz exploration-exploitation scheme for Bayesian optimization. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8188, pp. 210–224. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40988-2_14CrossRefGoogle Scholar
- 7.Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim.
**13**(4), 455–492 (1998)MathSciNetCrossRefGoogle Scholar - 8.Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Basic Eng.
**86**(1), 97–106 (1964)CrossRefGoogle Scholar - 9.Li, C., et al.: Rapid Bayesian optimisation for synthesis of short polymer fiber materials. Sci. Rep.
**7**(1), 5683 (2017). https://doi.org/10.1038/s41598-017-05723-0 - 10.Lizotte, D.J.: Practical Bayesian Optimization. Ph.D thesis, University of Alberta (2008)Google Scholar
- 11.Mockus, J.: Application of Bayesian approach to numerical methods of global and stochastic optimization. J. Global Optim.
**4**(4), 347–365 (1994)MathSciNetCrossRefGoogle Scholar - 12.V. Nguyen, S. Gupta, S. Rana, C. Li, and S. Venkatesh. A Bayesian nonparametric approach for multi-label classification. In: Asian Conference on Machine Learning, pp. 254–269 (2016)Google Scholar
- 13.Nguyen, V., Gupta, S., Rana, S., Li, C., Venkatesh, S.: Predictive variance reduction search. In: NIPS Workshop on Bayesian Optimization, vol. 12 (2017)Google Scholar
- 14.Nguyen, V., Gupta, S., Rana, S., Li, C., Venkatesh, S.: Regret for expected improvement over the best-observed value and stopping condition. In: Asian Conference on Machine Learning, pp. 279–294 (2017)Google Scholar
- 15.Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)zbMATHGoogle Scholar
- 16.Ryzhov, I.O.: On the convergence rates of expected improvement methods. Oper. Res.
**64**(6), 1515–1528 (2016)MathSciNetCrossRefGoogle Scholar - 17.Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: NIPS, pp. 2951–2959 (2012)Google Scholar
- 18.Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of the 27th International Conference on Machine Learning, pp. 1015–1022 (2010)Google Scholar
- 19.Turgeon, M., Lustig, C., Meck, W.H.: Cognitive aging and time perception: roles of Bayesian optimization and degeneracy. Front. Aging Neurosci.
**8**, 102 (2016). https://doi.org/10.3389/fnagi.2016.00102CrossRefGoogle Scholar - 20.Wang, Z., de Freitas, N.: Theoretical analysis of Bayesian optimisation with unknown Gaussian process hyper-parameters. In: NIPS Workshop on Bayesian Optimization (2014)Google Scholar
- 21.Wang, Z., Jegelka, S.: Max-value entropy search for efficient Bayesian optimization. In: International Conference on Machine Learning, pp. 3627–3635 (2017)Google Scholar