Bayesian Optimization for Recommender System

  • Bruno Giovanni GaluzziEmail author
  • Ilaria Giordani
  • A. Candelieri
  • Riccardo Perego
  • Francesco Archetti
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 991)


Many web services have a Recommender System to help the users in their choices such as movies to watch or products to buy. The aim is to make accurate predictions on the user preferences depending on his/her past choices. Matrix-factorization is one of the most widely adopted method to build a Recommender System. Like many Machine Learning algorithms, matrix-factorization has a set of hyper-parameters to tune, leading to a complex expensive black-box optimization problem. The objective function maps any possible hyper-parameter configuration to a numeric score quantifying the quality of predictions. In this work, we show how Bayesian Optimization can efficiently optimize three hyper-parameters of a Recommender System: number of latent factors, regularization term and learning rate. A widely adopted acquisition function, namely Expected Improvement, is compared with a variant of Thompson Sampling. Numerical for both a benchmark 2-dimensional test function and a Recommender System evaluated on a benchmark dataset proved that Bayesian Optimization is an efficient tool for improving the predictions of a Recommendation System, but a clear choice between the two acquisition function is not evident.


Recommender System Bayesian Optimization Hyper-parameter optimization 


  1. 1.
    Aggarwal, C.C.: Recommender Systems. Springer International Publishing (2016). Scholar
  2. 2.
    Basu, K., Ghosh, S.: Analysis of Thompson Sampling for Gaussian Process Optimization in the Bandit Setting (2017). arXiv preprint arXiv:1705.06808
  3. 3.
    Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT 2010-19th International Conference on Computational Statistics, Keynote, Invited and Contributed Papers, pp. 177–186 (2010).
  4. 4.
    Cacheda, F., Carneiro, V., Fernández, D., Formoso, V.: Comparison of collaborative filtering algorithms. ACM Trans. Web 5(1), 1–33 (2011). Scholar
  5. 5.
    Candelieri, A., Perego, R., Archetti, F.: Bayesian optimization of pump operations in water distribution systems. J. Glob. Optim. 71(1), 213–235 (2018). Scholar
  6. 6.
    Candelieri, A., Giordani, I., Archetti, F., Barkalov, K., Meyerov, I., Polovinkin, A., Sysoyev, A., Zolotykh, N.: Tuning hyperparameters of a SVM-based water demand forecasting system through parallel global optimization. Comput. Oper. Res. (2018). Scholar
  7. 7.
    Crespo, R.G., Martínez, O.S., Lovelle, J.M.C., García-Bustelo, B.C.P., Gayo, J.E.L., Pablos, P.O.D.: Recommendation system based on user interaction data applied to intelligent electronic books. Comput. Hum. Behav. 27(4), 1445–1449 (2011). Scholar
  8. 8.
    Dewancker, I., McCourt, M., Clark, S.: Bayesian Optimization for Machine Learning : A Practical Guidebook (2016). arXiv preprint arXiv:1612.04858
  9. 9.
    Frazier, P.I.: A Tutorial on Bayesian Optimization (2018). arXiv preprint arXiv:1807.02811
  10. 10.
    Garnett, R., Osborne, M.A., Roberts, S.J.: Bayesian optimization for sensor set selection. In: Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks-IPSN 2010, Stockholm, pp. 209–219 (2010).
  11. 11.
    Gaviano, M., Kvasov, D.E., Lera, D., Sergeyev, Y.D.: Algorithm 829: software for generation of classes of test functions with known local and global minima for global optimization. ACM Trans. Math. Softw. 29(4), 469–480 (2003). Scholar
  12. 12.
    Gaviano, M., Kvasov, D., Lera, D., Sergeyev, Y.D.: Software for generation of classes of test functions with known local and global minima for global optimization. ACM Trans. Math. Softw. 29(4), 469–480 (2003)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Harper, F.M., Konstan, J.A.: The movielens datasets. ACM Trans. Interact. Intell. Syst. 5(4), 1–19 (2015). Scholar
  14. 14.
    Kandasamy, K., Krishnamurthy, A., Schneider, J., Póczos, B.: Parallelised Bayesian optimisation via Thompson sampling. In: International Conference on Artificial Intelligence and Statistics, pp. 133–142 (2018)Google Scholar
  15. 15.
    Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009). Scholar
  16. 16.
    Lee, S.K., Cho, Y.H., Kim, S.H.: Collaborative filtering with ordinal scale-based implicit ratings for mobile music recommendations. Inf. Sci. 180(11), 2142–2155 (2010). Scholar
  17. 17.
    McNally, K., O’Mahony, M.P., Coyle, M., Briggs, P., Smyth, B.: A case study of collaboration and reputation in social web search. ACM Trans. Intell. Syst. Technol. 3(1), 1–29 (2011). Scholar
  18. 18.
    Meldgaard, S.A., Kolsbjerg, E.L., Hammer, B.: Machine learning enhanced global optimization by clustering local environments to enable bundled atomic energies. J. Chem. Phys. 149(13) (2018). Scholar
  19. 19.
    Mockus, J.: Bayesian Approach to Global Optimization, vol. 37. Springer Netherlands (1989). Scholar
  20. 20.
    Olofsson, S., Mehrian, M., Calandra, R., Geris, L., Deisenroth, M., Misener, R.: Bayesian multi-objective optimisation with mixed analytical and black-box functions: application to tissue engineering (2018). Scholar
  21. 21.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Perdikaris, P., Karniadakis, G.E.: Model inversion via multi-fidelity Bayesian optimization: a new paradigm for parameter estimation in haemodynamics, and beyond. J. R. Soc. Interface 13(118) (2016). Scholar
  23. 23.
    Roustant, O., Ginsbourger, D., Deville, Y.: DiceKriging, DiceOptim: two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization. J. Stat. Softw. 51(1), 1–55. (2012)
  24. 24.
    Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems (NIPS), pp. 1257–1264 (2008)Google Scholar
  25. 25.
    Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., De Freitas, N.: Taking the human out of the loop: a review of Bayesian optimization. Proc. IEEE 104, 148–175 (2016). Scholar
  26. 26.
    Takács, G., Pilászy, I., Németh, B., Tikk, D.: Scalable collaborative filtering approaches for large recommender systems. J. Mach. Learn. Res. 10, 623–656 (2009). Scholar
  27. 27.
    Vanchinathan, H.P., Nikolic, I., De Bona, F., Krause, A.: Explore-exploit in top-N recommender systems via Gaussian processes. In: Proceedings of the 8th ACM Conference on Recommender systems-RecSys 2014, No. June 2015, pp. 225–232. (2014).

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of Milano-BicoccaMilanItaly
  2. 2.Consorzio Milano-RicercheMilanItaly

Personalised recommendations