Self Hyper-parameter Tuning for Stream Recommendation Algorithms

  • Bruno Veloso
  • João Gama
  • Benedita MalheiroEmail author
  • João Vinagre
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 967)


E-commerce platforms explore the interaction between users and digital content – user generated streams of events – to build and maintain dynamic user preference models which are used to make meaningful recommendations. However, the accuracy of these incremental models is critically affected by the choice of hyper-parameters. So far, the incremental recommendation algorithms used to process data streams rely on human expertise for hyper-parameter tuning. In this work we apply our Self Hyper-Parameter Tuning (SPT) algorithm to incremental recommendation algorithms. SPT adapts the Melder-Mead optimisation algorithm to perform hyper-parameter tuning. First, it creates three models with random hyper-parameter values and, then, at dynamic size intervals, assesses and applies the Melder-Mead operators to update their hyper-parameters until the models converge. The main contribution of this work is the adaptation of the SPT method to incremental matrix factorisation recommendation algorithms. The proposed method was evaluated with well-known recommendation data sets. The results show that SPT systematically improves data stream recommendations.


Parameter tuning Hyper-parameters Optimisation Nelder-Mead Recommendation 



This research was carried out in the framework of the project TEC4Growth - RL SMILES - Smart, mobile, Intelligent and Large Scale Sensing and analytics NORTE-01-0145-FEDER-000020 which is financed by the north Portugal regional operational program (NORTE 2020), under the Portugal 2020 partnership agreement, and through the European regional development fund.


  1. 1.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(1), 281–305 (2012)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Berkeley University: Jester data set. Accessed Mar 2018Google Scholar
  3. 3.
    Cremonesi, P., Koren, Y., Turrin, R.: Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys 2010, pp. 39–46. ACM, New York (2010)Google Scholar
  4. 4.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Escalante, H.J., Montes, M., Sucar, E.: Ensemble particle swarm model selection. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2010)Google Scholar
  6. 6.
    Escalante, H.J., Montes, M., Sucar, L.E.: Particle swarm model selection. J. Mach. Learn. Res. 10(Feb), 405–440 (2009)Google Scholar
  7. 7.
    Fernandes, S., Tork, H.F., Gama, J.: The initialization and parameter setting problem in tensor decomposition-based link prediction. In: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 99–108, October 2017Google Scholar
  8. 8.
    Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)Google Scholar
  9. 9.
    Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Precup, D., Teh, Y.W. (eds.), Proceedings of the 34th International Conference on Machine Learning, Volume 70 of Proceedings of Machine Learning Research, 06–11 August 2017, International Convention Centre, Sydney, Australia, pp. 1126–1135. PMLR (2017)Google Scholar
  10. 10.
    Gama, J., Sebastiáo, R., Rodrigues, P.P.: On evaluating stream learning algorithms. Mach. Learn. 90(3), 317–34 (2013)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Hsu, C.-W., Chang, C.-C. and Lin, C.-J., et al.: A practical guide to support vector classification (2003)Google Scholar
  12. 12.
    Hutter, F., et al.: In: AutoML Workshop @ ICML 2014 (2014). Accessed 18 July 2018Google Scholar
  13. 13.
    Kaggle: Goodbooks data set. Accessed Mar 2018Google Scholar
  14. 14.
    Kar, R., Konar, A., Chakraborty, A., Ralescu, A.L., Nagar, A.K.: Extending the Nelder-Mead algorithm for feature selection from brain networks. In: 2016 IEEE Congress on Evolutionary Computation (CEC), pp. 4528–4534. IEEE (2016)Google Scholar
  15. 15.
    Koenigstein, N., Dror, G., Koren, Y.: Yahoo! music recommendations: modeling music ratings with temporal dynamics and item taxonomy. In Proceedings of the Fifth ACM Conference on Recommender Systems, pp. 165–172. ACM (2011)Google Scholar
  16. 16.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI 1995, pp. 1137–1143. Morgan Kaufmann Publishers Inc., San Francisco (1995)Google Scholar
  17. 17.
    Kohavi, R., John, G.H.: Automatic parameter selection by minimizing estimated error. In: Machine Learning Proceedings 1995, pp. 304–312. Elsevier (1995)Google Scholar
  18. 18.
    Maclaurin, D., Duvenaud, D., Adams, R.: Gradient-based hyperparameter optimization through reversible learning. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICM 2015, pp. 2113–2122. (2015)Google Scholar
  19. 19.
    McNemar, Q.: Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153–157 (1947)Google Scholar
  20. 20.
    MovieLens: Movielens 100k data set. Accessed Mar 2018Google Scholar
  21. 21.
    MovieLens: Movielens 1M data set. Accessed Mar 2018Google Scholar
  22. 22.
    Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Nichol, A., Schulman, J.: Reptile: a scalable metalearning algorithm. arXiv e-prints, March 2018Google Scholar
  24. 24.
    Pfaffe, P., Tillmann, M., Walter, S., Tichy, W.F.: Online-autotuning in the presence of algorithmic choice. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1379–1388. IEEE (2017)Google Scholar
  25. 25.
    Takács, G., Pilászy, I., Németh, B., Tikk, D.: Scalable collaborative filtering approaches for large recommender systems. J. Mach. Learn. Res. 10, 623–656 (2009)Google Scholar
  26. 26.
    Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, pp. 847–855. ACM, New York (2013)Google Scholar
  27. 27.
    Veloso, B., Gama, J., Malheiro, B.: Self hyper-parameter tuning for data streams. In: Soldatova, L., Vanschoren, J., Papadopoulos, G., Ceci, M. (eds.) DS 2018. LNCS (LNAI), vol. 11198, pp. 241–255. Springer, Cham (2018)Google Scholar
  28. 28.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biomet. Bull. 1(6), 80–83 (1945)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.LIAADINESC TECPortoPortugal
  2. 2.Research on Economics, Management and Information Technologies - REMITUniv PortucalensePortoPortugal
  3. 3.FEPUniversity of PortoPortoPortugal
  4. 4.ISEPPolytechnic of PortoPortoPortugal
  5. 5.CRASINESC TECPortoPortugal
  6. 6.FCUPUniversity of PortoPortoPortugal

Personalised recommendations