User Modeling and User-Adapted Interaction

, Volume 28, Issue 4–5, pp 331–390 | Cite as

Evaluation of session-based recommendation algorithms

  • Malte LudewigEmail author
  • Dietmar Jannach


Recommender systems help users find relevant items of interest, for example on e-commerce or media streaming sites. Most academic research is concerned with approaches that personalize the recommendations according to long-term user profiles. In many real-world applications, however, such long-term profiles often do not exist and recommendations therefore have to be made solely based on the observed behavior of a user during an ongoing session. Given the high practical relevance of the problem, an increased interest in this problem can be observed in recent years, leading to a number of proposals for session-based recommendation algorithms that typically aim to predict the user’s immediate next actions. In this work, we present the results of an in-depth performance comparison of a number of such algorithms, using a variety of datasets and evaluation measures. Our comparison includes the most recent approaches based on recurrent neural networks like gru4rec, factorized Markov model approaches such as fism or fossil, as well as simpler methods based, e.g., on nearest neighbor schemes. Our experiments reveal that algorithms of this latter class, despite their sometimes almost trivial nature, often perform equally well or significantly better than today’s more complex approaches based on deep neural networks. Our results therefore suggest that there is substantial room for improvement regarding the development of more sophisticated session-based recommendation algorithms.


Session-based recommendation Sequential recommendation Deep learning Factorized Markov models Nearest neighbors 


  1. Adomavicius, G., Kwon, Y.O.: Improving aggregate recommendation diversity using ranking-based techniques. IEEE Trans. Knowl. Data Eng. 24(5), 896–911 (2012)CrossRefGoogle Scholar
  2. Adomavicius, G., Zhang, J.: Impact of data characteristics on recommender systems performance. ACM Trans. Manag. Inf. Syst. 3(1), 3:1–3:17 (2012)CrossRefGoogle Scholar
  3. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD ’93, pp. 207–216 (1993)CrossRefGoogle Scholar
  4. Baeza-Yates, R., Jiang, D., Silvestri, F., Harrison, B.: Predicting the next app that you are going to use. In: WSDM ’15, pp. 285–294 (2015)Google Scholar
  5. Billsus, D., Pazzani, M.J., Chen, J.: A learning agent for wireless news access. In: IUI ’00, pp. 33–36 (2000)Google Scholar
  6. Bonnin, G., Jannach, D.: Automated generation of music playlists: survey and experiments. Comput. Surv. 47(2), 26:1–26:35 (2014)CrossRefGoogle Scholar
  7. Chen, S., Moore, J.L., Turnbull, D., Joachims, T.: Playlist prediction via metric embedding. In: KDD ’12, pp. 714–722 (2012)Google Scholar
  8. Chen, S., Xu, J., Joachims, T.: Multi-space probabilistic sequence modeling. In: KDD ’13, pp. 865–873 (2013)Google Scholar
  9. Cheng, C., Yang, H., Lyu, M.R., King, I.: Where you like to go next: Successive point-of-interest recommendation. In: IJCAI ’13, pp. 2605–2611 (2013)Google Scholar
  10. Cho, K., van Merriënboer, B., Gülçehre, Ç, Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: EMNLP ’14, pp. 1724–1734 (2014)Google Scholar
  11. Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowl. Inf. Syst. 1(1), 5–32 (1999)CrossRefGoogle Scholar
  12. Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: Scalable online collaborative filtering. In: Proceedings of the 16th International Conference on World Wide Web, WWW ’07, pp. 271–280 (2007)Google Scholar
  13. Davidson, J., Liebald, B., Liu, J., Nandy, P., Van Vleet, T., Gargi, U., Gupta, S., He, Y., Lambert, M., Livingston, B., Sampath, D.: The YouTube video recommendation system. In: RecSys ’10, pp. 293–296 (2010)Google Scholar
  14. Devooght, R., Bersini, H.: Long and short-term recommendations with recurrent neural networks. In: UMAP ’17, pp. 13–21 (2017)Google Scholar
  15. Djuric, N., Radosavljevic, V., Grbovic, M., Bhamidipati, N.: Hidden conditional random fields with deep user embeddings for ad targeting. In: ICDM ’14, pp. 779–784 (2014)Google Scholar
  16. Du, N., Dai, H., Trivedi, R., Upadhyay, U., Gomez-Rodriguez, M., Song, L.: Recurrent marked temporal point processes: embedding event history to vector. In: KDD ’16, pp. 1555–1564 (2016)Google Scholar
  17. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetzbMATHGoogle Scholar
  18. Feng, S., Li, X., Zeng, Y., Cong, G., Chee, Y.M., Yuan, Q.: Personalized ranking metric embedding for next new POI recommendation. In: IJCAI ’15, pp. 2069–2075 (2015)Google Scholar
  19. Garcin, F., Dimitrakakis, C., Faltings, B.: Personalized news recommendation with context trees. In: RecSys ’13, pp. 105–112 (2013)Google Scholar
  20. Grbovic, M., Radosavljevic, V., Djuric, N., Bhamidipati, N., Savla, J., Bhagwan, V., Sharp, D.: E-commerce in your inbox: product recommendations at scale. In: KDD ’15, pp. 1809–1818 (2015)Google Scholar
  21. Hariri, N., Mobasher, B., Burke, R.: Context-aware music recommendation based on latent topic sequential patterns. In: RecSys ’12, pp. 131–131 (2012)Google Scholar
  22. He, R., McAuley, J.: Fusing similarity models with Markov Chains for sparse sequential recommendation. CoRR. (2016). arxiv:1609.09152
  23. He, Q., Jiang, D., Liao, Z., Hoi, S.C.H., Chang, K., Lim, E.-P., Li, H.: Web query recommendation via sequential query prediction. In: ICDE ’09, pp. 1443–1454 (2009)Google Scholar
  24. He, J., Li, X., Liao, L., Song, D., Cheung, W.: Inferring a personalized next point-of-interest recommendation model with latent behavior patterns. In: AAAI ’16 (2016)Google Scholar
  25. Hidasi, B., Karatzoglou, A.: Recurrent neural networks with top-k gains for session-based recommendations. CoRR. (2017). arxiv:1706.03847
  26. Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks. In: ICLR ’16 (2016a)Google Scholar
  27. Hidasi, B., Quadrana, M., Karatzoglou, A., Tikk, D.: Parallel recurrent neural network architectures for feature-rich session-based recommendations. In: RecSys ’16, pp. 241–248 (2016b)Google Scholar
  28. Hosseinzadeh Aghdam, M., Hariri, N., Mobasher, B., Burke, R.: Adapting recommendations to contextual changes using hierarchical hidden Markov models. In: RecSys ’15, pp. 241–244 (2015)Google Scholar
  29. Jannach, D., Hegelich, K.: A case study on the effectiveness of recommendations in the mobile internet. In: RecSys ’09, pp. 205–208 (2009)Google Scholar
  30. Jannach, D., Adomavicius, G.: Recommendations with a purpose. In: RecSys ’16, pp. 7–10 (2016)Google Scholar
  31. Jannach, D., Ludewig, M.: When recurrent neural networks meet the neighborhood for session-based recommendation. In: RecSys ’17, pp. 306–310 (2017)Google Scholar
  32. Jannach, D., Lerche, L., Jugovac, M.: Adaptation and evaluation of recommendations for short-term shopping goals. In: RecSys ’15, pp. 211–218 (2015a)Google Scholar
  33. Jannach, D., Lerche, L., Kamehkhosh, I., Jugovac, M.: What recommenders recommend: an analysis of recommendation biases and possible countermeasures. User Model. User Adapt. Interact. 25(5), 427–491 (2015b)CrossRefGoogle Scholar
  34. Jannach, D., Kamehkhosh, I., Lerche, L.: Leveraging multi-dimensional user models for personalized next-track music recommendation. In: ACM SAC 2017 (2017a)Google Scholar
  35. Jannach, D., Ludewig, M., Lerche, L.: Session-based item recommendation in e-commerce: on short-term intents, reminders, trends, and discounts. User Model. User Adapt. Interact. 27(3–5), 351–392 (2017b)CrossRefGoogle Scholar
  36. Jugovac, M., Jannach, D., Karimi, M.: StreamingRec: a framework for benchmarking stream-based news recommenders. In: RecSys 2018 (2018)Google Scholar
  37. Kabbur, S., Ning, X., Karypis, G.: FISM: factored item similarity models for top-n recommender systems. In: KDD ’13, pp. 659–667 (2013)Google Scholar
  38. Kamehkhosh, I., Jannach, D., Ludewig, M.: A comparison of frequent pattern techniques and a deep learning method for session-based recommendation. In: TempRec Workshop at ACM RecSys ’17, Como, Italy (2017)Google Scholar
  39. Karimi, M., Jannach, D., Jugovac, M.: News recommender systems—survey and roads ahead. Inf. Process. Manag. 54(6), 1203–1227 (2018)Google Scholar
  40. Kingma, D.P., Adam, J.B.: A method for stochastic optimization. CoRR (2014). arxiv:1412.6980
  41. Lee, D., Hosanagar, K.: Impact of recommender systems on sales volume and diversity. In: ICIS 2014 (2014)Google Scholar
  42. Lerche, L., Jannach, D., Ludewig, M.: On the value of reminders within e-commerce recommendations. In: UMAP ’16, pp. 27–35 (2016)Google Scholar
  43. Li, Z., Zhao, H., Liu, Q., Huang, Z., Mei, T., Chen, E.: Learning from history and present: next-item recommendation via discriminatively exploiting user behaviors. In: KDD 2018 (2018)Google Scholar
  44. Lian, D., Zheng, V.W., Xie, X.: Collaborative filtering meets next check-in location prediction. In: WWW ’13, pp. 231–232 (2013)Google Scholar
  45. Linden, G., Smith, B., York, J.: recommendations: item-to-item collaborative filtering. IEEE Internet Comput. 7(1), 76–80 (2003)CrossRefGoogle Scholar
  46. Liu, J., Dolan, P., Pedersen, E.R.: Personalized news recommendation based on click behavior. In: IUI ’10, pp. 31–40 (2010)Google Scholar
  47. Liu, Y., Liu, C., Liu, B., Qu, M., Xiong, H.: Unified point-of-interest recommendation with temporal interval assessment. In: KDD ’16, pp. 1015–1024 (2016)Google Scholar
  48. Liu, Q., Zeng, Y., Mokhosi, R., Zhang, H.: STAMP: short-term attention/memory priority model for session-based recommendation. In: KDD 2018 (2018)Google Scholar
  49. Ludmann, C.A.: Recommending news articles in the CLEF news recommendation evaluation lab with the data stream management system odysseus. In: Working Notes of CLEF 2017—Conference and Labs of the Evaluation (2017)Google Scholar
  50. McFee, B., Lanckriet, G.: The natural language of playlists. In: ISMIR ’11, pp. 537–541 (2011)Google Scholar
  51. McFee, B., Lanckriet, G.R.G.: Hypergraph models of playlist dialects. In: ISMIR ’12, pp. 343–348 (2012)Google Scholar
  52. Mobasher, B., Dai, H., Luo, T., Nakagawa, M.: Using sequential and non-sequential patterns in predictive web usage mining tasks. In: ICDM ’02, pp. 669–672 (2002)Google Scholar
  53. Moling, O., Baltrunas, L., Ricci, F.: Optimal radio channel recommendations with explicit and implicit feedback. In: RecSys ’12, pp. 75–82 (2012)Google Scholar
  54. Natarajan, N., Shin, D., Dhillon, I.S.: Which app will you use next? Collaborative filtering with interactional context. In: RecSys ’13, pp. 201–208 (2013)Google Scholar
  55. Norris, J.R.: Markov Chains. Cambridge University Press, Cambridge (1997)CrossRefGoogle Scholar
  56. Quadrana, M., Karatzoglou, A., Hidasi, B., Cremonesi, P.: Personalizing session-based recommendations with hierarchical recurrent neural networks. In: RecSys ’17 (2017)Google Scholar
  57. Quadrana, M., Cremonesi, P., Jannach, D.: Sequence-aware recommender systems. ACM Comput. Surv. 54, 1–36 (2018)CrossRefGoogle Scholar
  58. Reddy, S., Labutov, I., Joachims, T.: Learning student and content embeddings for personalized lesson sequence recommendation. In: ACM Learning @ Scale ’16, pp. 93–96 (2016)Google Scholar
  59. Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: Bayesian personalized ranking from implicit feedback. In: UAI ’09, pp. 452–461 (2009)Google Scholar
  60. Rendle, S., Freudenthaler, C., Schmidt-Thieme, L.: Factorizing personalized Markov Chains for next-basket recommendation. In: WWW ’10, pp. 811–820 (2010)Google Scholar
  61. Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6, 1265–1295 (2005)MathSciNetzbMATHGoogle Scholar
  62. Soh, H., Sanner, S., White, M., Jamieson, G.: Deep sequential recommendation for personalized adaptive user interfaces. In: IUI ’17, pp. 589–593 (2017)Google Scholar
  63. Song, Q., Cheng, J., Yuan, T., Lu, H.: Personalized recommendation meets your next favorite. In: CIKM ’15, pp. 1775–1778 (2015)Google Scholar
  64. Song, Y., Elkahky, A.M., He, X.: Multi-rate deep learning for temporal recommendation. In: SIGIR ’16, pp. 909–912 (2016)Google Scholar
  65. Sordoni, A., Bengio, Y., Vahabi, H., Lioma, C., Grue Simonsen, J., Nie, J.-Y.: A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In: CIKM ’15, pp. 553–562 (2015)Google Scholar
  66. Tagami, Y., Kobayashi, H., Ono, S., Tajima, A.: Modeling user activities on the web using paragraph vector. In: WWW ’15, pp. 125–126 (2015)Google Scholar
  67. Tan, Y.K., Xu, X., Liu, Y.: Improved recurrent neural networks for session-based recommendations. In: DLRS ’16 Workshop at ACM RecSys, pp. 17–22 (2016)Google Scholar
  68. Tavakol, M., Brefeld, U.: Factored MDPs for detecting topics of user sessions. In: RecSys ’14, pp. 33–40 (2014)Google Scholar
  69. Turrin, R., Quadrana, M., Condorelli, A., Pagano, R., Cremonesi, P.: 30music listening and playlists dataset. In: Poster Proceedings of RecSys ’15 (2015)Google Scholar
  70. Twardowski, B.: Modelling contextual information in session-aware recommender systems with neural networks. In: RecSys ’16, pp. 273–276 (2016)Google Scholar
  71. Vasile, F., Smirnova, E., Conneau, A.: Meta-prod2vec: product embeddings using side-information for recommendation. In: RecSys ’16, pp. 225–232 (2016)Google Scholar
  72. Verstrepen, K., Goethals, B.: Unifying nearest neighbors collaborative filtering. In: RecSys ’14, pp. 177–184 (2014)Google Scholar
  73. Wu, X., Liu, Q., Chen, E., He, L., Lv, J., Cao, C., Hu, G.: Personalized next-song recommendation in online karaokes. In: RecSys ’13, pp. 137–140 (2013)Google Scholar
  74. Yap, G.-E., Li, X.-L., Yu, P.S.: Effective next-items recommendation via personalized sequential pattern mining. In: DASFAA ’12, Volume Part II, pp. 48–64 (2012)CrossRefGoogle Scholar
  75. Yu, F., Liu, Q., Wu, S., Wang, L., Tan, T.: A dynamic recurrent model for next basket recommendation. In: SIGIR ’16, pp. 729–732 (2016)Google Scholar
  76. Zangerle, E., Pichl, M., Gassler, W., Specht, G.: #nowplaying music dataset: extracting listening behavior from Twitter. In: WISMM ’14 Workshop at MM ’14, pp. 21–26 (2014)Google Scholar
  77. Zeiler, M.D.: ADADELTA: an adaptive learning rate method. CoRR (2012). arxiv:1212.5701
  78. Zhang, Y., Dai, H., Xu, C., Feng, J., Wang, T., Bian, J., Wang, B., Liu, T.-Y.: Sequential click prediction for sponsored search with recurrent neural networks. In: AAAI ’14, pp. 1369–1375 (2014)Google Scholar
  79. Zheleva, E., Guiver, J., Mendes Rodrigues, E., Milić-Frayling, N.: Statistical models of music-listening sessions in social media. In: WWW ’10, pp. 1019–1028 (2010)Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.TU DortmundDortmundGermany
  2. 2.AAU KlagenfurtKlagenfurtAustria

Personalised recommendations