Abstract
Recommender systems have been successfully applied to many application areas to predict users’ preference. However, these systems face the exploration-exploitation dilemma when making a recommendation, since they need to exploit items which raise users’ interest and explore new items to improve satisfaction simultaneously. In this paper, we deal with this dilemma through Multi-Armed Bandit (MAB) approaches, especially for large-scale recommender systems that have vast or infinite items. We propose two large-scale bandit approaches under the situations that there is no available priori information. The continuous exploration in our approaches can address the cold start problem in recommender systems. Furthermore, our context-free approaches are based on users’ click behavior without the dependence on priori information. We theoretically prove that our approaches can converge to optimal item recommendations in the long run. Experimental results indicate that our approaches are able to provide more accurate recommendations than some classic bandit approaches in terms of click-through rates, with less calculation time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–58 (1997)
Balabanović, M., Shoham, Y.: Fab: content-based, collaborative recommendation. Commun. ACM 40(3), 66–72 (1997)
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235–256 (2002)
Liu, J., Dolan, P., Pedersen, E. R.: Personalized news recommendation based on click behavior. In: International Conference on Intelligent User Interfaces, pp. 31–40 (2010)
Tang, L., Jiang, Y., Li, L., Li, T.: Ensemble contextual bandits for personalized recommendation. In: RecSys, pp. 73–80 (2014)
Song, L., Tekin, C., Schaar, M.V.D.: Online learning in large-scale contextual recommender systems. IEEE Trans. Serv. Comput. 9(3), 433–445 (2016)
Jośe, A.M.H., Vargas, A.M.: Linear Bayes policy for learning in contextual-bandits. Expert Syst. Appl. 40(18), 7400–7406 (2013)
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)
Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2(1), 1–19 (2009)
Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6(1), 1265–1295 (2005)
Ren, Z., Krogh, B.H.: State aggregation in markov decision processes. In: IEEE Conference on Decision and Control, pp. 3819–3824 (2002)
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)
Cesa-Bianchi, N., Fischer, P.: Finite-time regret bounds for the multi-armed bandit problem. In: ICML, pp. 100–108 (1998)
Bubeck, S., Slivkins, A.: The best of both worlds: stochastic and adversarial bandits. J. Mach. Learn. Res. 23(42), 1–23 (2012)
Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 191–226. Springer, Boston (2015). doi:10.1007/978-1-4899-7637-6_6
Adomavicius, G., Sankaranarayanan, R., Sen, S., Tuzhilin, A.: Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Syst. 23(1), 103–145 (2005)
Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: World Wide Web, pp. 661–670 (2010)
Bubeck, S., Cesa-bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zhou, Q., Zhang, X., Xu, J., Liang, B. (2017). Large-Scale Bandit Approaches for Recommender Systems. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10634. Springer, Cham. https://doi.org/10.1007/978-3-319-70087-8_83
Download citation
DOI: https://doi.org/10.1007/978-3-319-70087-8_83
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70086-1
Online ISBN: 978-3-319-70087-8
eBook Packages: Computer ScienceComputer Science (R0)