Large-Scale Bandit Approaches for Recommender Systems

Zhou, Qian; Zhang, XiaoFang; Xu, Jin; Liang, Bin

doi:10.1007/978-3-319-70087-8_83

Qian Zhou¹⁸,
XiaoFang Zhang^18,19,
Jin Xu¹⁸ &
…
Bin Liang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10634))

Included in the following conference series:

International Conference on Neural Information Processing

4959 Accesses
9 Citations
6 Altmetric

Abstract

Recommender systems have been successfully applied to many application areas to predict users’ preference. However, these systems face the exploration-exploitation dilemma when making a recommendation, since they need to exploit items which raise users’ interest and explore new items to improve satisfaction simultaneously. In this paper, we deal with this dilemma through Multi-Armed Bandit (MAB) approaches, especially for large-scale recommender systems that have vast or infinite items. We propose two large-scale bandit approaches under the situations that there is no available priori information. The continuous exploration in our approaches can address the cold start problem in recommender systems. Furthermore, our context-free approaches are based on users’ click behavior without the dependence on priori information. We theoretically prove that our approaches can converge to optimal item recommendations in the long run. Experimental results indicate that our approaches are able to provide more accurate recommendations than some classic bandit approaches in terms of click-through rates, with less calculation time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://webscope.sandbox.yahoo.com.

References

Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–58 (1997)
Article Google Scholar
Balabanović, M., Shoham, Y.: Fab: content-based, collaborative recommendation. Commun. ACM 40(3), 66–72 (1997)
Article Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235–256 (2002)
Article MATH Google Scholar
Liu, J., Dolan, P., Pedersen, E. R.: Personalized news recommendation based on click behavior. In: International Conference on Intelligent User Interfaces, pp. 31–40 (2010)
Google Scholar
Tang, L., Jiang, Y., Li, L., Li, T.: Ensemble contextual bandits for personalized recommendation. In: RecSys, pp. 73–80 (2014)
Google Scholar
Song, L., Tekin, C., Schaar, M.V.D.: Online learning in large-scale contextual recommender systems. IEEE Trans. Serv. Comput. 9(3), 433–445 (2016)
Article Google Scholar
Jośe, A.M.H., Vargas, A.M.: Linear Bayes policy for learning in contextual-bandits. Expert Syst. Appl. 40(18), 7400–7406 (2013)
Article Google Scholar
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)
Article Google Scholar
Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2(1), 1–19 (2009)
Article Google Scholar
Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6(1), 1265–1295 (2005)
MATH MathSciNet Google Scholar
Ren, Z., Krogh, B.H.: State aggregation in markov decision processes. In: IEEE Conference on Decision and Control, pp. 3819–3824 (2002)
Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)
Book MATH Google Scholar
Cesa-Bianchi, N., Fischer, P.: Finite-time regret bounds for the multi-armed bandit problem. In: ICML, pp. 100–108 (1998)
Google Scholar
Bubeck, S., Slivkins, A.: The best of both worlds: stochastic and adversarial bandits. J. Mach. Learn. Res. 23(42), 1–23 (2012)
Google Scholar
Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 191–226. Springer, Boston (2015). doi:10.1007/978-1-4899-7637-6_6
Chapter Google Scholar
Adomavicius, G., Sankaranarayanan, R., Sen, S., Tuzhilin, A.: Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Syst. 23(1), 103–145 (2005)
Article Google Scholar
Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: World Wide Web, pp. 661–670 (2010)
Google Scholar
Bubeck, S., Cesa-bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Soochow University, Suzhou, 215006, China
Qian Zhou, XiaoFang Zhang, Jin Xu & Bin Liang
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210033, China
XiaoFang Zhang

Authors

Qian Zhou
View author publications
You can also search for this author in PubMed Google Scholar
XiaoFang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Bin Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to XiaoFang Zhang .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Q., Zhang, X., Xu, J., Liang, B. (2017). Large-Scale Bandit Approaches for Recommender Systems. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10634. Springer, Cham. https://doi.org/10.1007/978-3-319-70087-8_83

Download citation

DOI: https://doi.org/10.1007/978-3-319-70087-8_83
Published: 24 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70086-1
Online ISBN: 978-3-319-70087-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics