Abstract
The much-publicized Netflix competition has put the spotlight on the application domain of collaborative filtering and has sparked interest in machine learning algorithms that can be applied to this sort of problem. The demanding nature of the Netflix data has lead to some interesting and ingenious modifications to standard learning methods in the name of efficiency and speed. There are three basic methods that have been applied in most approaches to the Netflix problem so far: stand-alone neighborhood-based methods, latent factor models based on singular-value decomposition, and ensembles consisting of variations of these techniques. In this paper we investigate the application of forward stage-wise additive modeling to the Netflix problem, using two regression schemes as base learners: ensembles of weighted simple linear regressors and k-means clustering—the latter being interpreted as a tool for multi-variate regression in this context. Experimental results show that our methods produce competitive results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bell, R., Koren, Y., Volinsky, C.: Chasing $1,000,000: How we won the Netflix progress prize. ASA Statistical and Computing Graphics Newsletter 18(2), 4–12 (2007)
Bell, R.M., Koren, Y.: Improved neighborhood-based collaborative filtering. In: KDD Cup and Workshop at the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
Dembczyński, K., Kotłowski, W., Słowiński, R.: Ordinal classification with decision rules. In: Proc. 3rd International Workshop on Mining Complex Data, pp. 169–181. Springer, Heidelberg (2008)
Freund, Y., Shapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Friedman, J.: Greedy function approximation: A gradient boosting machine. Annals of Statistics 29(5), 1189–1232 (2001)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting (with discussion and rejoinder by the authors). Annals of Statistics 28(2), 337–407 (2000)
Kurucz, M., Benczúr, A.A., Csalogány, K.: Methods for large scale SVD with missing values. In: KDD Cup and Workshop at the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
Lim, Y.J., Teh, Y.W.: Variational Bayesian approach to movie rating prediction. In: KDD Cup and Workshop at the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
Paterek, A.: Improving regularized singular value decomposition for collaborative filtering. In: KDD Cup and Workshop at the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
Takács, G., Pilászy, I., Németh, B., Tikk, D.: On the Gravity recommendation system. In: KDD Cup and Workshop at the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
Wu, M.: Collaborative filtering via ensembles of matrix factorizations. In: KDD Cup and Workshop at the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Frank, E., Hall, M. (2008). Additive Regression Applied to a Large-Scale Collaborative Filtering Problem. In: Wobcke, W., Zhang, M. (eds) AI 2008: Advances in Artificial Intelligence. AI 2008. Lecture Notes in Computer Science(), vol 5360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89378-3_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-89378-3_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89377-6
Online ISBN: 978-3-540-89378-3
eBook Packages: Computer ScienceComputer Science (R0)