Model-Based Collaborative Filtering

Aggarwal, Charu C.

doi:10.1007/978-3-319-29659-3_3

Charu C. Aggarwal²

136k Accesses
24 Citations

Abstract

The neighborhood-based methods of the previous chapter can be viewed as generalizations of k-nearest neighbor classifiers, which are commonly used in machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
From a practical point of view, preprocessing is essential for efficiency. However, one could implement the neighborhood method without a preprocessing phase, albeit with larger latencies at query time.
2.
Parameter-tuning methods, such as hold-out and cross-validation, are discussed in Chapter 7
3.
In the case of user-based associations, the consequents might contain any user.
4.
It is also possible to use more sophisticated ways of removing bias for better performance. For example, the bias B _ij, which is specific to user i and item j, can be computed using the approach discussed in section 3.7.1. This bias is subtracted from observed entries and all missing entries are initialized to 0s during pre-processing. After computing the predictions, the biases B _ij are added back to the predicted values during postprocessing.
5.
A detailed description of the method used for performing this estimation in various scenarios is discussed in section 3.6.5.3.
6.
The row space of a matrix is defined by all possible linear combinations of the rows of the matrix. The column space of a matrix is defined by all possible linear combinations of the columns of the matrix.
7.
In SVD [568], the basis vectors are also referred to as singular vectors, which, by definition, must be mutually orthonormal.
8.
Refer to Chapter 6 for a discussion of the bias-variance trade-off.
9.
A more precise update should be $\overline{u_{i}} \Leftarrow \overline{u_{i}} +\alpha (e_{ij}\overline{v_{j}} -\lambda \overline{u_{i}}/n_{i}^{user})$ and $\overline{v_{j}} \Leftarrow \overline{v_{j}} +\alpha (e_{ij}\overline{u_{i}} -\lambda \overline{v_{j}}/n_{j}^{item})$. Here, n _i ^user represents the number of observed ratings for user i and $n_{j}^{item}$ represents the number of observed ratings for item j. Here, the regularization terms for various user/item factors are divided equally among the corresponding observed entries for various users/items. In practice, the (simpler) heuristic update rules discussed in the chapter are often used. We have chosen to use these (simpler) rules throughout this chapter to be consistent with the research literature on recommender systems. With proper parameter tuning, $\lambda$ will automatically adjust to a smaller value in the case of the simpler update rules.
10.
The inner-product of two column-vectors $\overline{x}$ and $\overline{y}$ is given by the scalar $\overline{x}^{T}\overline{y}$, whereas the outer-product is given by the rank-1 matrix $\overline{x}\,\overline{y}^{T}$. Furthermore, $\overline{x}$ and $\overline{y}$ need not be of the same size in order to compute an outer-product.
11.
In many cases, this approach can outperform SVD + +, especially when the number of observed ratings is small.
12.
For matrices, which are not mean-centered, the global mean can be subtracted during preprocessing and then added back at prediction time.
13.
We use a slightly different notation than the original paper [309], although the approach described here is equivalent. This presentation simplifies the notation by introducing fewer variables and viewing bias variables as constraints on the factorization process.
14.
The literature often describes these updates in vectorized form. These updates may be applied to the rows of U, V, and Y as follows:
$$\displaystyle\begin{array}{rcl} & & \overline{u_{i}} \Leftarrow \overline{u_{i}} +\alpha (e_{ij}\overline{v_{j}} -\lambda \overline{u_{i}}) {}\\ & & \overline{v_{j}} \Leftarrow \overline{v_{j}} +\alpha \left (e_{ij} \cdot \left [\overline{u_{i}} +\sum _{h\in I_{i}} \frac{\overline{y_{h}}} {\sqrt{\vert I_{i } \vert }}\right ] -\lambda \cdot \overline{v_{j}}\right ) {}\\ & & \overline{y_{h}} \Leftarrow \overline{y_{h}} +\alpha \left (\frac{e_{ij} \cdot \overline{v_{j}}} {\sqrt{\vert I_{i } \vert }} -\lambda \cdot \overline{y_{h}}\right )\ \ \forall h \in I_{i} {}\\ & & \mbox{ Reset perturbed entries in fixed columns of $U$, $V $, and $Y $} {}\\ \end{array}$$
15.
These effects are best understood in terms of the bias-variance trade-off in machine learning [22]. Setting the unspecified values to 0 increases bias, but it reduces variance. When a large number of entries are unspecified, and the prior probability of a missing entry to be 0 is very high, the variance effects can dominate.
16.
Refer to Chapter 6 for a discussion of the bias-variance trade-off in collaborative filtering.
17.
Note that we use upper-case variable K to represent the size of the neighborhood that defines Q _j(i). This is a deviation from section 2.6.2 of Chapter 2 We use lower-case variable k to represent the dimensionality of the factor matrices. The values of k and K are generally different.

Bibliography

D. Agarwal, and B. Chen. Regression-based latent factor models. ACM KDD Conference, pp. 19–28. 2009.
Google Scholar
C. Aggarwal. Data classification: algorithms and applications. CRC Press, 2014.
Google Scholar
C. Aggarwal. Data mining: the textbook. Springer, New York, 2015.
Google Scholar
C. Aggarwal and J. Han. Frequent pattern mining. Springer, New York, 2014.
Google Scholar
C. Aggarwal and S. Parthasarathy. Mining massively incomplete data sets by conceptual reconstruction. ACM KDD Conference, pp. 227–232, 2001.
Google Scholar
C. Aggarwal, C. Procopiuc, and P. S. Yu. Finding localized associations in market basket data. IEEE Transactions on Knowledge and Data Engineering, 14(1), pp. 51–62, 2001.
Article Google Scholar
C. Aggarwal, Z. Sun, and P. Yu. Online generation of profile association rules. ACM KDD Conference, pp. 129–133, 1998.
Google Scholar
C. Aggarwal, Z. Sun, and P. Yu. Online algorithms for finding profile association rules, CIKM Conference, pp. 86–95, 1998.
Google Scholar
R. Battiti. Accelerated backpropagation learning: Two optimization methods. Complex Systems, 3(4), pp. 331–342, 1989.
MATH Google Scholar
R. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. IEEE International Conference on Data Mining, pp. 43–52, 2007.
Google Scholar
R. Bell and Y. Koren. Lessons from the Netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2), pp. 75–79, 2007.
Article Google Scholar
D. P. Bertsekas. Nonlinear programming. Athena Scientific Publishers, Belmont, 1999.
Google Scholar
D. Billsus and M. Pazzani. Learning collaborative information filters. ICML Conference, pp. 46–54, 1998.
Google Scholar
C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.
Google Scholar
M. Brand. Fast online SVD revisions for lightweight recommender systems. SIAM Conference on Data Mining, pp. 37–46, 2003.
Google Scholar
J. Cai, E. Candes, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982, 2010.
Google Scholar
J. Canny. Collaborative filtering with privacy via factor analysis. ACM SIGR Conference, pp. 238–245, 2002.
Google Scholar
T. Chen, Z. Zheng, Q. Lu, W. Zhang, and Y. Yu. Feature-based matrix factorization. arXiv preprint arXiv:1109.2271, 2011.
Google Scholar
A. Cichocki and R. Zdunek. Regularized alternating least squares algorithms for non-negative matrix/tensor factorization. International Symposium on Neural Networks, pp. 793–802. 2007.
Google Scholar
D. DeCoste. Collaborative prediction using ensembles of maximum margin matrix factorizations. International Conference on Machine Learning, pp. 249–256, 2006.
Google Scholar
R. Devooght, N. Kourtellis, and A. Mantrach. Dynamic matrix factorization with priors on unknown values. ACM KDD Conference, 2015.
Google Scholar
R. Gemulla, E. Nijkamp, P. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. ACM KDD Conference, pp. 69–77, 2011.
Google Scholar
L. Getoor and M. Sahami. Using probabilistic relational models for collaborative filtering. Workshop on Web Usage Analysis and User Profiling, 1999.
Google Scholar
F. Girosi, M. Jones, and T. Poggio. Regularization theory and neural networks architectures. Neural Computation, 2(2), pp. 219–269, 1995.
Article Google Scholar
T. Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1), pp. 89–114, 2004.
Article Google Scholar
Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. IEEE International Conference on Data Mining, pp. 263–272, 2008.
Google Scholar
P. Jain and I. Dhillon. Provable inductive matrix completion. arXiv preprint arXiv:1306.0626 http://arxiv.org/abs/1306.0626.
P. Jain, P. Netrapalli, and S. Sanghavi. Low-rank matrix completion using alternating minimization. ACM Symposium on Theory of Computing, pp. 665–674, 2013.
Google Scholar
D. Kim, and B. Yum. Collaborative filtering Based on iterative principal component analysis, Expert Systems with Applications, 28, pp. 623–830, 2005.
Google Scholar
H. Kim and H. Park. Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2), pp. 713–730, 2008.
Article MathSciNet MATH Google Scholar
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. ACM KDD Conference, pp. 426–434, 2008. Extended version of this paper appears as: “Y. Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1, 2010.”
Google Scholar
Y. Koren. Collaborative filtering with temporal dynamics. ACM KDD Conference, pp. 447–455, 2009. Another version also appears in the Communications of the ACM,, 53(4), pp. 89–97, 2010.
Google Scholar
Y. Koren. The Bellkor solution to the Netflix grand prize. Netflix prize documentation, 81, 2009. http://www.netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf
Y. Koren and R. Bell. Advances in collaborative filtering. Recommender Systems Handbook, Springer, pp. 145–186, 2011. (Extended version in 2015 edition of handbook).
Google Scholar
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8), pp. 30–37, 2009.
Article Google Scholar
S. Kabbur, X. Ning, and G. Karypis. FISM: factored item similarity models for top-N recommender systems. ACM KDD Conference, pp. 659–667, 2013.
Google Scholar
S. Kabbur and G. Karypis. NLMF: NonLinear Matrix Factorization Methods for Top-N Recommender Systems. IEEE Data Mining Workshop (ICDMW), pp. 167–174, 2014.
Google Scholar
A. Langville, C. Meyer, R. Albright, J. Cox, and D. Duling. Initializations for the nonnegative matrix factorization. ACM KDD Conference, pp. 23–26, 2006.
Google Scholar
D. Lemire and A. Maclachlan. Slope one predictors for online rating-based collaborative filtering. SIAM Conference on Data Mining, 2005.
Google Scholar
M. Li, T. Zhang, Y. Chen, and A. Smola. Efficient mini-batch training for stochastic optimization. ACM KDD Conference, pp. 661–670, 2014.
Google Scholar
C.-J. Lin. Projected gradient methods for nonnegative matrix factorization. Neural Computation, 19(10), pp. 2576–2779, 2007.
Article MathSciNet Google Scholar
W. Lin. Association rule mining for collaborative recommender systems. Masters Thesis, Worcester Polytechnic Institute, 2000.
Google Scholar
W. Lin, S. Alvarez, and C. Ruiz. Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Discovery, 6(1), pp. 83–105, 2002.
Article MathSciNet Google Scholar
B. Liu, W. Hsu, and Y. Ma. Mining association rules with multiple minimum supports. ACM KDD Conference, pp. 337–341, 1999.
Google Scholar
X. Liu, C. Aggarwal, Y.-F. Lee, X. Kong, X. Sun, and S. Sathe. Kernelized matrix factorization for collaborative filtering. SIAM Conference on Data Mining, 2016.
Google Scholar
A. Mild and M. Natter. Collaborative filtering or regression models for Internet recommendation systems?. Journal of Targeting, Measurement and Analysis for Marketing, 10(4), pp. 304–313, 2002.
Article Google Scholar
K. Miyahara, and M. J. Pazzani. Collaborative filtering with the simple Bayesian classifier. Pacific Rim International Conference on Artificial Intelligence, 2000.
Google Scholar
B. Mobasher, H. Dai, T. Luo, and M. Nakagawa. Effective personalization based on association rule discovery from Web usage data. ACM Workshop on Web Information and Data Management, pp. 9–15, 2001.
Google Scholar
X. Ning and G. Karypis. SLIM: Sparse linear methods for top-N recommender systems. IEEE International Conference on Data Mining, pp. 497–506, 2011.
Google Scholar
D. Oard and J. Kim. Implicit feedback for recommender systems. Proceedings of the AAAI Workshop on Recommender Systems, pp. 81–83, 1998.
Google Scholar
P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2), pp. 111–126, 1994.
Article Google Scholar
R. Pan, Y. Zhou, B. Cao, N. Liu, R. Lukose, M. Scholz, Q. Yang. One-class collaborative filtering. IEEE International Conference on Data Mining, pp. 502–511, 2008.
Google Scholar
R. Pan, and M. Scholz. Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. ACM KDD Conference, pp. 667–676, 2009.
Google Scholar
S. Parthasarathy and C. Aggarwal. On the use of conceptual reconstruction for mining massively incomplete data sets. IEEE Transactions on Knowledge and Data Engineering, 15(6), pp. 1512–1521, 2003.
Article Google Scholar
A. Paterek. Improving regularized singular value decomposition for collaborative filtering. Proceedings of KDD Cup and Workshop, 2007.
Google Scholar
V. Pauca, J. Piper, and R. Plemmons. Nonnegative matrix factorization for spectral data analysis. Linear algebra and its applications, 416(1), pp. 29–47, 2006.
Article MathSciNet MATH Google Scholar
S. Rendle. Factorization machines. IEEE International Conference on Data Mining, pp. 995–100, 2010.
Google Scholar
J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. ICML Conference, pp. 713–718, 2005.
Google Scholar
R. Salakhutdinov, and A. Mnih. Probabilistic matrix factorization. Advances in Neural and Information Processing Systems, pp. 1257–1264, 2007.
Google Scholar
R. Salakhutdinov, and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. International Conference on Machine Learning, pp. 880–887, 2008.
Google Scholar
R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted Boltzmann machines for collaborative filtering. International conference on Machine Learning, pp. 791–798, 2007.
Google Scholar
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. World Wide Web Conference, pp. 285–295, 2001.
Google Scholar
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Application of dimensionality reduction in recommender system – a case study. WebKDD Workshop at ACM SIGKDD Conference, 2000. Also appears at Technical Report TR-00-043, University of Minnesota, Minneapolis, 2000. https://wwws.cs.umn.edu/tech_reports_upload/tr2000/00-043.pdf
D. Seung, and L. Lee. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems, 13, pp. 556–562, 2001.
Google Scholar
H. Shen and J. Z. Huang. Sparse principal component analysis via regularized low rank matrix approximation. Journal of multivariate analysis. 99(6), pp. 1015–1034, 2008.
Article MathSciNet MATH Google Scholar
M.-L. Shyu, C. Haruechaiyasak, S.-C. Chen, and N. Zhao. Collaborative filtering by mining association rules from user access sequences. Workshop on Challenges in Web Information Retrieval and Integration, pp. 128–135, 2005.
Google Scholar
G. Strang. An introduction to linear algebra. Wellesley Cambridge Press, 2009.
Google Scholar
N. Srebro, J. Rennie, and T. Jaakkola. Maximum-margin matrix factorization. Advances in neural information processing systems, pp. 1329–1336, 2004.
Google Scholar
X. Su, T. Khoshgoftaar, X. Zhu, and R. Greiner. Imputation-boosted collaborative filtering using machine learning classifiers. ACM symposium on Applied computing, pp. 949–950, 2008.
Google Scholar
G. Takacs, I. Pilaszy, B. Nemeth, and D. Tikk. Matrix factorization and neighbor based algorithms for the Netflix prize problem. ACM Conference on Recommender Systems, pp. 267–274, 2008.
Google Scholar
S. Vucetic and Z. Obradovic. Collaborative filtering using a regression-based approach. Knowledge and Information Systems, 7(1), pp. 1–22, 2005.
Article Google Scholar
M. Weimer, A. Karatzoglou, Q. Le, and A. Smola. CoFiRank: Maximum margin matrix factorization for collaborative ranking. Advances in Neural Information Processing Systems, 2007.
Google Scholar
S. Wild, J. Curry, and A. Dougherty. Improving non-negative matrix factorizations through structured initialization. Pattern Recognition, 37(11), pp. 2217–2232, 2004.
Article Google Scholar
Z. Xia, Y. Dong, and G. Xing. Support vector machines for collaborative filtering. Proceedings of the 44th Annual Southeast Regional Conference, pp. 169–174, 2006.
Google Scholar
H. F. Yu, C. Hsieh, S. Si, and I. S. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. IEEE International Conference on Data Mining, pp. 765–774, 2012.
Google Scholar
K. Yu, S. Zhu, J. Lafferty, and Y. Gong. Fast nonparametric matrix factorization for large-scale collaborative filtering. ACM SIGIR Conference, pp. 211–218, 2009.
Google Scholar
S. Zhang, W. Wang, J. Ford, and F. Makedon. Learning from incomplete ratings using nonnegative matrix factorization. SIAM Conference on Data Mining, pp. 549–553, 2006.
Google Scholar
T. Zhang and V. Iyengar. Recommender systems using linear classifiers. Journal of Machine Learning Research, 2, pp. 313–334, 2002.
MATH Google Scholar
K. Zhou, S. Yang, and H. Zha. Functional matrix factorizations for cold-start recommendation. ACM SIGIR Conference, pp. 315–324, 2011.
Google Scholar
Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the Netflix prize. Algorithmic Aspects in Information and Management, pp. 337–348, 2008.
Google Scholar
C. Ziegler. Applying feed-forward neural networks to collaborative filtering, Master’s Thesis, Universitat Freiburg, 2006.
Google Scholar
http://www.the-ensemble.com/

Download references

Author information

Authors and Affiliations

IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C. (2016). Model-Based Collaborative Filtering. In: Recommender Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-29659-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-29659-3_3
Published: 29 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29657-9
Online ISBN: 978-3-319-29659-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics