Skip to main content

Model-Based Collaborative Filtering

  • Chapter
  • First Online:
Book cover Recommender Systems

Abstract

The neighborhood-based methods of the previous chapter can be viewed as generalizations of k-nearest neighbor classifiers, which are commonly used in machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 69.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    From a practical point of view, preprocessing is essential for efficiency. However, one could implement the neighborhood method without a preprocessing phase, albeit with larger latencies at query time.

  2. 2.

    Parameter-tuning methods, such as hold-out and cross-validation, are discussed in Chapter 7

  3. 3.

    In the case of user-based associations, the consequents might contain any user.

  4. 4.

    It is also possible to use more sophisticated ways of removing bias for better performance. For example, the bias B ij , which is specific to user i and item j, can be computed using the approach discussed in section 3.7.1. This bias is subtracted from observed entries and all missing entries are initialized to 0s during pre-processing. After computing the predictions, the biases B ij are added back to the predicted values during postprocessing.

  5. 5.

    A detailed description of the method used for performing this estimation in various scenarios is discussed in section 3.6.5.3.

  6. 6.

    The row space of a matrix is defined by all possible linear combinations of the rows of the matrix. The column space of a matrix is defined by all possible linear combinations of the columns of the matrix.

  7. 7.

    In SVD [568], the basis vectors are also referred to as singular vectors, which, by definition, must be mutually orthonormal.

  8. 8.

    Refer to Chapter 6 for a discussion of the bias-variance trade-off.

  9. 9.

    A more precise update should be \(\overline{u_{i}} \Leftarrow \overline{u_{i}} +\alpha (e_{ij}\overline{v_{j}} -\lambda \overline{u_{i}}/n_{i}^{user})\) and \(\overline{v_{j}} \Leftarrow \overline{v_{j}} +\alpha (e_{ij}\overline{u_{i}} -\lambda \overline{v_{j}}/n_{j}^{item})\). Here, n i user represents the number of observed ratings for user i and \(n_{j}^{item}\) represents the number of observed ratings for item j. Here, the regularization terms for various user/item factors are divided equally among the corresponding observed entries for various users/items. In practice, the (simpler) heuristic update rules discussed in the chapter are often used. We have chosen to use these (simpler) rules throughout this chapter to be consistent with the research literature on recommender systems. With proper parameter tuning, \(\lambda\) will automatically adjust to a smaller value in the case of the simpler update rules.

  10. 10.

    The inner-product of two column-vectors \(\overline{x}\) and \(\overline{y}\) is given by the scalar \(\overline{x}^{T}\overline{y}\), whereas the outer-product is given by the rank-1 matrix \(\overline{x}\,\overline{y}^{T}\). Furthermore, \(\overline{x}\) and \(\overline{y}\) need not be of the same size in order to compute an outer-product.

  11. 11.

    In many cases, this approach can outperform SVD + +, especially when the number of observed ratings is small.

  12. 12.

    For matrices, which are not mean-centered, the global mean can be subtracted during preprocessing and then added back at prediction time.

  13. 13.

    We use a slightly different notation than the original paper [309], although the approach described here is equivalent. This presentation simplifies the notation by introducing fewer variables and viewing bias variables as constraints on the factorization process.

  14. 14.

    The literature often describes these updates in vectorized form. These updates may be applied to the rows of U, V, and Y as follows:

    $$\displaystyle\begin{array}{rcl} & & \overline{u_{i}} \Leftarrow \overline{u_{i}} +\alpha (e_{ij}\overline{v_{j}} -\lambda \overline{u_{i}}) {}\\ & & \overline{v_{j}} \Leftarrow \overline{v_{j}} +\alpha \left (e_{ij} \cdot \left [\overline{u_{i}} +\sum _{h\in I_{i}} \frac{\overline{y_{h}}} {\sqrt{\vert I_{i } \vert }}\right ] -\lambda \cdot \overline{v_{j}}\right ) {}\\ & & \overline{y_{h}} \Leftarrow \overline{y_{h}} +\alpha \left (\frac{e_{ij} \cdot \overline{v_{j}}} {\sqrt{\vert I_{i } \vert }} -\lambda \cdot \overline{y_{h}}\right )\ \ \forall h \in I_{i} {}\\ & & \mbox{ Reset perturbed entries in fixed columns of $U$, $V $, and $Y $} {}\\ \end{array}$$
  15. 15.

    These effects are best understood in terms of the bias-variance trade-off in machine learning [22]. Setting the unspecified values to 0 increases bias, but it reduces variance. When a large number of entries are unspecified, and the prior probability of a missing entry to be 0 is very high, the variance effects can dominate.

  16. 16.

    Refer to Chapter 6 for a discussion of the bias-variance trade-off in collaborative filtering.

  17. 17.

    Note that we use upper-case variable K to represent the size of the neighborhood that defines Q j (i). This is a deviation from section 2.6.2 of Chapter 2 We use lower-case variable k to represent the dimensionality of the factor matrices. The values of k and K are generally different.

Bibliography

  1. D. Agarwal, and B. Chen. Regression-based latent factor models. ACM KDD Conference, pp. 19–28. 2009.

    Google Scholar 

  2. C. Aggarwal. Data classification: algorithms and applications. CRC Press, 2014.

    Google Scholar 

  3. C. Aggarwal. Data mining: the textbook. Springer, New York, 2015.

    Google Scholar 

  4. C. Aggarwal and J. Han. Frequent pattern mining. Springer, New York, 2014.

    Google Scholar 

  5. C. Aggarwal and S. Parthasarathy. Mining massively incomplete data sets by conceptual reconstruction. ACM KDD Conference, pp. 227–232, 2001.

    Google Scholar 

  6. C. Aggarwal, C. Procopiuc, and P. S. Yu. Finding localized associations in market basket data. IEEE Transactions on Knowledge and Data Engineering, 14(1), pp. 51–62, 2001.

    Article  Google Scholar 

  7. C. Aggarwal, Z. Sun, and P. Yu. Online generation of profile association rules. ACM KDD Conference, pp. 129–133, 1998.

    Google Scholar 

  8. C. Aggarwal, Z. Sun, and P. Yu. Online algorithms for finding profile association rules, CIKM Conference, pp. 86–95, 1998.

    Google Scholar 

  9. R. Battiti. Accelerated backpropagation learning: Two optimization methods. Complex Systems, 3(4), pp. 331–342, 1989.

    MATH  Google Scholar 

  10. R. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. IEEE International Conference on Data Mining, pp. 43–52, 2007.

    Google Scholar 

  11. R. Bell and Y. Koren. Lessons from the Netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2), pp. 75–79, 2007.

    Article  Google Scholar 

  12. D. P. Bertsekas. Nonlinear programming. Athena Scientific Publishers, Belmont, 1999.

    Google Scholar 

  13. D. Billsus and M. Pazzani. Learning collaborative information filters. ICML Conference, pp. 46–54, 1998.

    Google Scholar 

  14. C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.

    Google Scholar 

  15. M. Brand. Fast online SVD revisions for lightweight recommender systems. SIAM Conference on Data Mining, pp. 37–46, 2003.

    Google Scholar 

  16. J. Cai, E. Candes, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982, 2010.

    Google Scholar 

  17. J. Canny. Collaborative filtering with privacy via factor analysis. ACM SIGR Conference, pp. 238–245, 2002.

    Google Scholar 

  18. T. Chen, Z. Zheng, Q. Lu, W. Zhang, and Y. Yu. Feature-based matrix factorization. arXiv preprint arXiv:1109.2271, 2011.

    Google Scholar 

  19. A. Cichocki and R. Zdunek. Regularized alternating least squares algorithms for non-negative matrix/tensor factorization. International Symposium on Neural Networks, pp. 793–802. 2007.

    Google Scholar 

  20. D. DeCoste. Collaborative prediction using ensembles of maximum margin matrix factorizations. International Conference on Machine Learning, pp. 249–256, 2006.

    Google Scholar 

  21. R. Devooght, N. Kourtellis, and A. Mantrach. Dynamic matrix factorization with priors on unknown values. ACM KDD Conference, 2015.

    Google Scholar 

  22. R. Gemulla, E. Nijkamp, P. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. ACM KDD Conference, pp. 69–77, 2011.

    Google Scholar 

  23. L. Getoor and M. Sahami. Using probabilistic relational models for collaborative filtering. Workshop on Web Usage Analysis and User Profiling, 1999.

    Google Scholar 

  24. F. Girosi, M. Jones, and T. Poggio. Regularization theory and neural networks architectures. Neural Computation, 2(2), pp. 219–269, 1995.

    Article  Google Scholar 

  25. T. Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1), pp. 89–114, 2004.

    Article  Google Scholar 

  26. Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. IEEE International Conference on Data Mining, pp. 263–272, 2008.

    Google Scholar 

  27. P. Jain and I. Dhillon. Provable inductive matrix completion. arXiv preprint arXiv:1306.0626 http://arxiv.org/abs/1306.0626.

  28. P. Jain, P. Netrapalli, and S. Sanghavi. Low-rank matrix completion using alternating minimization. ACM Symposium on Theory of Computing, pp. 665–674, 2013.

    Google Scholar 

  29. D. Kim, and B. Yum. Collaborative filtering Based on iterative principal component analysis, Expert Systems with Applications, 28, pp. 623–830, 2005.

    Google Scholar 

  30. H. Kim and H. Park. Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2), pp. 713–730, 2008.

    Article  MathSciNet  MATH  Google Scholar 

  31. Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. ACM KDD Conference, pp. 426–434, 2008. Extended version of this paper appears as: “Y. Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1, 2010.”

    Google Scholar 

  32. Y. Koren. Collaborative filtering with temporal dynamics. ACM KDD Conference, pp. 447–455, 2009. Another version also appears in the Communications of the ACM,, 53(4), pp. 89–97, 2010.

    Google Scholar 

  33. Y. Koren. The Bellkor solution to the Netflix grand prize. Netflix prize documentation, 81, 2009. http://www.netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf

  34. Y. Koren and R. Bell. Advances in collaborative filtering. Recommender Systems Handbook, Springer, pp. 145–186, 2011. (Extended version in 2015 edition of handbook).

    Google Scholar 

  35. Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8), pp. 30–37, 2009.

    Article  Google Scholar 

  36. S. Kabbur, X. Ning, and G. Karypis. FISM: factored item similarity models for top-N recommender systems. ACM KDD Conference, pp. 659–667, 2013.

    Google Scholar 

  37. S. Kabbur and G. Karypis. NLMF: NonLinear Matrix Factorization Methods for Top-N Recommender Systems. IEEE Data Mining Workshop (ICDMW), pp. 167–174, 2014.

    Google Scholar 

  38. A. Langville, C. Meyer, R. Albright, J. Cox, and D. Duling. Initializations for the nonnegative matrix factorization. ACM KDD Conference, pp. 23–26, 2006.

    Google Scholar 

  39. D. Lemire and A. Maclachlan. Slope one predictors for online rating-based collaborative filtering. SIAM Conference on Data Mining, 2005.

    Google Scholar 

  40. M. Li, T. Zhang, Y. Chen, and A. Smola. Efficient mini-batch training for stochastic optimization. ACM KDD Conference, pp. 661–670, 2014.

    Google Scholar 

  41. C.-J. Lin. Projected gradient methods for nonnegative matrix factorization. Neural Computation, 19(10), pp. 2576–2779, 2007.

    Article  MathSciNet  Google Scholar 

  42. W. Lin. Association rule mining for collaborative recommender systems. Masters Thesis, Worcester Polytechnic Institute, 2000.

    Google Scholar 

  43. W. Lin, S. Alvarez, and C. Ruiz. Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Discovery, 6(1), pp. 83–105, 2002.

    Article  MathSciNet  Google Scholar 

  44. B. Liu, W. Hsu, and Y. Ma. Mining association rules with multiple minimum supports. ACM KDD Conference, pp. 337–341, 1999.

    Google Scholar 

  45. X. Liu, C. Aggarwal, Y.-F. Lee, X. Kong, X. Sun, and S. Sathe. Kernelized matrix factorization for collaborative filtering. SIAM Conference on Data Mining, 2016.

    Google Scholar 

  46. A. Mild and M. Natter. Collaborative filtering or regression models for Internet recommendation systems?. Journal of Targeting, Measurement and Analysis for Marketing, 10(4), pp. 304–313, 2002.

    Article  Google Scholar 

  47. K. Miyahara, and M. J. Pazzani. Collaborative filtering with the simple Bayesian classifier. Pacific Rim International Conference on Artificial Intelligence, 2000.

    Google Scholar 

  48. B. Mobasher, H. Dai, T. Luo, and M. Nakagawa. Effective personalization based on association rule discovery from Web usage data. ACM Workshop on Web Information and Data Management, pp. 9–15, 2001.

    Google Scholar 

  49. X. Ning and G. Karypis. SLIM: Sparse linear methods for top-N recommender systems. IEEE International Conference on Data Mining, pp. 497–506, 2011.

    Google Scholar 

  50. D. Oard and J. Kim. Implicit feedback for recommender systems. Proceedings of the AAAI Workshop on Recommender Systems, pp. 81–83, 1998.

    Google Scholar 

  51. P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2), pp. 111–126, 1994.

    Article  Google Scholar 

  52. R. Pan, Y. Zhou, B. Cao, N. Liu, R. Lukose, M. Scholz, Q. Yang. One-class collaborative filtering. IEEE International Conference on Data Mining, pp. 502–511, 2008.

    Google Scholar 

  53. R. Pan, and M. Scholz. Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. ACM KDD Conference, pp. 667–676, 2009.

    Google Scholar 

  54. S. Parthasarathy and C. Aggarwal. On the use of conceptual reconstruction for mining massively incomplete data sets. IEEE Transactions on Knowledge and Data Engineering, 15(6), pp. 1512–1521, 2003.

    Article  Google Scholar 

  55. A. Paterek. Improving regularized singular value decomposition for collaborative filtering. Proceedings of KDD Cup and Workshop, 2007.

    Google Scholar 

  56. V. Pauca, J. Piper, and R. Plemmons. Nonnegative matrix factorization for spectral data analysis. Linear algebra and its applications, 416(1), pp. 29–47, 2006.

    Article  MathSciNet  MATH  Google Scholar 

  57. S. Rendle. Factorization machines. IEEE International Conference on Data Mining, pp. 995–100, 2010.

    Google Scholar 

  58. J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. ICML Conference, pp. 713–718, 2005.

    Google Scholar 

  59. R. Salakhutdinov, and A. Mnih. Probabilistic matrix factorization. Advances in Neural and Information Processing Systems, pp. 1257–1264, 2007.

    Google Scholar 

  60. R. Salakhutdinov, and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. International Conference on Machine Learning, pp. 880–887, 2008.

    Google Scholar 

  61. R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted Boltzmann machines for collaborative filtering. International conference on Machine Learning, pp. 791–798, 2007.

    Google Scholar 

  62. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. World Wide Web Conference, pp. 285–295, 2001.

    Google Scholar 

  63. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Application of dimensionality reduction in recommender system – a case study. WebKDD Workshop at ACM SIGKDD Conference, 2000. Also appears at Technical Report TR-00-043, University of Minnesota, Minneapolis, 2000. https://wwws.cs.umn.edu/tech_reports_upload/tr2000/00-043.pdf

  64. D. Seung, and L. Lee. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems, 13, pp. 556–562, 2001.

    Google Scholar 

  65. H. Shen and J. Z. Huang. Sparse principal component analysis via regularized low rank matrix approximation. Journal of multivariate analysis. 99(6), pp. 1015–1034, 2008.

    Article  MathSciNet  MATH  Google Scholar 

  66. M.-L. Shyu, C. Haruechaiyasak, S.-C. Chen, and N. Zhao. Collaborative filtering by mining association rules from user access sequences. Workshop on Challenges in Web Information Retrieval and Integration, pp. 128–135, 2005.

    Google Scholar 

  67. G. Strang. An introduction to linear algebra. Wellesley Cambridge Press, 2009.

    Google Scholar 

  68. N. Srebro, J. Rennie, and T. Jaakkola. Maximum-margin matrix factorization. Advances in neural information processing systems, pp. 1329–1336, 2004.

    Google Scholar 

  69. X. Su, T. Khoshgoftaar, X. Zhu, and R. Greiner. Imputation-boosted collaborative filtering using machine learning classifiers. ACM symposium on Applied computing, pp. 949–950, 2008.

    Google Scholar 

  70. G. Takacs, I. Pilaszy, B. Nemeth, and D. Tikk. Matrix factorization and neighbor based algorithms for the Netflix prize problem. ACM Conference on Recommender Systems, pp. 267–274, 2008.

    Google Scholar 

  71. S. Vucetic and Z. Obradovic. Collaborative filtering using a regression-based approach. Knowledge and Information Systems, 7(1), pp. 1–22, 2005.

    Article  Google Scholar 

  72. M. Weimer, A. Karatzoglou, Q. Le, and A. Smola. CoFiRank: Maximum margin matrix factorization for collaborative ranking. Advances in Neural Information Processing Systems, 2007.

    Google Scholar 

  73. S. Wild, J. Curry, and A. Dougherty. Improving non-negative matrix factorizations through structured initialization. Pattern Recognition, 37(11), pp. 2217–2232, 2004.

    Article  Google Scholar 

  74. Z. Xia, Y. Dong, and G. Xing. Support vector machines for collaborative filtering. Proceedings of the 44th Annual Southeast Regional Conference, pp. 169–174, 2006.

    Google Scholar 

  75. H. F. Yu, C. Hsieh, S. Si, and I. S. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. IEEE International Conference on Data Mining, pp. 765–774, 2012.

    Google Scholar 

  76. K. Yu, S. Zhu, J. Lafferty, and Y. Gong. Fast nonparametric matrix factorization for large-scale collaborative filtering. ACM SIGIR Conference, pp. 211–218, 2009.

    Google Scholar 

  77. S. Zhang, W. Wang, J. Ford, and F. Makedon. Learning from incomplete ratings using nonnegative matrix factorization. SIAM Conference on Data Mining, pp. 549–553, 2006.

    Google Scholar 

  78. T. Zhang and V. Iyengar. Recommender systems using linear classifiers. Journal of Machine Learning Research, 2, pp. 313–334, 2002.

    MATH  Google Scholar 

  79. K. Zhou, S. Yang, and H. Zha. Functional matrix factorizations for cold-start recommendation. ACM SIGIR Conference, pp. 315–324, 2011.

    Google Scholar 

  80. Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the Netflix prize. Algorithmic Aspects in Information and Management, pp. 337–348, 2008.

    Google Scholar 

  81. C. Ziegler. Applying feed-forward neural networks to collaborative filtering, Master’s Thesis, Universitat Freiburg, 2006.

    Google Scholar 

  82. http://www.the-ensemble.com/

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Aggarwal, C.C. (2016). Model-Based Collaborative Filtering. In: Recommender Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-29659-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-29659-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-29657-9

  • Online ISBN: 978-3-319-29659-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics