Abstract
Neighborhood-based collaborative filtering algorithms, also referred to as memory-based algorithms, were among the earliest algorithms developed for collaborative filtering. These algorithms are based on the fact that similar users display similar patterns of rating behavior and similar items receive similar ratings. There are two primary types of neighborhood-based algorithms:
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In many cases, k valid peers of target user u with observed ratings for item j might not exist. This scenario is particularly common in sparse ratings matrices, such as the case where user u has less than k observed ratings. In such cases, the set P u (j) will have cardinality less than k.
- 2.
The precise method used by Netflix is proprietary and therefore not known. However, item-based methods do provide a viable methodology to achieve similar goals.
- 3.
There can be some minor differences depending on how the mean is computed for each row within the Pearson coefficient. If the mean for each row is computed using all the observed entries of that row (rather than only the mutually specified entries), then the Pearson correlation coefficient is identical to the cosine coefficient for row-wise mean-centered matrices.
- 4.
Diagonal matrices are usually square. Although this matrix is not square, only entries with equal indices are nonzero. This is a generalized definition of a diagonal matrix.
- 5.
- 6.
The approach can be adapted to arbitrary rating matrices. However, the main advantages of the approach are realized for non-negative ratings matrices.
- 7.
It is noteworthy that imposing an additional constraint, such as non-negativity, always reduces the quality of the optimal solution on the observed entries. On the other hand, imposing constraints increases the model bias and reduces model variance, which might reduce overfitting on the unobserved entries. In fact, when two closely related models have contradicting relative performances on the observed and unobserved entries, respectively, it is almost always a result of differential levels of overfitting in the two cases. You will learn more about the bias-variance trade-off in Chapter 6. In general, it is more reliable to predict item ratings with positive item-item relationships rather than negative relationships. The non-negativity constraint is based on this observation. The incorporation of model biases in the form of such natural constraints is particularly useful for smaller data sets.
Bibliography
C. Aggarwal. Social network data analytics. Springer, New York, 2011.
C. Aggarwal. Data mining: the textbook. Springer, New York, 2015.
C. Aggarwal and S. Parthasarathy. Mining massively incomplete data sets by conceptual reconstruction. ACM KDD Conference, pp. 227–232, 2001.
C. Aggarwal, J. Wolf, K.-L. Wu, and P. Yu. Horting hatches an egg: a new graph-theoretic approach to collaborative filtering. ACM KDD Conference, pp. 201–212, 1999.
C. Anderson. The long tail: why the future of business is selling less of more. Hyperion, 2006.
F. Aiolli. Efficient top-n recommendation for very large scale binary rated datasets. ACM conference on Recommender Systems, pp. 273–280, 2013.
R. Bell, Y. Koren, and C. Volinsky. Modeling relationships at multiple scales to improve accuracy of large recommender systems. ACM KDD Conference, pp. 95–104, 2007.
R. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. IEEE International Conference on Data Mining, pp. 43–52, 2007.
J. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. Conference on Uncertainty in Artificial Inetlligence, 1998.
S. Chee, J. Han, and K. Wang. Rectree: An efficient collaborative filtering method. Data Warehousing and Knowledge Discovery, pp. 141–151, 2001.
E. Christakopoulou and G. Karypis. HOSLIM: Higher-order sparse linear method for top-n recommender systems. Advances in Knowledge Discovery and Data Mining, pp. 38–49, 2014.
W. Cohen, R. Schapire and Y. Singer. Learning to order things. Advances in Neural Information Processing Systems, pp. 451–457, 2007.
M. O’Connor and J. Herlocker. Clustering items for collaborative filtering. Proceedings of the ACM SIGIR workshop on recommender systems, Vol 128. 1999.
P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. RecSys, pp. 39–46, 2010.
M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS), 22(1), pp. 143–177, 2004.
C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-based recommendation methods. Recommender Systems Handbook, pp. 107–144, 2011.
F. Fouss, A. Pirotte, J. Renders, and M. Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering, 19(3), pp. 355–369, 2007.
F. Fouss, L. Yen, A. Pirotte, and M. Saerens. An experimental investigation of graph kernels on a collaborative recommendation task. IEEE International Conference on Data Mining (ICDM), pp. 863–868, 2006.
K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval, 4(2), pp. 133–151, 2001.
M. Gori and A. Pucci. Itemrank: a random-walk based scoring algorithm for recommender engines. IJCAI Conference, pp. 2766–2771, 2007.
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.
J. Herlocker, J. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. ACM SIGIR Conference, pp. 230–237, 1999.
J. Herlocker, J. Konstan,, and J. Riedl. An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Information Retrieval, 5(4), pp. 287–310, 2002.
T. Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1), pp. 89–114, 2004.
A. Howe, and R. Forbes. Re-considering neighborhood-based collaborative filtering parameters in the context of new data. Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 1481–1482, 2008.
Z. Huang, X. Li, and H. Chen. Link prediction approach to collaborative filtering. ACM/IEEE-CS joint conference on Digital libraries, pp. 141–142, 2005.
Z. Huang, H. Chen, and D. Zheng. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems, 22(1), pp. 116–142, 2004.
R. Jin, J. Chai, and L. Si. An automatic weighting scheme for collaborative filtering. ACM SIGIR Conference, pp. 337–344, 2004.
R. Jin, L. Si, and C. Zhai. Preference-based graphic models for collaborative filtering. Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence, pp. 329–336, 2003.
R. Jin, L. Si, C. Zhai, and J. Callan. Collaborative filtering with decoupled models for preferences and ratings. ACM CIKM Conference, pp. 309–316, 2003.
M. Kendall and J. Gibbons. Rank correlation methods. Charles Griffin, 5th edition, 1990.
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. ACM KDD Conference, pp. 426–434, 2008. Extended version of this paper appears as: “Y. Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1, 2010.”
Y. Koren and R. Bell. Advances in collaborative filtering. Recommender Systems Handbook, Springer, pp. 145–186, 2011. (Extended version in 2015 edition of handbook).
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8), pp. 30–37, 2009.
D. Lemire and A. Maclachlan. Slope one predictors for online rating-based collaborative filtering. SIAM Conference on Data Mining, 2005.
M. Levy and K. Jack. Efficient Top-N Recommendation by Linear Regression. Large Scale Recommender Systems Workshop (LSRS) at RecSys, 2013.
D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7), pp. 1019–1031, 2007.
G. Linden, B. Smith, and J. York. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Computing, 7(1), pp. 76–80, 2003.
H. Ma, I. King, and M. Lyu. Effective missing data prediction for collaborative filtering. ACM SIGIR Conference, pp. 39–46, 2007.
C. Manning, P. Raghavan, and H. Schutze. Introduction to information retrieval. Cambridge University Press, Cambridge, 2008.
N. Meinshausen. Sign-constrained least squares estimation for high-dimensional regression. Electronic Journal of Statistics, 7, pp. 607–1631, 2013.
X. Ning and G. Karypis. SLIM: Sparse linear methods for top-N recommender systems. IEEE International Conference on Data Mining, pp. 497–506, 2011.
X. Ning and G. Karypis. Sparse linear methods with side information for top-n recommendations. ACM Conference on Recommender Systems, pp. 155–162, 2012.
Y. Park and A. Tuzhilin. The long tail of recommender systems and how to leverage it. Proceedings of the ACM Conference on Recommender Systems, pp. 11–18, 2008.
W. Pan and L. Chen. CoFiSet: Collaborative filtering via learning pairwise preferences over item-sets. SIAM Conference on Data Mining, 2013.
S. Parthasarathy and C. Aggarwal. On the use of conceptual reconstruction for mining massively incomplete data sets. IEEE Transactions on Knowledge and Data Engineering, 15(6), pp. 1512–1521, 2003.
J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. ICML Conference, pp. 713–718, 2005.
P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. GroupLens: an open architecture for collaborative filtering of netnews. Proceedings of the ACM Conference on Computer Supported Cooperative Work, pp. 175–186, 1994.
R. Salakhutdinov, and A. Mnih. Probabilistic matrix factorization. Advances in Neural and Information Processing Systems, pp. 1257–1264, 2007.
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. World Wide Web Conference, pp. 285–295, 2001.
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Application of dimensionality reduction in recommender system – a case study. WebKDD Workshop at ACM SIGKDD Conference, 2000. Also appears at Technical Report TR-00-043, University of Minnesota, Minneapolis, 2000. https://wwws.cs.umn.edu/tech_reports_upload/tr2000/00-043.pdf
B. Sarwar, J. Konstan, A. Borchers, J. Herlocker, B. Miller, and J. Riedl. Using filtering agents to improve prediction quality in the grouplens research collaborative filtering system. ACM Conference on Computer Supported Cooperative Work, pp. 345–354, 1998.
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. International Conference on Computer and Information Technology, 2002.
U. Shardanand and P. Maes. Social information filtering: algorithms for automating word of mouth. ACM Conference on Human Factors in Computing Systems, 1995.
G. Strang. An introduction to linear algebra. Wellesley Cambridge Press, 2009.
K. Verstrepen and B. Goethals. Unifying nearest neighbors collaborative filtering. ACM Conference on Recommender Systems, pp. 177–184, 2014.
S. Vucetic and Z. Obradovic. Collaborative filtering using a regression-based approach. Knowledge and Information Systems, 7(1), pp. 1–22, 2005.
J. Wang, A. de Vries, and M. Reinders. Unifying user-based and item-based similarity approaches by similarity fusion. ACM SIGIR Conference, pp. 501–508, 2006.
B. Xu, J. Bu, C. Chen, and D. Cai. An exploration of improving collaborative recommender systems via user-item subgroups. World Wide Web Conference, pp. 21–30, 2012.
G. Xue, C. Lin, Q. Yang, W. Xi, H. Zeng, Y. Yu, and Z. Chen. Scalable collaborative filtering using cluster-based smoothing. ACM SIGIR Conference, pp. 114–121, 2005.
H. Yildirim, and M. Krishnamoorthy. A random walk method for alleviating the sparsity problem in collaborative filtering. ACM Conference on Recommender Systems, pp. 131–138, 2008.
H. Yin, B. Cui, J. Li, J. Yao, and C. Chen. Challenging the long tail recommendation. Proceedings of the VLDB Endowment, 5(9), pp. 896–907, 2012.
T. Zhang and V. Iyengar. Recommender systems using linear classifiers. Journal of Machine Learning Research, 2, pp. 313–334, 2002.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Aggarwal, C.C. (2016). Neighborhood-Based Collaborative Filtering. In: Recommender Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-29659-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-29659-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29657-9
Online ISBN: 978-3-319-29659-3
eBook Packages: Computer ScienceComputer Science (R0)