Neighborhood-Based Collaborative Filtering

Aggarwal, Charu C.

doi:10.1007/978-3-319-29659-3_2

Charu C. Aggarwal²

136k Accesses
48 Citations

Abstract

Neighborhood-based collaborative filtering algorithms, also referred to as memory-based algorithms, were among the earliest algorithms developed for collaborative filtering. These algorithms are based on the fact that similar users display similar patterns of rating behavior and similar items receive similar ratings. There are two primary types of neighborhood-based algorithms:

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In many cases, k valid peers of target user u with observed ratings for item j might not exist. This scenario is particularly common in sparse ratings matrices, such as the case where user u has less than k observed ratings. In such cases, the set P _u(j) will have cardinality less than k.
2.
The precise method used by Netflix is proprietary and therefore not known. However, item-based methods do provide a viable methodology to achieve similar goals.
3.
There can be some minor differences depending on how the mean is computed for each row within the Pearson coefficient. If the mean for each row is computed using all the observed entries of that row (rather than only the mutually specified entries), then the Pearson correlation coefficient is identical to the cosine coefficient for row-wise mean-centered matrices.
4.
Diagonal matrices are usually square. Although this matrix is not square, only entries with equal indices are nonzero. This is a generalized definition of a diagonal matrix.
5.
A discussion of linear regression is provided in section 4.4.5 of Chapter 4, but in the context of content-based systems.
6.
The approach can be adapted to arbitrary rating matrices. However, the main advantages of the approach are realized for non-negative ratings matrices.
7.
It is noteworthy that imposing an additional constraint, such as non-negativity, always reduces the quality of the optimal solution on the observed entries. On the other hand, imposing constraints increases the model bias and reduces model variance, which might reduce overfitting on the unobserved entries. In fact, when two closely related models have contradicting relative performances on the observed and unobserved entries, respectively, it is almost always a result of differential levels of overfitting in the two cases. You will learn more about the bias-variance trade-off in Chapter 6. In general, it is more reliable to predict item ratings with positive item-item relationships rather than negative relationships. The non-negativity constraint is based on this observation. The incorporation of model biases in the form of such natural constraints is particularly useful for smaller data sets.

Bibliography

C. Aggarwal. Social network data analytics. Springer, New York, 2011.
Google Scholar
C. Aggarwal. Data mining: the textbook. Springer, New York, 2015.
Google Scholar
C. Aggarwal and S. Parthasarathy. Mining massively incomplete data sets by conceptual reconstruction. ACM KDD Conference, pp. 227–232, 2001.
Google Scholar
C. Aggarwal, J. Wolf, K.-L. Wu, and P. Yu. Horting hatches an egg: a new graph-theoretic approach to collaborative filtering. ACM KDD Conference, pp. 201–212, 1999.
Google Scholar
C. Anderson. The long tail: why the future of business is selling less of more. Hyperion, 2006.
Google Scholar
F. Aiolli. Efficient top-n recommendation for very large scale binary rated datasets. ACM conference on Recommender Systems, pp. 273–280, 2013.
Google Scholar
R. Bell, Y. Koren, and C. Volinsky. Modeling relationships at multiple scales to improve accuracy of large recommender systems. ACM KDD Conference, pp. 95–104, 2007.
Google Scholar
R. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. IEEE International Conference on Data Mining, pp. 43–52, 2007.
Google Scholar
J. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. Conference on Uncertainty in Artificial Inetlligence, 1998.
Google Scholar
S. Chee, J. Han, and K. Wang. Rectree: An efficient collaborative filtering method. Data Warehousing and Knowledge Discovery, pp. 141–151, 2001.
Google Scholar
E. Christakopoulou and G. Karypis. HOSLIM: Higher-order sparse linear method for top-n recommender systems. Advances in Knowledge Discovery and Data Mining, pp. 38–49, 2014.
Google Scholar
W. Cohen, R. Schapire and Y. Singer. Learning to order things. Advances in Neural Information Processing Systems, pp. 451–457, 2007.
Google Scholar
M. O’Connor and J. Herlocker. Clustering items for collaborative filtering. Proceedings of the ACM SIGIR workshop on recommender systems, Vol 128. 1999.
Google Scholar
P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. RecSys, pp. 39–46, 2010.
Google Scholar
M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS), 22(1), pp. 143–177, 2004.
Article Google Scholar
C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-based recommendation methods. Recommender Systems Handbook, pp. 107–144, 2011.
Google Scholar
F. Fouss, A. Pirotte, J. Renders, and M. Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering, 19(3), pp. 355–369, 2007.
Article Google Scholar
F. Fouss, L. Yen, A. Pirotte, and M. Saerens. An experimental investigation of graph kernels on a collaborative recommendation task. IEEE International Conference on Data Mining (ICDM), pp. 863–868, 2006.
Google Scholar
K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval, 4(2), pp. 133–151, 2001.
Article MATH Google Scholar
M. Gori and A. Pucci. Itemrank: a random-walk based scoring algorithm for recommender engines. IJCAI Conference, pp. 2766–2771, 2007.
Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.
Google Scholar
J. Herlocker, J. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. ACM SIGIR Conference, pp. 230–237, 1999.
Google Scholar
J. Herlocker, J. Konstan,, and J. Riedl. An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Information Retrieval, 5(4), pp. 287–310, 2002.
Article Google Scholar
T. Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1), pp. 89–114, 2004.
Article Google Scholar
A. Howe, and R. Forbes. Re-considering neighborhood-based collaborative filtering parameters in the context of new data. Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 1481–1482, 2008.
Google Scholar
Z. Huang, X. Li, and H. Chen. Link prediction approach to collaborative filtering. ACM/IEEE-CS joint conference on Digital libraries, pp. 141–142, 2005.
Google Scholar
Z. Huang, H. Chen, and D. Zheng. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems, 22(1), pp. 116–142, 2004.
Article Google Scholar
R. Jin, J. Chai, and L. Si. An automatic weighting scheme for collaborative filtering. ACM SIGIR Conference, pp. 337–344, 2004.
Google Scholar
R. Jin, L. Si, and C. Zhai. Preference-based graphic models for collaborative filtering. Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence, pp. 329–336, 2003.
Google Scholar
R. Jin, L. Si, C. Zhai, and J. Callan. Collaborative filtering with decoupled models for preferences and ratings. ACM CIKM Conference, pp. 309–316, 2003.
Google Scholar
M. Kendall and J. Gibbons. Rank correlation methods. Charles Griffin, 5th edition, 1990.
Google Scholar
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. ACM KDD Conference, pp. 426–434, 2008. Extended version of this paper appears as: “Y. Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1, 2010.”
Google Scholar
Y. Koren and R. Bell. Advances in collaborative filtering. Recommender Systems Handbook, Springer, pp. 145–186, 2011. (Extended version in 2015 edition of handbook).
Google Scholar
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8), pp. 30–37, 2009.
Article Google Scholar
D. Lemire and A. Maclachlan. Slope one predictors for online rating-based collaborative filtering. SIAM Conference on Data Mining, 2005.
Google Scholar
M. Levy and K. Jack. Efficient Top-N Recommendation by Linear Regression. Large Scale Recommender Systems Workshop (LSRS) at RecSys, 2013.
Google Scholar
D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7), pp. 1019–1031, 2007.
Article Google Scholar
G. Linden, B. Smith, and J. York. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Computing, 7(1), pp. 76–80, 2003.
Google Scholar
H. Ma, I. King, and M. Lyu. Effective missing data prediction for collaborative filtering. ACM SIGIR Conference, pp. 39–46, 2007.
Google Scholar
C. Manning, P. Raghavan, and H. Schutze. Introduction to information retrieval. Cambridge University Press, Cambridge, 2008.
Google Scholar
N. Meinshausen. Sign-constrained least squares estimation for high-dimensional regression. Electronic Journal of Statistics, 7, pp. 607–1631, 2013.
Article MathSciNet MATH Google Scholar
X. Ning and G. Karypis. SLIM: Sparse linear methods for top-N recommender systems. IEEE International Conference on Data Mining, pp. 497–506, 2011.
Google Scholar
X. Ning and G. Karypis. Sparse linear methods with side information for top-n recommendations. ACM Conference on Recommender Systems, pp. 155–162, 2012.
Google Scholar
Y. Park and A. Tuzhilin. The long tail of recommender systems and how to leverage it. Proceedings of the ACM Conference on Recommender Systems, pp. 11–18, 2008.
Google Scholar
W. Pan and L. Chen. CoFiSet: Collaborative filtering via learning pairwise preferences over item-sets. SIAM Conference on Data Mining, 2013.
Google Scholar
S. Parthasarathy and C. Aggarwal. On the use of conceptual reconstruction for mining massively incomplete data sets. IEEE Transactions on Knowledge and Data Engineering, 15(6), pp. 1512–1521, 2003.
Article Google Scholar
J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. ICML Conference, pp. 713–718, 2005.
Google Scholar
P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. GroupLens: an open architecture for collaborative filtering of netnews. Proceedings of the ACM Conference on Computer Supported Cooperative Work, pp. 175–186, 1994.
Google Scholar
R. Salakhutdinov, and A. Mnih. Probabilistic matrix factorization. Advances in Neural and Information Processing Systems, pp. 1257–1264, 2007.
Google Scholar
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. World Wide Web Conference, pp. 285–295, 2001.
Google Scholar
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Application of dimensionality reduction in recommender system – a case study. WebKDD Workshop at ACM SIGKDD Conference, 2000. Also appears at Technical Report TR-00-043, University of Minnesota, Minneapolis, 2000. https://wwws.cs.umn.edu/tech_reports_upload/tr2000/00-043.pdf
B. Sarwar, J. Konstan, A. Borchers, J. Herlocker, B. Miller, and J. Riedl. Using filtering agents to improve prediction quality in the grouplens research collaborative filtering system. ACM Conference on Computer Supported Cooperative Work, pp. 345–354, 1998.
Google Scholar
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. International Conference on Computer and Information Technology, 2002.
Google Scholar
U. Shardanand and P. Maes. Social information filtering: algorithms for automating word of mouth. ACM Conference on Human Factors in Computing Systems, 1995.
Google Scholar
G. Strang. An introduction to linear algebra. Wellesley Cambridge Press, 2009.
Google Scholar
K. Verstrepen and B. Goethals. Unifying nearest neighbors collaborative filtering. ACM Conference on Recommender Systems, pp. 177–184, 2014.
Google Scholar
S. Vucetic and Z. Obradovic. Collaborative filtering using a regression-based approach. Knowledge and Information Systems, 7(1), pp. 1–22, 2005.
Article Google Scholar
J. Wang, A. de Vries, and M. Reinders. Unifying user-based and item-based similarity approaches by similarity fusion. ACM SIGIR Conference, pp. 501–508, 2006.
Google Scholar
B. Xu, J. Bu, C. Chen, and D. Cai. An exploration of improving collaborative recommender systems via user-item subgroups. World Wide Web Conference, pp. 21–30, 2012.
Google Scholar
G. Xue, C. Lin, Q. Yang, W. Xi, H. Zeng, Y. Yu, and Z. Chen. Scalable collaborative filtering using cluster-based smoothing. ACM SIGIR Conference, pp. 114–121, 2005.
Google Scholar
H. Yildirim, and M. Krishnamoorthy. A random walk method for alleviating the sparsity problem in collaborative filtering. ACM Conference on Recommender Systems, pp. 131–138, 2008.
Google Scholar
H. Yin, B. Cui, J. Li, J. Yao, and C. Chen. Challenging the long tail recommendation. Proceedings of the VLDB Endowment, 5(9), pp. 896–907, 2012.
Article Google Scholar
T. Zhang and V. Iyengar. Recommender systems using linear classifiers. Journal of Machine Learning Research, 2, pp. 313–334, 2002.
MATH Google Scholar
http://eigentaste.berkeley.edu/user/index.php

Download references

Author information

Authors and Affiliations

IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C. (2016). Neighborhood-Based Collaborative Filtering. In: Recommender Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-29659-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-29659-3_2
Published: 29 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29657-9
Online ISBN: 978-3-319-29659-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics