Unsupervised strategies for shilling detection and robust collaborative filtering

Mehta, Bhaskar; Nejdl, Wolfgang

doi:10.1007/s11257-008-9050-4

Unsupervised strategies for shilling detection and robust collaborative filtering

Original Paper
Published: 18 July 2008

Volume 19, pages 65–97, (2009)
Cite this article

User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Bhaskar Mehta¹ &
Wolfgang Nejdl²

729 Accesses
118 Citations
Explore all metrics

Abstract

Collaborative filtering systems are essentially social systems which base their recommendation on the judgment of a large number of people. However, like other social systems, they are also vulnerable to manipulation by malicious social elements. Lies and Propaganda may be spread by a malicious user who may have an interest in promoting an item, or downplaying the popularity of another one. By doing this systematically, with either multiple identities, or by involving more people, malicious user votes and profiles can be injected into a collaborative recommender system. This can significantly affect the robustness of a system or algorithm, as has been studied in previous work. While current detection algorithms are able to use certain characteristics of shilling profiles to detect them, they suffer from low precision, and require a large amount of training data. In this work, we provide an in-depth analysis of shilling profiles and describe new approaches to detect malicious collaborative filtering profiles. In particular, we exploit the similarity structure in shilling user profiles to separate them from normal user profiles using unsupervised dimensionality reduction. We present two detection algorithms; one based on PCA, while the other uses PLSA. Experimental results show a much improved detection precision over existing methods without the usage of additional training time required for supervised approaches. Finally, we present a novel and highly effective robust collaborative filtering algorithm which uses ideas presented in the detection algorithms using principal component analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Al-Kandari N.M., Jolliffe I.T.: Variable selection and interpretation in correlation principal components. Environmetrics 16(6), 659–672 (2005)
Article MathSciNet Google Scholar
Bennett J., Elkan C., Liu B., Smyth P., Tikk D.: KDD Cup and workshop 2007. SIGKDD Explor. 9(2), 51–52 (2007)
Article Google Scholar
Brand, M.: Fast online SVD revisions for lightweight recommender systems. Proceedings of SIAM International Conference on Data Mining, pp. 37–46 (2003)
Burke, R., Mobasher, B., Williams, C., Bhaumik, R.: Analysis and detection of segment-focused attacks against collaborative recommendation. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’06), pp. 542–547 (2006)
Canny, J.: Collaborative filtering with privacy via factor analysis. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 238–245 (2002)
Chirita, P.-A., Nejdl, W., Zamfir, C.: Preventing shilling attacks in online recommender systems. In: WIDM ’05: 7th Annual ACM International Workshop, pp. 67–74. ACM Press, New York, USA (2005)
Gorrell, G.: Generalized Hebbian Algorithm for incremental singular value decomposition in natural language processing. In: EACL. The Association for Computer Linguistics, pp. 97–104 (2006)
Hastie T., Tibshirani R., Eisen M., Brown P., Ross D., Scherf U., Weinstein J., Alizadeh A., Staudt L., Botstein D.: Gene shaving: a new class of clustering methods for ecpression arrays. Genome Biol. 1, 1–0003 (2000)
Article Google Scholar
Hofmann T.: Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst. 22(1), 89–115 (2004)
Article Google Scholar
Jolliffe I.: Discarding variables in a principal component analysis. I: artificial data. Appl. Stat. 21(2), 160–173 (1972)
Article MathSciNet Google Scholar
Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer (2002)
Konstan, J.A., Riedl, J., Smyth, B. (eds.): Proceedings of the 2007. ACM Conference on Recommender Systems, RecSys 2007. ACM, Minneapolis, MN, USA, 19–20 October 2007
Lam, S.K., Riedl, J.: Shilling recommender systems for fun and profit. In: WWW ’04: Proceedings of the 13th International Conference on World Wide Web, pp. 393–402. ACM Press, New York (2004)
Mahalanobis P.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)
MATH Google Scholar
Mehta, B., Hofmann, T., Fankhauser, P.: Lies and propaganda: detecting spam users in collaborative filtering. In: IUI ’07: Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 14–21. ACM Press, New York (2007a)
Mehta, B., Hofmann, T., Nejdl, W.: Robust collaborative filtering. In: Konstan (2007), pp. 49–56. ACM (2007b)
Mobasher, B., Burke, R.D., Sandvig, J.J.: Model-based collaborative filtering as a defense against profile injection attacks. Proceedings of the 21st National Conference on Artificial Intelligence (AAAI’06), pp. 1388–1393. AAAI Press, Boston, MA (2006)
O’Mahony M., Hurley N., Kushmerick N., Silvestre G.: Collaborative recommendation: a robustness analysis. ACM Trans. Int. Tech. 4(4), 344–377 (2004)
Article Google Scholar
O’Mahony, M.P., Hurley, N.J., Silvestre: Detecting noise in recommender system databases. In: Proceedings of the International Conference on Intelligent User Interfaces (IUI’06), 29th–1st. Sydney, Australia, pp. 109–115. ACM Press (2006)
Paterek, A.: Improving regularized singular value decomposition for collaborative filtering. Proceedings of KDD Cup and Workshop (2007)
Riedl, J., et al.: MovieLens dataset, available at http://www.cs.umn.edu/Research (1998)
Sandvig, J.J., Mobasher, B., Burke, R.D.: Robustness of collaborative recommendation based on association rule mining. In: Konstan (2007), pp. 105–112. ACM (2007)
Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: WWW, pp. 285–295 (2001)
Williams, C., Mobasher, B., Burke, R., Sandvig, J., Bhaumik, R.: Detection of obfuscated attacks in collaborative recommender systems. In: Workshop on Recommender Systems, ECAI (2006)

Download references

Author information

Authors and Affiliations

Google Inc., Brandschekenstrasse 110, Zurich, 8004, Switzerland
Bhaskar Mehta
Forschungszentrum L3S, University of Hannover, Hannover, 30167, Germany
Wolfgang Nejdl

Authors

Bhaskar Mehta
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Nejdl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bhaskar Mehta.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mehta, B., Nejdl, W. Unsupervised strategies for shilling detection and robust collaborative filtering. User Model User-Adap Inter 19, 65–97 (2009). https://doi.org/10.1007/s11257-008-9050-4

Download citation

Received: 19 February 2007
Revised: 10 September 2007
Accepted: 21 May 2008
Published: 18 July 2008
Issue Date: February 2009
DOI: https://doi.org/10.1007/s11257-008-9050-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised strategies for shilling detection and robust collaborative filtering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised strategies for shilling detection and robust collaborative filtering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation