Abstract
Collaborative filtering systems are essentially social systems which base their recommendation on the judgment of a large number of people. However, like other social systems, they are also vulnerable to manipulation by malicious social elements. Lies and Propaganda may be spread by a malicious user who may have an interest in promoting an item, or downplaying the popularity of another one. By doing this systematically, with either multiple identities, or by involving more people, malicious user votes and profiles can be injected into a collaborative recommender system. This can significantly affect the robustness of a system or algorithm, as has been studied in previous work. While current detection algorithms are able to use certain characteristics of shilling profiles to detect them, they suffer from low precision, and require a large amount of training data. In this work, we provide an in-depth analysis of shilling profiles and describe new approaches to detect malicious collaborative filtering profiles. In particular, we exploit the similarity structure in shilling user profiles to separate them from normal user profiles using unsupervised dimensionality reduction. We present two detection algorithms; one based on PCA, while the other uses PLSA. Experimental results show a much improved detection precision over existing methods without the usage of additional training time required for supervised approaches. Finally, we present a novel and highly effective robust collaborative filtering algorithm which uses ideas presented in the detection algorithms using principal component analysis.
Similar content being viewed by others
References
Al-Kandari N.M., Jolliffe I.T.: Variable selection and interpretation in correlation principal components. Environmetrics 16(6), 659–672 (2005)
Bennett J., Elkan C., Liu B., Smyth P., Tikk D.: KDD Cup and workshop 2007. SIGKDD Explor. 9(2), 51–52 (2007)
Brand, M.: Fast online SVD revisions for lightweight recommender systems. Proceedings of SIAM International Conference on Data Mining, pp. 37–46 (2003)
Burke, R., Mobasher, B., Williams, C., Bhaumik, R.: Analysis and detection of segment-focused attacks against collaborative recommendation. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’06), pp. 542–547 (2006)
Canny, J.: Collaborative filtering with privacy via factor analysis. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 238–245 (2002)
Chirita, P.-A., Nejdl, W., Zamfir, C.: Preventing shilling attacks in online recommender systems. In: WIDM ’05: 7th Annual ACM International Workshop, pp. 67–74. ACM Press, New York, USA (2005)
Gorrell, G.: Generalized Hebbian Algorithm for incremental singular value decomposition in natural language processing. In: EACL. The Association for Computer Linguistics, pp. 97–104 (2006)
Hastie T., Tibshirani R., Eisen M., Brown P., Ross D., Scherf U., Weinstein J., Alizadeh A., Staudt L., Botstein D.: Gene shaving: a new class of clustering methods for ecpression arrays. Genome Biol. 1, 1–0003 (2000)
Hofmann T.: Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst. 22(1), 89–115 (2004)
Jolliffe I.: Discarding variables in a principal component analysis. I: artificial data. Appl. Stat. 21(2), 160–173 (1972)
Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer (2002)
Konstan, J.A., Riedl, J., Smyth, B. (eds.): Proceedings of the 2007. ACM Conference on Recommender Systems, RecSys 2007. ACM, Minneapolis, MN, USA, 19–20 October 2007
Lam, S.K., Riedl, J.: Shilling recommender systems for fun and profit. In: WWW ’04: Proceedings of the 13th International Conference on World Wide Web, pp. 393–402. ACM Press, New York (2004)
Mahalanobis P.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)
Mehta, B., Hofmann, T., Fankhauser, P.: Lies and propaganda: detecting spam users in collaborative filtering. In: IUI ’07: Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 14–21. ACM Press, New York (2007a)
Mehta, B., Hofmann, T., Nejdl, W.: Robust collaborative filtering. In: Konstan (2007), pp. 49–56. ACM (2007b)
Mobasher, B., Burke, R.D., Sandvig, J.J.: Model-based collaborative filtering as a defense against profile injection attacks. Proceedings of the 21st National Conference on Artificial Intelligence (AAAI’06), pp. 1388–1393. AAAI Press, Boston, MA (2006)
O’Mahony M., Hurley N., Kushmerick N., Silvestre G.: Collaborative recommendation: a robustness analysis. ACM Trans. Int. Tech. 4(4), 344–377 (2004)
O’Mahony, M.P., Hurley, N.J., Silvestre: Detecting noise in recommender system databases. In: Proceedings of the International Conference on Intelligent User Interfaces (IUI’06), 29th–1st. Sydney, Australia, pp. 109–115. ACM Press (2006)
Paterek, A.: Improving regularized singular value decomposition for collaborative filtering. Proceedings of KDD Cup and Workshop (2007)
Riedl, J., et al.: MovieLens dataset, available at http://www.cs.umn.edu/Research (1998)
Sandvig, J.J., Mobasher, B., Burke, R.D.: Robustness of collaborative recommendation based on association rule mining. In: Konstan (2007), pp. 105–112. ACM (2007)
Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: WWW, pp. 285–295 (2001)
Williams, C., Mobasher, B., Burke, R., Sandvig, J., Bhaumik, R.: Detection of obfuscated attacks in collaborative recommender systems. In: Workshop on Recommender Systems, ECAI (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mehta, B., Nejdl, W. Unsupervised strategies for shilling detection and robust collaborative filtering. User Model User-Adap Inter 19, 65–97 (2009). https://doi.org/10.1007/s11257-008-9050-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11257-008-9050-4