Abstract
Collaborative filtering (CF) is a widely used technique to guide the users of web applications towards items that might interest them. CF approaches are severely challenged by the characteristics of user-item preference matrices, which are often high dimensional and extremely sparse. Recently, several works have shown that incorporating information from social networks—such as friendship and trust relationships—into traditional CF alleviates the sparsity related issues and yields a better recommendation quality, in most cases. More interestingly, even with comparable performances, social-based CF is more beneficial than traditional CF; the former makes it possible to provide recommendations for cold start users. In this paper, we propose a novel model that leverages information from social networks to improve recommendations. While existing social CF models are based on popular modelling assumptions such as Gaussian or Multinomial, our model builds on the von Mises–Fisher assumption which turns out to be more adequate, than the aforementioned assumptions, for high dimensional sparse data. Setting the estimate of the model parameters under the maximum likelihood approach, we derive a scalable learning algorithm for analyzing data with our model. Empirical results on several real-world datasets provide strong support for the advantages of the proposed model.
Similar content being viewed by others
Notes
In the rest of this paper we treat “direction data” and “\(L_2\) normalized data” as synonyms.
Several variants of nDCG exist, here we adopt the same as in LibRec for fairness purpose.
Cold start users are users who have expressed only few rating/social-interactions. Following previous works (Jamali and Ester 2010; Guo et al. 2015) we consider users who have expressed less than five ratings as cold start users in the preference matrix. Similarly, users who have less than five social relations are considered as cold start users in the social network.
We observed the same behaviour on the Flixster dataset, not reported here for presentation purpose.
References
Amatriain X, Castells P, de Vries A, Posse C (2012) Workshop on recommendation utility evaluation: beyond RMSE–RUE 2012. In: ACM conference on recommender systems (RecSys), pp 351–352
Banerjee A, Dhillon IS, Ghosh J, Sra S (2005) Clustering on the unit hypersphere using von Mises–Fisher distributions. J Mach Learn Res 6:1345–1382
Barbieri N, Manco G, Ritacco E (2014) Probabilistic approaches to recommendations. Synth Lect Data Min Knowl Discov 5(2):1–197
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Bobadilla J, Ortega F, Hernando A, Gutirrez A (2013) Recommender systems survey. Knowl Based Syst 46:109–132
Cai D, Mei Q, Han J, Zhai C (2008) Modeling hidden topics on document manifold. In: Proceedings of the ACM conference on information and knowledge management, pp 911–920
Chaney AJ, Blei DM, Eliassi-Rad T (2015) A probabilistic model for using social networks in personalized item recommendation. In: ACM conference on recommender systems (RecSys), pp 43–50
Cremonesi P, Koren Y, Turrin R (2010) Performance of recommender algorithms on top-n recommendation tasks. In: ACM conference on recommender systems (RecSys), pp 39–46
Delporte J, Karatzoglou A, Matuszczyk T, Canu S (2013) Socially enabled preference learning from implicit feedback data. In: Joint european conference on machine learning and knowledge discovery in databases (ECML PKDD), Springer, Berlin, pp 145–160
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39:1–38
Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42(1–2):143–175
Dhillon IS, Mallela S, Modha DS (2003) Information-theoretic co-clustering. In: Proceedings of the ACM SIGKDD international conference on Knowledge discovery and data mining, pp 89–98
Gopal S, Yang Y (2014) Von Mises–Fisher clustering models. In: Proceedings of the international conference on machine learning (ICML), pp 154–162
Govaert G, Nadif M (2013) Co-Clustering. Wiley, New York
Govaert G, Nadif M (2016) Mutual information, phi-squared and model-based co-clustering for contingency tables. Adv Data Anal Classif. doi:10.1007/s11634-016-0274-6
Guo G, Zhang J, Yorke-Smith N (2013) A novel Bayesian similarity measure for recommender systems. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), pp 2619–2625
Guo G, Zhang J, Thalmann D, Yorke-Smith N (2014) ETAF: an extended trust antecedents framework for trust prediction. In: IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 540–547
Guo G, Zhang J, Yorke-Smith N (2015) TrustSVD: collaborative filtering with both the explicit and implicit influence of user trust and of item ratings. In: Proceedings of the international joint conference on artificial intelligence (AAAI), pp 123–129
He X, Cai D, Shao Y, Bao H, Han J (2011) Laplacian regularized gaussian mixture model for data clustering. IEEE Trans Knowl Data Eng (TKDE) 23(9):1406–1418
Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: ACM conference on recommender systems (RecSys), pp 135–142
Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 426–434
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Le T, Lauw HW (2014) Semantic visualization for spherical representation. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 1007–1016
Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80
Liu H, Hu Z, Mian A, Tian H, Zhu X (2014) A new user similarity model to improve the accuracy of collaborative filtering. Knowl Based Syst 56:156–166
Loiacono D, Lommatzsch A, Turrin R (2014) An analysis of the 2014 RecSys challenge. In: ACM conference on recommender systems (RecSys), p 1
Ma H, Yang H, Lyu MR, King I (2008) Sorec: social recommendation using probabilistic matrix factorization. In: Proceedings of the ACM international on conference on information and knowledge management (CIKM), pp 931–940
Ma H, King I, Lyu MR (2009) Learning to recommend with social trust ensemble. In: Proceedings of the international ACM SIGIR conference on research and development in information retrieval, ACM, pp 203–210
Ma H, Zhou D, Liu C, Lyu MR, King I (2011) Recommender systems with social regularization. In: Proceedings of the ACM WSDM international conference on web search and data mining, pp 287–296
Mardia K, Jupp P (2009) Directional statistics. Wiley Series in Probability and Statistics. Wiley, New York
McLachlan G, Krishnan T (2007) The EM algorithm and extensions, vol 382. Wiley, New York
McLachlan G, Peel D (2004) Finite mixture models. Wiley, New York
Mei Q, Cai D, Zhang D, Zhai C (2008) Topic modeling with network regularization. In: Proceedings of the international conference on world wide web (WWW), pp 101–110
Nadif M, Govaert G (2010) Model-based co-clustering for continuous data. In: Proceedings of international conference on machine learning and applications (ICMLA), pp 175–180
Reisinger J, Waters A, Silverthorn B, Mooney RJ (2010) Spherical topic models. In: Proceedings of the international conference on machine learning (ICML), pp 903–910
Salah A, Rogovschi N, Nadif M (2016a) A dynamic collaborative filtering system via a weighted clustering approach. Neurocomputing 175:206–215
Salah A, Rogovschi N, Nadif M (2016b) Model-based co-clustering for high dimensional sparse data. In: Proceedings of the 19th international conference on artificial intelligence and statistics (AISTATS), pp 866–874
Salah A, Rogovschi N, Nadif M (2016c) Stochastic co-clustering for document-term data. In: Proceedings of the SIAM SDM international conference on data mining, pp 306–314
Salakhutdinov R, Mnih A (2008) Probabilistic matrix factorization. Adv Neural Inf Process Syst (NIPS) 20:1257–1264
Sarwar B, Karypis G, Konstan J, Riedl J (2000) Application of dimensionality reduction in recommender system-a case study. Technical Report, DTIC Document
Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the international conference on world wide web (WWW), ACM, pp 285–295
Sra S (2012) A short note on parameter approximation for von Mises–Fisher distributions: and a fast implementation of I s (x). Comput Stat 27(1):177–190
Tanabe A, Fukumizu K, Oba S, Takenouchi T, Ishii S (2007) Parameter estimation for von Mises–Fisher distributions. Comput Stat 22(1):145–157
Tang J, Gao H, Liu H (2012) mTrust: discerning multi-faceted trust in a connected world. In: Proceedings of the ACM WSDM international conference on web search and data mining, pp 93–102
Ungar LH, Foster DP (1998) Clustering methods for collaborative filtering. AAAI workshop on recommendation systems, vol 1, pp 114–129
Yang B, Lei Y, Liu D, Liu J (2013) Social collaborative filtering by trust. In: Proceedings of the international joint conference on artificial intelligence (AAAI), pp 2747–2753
Zhu X, Lafferty J (2005) Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning. In: Proceedings of the international conference on machine learning (ICML), pp 1052–1059
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editors: Kurt Driessens, Dragi Kocev, Marko Robnik Šikonja and Myra Spiliopoulou.
Rights and permissions
About this article
Cite this article
Salah, A., Nadif, M. Social regularized von Mises–Fisher mixture model for item recommendation. Data Min Knowl Disc 31, 1218–1241 (2017). https://doi.org/10.1007/s10618-017-0499-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-017-0499-9