Information Systems Frontiers

, Volume 20, Issue 1, pp 111–124 | Cite as

Leveraging clustering to improve collaborative filtering

  • Nima Mirbakhsh
  • Charles X. Ling


Extensive work on matrix factorization (MF) techniques have been done recently as they provide accurate rating prediction models in recommendation systems. Additional extensions, such as neighbour-aware models, have been shown to improve rating prediction further. However, these models often suffer from a long computation time. In this paper, we propose a novel method that applies clustering algorithms to the latent vectors of users and items. Our method can capture the common interests between the cluster of users and the cluster of items in a latent space. A matrix factorization technique is then applied to this cluster-level rating matrix to predict the future cluster-level interests. We then aggregate the traditional user-item rating predictions with our cluster-level rating predictions to improve the rating prediction accuracy. Our method is a general “wrapper” that can be applied to all collaborative filtering methods. In our experiments, we show that our new approach, when applied to a variety of existing matrix factorization techniques, improves their rating predictions and also results in better rating predictions for cold-start users. Above all, in this paper we show that better quality and more quantity of these clusters achieve a better rating prediction accuracy.


Collaborative filtering Recommendation system Matrix factorization 



This work was made possible by the facilities of the Shared Hierarchical Academic Research Computing Network (SHARCNET: and Compute/Calcul Canada. The authors would like to thank the reviewers of the 2013 ACM Recommender System conference (RecSys’13) for their valuable comments.


  1. Rapidminer (2016). Accessed.
  2. Weka 3: Data mining software in java (2016). Accessed.
  3. Balijepally, V., Mangalaraj G., Iyengar K. (2011) Are we wielding this hammer correctly? A reflective review of the application of cluster analysis in information systems research. Journal AIS 12 (5) [].
  4. Beutel, A., Murray K., Faloutsos C., Smola A.J. 2014. Cobafi: Collaborative bayesian filtering. ACM, NY, USA. doi: 10.1145/2566486.2568040.
  5. Bishop, C.M. (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York, Inc., NJ, USA.Google Scholar
  6. Connor, M., Herlocker J. (1999) Clustering items for collaborative filtering. Proceedings of the ACM SIGIR Workshop on Recommender Systems, Berkeley, CA.Google Scholar
  7. Desrosiers, C., Karypis G. (2011) A comprehensive survey of neighborhood-based recommendation methods. Recommender Systems Handbook. In: Ricci F., Rokach L., Shapira B., Kantor P.B. (eds), 107–144.. Springer, US. doi: 10.1007/978-0-387-85820-3_4.
  8. Ekstrand, M.D., Riedl J.T., Konstan J.A. (2011) Collaborative filtering recommender systems. Foundations Trends Human-Computer Interaction 4 (2): 81–173. doi: 10.1561/1100000009.CrossRefGoogle Scholar
  9. George, T., Merugu S. (2005) A scalable collaborative filtering framework based on co-clustering. Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM ’05, 625–628.. IEEE Computer Society, DC, USA. doi: 10.1109/ICDM.2005.14.
  10. Gueye, M., Abdessalem T., Naacke H. (2011) A cluster-based matrix-factorization for online integration of new ratings. Journées de Bases de Données Avancées (BDA) , 1–18.Google Scholar
  11. Herlocker, J.L., Konstan J.A., Borchers A., Riedl J. (1999) An algorithmic framework for performing collaborative filtering. Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’99, 230–237.. ACM, NY, USA. doi: 10.1145/312624.312682  10.1145/312624.312682.
  12. Jamali, M., Huang T., Ester M. (2011) A generalized stochastic block model for recommendation in social rating networks. Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11, 53–60.. ACM, NY, USA. doi: 10.1145/2043932.2043946.
  13. Konstan, J.A., Riedl J.T. (2012) Recommender systems: from algorithms to user experience. User Modeling and User-Adapted Interaction 22: 101–123. doi: 10.1007/s11257-011-9112-x  10.1007/s11257-011-9112-x.CrossRefGoogle Scholar
  14. Koren, Y. (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’08, 426–434.. ACM, NY, USA. doi: 10.1145/1401890.1401944.
  15. Koren, Y. (2010) Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions Knowledge Discovery Data 4 (1): 1:1–1:24. doi: 10.1145/1644873.1644874.Google Scholar
  16. Koren, Y., Bell R. (2011) Advances in collaborative filtering. Recommender Systems Handbook. In: Ricci F., Rokach L., Shapira B., Kantor P.B. (eds), 145–186.. Springer, US. doi: 10.1007/978-0-387-85820-3_5.
  17. Koren, Y., Bell R., Volinsky C. (2009) Matrix factorization techniques for recommender systems. Computer 42 (8): 30–37. doi: 10.1109/MC.2009.263.CrossRefGoogle Scholar
  18. Mirbakhsh, N., Ling C.X. (2013) Clustering-based factorized collaborative filtering. Proceedings of the 7th ACM conference on Recommender systems, RecSys ’13, 315–318.. ACM, NY, USA. doi: 10.1145/2507157.2507233.
  19. Mirbakhsh, N., Ling C.X. (2015) Improving top-n recommendation for cold-start users via cross-domain information (accepted to publish) the Transactions on Knowledge Discovery from Data (TKDD).Google Scholar
  20. Ning, X., Karypis G. (2012) Sparse linear methods with side information for top-n recommendations. Proceedings of the Sixth ACM Conference on Recommender Systems, RecSys ’12, 155–162.. ACM, NY, USA. doi: 10.1145/2365952.2365983  10.1145/2365952.2365983.
  21. Rashid, A.M., Karypis G., Riedl J. (2008) Learning preferences of new users in recommender systems: an information theoretic approach. SIGKDD Exploration Newsletter 10 (2): 90–100. doi: 10.1145/1540276.1540302.CrossRefGoogle Scholar
  22. Rendle, S., Freudenthaler C., Gantner Z., Schmidt-Thieme L. (2009) Bpr: Bayesian personalized ranking from implicit feedback. Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI ’09, 452–461.. AUAI Press, Virginia, US [].
  23. Steck, H. (2010) Training and testing of recommender systems on data missing not at random. Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, 713–722.. ACM, NY, USA. doi: 10.1145/1835804.1835895.
  24. Steck, H. (2010) Training and testing of recommender systems on data missing not at random. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10, 713–722.. ACM, NY, USA. doi: 10.1145/1835804.1835895.
  25. Töscher, A., Jahrer M., Legenstein R. (2008) Improved neighborhood-based algorithms for large-scale recommender systems. Proceedings of the 2nd KDD Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition, NETFLIX ’08, 4:1–4:6.. ACM, NY, USA. doi: 10.1145/1722149.1722153.
  26. Witten, I.H., Frank E. (2005) Data mining: Practical machine learning tools and techniques, second edition (morgan kaufmann series in data management systems). Morgan Kaufmann Publishers Inc., CA, USA.Google Scholar
  27. Xu, B., Bu J., Chen C., Cai D. (2012) An exploration of improving collaborative recommender systems via user-item subgroups. Proceedings of the 21st international conference on World Wide Web, WWW ’12, 21–30.. ACM, NY, USA. doi: 10.1145/2187836.2187840.

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Western UniversityLondonCanada

Personalised recommendations