Data Mining Methods for Recommender Systems

Amatriain, Xavier; Jaimes*, Alejandro; Oliver, Nuria; Pujol, Josep M.

doi:10.1007/978-0-387-85820-3_2

Xavier Amatriain⁵,
Alejandro Jaimes*⁶,
Nuria Oliver⁵ &
…
Josep M. Pujol⁵

25k Accesses
81 Citations
1 Altmetric

Abstract

In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common preprocessing methods such as sampling or dimensionality reduction. Next, we review the most important classification techniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an efficient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.

*Work on the chapter was performed while the author was at Telefonica Research

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 179.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adomavicius, G.,and Tuzhilin, A., Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6): 734–749, 2005.
Article Google Scholar
Agrawal, R.,and Srikant, R., Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, 1994.
Google Scholar
Amatriain, X., Lathia, N., Pujol, J.M., Kwak, H., and Oliver, N., The wisdom of the few: A collaborative filtering approach based on expert opinions from the web. In Proc. of SIGIR ’09, 2009.
Google Scholar
Amatriain, X., Pujol, J.M., and Oliver, N., I like it... i like it not: Evaluating user ratings noise in recommender systems. In UMAP ’09, 2009.
Google Scholar
Amatriain, X., Pujol, J.M., Tintarev, N., and Oliver, N., Rate it again: Increasing recommendation accuracy by user re-rating. In Recys ’09, 2009.
Google Scholar
Anderson, M., Ball, M., Boley, H., Greene, S., Howse, N., Lemire, D., and S. McGrath. Racofi: A rule-applying collaborative filtering system. In Proc. IEEE/WIC COLA’03, 2003.
Google Scholar
Baets, B.D., Growing decision trees in an ordinal setting. International Journal of Intelligent Systems, 2003.
Google Scholar
Banerjee, S.,and Ramanathan, K., Collaborative filtering on skewed datasets. In Proc. of WWW ’08, 2008.
Google Scholar
Basu, C., Hirsh, H., and Cohen, W., Recommendation as classification: Using social and content-based information in recommendation. In In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 714–720. AAAI Press, 1998.
Google Scholar
Basu, C., Hirsh, H., and Cohen, W., Recommendation as classification: Using social and content-based information in recommendation. In AAAI Workshop on Recommender Systems, 1998.
Google Scholar
Bell, R.M., Koren, Y., and Volinsky, C., The bellkor solution to the netflix prize. Technical report, AT&T Labs Research, 2007.
Google Scholar
Bouza, A., Reif, G., Bernstein, A., and Gall, H., Semtree: ontology-based decision tree algorithm for recommender systems. In International Semantic Web Conference, 2008.
Google Scholar
Bozzon, A., Prandi, G., Valenzise, G., and Tagliasacchi, M., A music recommendation system based on semantic audio segments similarity. In Proceeding of Internet and Multimedia Systems and Applications - 2008, 2008.
Google Scholar
Brand, M., Fast online svd revisions for lightweight recommender systems. In SIAM International Conference on Data Mining (SDM), 2003.
Google Scholar
Breese, J., Heckerman, D., and Kadie, C., Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, page 4352, 1998.
Google Scholar
Burke, R., Hybrid web recommender systems. pages 377–408. 2007.
Google Scholar
Cheng, W., J. Hühn, and E. Hüllermeier. Decision tree and instance-based learning for label ranking. In ICML ’09: Proceedings of the 26th Annual International Conference on Machine Learning, pages 161–168, New York, NY, USA, 2009. ACM.
Google Scholar
Cho, Y., Kim, J., and Kim, S., A personalized recommender system based on web usage mining and decision tree induction. Expert Systems with Applications, 2002.
Google Scholar
Christakou, C.,and Stafylopatis, A., A hybrid movie recommender system based on neural networks. In ISDA ’05: Proceedings of the 5th International Conference on Intelligent Systems Design and Applications, pages 500–505, 2005.
Google Scholar
Cohen, W., Fast effective rule induction. In Machine Learning: Proceedings of the 12th International Conference, 1995.
Google Scholar
Connor, M.,and Herlocker, J., Clustering items for collaborative filtering. In SIGIR Workshop on Recommender Systems, 2001.
Google Scholar
Cover, T.,and Hart, P., Nearest neighbor pattern classification. Information Theory, IEEE Transactions on, 13(1):21–27, 1967.
Article MATH Google Scholar
Cristianini, N.,and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, March 2000.
Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., L. T. K., and Harshman, R., Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 1990.
Google Scholar
Deshpande, M.,and Karypis, G., Item-based top-n recommendation algorithms. ACM Trans.Inf. Syst., 22(1):143–177, 2004.
Article Google Scholar
B. S. et al. Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. In Proceedings of the Fifth International Conference on Computer and Information Technology, 2002.
Google Scholar
K. O. et al. Context-aware svm for context-dependent information recommendation. In International Conference On Mobile Data Management, 2006.
Google Scholar
P. T. et al. Introduction to Data Mining. Addison Wesley, 2005.
Google Scholar
S. G. et al. Tv content recommender system. In AAAI/IAAI 2000, 2000.
Google Scholar
S. H. et al. Aimed- a personalized tv recommendation system. In Interactive TV: a Shared Experience, 2007.
Google Scholar
T. B. et al. A trail based internet-domain recommender system using artificial neural networks. In Proceedings of the Int. Conf. on Adaptive Hypermedia and Adaptive Web Based Systems, 2002.
Google Scholar
Freund, Y., Iyer, R., Schapire, R.E., and Singer, Y., An efficient boosting algorithm for combining preferences. Mach, J., Learn. Res., 4:933–969, 2003.
Google Scholar
Frey, B.J.,and Dueck, D., Clustering by passing messages between data points. Science, 307, 2007.
Google Scholar
Friedman, N., Geiger, D., and Goldszmidt, M., Bayesian network classifiers. Mach. Learn., 29(2-3):131–163, 1997.
Article MATH Google Scholar
Funk, S., Netflix update: Try this at home, 2006.
Google Scholar
Ghani, R.,and Fano, A., Building recommender systems using a knowledge base of product semantics. In In 2nd International Conference on Adaptive Hypermedia and Adaptive Web Based Systems, 2002.
Google Scholar
Goldberg, K., Roeder, T., Gupta, D., and Perkins, C., Eigentaste: A constant time collaborative filtering algorithm. Journal Information Retrieval, 4(2):133–151, July 2001.
Article MATH Google Scholar
Golub, G.,and Reinsch, C., Singular value decomposition and least squares solutions. Numerische Mathematik, 14(5):403–420, April 1970.
Article MATH MathSciNet Google Scholar
Gose, E., Johnsonbaugh, R., and Jost, S., Pattern Recognition and Image Analysis. Prentice Hall, 1996.
Google Scholar
Guha, S., Rastogi, R., and Shim, K., Rock: a robust clustering algorithm for categorical attributes. In Proc. of the 15th Intl Conf. On Data Eng., 1999.
Google Scholar
Hartigan, J.A., Clustering Algorithms (Probability & Mathematical Statistics). John Wiley & Sons Inc, 1975.
Google Scholar
Herlocker, J.L., Konstan, J.A., Terveen, L.G., and Riedl, J.T., Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., 22(1):5–53, 2004.
Article Google Scholar
Huang, Z., Zeng, D., and Chen, H., A link analysis approach to recommendation under sparse data. In Proceedings of AMCIS 2004, 2004.
Google Scholar
Isaksson, A., Wallman, M., H. Göransson, and Gustafsson, M.G., Cross-validation and bootstrapping are unreliable in small sample classification. Pattern Recognition Letters, 29:1960–1965, 2008.
Article Google Scholar
Jolliffe, I.T., Principal Component Analysis. Springer, 2002.
Google Scholar
Kang, H.,and Yoo, S., Svm and collaborative filtering-based prediction of user preference for digital fashion recommendation systems. IEICE Transactions on Inf & Syst, 2007.
Google Scholar
Kurucz, M., Benczur, A.A., and Csalogany, K., Methods for large scale svd with missing values. In Proceedings of KDD Cup and Workshop 2007, 2007.
Google Scholar
Lathia, N., Hailes, S., and Capra, L., The effect of correlation coefficients on communities of recommenders. In SAC ’08: Proceedings of the 2008 ACM symposium on Applied computing, pages 2000–2005, New York, NY, USA, 2008. ACM.
Chapter Google Scholar
Lin, W.,and Alvarez, S., Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Discovery Journal, 6(1), 2004.
Google Scholar
M. R. McLaughlin and Herlocker, J.L., A collaborative filtering algorithm and evaluation metric that accurately model the user experience. In Proc. of SIGIR ’04, 2004.
Google Scholar
S. M. McNee, Riedl, J., and Konstan, J.A., Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI ’06: CHI ’06 extended abstracts on Human factors in computing systems, pages 1097–1101, New York, NY, USA, 2006. ACM Press.
Chapter Google Scholar
Miyahara, K.,and Pazzani, M.J., Collaborative filtering with the simple bayesian classifier. In Pacific Rim International Conference on Artificial Intelligence, 2000.
Google Scholar
Mobasher, B., Dai, H., Luo, T., and Nakagawa, M., Effective personalization based on association rule discovery from web usage data. In Workshop On Web Information And Data Management, WIDM ’01, 2001.
Google Scholar
Nikovski, D.,and Kulev, V., Induction of compact decision trees for personalized recommendation. In SAC ’06: Proceedings of the 2006 ACM symposium on Applied computing, pages 575–581, New York, NY, USA, 2006. ACM.
Chapter Google Scholar
M. P. Omahony. Detecting noise in recommender system databases. In In Proceedings of the International Conference on Intelligent User Interfaces (IUI06), 29th1st, pages 109–115. ACM Press, 2006.
Google Scholar
Paterek, A., Improving regularized singular value decomposition for collaborative filtering. In Proceedings of KDD Cup and Workshop 2007, 2007.
Google Scholar
Pazzani, M.J.,and Billsus, D., Learning and revising user profiles: The identification of interesting web sites. Machine Learning, 27(3):313–331, 1997.
Article Google Scholar
Pronk, V., Verhaegh, W., Proidl, A., and Tiemann, M., Incorporating user control into recommender systems based on naive bayesian classification. In RecSys ’07: Proceedings of the 2007 ACM conference on Recommender systems, pages 73–80, 2007.
Google Scholar
Pyle, D., Data Preparation for Data Mining. Morgan Kaufmann, second edition edition, 1999.
Google Scholar
Li, B., K.Q., Clustering approach for hybrid recommender system. In Web Intelligence 03, 2003.
Google Scholar
Quinlan, J.R., Induction of decision trees. Machine Learning, 1(1):81–106, March 1986.
Google Scholar
Rendle, S.,and L. Schmidt-Thieme. Online-updating regularized kernel matrix factorization models for large-scale recommender systems. In Recsys ’08: Proceedings of the 2008 ACM conference on Recommender Systems, 2008.
Google Scholar
Rokach, L., Maimon, O., Data Mining with Decision Trees: Theory and Applications, World Scientific Publishing (2008).
Google Scholar
Zhang, J., F.S., Ouyang, Y.,and Makedon, F., Analysis of a low-dimensional linear model under recommendation attacks. In Proc. of SIGIR ’06, 2006.
Google Scholar
Sarwar, B., Karypis, G., Konstan, J., and Riedl, J., Incremental svd-based algorithms for highly scalable recommender systems. In 5th International Conference on Computer and Information Technology (ICCIT), 2002.
Google Scholar
Sarwar, B.M., Karypis, G., Konstan, J.A., and Riedl, J.T., Application of dimensionality reduction in recommender systemsa case study. In ACM WebKDD Workshop, 2000.
Google Scholar
Schclar, A., Tsikinovsky, A., Rokach, L., Meisels, A., and Antwarg, L., Ensemble methods for improving the performance of neighborhood-based collaborative filtering. In RecSys ’09: Proceedings of the third ACM conference on Recommender systems, pages 261–264, New York, NY, USA, 2009. ACM.
Chapter Google Scholar
Smyth, B., K. McCarthy, Reilly, J., D. O‘Sullivan, L. McGinty, and Wilson, D., Case studies in association rule mining for recommender systems. In Proc. of International Conference on Artificial Intelligence (ICAI ’05), 2005.
Google Scholar
Spertus, E., Sahami, M., and Buyukkokten, O., Evaluating similarity measures: A large-scale study in the orkut social network. In Proceedings of the 2005 International Conference on Knowledge Discovery and Data Mining (KDD-05), 2005.
Google Scholar
Tiemann, M.,and Pauws, S., Towards ensemble learning for hybrid music recommendation. In RecSys ’07: Proceedings of the 2007 ACM conference on Recommender systems, pages 177–178, New York, NY, USA, 2007. ACM.
Chapter Google Scholar
Toescher, A., Jahrer, M., and Legenstein, R., Improved neighborhood-based algorithms for large-scale recommender systems. In In KDD-Cup and Workshop 08, 2008.
Google Scholar
Ungar, L.H.,and Foster, D.P., Clustering methods for collaborative filtering. In Proceedings of the Workshop on Recommendation Systems, 2000.
Google Scholar
Witten, I.H.,and Frank, E., Data Mining: Practical Machine Learning Tools and Techniques.Morgan Kaufmann, second edition edition, 2005.
Google Scholar
Wu, M., Collaborative filtering via ensembles of matrix factorizations. In Proceedings of KDD Cup and Workshop 2007, 2007.
Google Scholar
Xia, Z., Dong, Y., and Xing, G., Support vector machines for collaborative filtering. In ACMSE 44: Proceedings of the 44th annual Southeast regional conference, pages 169–174, New York, NY, USA, 2006. ACM.
Chapter Google Scholar
Xu, J.,and Araki, K., A svm-based personal recommendation system for tv programs. In Multi-Media Modelling Conference Proceedings, 2006.
Google Scholar
Xue, G., Lin,R., Yang, C., Xi, Q., Zeng, W., H.-, Yu, J., and Chen, Z., Scalable collaborative filtering using cluster-based smoothing. In Proceedings of the 2005 SIGIR, 2005.
Google Scholar
Yu, K., Tresp, V., and Yu, S., A nonparametric hierarchical bayesian framework for information filtering. In SIGIR ’04, 2004.
Google Scholar
Zhang, Y.,and Koren, J., Efficient bayesian hierarchical user modeling for recommendation system. In SIGIR 07, 2007.
Google Scholar
Ziegler, C., McNee N., S. M., Konstan, J.A., and Lausen, G., Improving recommendation lists through topic diversification. In Proc. of WWW ’05, 2005.
Google Scholar
Zurada, J., Introduction to artificial neural systems. West Publishing Co., St. Paul, MN, USA, 1992.
Google Scholar

Download references

Acknowledgments

This chapter has been written with partial support of an ICREA grant from the Generalitat de Catalunya.

Author information

Authors and Affiliations

Telefonica Research, Via Augusta, 122, Barcelona, 08021, Spain
Xavier Amatriain, Nuria Oliver & Josep M. Pujol
Yahoo! Research, Av.Diagonal, 177, Barcelona, 08018, Spain
Alejandro Jaimes*

Authors

Xavier Amatriain
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Jaimes*
View author publications
You can also search for this author in PubMed Google Scholar
Nuria Oliver
View author publications
You can also search for this author in PubMed Google Scholar
Josep M. Pujol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xavier Amatriain .

Editor information

Editors and Affiliations

, Faculty of Computer Science, Free University of Bozen-Bolzano, Piazza Domenicani 3, Bolzano, 39100, Italy
Francesco Ricci
, Dept. Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
Lior Rokach
Dept. Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Bracha Shapira
School of Communication,, Information & Library Studies, Rutgers University, Huntington Street 4, New Brunswick, 08901-1071, New Jersey, USA
Paul B. Kantor

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Amatriain, X., Jaimes*, A., Oliver, N., Pujol, J.M. (2011). Data Mining Methods for Recommender Systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P. (eds) Recommender Systems Handbook. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-85820-3_2

Download citation

DOI: https://doi.org/10.1007/978-0-387-85820-3_2
Published: 05 October 2010
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-85819-7
Online ISBN: 978-0-387-85820-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics