Evaluating Strategies for Selecting Test Datasets in Recommender Systems

Pajuelo-Holguera, Francisco; Gómez-Pulido, Juan A.; Ortega, Fernando

doi:10.1007/978-3-030-29859-3_21

Francisco Pajuelo-Holguera¹³,
Juan A. Gómez-Pulido¹³ &
Fernando Ortega¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11734))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1332 Accesses

Abstract

Recommender systems based on collaborative filtering are widely used to predict users’ behaviour in large databases, where users rate items. The prediction model is built from a training dataset according to matrix factorization method and validated using a test dataset in order to measure the prediction error. Random selection is the most simple and instinctive way to build test datasets. Nevertheless, we could think about other deterministic methods to select test ratings uniformly along the database, in order to obtain a balanced contribution from all the users and items. In this paper, we perform several experiments of validating recommender systems using random and deterministic strategies to select test datasets. We considered a zigzag deterministic strategy that selects ratings uniformly across the rows and columns of the ratings matrix, following a diagonal path. After analysing the statistical results, we conclude that there are no particular advantages in considering the deterministic strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)
Article Google Scholar
Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 191–226. Springer, Boston (2015). https://doi.org/10.1007/978-1-4899-7637-6_6
Chapter Google Scholar
Ahn, H.J.: A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem. Inf. Sci. 178(1), 37–51 (2008)
Article Google Scholar
Banik, R.: The movies dataset, version 7 (2017). https://www.kaggle.com/rounakbanik/the-movies-dataset
Bell, R.M., Koren, Y.: Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 43–52, October 2007
Google Scholar
Bickel, S., Brückner, M., Scheffer, T.: Discriminative learning for differing training and test distributions. In: In: ICML, pp. 81–88. ACM Press (2007)
Google Scholar
Bobadilla, J., Serradilla, F., Bernal, J.: A new collaborative filtering metric that improves the behavior of recommender systems. Knowl. Based Syst. 23(6), 520–528 (2010)
Article Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of 19th International Conference on Computational Statistics, pp. 177–186. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Chapter Google Scholar
Harper, F.M., Konstan, J.A.: The MovieLens datasets: History and context. ACM Trans. Interact. Intell. Syst. 5(4), 19:1–19:19 (2015)
Article Google Scholar
Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. (TOIS) 22(1), 5–53 (2004)
Article Google Scholar
Hernando, A., Bobadilla, J., Ortega, F.: A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model. Knowl. Based Syst. 97, 188–202 (2016)
Article Google Scholar
Jannach, D., Zanker, M., Felfernig, A., Friedrich, G.: Recommender Systems. An Introduction. Cambridge University Press, New York (2011)
Google Scholar
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Mnih, A., Salakhutdinov, R.R.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems, pp. 1257–1264 (2008)
Google Scholar
Ortega, F., Hernando, A., Bobadilla, J., Kang, J.H.: Recommending items to group of users using matrix factorization based collaborative filtering. Inf. Sci. 345, 313–324 (2016)
Article Google Scholar
Paatero, P., Tapper, U.: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111–126 (1994)
Article Google Scholar
Rendle, S., Schmidt-Thieme, L.: Online-updating regularized kernel matrix factorization models for large-scale recommender systems. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 251–258 (2008)
Google Scholar
Ricci, F., Rokach, L., Shapira, B.: Introduction to recommender systems handbook. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 1–35. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-85820-3_1
Chapter MATH Google Scholar
Storkey, A.J.: When training and test sets are different: characterising learning transfer. In: Dataset Shift in Machine Learning, pp. 3–28. MIT Press (2009)
Google Scholar

Download references

Acknowledgments

This work was partially funded by the Government of Extremadura under the project IB16002, and by the AEI (State Research Agency, Spain) and the ERDF (European Regional Development Fund, EU) under the contract TIN2016-76259-P.

Author information

Authors and Affiliations

Department Technologies of Computers and Communications, Universidad de Extremadura, 10003, Caceres, Spain
Francisco Pajuelo-Holguera & Juan A. Gómez-Pulido
Department Sistemas Informáticos, ETSI Sistemas Informáticos, Universidad Politécnica de Madrid, Madrid, Spain
Fernando Ortega

Authors

Francisco Pajuelo-Holguera
View author publications
You can also search for this author in PubMed Google Scholar
Juan A. Gómez-Pulido
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Ortega
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan A. Gómez-Pulido .

Editor information

Editors and Affiliations

University of León, León, Spain
Hilde Pérez García
University of León, León, Spain
Lidia Sánchez González
University of León, León, Spain
Manuel Castejón Limas
University of A Coruña, Ferrol, Spain
Héctor Quintián Pardo
University of Salamanca, Salamanca, Spain
Emilio Corchado Rodríguez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pajuelo-Holguera, F., Gómez-Pulido, J.A., Ortega, F. (2019). Evaluating Strategies for Selecting Test Datasets in Recommender Systems. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2019. Lecture Notes in Computer Science(), vol 11734. Springer, Cham. https://doi.org/10.1007/978-3-030-29859-3_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-29859-3_21
Published: 26 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29858-6
Online ISBN: 978-3-030-29859-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics