Abstract
Recommender systems based on collaborative filtering are widely used to predict users’ behaviour in large databases, where users rate items. The prediction model is built from a training dataset according to matrix factorization method and validated using a test dataset in order to measure the prediction error. Random selection is the most simple and instinctive way to build test datasets. Nevertheless, we could think about other deterministic methods to select test ratings uniformly along the database, in order to obtain a balanced contribution from all the users and items. In this paper, we perform several experiments of validating recommender systems using random and deterministic strategies to select test datasets. We considered a zigzag deterministic strategy that selects ratings uniformly across the rows and columns of the ratings matrix, following a diagonal path. After analysing the statistical results, we conclude that there are no particular advantages in considering the deterministic strategy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)
Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 191–226. Springer, Boston (2015). https://doi.org/10.1007/978-1-4899-7637-6_6
Ahn, H.J.: A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem. Inf. Sci. 178(1), 37–51 (2008)
Banik, R.: The movies dataset, version 7 (2017). https://www.kaggle.com/rounakbanik/the-movies-dataset
Bell, R.M., Koren, Y.: Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 43–52, October 2007
Bickel, S., Brückner, M., Scheffer, T.: Discriminative learning for differing training and test distributions. In: In: ICML, pp. 81–88. ACM Press (2007)
Bobadilla, J., Serradilla, F., Bernal, J.: A new collaborative filtering metric that improves the behavior of recommender systems. Knowl. Based Syst. 23(6), 520–528 (2010)
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of 19th International Conference on Computational Statistics, pp. 177–186. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Harper, F.M., Konstan, J.A.: The MovieLens datasets: History and context. ACM Trans. Interact. Intell. Syst. 5(4), 19:1–19:19 (2015)
Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. (TOIS) 22(1), 5–53 (2004)
Hernando, A., Bobadilla, J., Ortega, F.: A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model. Knowl. Based Syst. 97, 188–202 (2016)
Jannach, D., Zanker, M., Felfernig, A., Friedrich, G.: Recommender Systems. An Introduction. Cambridge University Press, New York (2011)
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Mnih, A., Salakhutdinov, R.R.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems, pp. 1257–1264 (2008)
Ortega, F., Hernando, A., Bobadilla, J., Kang, J.H.: Recommending items to group of users using matrix factorization based collaborative filtering. Inf. Sci. 345, 313–324 (2016)
Paatero, P., Tapper, U.: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111–126 (1994)
Rendle, S., Schmidt-Thieme, L.: Online-updating regularized kernel matrix factorization models for large-scale recommender systems. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 251–258 (2008)
Ricci, F., Rokach, L., Shapira, B.: Introduction to recommender systems handbook. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 1–35. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-85820-3_1
Storkey, A.J.: When training and test sets are different: characterising learning transfer. In: Dataset Shift in Machine Learning, pp. 3–28. MIT Press (2009)
Acknowledgments
This work was partially funded by the Government of Extremadura under the project IB16002, and by the AEI (State Research Agency, Spain) and the ERDF (European Regional Development Fund, EU) under the contract TIN2016-76259-P.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Pajuelo-Holguera, F., Gómez-Pulido, J.A., Ortega, F. (2019). Evaluating Strategies for Selecting Test Datasets in Recommender Systems. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2019. Lecture Notes in Computer Science(), vol 11734. Springer, Cham. https://doi.org/10.1007/978-3-030-29859-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-29859-3_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29858-6
Online ISBN: 978-3-030-29859-3
eBook Packages: Computer ScienceComputer Science (R0)