Abstract
Traditionally, it is often assumed that data sparsity is a big problem of user-based collaborative filtering algorithm. However, the analysis is based only on data quantity without considering data quality, which is an important characteristic of data, sparse high quality data may be good for the algorithm, thus, the analysis is one-sided. In this paper, the effects of training ratings with different levels of sparsity on recommendation quality are first investigated on a real world dataset. Preliminary experimental results show that data sparsity can have positive effects on both recommendation accuracy and coverage. Next, the measurement of data noise is introduced. Then, taking data noise into consideration, the effects of data sparsity on the recommendation quality of the algorithm are re-evaluated. Experimental results show that if sparsity implies high data quality (low noise), then sparsity is good for both recommendation accuracy and coverage. This result has shown that the traditional analysis about the effect of data sparsity is one-sided, and has the implication that recommendation quality can be improved substantially by choosing high quality data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Su, X., Khoshgoftaar, T.M.: A Survey of Collaborative Filtering Techniques. Advances in Artificial Intelligence 2009, 1–19 (2009)
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In: ACM 1994 Conference on Computer Supported Cooperative Work (CSCW 1994), pp. 175–186. ACM, New York (1994)
Sarwar, B., Karypis, G., Konstan, J., Reidl, J.: Item-based Collaborative Filtering Recommendation Algorithms. In: 10th International World Wide Web Conference (WWW 2001), pp. 285–295. ACM, New York (2001)
Bobadilla, J., Serradilla, F.: The Effect of Sparsity on Collaborative Filtering Metrics. In: 20th Australasian Database Conference (ADC 2009), pp. 9–18. Australian Computer Society, Inc., Australia (2009)
Piccart, B., Struyf, J., Blockeel, H.: Alleviating the Sparsity Problem in Collaborative Filtering by Using an Adapted Distance and A Graph-based Method. In: Tenth SIAM International Conference on Data Mining, pp. 189–198. Society for Industrial and Applied Mathematics, USA (2010)
Hofmann, T.: Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis. In: 26th Annual International ACM SIGIR Conference (SIGIR 2003), pp. 259–266. ACM Press, New York (2003)
Marlin, B.: Modeling User Rating Profiles for Collaborative Filtering. In: 17th Annual Conference on Neural Information Processing Systems (NIPS 2003), pp. 627–634. MIT Press, Cambridge (2003)
Xue, G., Lin, C., Yang, Q., Xi, W., Zeng, H., Yu, Y., Chen, Z.: Scalable Collaborative Filtering Using Cluster-based Smoothing. In: 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005), pp. 114–121. ACM Press, New York (2005)
Gu, Q., Zhou, J., Ding, C.: Collaborative Filtering: Weighted Nonnegative Matrix Factorization Incorporating User and Item Graphs. In: 10th SIAM International Conference on Data Mining (SDM 2010), pp. 199–210. Society for Industrial and Applied Mathematics, Philadelphia (2010)
Herlocker, J., Konstan, J., Terveen, L., Riedl, J.: Evaluating Collaborative Filtering Recommender Systems. Transactions on Information Systems (TOIS) 22, 5–53 (2004)
McNee, S.M., Riedl, J., Konstan, J.A.: Accurate is not always good: How Accuracy Metrics have hurt Recommender Systems. In: 2006 Conference on Human Factors in Computing Systems (CHI 2006), pp. 1–5. ACM Press, New York (2006)
Amatriain, X., Lathia, N., Pujol, J.M., Kwak, H., Oliver, N.: The Wisdom of the Few: A collaborative Filtering Approach Based on Expert Opinions from the Web. In: 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009), pp. 532–539. ACM Press, New York (2009)
Hu, B., Li, Z., Chao, W., Hu, X., Wang, J.: User Preference Representation Based on Psychometric Models. In: 22nd Australia Database Conference (ADC 2011), pp. 59–66. ACS, Australia (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hu, B., Li, Z., Chao, W. (2012). When Sparsity Meets Noise in Collaborative Filtering. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds) Web Technologies and Applications. APWeb 2012. Lecture Notes in Computer Science, vol 7235. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29253-8_54
Download citation
DOI: https://doi.org/10.1007/978-3-642-29253-8_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29252-1
Online ISBN: 978-3-642-29253-8
eBook Packages: Computer ScienceComputer Science (R0)