Skip to main content

When Sparsity Meets Noise in Collaborative Filtering

  • Conference paper
Web Technologies and Applications (APWeb 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7235))

Included in the following conference series:

  • 2164 Accesses

Abstract

Traditionally, it is often assumed that data sparsity is a big problem of user-based collaborative filtering algorithm. However, the analysis is based only on data quantity without considering data quality, which is an important characteristic of data, sparse high quality data may be good for the algorithm, thus, the analysis is one-sided. In this paper, the effects of training ratings with different levels of sparsity on recommendation quality are first investigated on a real world dataset. Preliminary experimental results show that data sparsity can have positive effects on both recommendation accuracy and coverage. Next, the measurement of data noise is introduced. Then, taking data noise into consideration, the effects of data sparsity on the recommendation quality of the algorithm are re-evaluated. Experimental results show that if sparsity implies high data quality (low noise), then sparsity is good for both recommendation accuracy and coverage. This result has shown that the traditional analysis about the effect of data sparsity is one-sided, and has the implication that recommendation quality can be improved substantially by choosing high quality data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Su, X., Khoshgoftaar, T.M.: A Survey of Collaborative Filtering Techniques. Advances in Artificial Intelligence 2009, 1–19 (2009)

    Google Scholar 

  2. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In: ACM 1994 Conference on Computer Supported Cooperative Work (CSCW 1994), pp. 175–186. ACM, New York (1994)

    Chapter  Google Scholar 

  3. Sarwar, B., Karypis, G., Konstan, J., Reidl, J.: Item-based Collaborative Filtering Recommendation Algorithms. In: 10th International World Wide Web Conference (WWW 2001), pp. 285–295. ACM, New York (2001)

    Chapter  Google Scholar 

  4. Bobadilla, J., Serradilla, F.: The Effect of Sparsity on Collaborative Filtering Metrics. In: 20th Australasian Database Conference (ADC 2009), pp. 9–18. Australian Computer Society, Inc., Australia (2009)

    Google Scholar 

  5. Piccart, B., Struyf, J., Blockeel, H.: Alleviating the Sparsity Problem in Collaborative Filtering by Using an Adapted Distance and A Graph-based Method. In: Tenth SIAM International Conference on Data Mining, pp. 189–198. Society for Industrial and Applied Mathematics, USA (2010)

    Google Scholar 

  6. Hofmann, T.: Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis. In: 26th Annual International ACM SIGIR Conference (SIGIR 2003), pp. 259–266. ACM Press, New York (2003)

    Google Scholar 

  7. Marlin, B.: Modeling User Rating Profiles for Collaborative Filtering. In: 17th Annual Conference on Neural Information Processing Systems (NIPS 2003), pp. 627–634. MIT Press, Cambridge (2003)

    Google Scholar 

  8. Xue, G., Lin, C., Yang, Q., Xi, W., Zeng, H., Yu, Y., Chen, Z.: Scalable Collaborative Filtering Using Cluster-based Smoothing. In: 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005), pp. 114–121. ACM Press, New York (2005)

    Chapter  Google Scholar 

  9. Gu, Q., Zhou, J., Ding, C.: Collaborative Filtering: Weighted Nonnegative Matrix Factorization Incorporating User and Item Graphs. In: 10th SIAM International Conference on Data Mining (SDM 2010), pp. 199–210. Society for Industrial and Applied Mathematics, Philadelphia (2010)

    Google Scholar 

  10. Herlocker, J., Konstan, J., Terveen, L., Riedl, J.: Evaluating Collaborative Filtering Recommender Systems. Transactions on Information Systems (TOIS) 22, 5–53 (2004)

    Article  Google Scholar 

  11. McNee, S.M., Riedl, J., Konstan, J.A.: Accurate is not always good: How Accuracy Metrics have hurt Recommender Systems. In: 2006 Conference on Human Factors in Computing Systems (CHI 2006), pp. 1–5. ACM Press, New York (2006)

    Google Scholar 

  12. Amatriain, X., Lathia, N., Pujol, J.M., Kwak, H., Oliver, N.: The Wisdom of the Few: A collaborative Filtering Approach Based on Expert Opinions from the Web. In: 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009), pp. 532–539. ACM Press, New York (2009)

    Chapter  Google Scholar 

  13. Hu, B., Li, Z., Chao, W., Hu, X., Wang, J.: User Preference Representation Based on Psychometric Models. In: 22nd Australia Database Conference (ADC 2011), pp. 59–66. ACS, Australia (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hu, B., Li, Z., Chao, W. (2012). When Sparsity Meets Noise in Collaborative Filtering. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds) Web Technologies and Applications. APWeb 2012. Lecture Notes in Computer Science, vol 7235. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29253-8_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29253-8_54

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29252-1

  • Online ISBN: 978-3-642-29253-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics