Advertisement

Differential Trust Propagation with Community Discovery for Link-Based Web Spam Demotion

  • Xianchao Zhang
  • Yafei Feng
  • Hua Shen
  • Wenxin LiangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9098)

Abstract

In this paper, we propose a novel differential trust propagation scheme with community discovery, which can be applied to all kinds of trust propagation algorithms. We first use a random walk-based community discovery algorithm to preselect suspicious communities in which the members are almost spam pages. We then utilize these suspicious communities to limit the across-community-boundary trust propagation. Experimental results on WEBSPAM-UK2007 and ClueWeb09 demonstrate that the proposed penalizing scheme significantly improves the performance of trust propagation algorithms such as TrustRank, LCRank, CPV.

Keywords

Web spam Community discovery Differential trust propagation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: VLDB 2004, pp. 576–587 (2004)Google Scholar
  2. 2.
    Zhang, X., Wang, Y., Mou, N., Liang, W.: Propagating both trust and distrust with target differentiation for combating web spam. In: AAAI (2011)Google Scholar
  3. 3.
    Wu, B., Chellapilla, K.: Extracting link spam using biased random walks from spam seed sets. In: Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web, pp. 37–44. ACM (2007)Google Scholar
  4. 4.
    Krishnan, V., Raj, R.: Web spam detection with anti-trust rank. In: AIRWeb 2006, pp. 37–40 (2006)Google Scholar
  5. 5.
    Wu, B., Goel, V., Davison, B.D.: Propagating trust and distrust to demote web spam. In: MTW 2006 (2006)Google Scholar
  6. 6.
    Zhang, Y., Jiang, Q., Zhang, L., Zhu, Y.: Exploiting bidirectional links: making spamming detection easier. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1839–1842. ACM (2009)Google Scholar
  7. 7.
    Yahoo!: Yahoo! research: Web spam collections. http://barcelona.research.yahoo.net/webspam/datasets/ Crawled by the Laboratory of Web Algorithmics. University of Milan (2007). http://law.dsi.unimi.it/
  8. 8.
    Callan, J., Hoy, M., Yoo, C., Zhao, L.: The clueweb09 data set (2009). http://boston.lti.cs.cmu.edu/Data/clueweb09/

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Xianchao Zhang
    • 1
  • Yafei Feng
    • 1
  • Hua Shen
    • 1
  • Wenxin Liang
    • 1
    Email author
  1. 1.School of SoftwareDalian University of TechnologyDalianChina

Personalised recommendations