Skip to main content

Antispam Topic Crawler Algorithm Based on Anti Spoofing

  • Conference paper
  • First Online:
Informatics and Management Science I

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 204))

  • 905 Accesses

Abstract

The main target for the current crawler system lack the ability of detecting (Web Spam Detection) capacity, which is the primary limitation for further improvement of their performance. In order to supply a want, the topic crawler algorithm based on anti Spoofing is proposed. The design goal of topic crawler is to gather more relevant to subject pages with limited resources, and minimize the likelihood of the irrelevant page. And the algorithm enables the topic crawlers to the function of the ant spam, improves the correlation of the pages downloaded by the topic crawlers, and enhances the adaptability of the crawlers. And the algorithm’s effectiveness has been verified by experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhu L (2008) Rearch and design about Topic crawler on Web. Nanjing Univ Sci 7(3):11–13

    Google Scholar 

  2. Zhou X, Zhang HX (2008) An algorithm of text categorization based on similar rough set and fuzzy cognitive map. In: Proceedings of the 5th international conference on fuzzy systems and knowledge discovery, Jinan, China, vol 40(4), pp 34–36

    Google Scholar 

  3. Belkin M, Niyogi P, Sindhwani V (2005) On manifold regularization. In: Proceedings of the 10th international workshop on artifcial intelligence and statistics (AISTATS), vol 22(18), pp 6–7

    Google Scholar 

  4. Zhang T, Popescul A, Dom B (2006) Linear prediction models with graph regularization for web-page categorization. In: KDD 06: Proceedings of the 12th ACM SIGKDD interactional conference on knowledge discovery and data mining, vol 34(5), pp 821–826

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoqiang Jia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this paper

Cite this paper

Jia, X. (2013). Antispam Topic Crawler Algorithm Based on Anti Spoofing. In: Du, W. (eds) Informatics and Management Science I. Lecture Notes in Electrical Engineering, vol 204. Springer, London. https://doi.org/10.1007/978-1-4471-4802-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4802-9_20

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4801-2

  • Online ISBN: 978-1-4471-4802-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics