Skip to main content

Web Crawler Architecture

  • Reference work entry
  • First Online:
Book cover Encyclopedia of Database Systems

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Boldi P, Codenotti B, Santini M, Vigna S. UbiCrawler: a scalable fully distributed web crawler. Software Pract Exper. 2004;34(8):711–26.

    Article  Google Scholar 

  2. Brin S, Page L. The anatomy of a large-scale hypertextual search engine. In: Proceedings of the 7th International World Wide Web Conference; 1998. p. 107–17.

    Google Scholar 

  3. Burner M. Crawling towards eternity: building an archive of the world wide web. Web Tech Mag. 1997;2(5):37–40.

    Google Scholar 

  4. Cho J, Garcia-Molina H. Parallel crawlers. In: Proceedings of the 11th International World Wide Web Conference; 2002. p. 124–35.

    Google Scholar 

  5. Eichmann D. The RBSE spider – balancing effective search against web load. In: Proceedings of the 3rd International World Wide Web Conference; 1994.

    Google Scholar 

  6. Gray M. Internet growth and statistics: credits and background. http://www.mit.edu/people/mkgray/net/background.html

  7. Hafri Y, Djeraba C. High performance crawling system. In: Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval; 2004. p. 299–306.

    Google Scholar 

  8. Heydon A, Najork M. Mercator: a scalable, extensible web crawler. World Wide Web. 1999;2(4): 219–29.

    Article  Google Scholar 

  9. Najork M, Heydon A. High-performance web crawling. Compaq SRC Research Report 173, Sept 2001.

    Google Scholar 

  10. Raghavan S, Garcia-Molina H. Crawling the hidden web. In: Proceedings of the 27th International Conference on Very Large Data Bases; 2001. p. 129–38.

    Google Scholar 

  11. Shkapenyuk V, Suel T. Design and implementation of a high-performance distributed web crawler. In: Proceedings of the 18th International Conference on Data Engineering; 2002. p. 357–68.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Najork .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Najork, M. (2018). Web Crawler Architecture. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_457

Download citation

Publish with us

Policies and ethics