Skip to main content

Crawlets: Agents for High Performance Web Search Engines

  • Conference paper
  • First Online:
Mobile Agents (MA 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2240))

Included in the following conference series:

Abstract

Some of the reasons for unsatisfactory performance of today’s search engines are their centralized approach to web crawling and lack of explicit support from web servers. We propose a modification to conventional crawling in which a search engine uploads simple agents, called crawlets, to web sites. A crawlet crawls pages at a site locally and sends a compact summary back to the search engine. This not only reduces bandwidth requirements and network latencies, but also parallelizes crawling. Crawlets also provide an effective means for achieving the performance gains of personalized web servers, and can make up for the lack of cooperation from conventional web servers. The specialized nature of crawlets allows simple solutions to security and resource control problems, and reduces software requirements at participating web sites. In fact, we propose an implementation that requires no changes to web servers, but only the installation of a few (active) web pages at host sites.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C.M. Bowman, P.B. Danzig, D.R. Hardy, U. Manber, and M.F. Schwartz. The harvest information discovery and access system. In Proceedings of the Second International WWW Conference:Mosaic and the Web, 1994.

    Google Scholar 

  2. Onn Brandman, Junghoo Cho, Hector Garcia-Molina, and Narayanan Shivakumar. Crawler-friendly web servers. In Proceedings of the Workshop on Performance and Architecture of Web Servers, 2000.

    Google Scholar 

  3. Brian Brewington. Observation of changing information sources. PhD thesis, Thayer School of Engineering, Dartmouth College, June 2000.

    Google Scholar 

  4. Brian Brewington and George Cybenko. Keeping up with the changing web. IEEE Computer, 33(5), May 2000.

    Google Scholar 

  5. Junghoo Cho, Hector Garcia-Molina, and Lawrence Page. Efficient crawling through url ordering. In Proceedings of the 7th World Wide Web conference, 1998.

    Google Scholar 

  6. William M. Farmer, Joshua D. Guttman, and Vipin Swarup. Security for mobile agents: authentication and state appraisal. In Proceedings of the European Symposium on Research in Computer Security, 1996.

    Google Scholar 

  7. J.S. Fritzinger and M. Mueller. Java security. Sun Microsystems, Inc., 1996.

    Google Scholar 

  8. Stefan Funfrocken. How to integrate mobile agents into web servers. In Proceedings of the Workshop on Collaborative Agents in Distributed Web Applications, 1997.

    Google Scholar 

  9. Vijay Gupta and Roy Campbell. Internet search engine freshness by web server help. In Proceedings of Symposium on Applications and the Internet, 2001.

    Google Scholar 

  10. Jaochim Hammer and Jan Fiedler. Using mobile crawlers to search the web efficiently. International Journal of Computer and Information Science, 1(1), 2000.

    Google Scholar 

  11. Steve Laurence and Lee C. Giles. Accessibility of information on the web. In Nature, volume 400, July 1999.

    Google Scholar 

  12. G. Necula and P. Lee. Proof carrying code. In ACM Symposium on Principles of Programming Languages, 1997.

    Google Scholar 

  13. Lawrence Page and Sergey Brin. The anatomy of a search engine. In Seventh International WWW Conference, 1998.

    Google Scholar 

  14. Daniela Rus, Robert Gray, and David Kotz. Autonomous and adaptive agents that gather information. In Proceedings of AAAI Workshop on Intelligent Agents, 1996.

    Google Scholar 

  15. T. Sander and C.F. Tschudin. Protecting mobile agents against malicious hosts. In Mobile Agents and Security, 1998. LNCS 1419.

    Chapter  Google Scholar 

  16. Giovanni Vigna. Protecting mobile agents through tracing. In Proceedings of the 3 rd ECOOP Workshop on Mobile Object Systems, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thati, P., Po-Hao, C., Agha, G. (2001). Crawlets: Agents for High Performance Web Search Engines. In: Picco, G.P. (eds) Mobile Agents. MA 2001. Lecture Notes in Computer Science, vol 2240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45647-3_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-45647-3_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42952-4

  • Online ISBN: 978-3-540-45647-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics