Abstract
Some of the reasons for unsatisfactory performance of today’s search engines are their centralized approach to web crawling and lack of explicit support from web servers. We propose a modification to conventional crawling in which a search engine uploads simple agents, called crawlets, to web sites. A crawlet crawls pages at a site locally and sends a compact summary back to the search engine. This not only reduces bandwidth requirements and network latencies, but also parallelizes crawling. Crawlets also provide an effective means for achieving the performance gains of personalized web servers, and can make up for the lack of cooperation from conventional web servers. The specialized nature of crawlets allows simple solutions to security and resource control problems, and reduces software requirements at participating web sites. In fact, we propose an implementation that requires no changes to web servers, but only the installation of a few (active) web pages at host sites.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
C.M. Bowman, P.B. Danzig, D.R. Hardy, U. Manber, and M.F. Schwartz. The harvest information discovery and access system. In Proceedings of the Second International WWW Conference:Mosaic and the Web, 1994.
Onn Brandman, Junghoo Cho, Hector Garcia-Molina, and Narayanan Shivakumar. Crawler-friendly web servers. In Proceedings of the Workshop on Performance and Architecture of Web Servers, 2000.
Brian Brewington. Observation of changing information sources. PhD thesis, Thayer School of Engineering, Dartmouth College, June 2000.
Brian Brewington and George Cybenko. Keeping up with the changing web. IEEE Computer, 33(5), May 2000.
Junghoo Cho, Hector Garcia-Molina, and Lawrence Page. Efficient crawling through url ordering. In Proceedings of the 7th World Wide Web conference, 1998.
William M. Farmer, Joshua D. Guttman, and Vipin Swarup. Security for mobile agents: authentication and state appraisal. In Proceedings of the European Symposium on Research in Computer Security, 1996.
J.S. Fritzinger and M. Mueller. Java security. Sun Microsystems, Inc., 1996.
Stefan Funfrocken. How to integrate mobile agents into web servers. In Proceedings of the Workshop on Collaborative Agents in Distributed Web Applications, 1997.
Vijay Gupta and Roy Campbell. Internet search engine freshness by web server help. In Proceedings of Symposium on Applications and the Internet, 2001.
Jaochim Hammer and Jan Fiedler. Using mobile crawlers to search the web efficiently. International Journal of Computer and Information Science, 1(1), 2000.
Steve Laurence and Lee C. Giles. Accessibility of information on the web. In Nature, volume 400, July 1999.
G. Necula and P. Lee. Proof carrying code. In ACM Symposium on Principles of Programming Languages, 1997.
Lawrence Page and Sergey Brin. The anatomy of a search engine. In Seventh International WWW Conference, 1998.
Daniela Rus, Robert Gray, and David Kotz. Autonomous and adaptive agents that gather information. In Proceedings of AAAI Workshop on Intelligent Agents, 1996.
T. Sander and C.F. Tschudin. Protecting mobile agents against malicious hosts. In Mobile Agents and Security, 1998. LNCS 1419.
Giovanni Vigna. Protecting mobile agents through tracing. In Proceedings of the 3 rd ECOOP Workshop on Mobile Object Systems, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thati, P., Po-Hao, C., Agha, G. (2001). Crawlets: Agents for High Performance Web Search Engines. In: Picco, G.P. (eds) Mobile Agents. MA 2001. Lecture Notes in Computer Science, vol 2240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45647-3_9
Download citation
DOI: https://doi.org/10.1007/3-540-45647-3_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42952-4
Online ISBN: 978-3-540-45647-6
eBook Packages: Springer Book Archive