Efficient Peer-to-Peer Keyword Searching
The recent file storage applications built on top of peer-to-peer distributed hash tables lack search capabilities. We believe that search is an important part of any document publication system. To that end, we have designed and analyzed a distributed search engine based on a distributed hash table. Our simulation results predict that our search engine can answer an average query in under one second, using under one kilobyte of bandwidth.
Keywordssearch distributed hash table peer-to-peer Bloom filter caching
- 3.Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. In 7th International World Wide Web Conference, 1998.Google Scholar
- 4.Junghoo Cho and Hector Garcia-Molina. The evolution of the web and implications for an incremental crawler. In The VLDB Journal, September 2000.Google Scholar
- 5.I. Clarke. A distributed decentralised information storage and retrieval system, 1999.Google Scholar
- 6.Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. Wide-area cooperative storage with CFS. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01), October 2001.Google Scholar
- 7.Li Fan, Pei Cao, Jussara Almeida, and Andrei Broder. Summary cache: A scalable wide-area web cache sharing protocol. In Proceedings of ACM SIGCOMM’98, pages 254–265, 1998.Google Scholar
- 8.Gnutella. http://gnutella.wego.com/.
- 9.T. Hong. Freenet: A distributed anonymous information storage and retrieval system. In ICSI Workshop on Design Issues in Anonymity and Unobservability, 2000.Google Scholar
- 10.David R. Karger, Eric Lehman, Frank Thomson Leighton, Rina Panigrahy, Matthew S. Levine, and Daniel Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In ACM Symposium on Theory of Computing, pages 654–663, 1997.Google Scholar
- 11.David Liben-Nowell, Hari Balakrishnan, and David Karger. Analysis of the evolution of peer-to-peer systems. In Proceedings of ACM Conference on Principles of Distributed Computing (PODC), 2002.Google Scholar
- 12.Lothar Mackert and Guy Lohman. R* optimizer validation and performance evaluation for local queries. In ACM-SIGMOD Conference on Management of Data, 1986.Google Scholar
- 14.Napster. http://www.napster.com/.
- 15.Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford University, 1998.Google Scholar
- 16.Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker. A scalable content-addressable network. In Proceedings of ACM SIGCOMM’01, 2001.Google Scholar
- 17.Antony Rowstron and Peter Druschel. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’ 01), 2001.Google Scholar
- 18.Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking 2002 (MMCN’02), January 2002.Google Scholar
- 19.Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of ACM SIGCOMM’01, 2001.Google Scholar
- 20.Beverly Yang and Hector Garcia-Molina. Efficient search in peer-to-peer networks. Technical Report 2001-47, Stanford University, October 2001.Google Scholar