Skip to main content

Content Location in Peer-to-Peer Systems: Exploiting Locality

  • Chapter
Web Content Delivery

Abstract

Efficient content location is a fundamental problem for decentralized peer-to-peer systems. Gnutella, a popular file-sharing application, relies on flooding queries to all peers. Although flooding is simple and robust, it is not scalable. In this chapter, we explore how to retain the simplicity of Gnutella while addressing its inherent weakness: scalability. We propose two complementary content location solutions that exploit locality to improve scalability. First, we look at temporal locality and find that the popularity of search strings follows a Zipf-like distribution. Caching query results to exploit temporal locality can significantly decrease the amount of traffic seen on the network by 3-times while using only a few megabytes of memory. As our second solution, we exploit a simple, yet powerful principle called interest-based locality, which posits that if a peer has a particular piece of content that one is interested in, it is very likely that it will have other items that one is interested in as well. We propose that peers loosely organize themselves into an interest-based structure on top of the existing Gnutella network. When using our algorithm, called interest-based shortcuts, a significant amount of flooding can be avoided, reducing the total load in the system by a factor of 3 to 7 and reducing the time to locate content to only one peer-to-peer hop. We demonstrate the existence of both types of locality and evaluate our solutions using traces of several different content distribution systems such as the Web and popular peer-to-peer file-sharing applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Almeida, V., Bestavros, A., Crovella, M., and de Oliveira, A. (1996). Characterizing Reference Locality in the WWW. In Proceedings of 1996 International Conference on Parallel and Distributed Information Systems (PDIS’96).

    Google Scholar 

  • Bayardo, Jr., R., Somani, A., Gruhl, D., and Agrawal, R. (2002). YouServ: A Web Hosting and Content Sharing Tool for the Masses. In Proceedings of International WWW Conference.

    Google Scholar 

  • BitTorrent (2005). Available at http://bitconjurer.org/BitTorrent.

    Google Scholar 

  • Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. (1999). Web Caching and Zipf-like Distributions: Evidence and Implications. In Proceedings of the IEEE INFOCOMM’ 99.

    Google Scholar 

  • Chawathe, Y., Ratnasamy, S., Breslau, L., Lanham, N., and Shenker, S. (2003). Making Gnutella-like P2P Systems Scalable. In Proceedings of ACM Sigcomm.

    Google Scholar 

  • Crespo, A. and Garcia-Molina, H. (2002). Routing Indices for Peer-to-Peer Systems. In Proceedings of the IEEE ICDCS.

    Google Scholar 

  • Cunha, C., Bestavros, A., and Covella, M. (1995). Characteristics of WWW Client Based Traces. Technical Report BU-CS-95-010, Computer Science Department, Boston University.

    Google Scholar 

  • GTK-Gnutella (2005). http://gtk-gnutella.sourceforge.net.

    Google Scholar 

  • Harren, M., Hellerstein, J., Huebsch, R., Loo, B., Shenker, S., and Stoica, I. (2002). Complex Queries in DHT-based Peer-to-Peer Networks. In Proceedings of IPTPS.

    Google Scholar 

  • Iyer, S., Rowstron, A., and Druschel, P. (2002). Squirrel: A Decentralized Peerto-Peer Web Cache. In ACM Symposium on Principles of Distributed Computing, PODC.

    Google Scholar 

  • Jacobson, V., Leres, C., and McCanne, S. (2005). Tcpdump. Available at http://www.tcpdump.org/.

    Google Scholar 

  • Kazaa (2005). http://www.kazaa.com.

    Google Scholar 

  • 6-draft.html.

    Google Scholar 

  • Kroeger, T. M., Mogul, J. C., and Maltzahn, C. (1996). Digital’s web proxy traces. Available at ftp://ftp.digital.com/pub/DEC/traces/proxy/webtraces.html.

    Google Scholar 

  • Kumar, A., Xu, J., and Zegura, E. (2005). Efficient and Scalable Query Routing for Unstructured Peer-to-Peer Networks. In Proceedings of IEEE Infocom.

    Google Scholar 

  • Lv, Q., Cao, P., Li, K., and Shenker, S. (2002). Replication Strategies in Unstructured Peer-to-Peer Networks. In Proceedings of ACM International Conference on Supercomputing(ICS).

    Google Scholar 

  • Meadows, J. (1999). Boeing proxy logs. Available at ftp://researchsmp2.cc.vt.edu/pub/boeing/.

    Google Scholar 

  • server_survey.html.

    Google Scholar 

  • Padmanabhan, V.N. and Sripanidkulchai, K. (2002). The Case for Cooperative Networking. In Proceedings of International Workshop on Peer-To-Peer Systems.

    Google Scholar 

  • Plaxton, C., Rajaraman, R., and Richa, A. W. (1997). Accessing Nearby Copies of Replicated Objects in a Distributed Environment. In Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures.

    Google Scholar 

  • Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. (2001). A Scalable Content-Addressable Network. In Proceedings of ACM SIGCOMM.

    Google Scholar 

  • Ratnasamy, S., Shenker, S., and Stoica, I. (2002). Routing Algorithms for DHTs: Some Open Questions. In Proceedings of International Peer-To-Peer Workshop.

    Google Scholar 

  • Reynolds, Patrick and Vahdat, Amin (2003). Efficient Peer-to-Peer Keyword Searching. In Proceedings of the ACM/IFIP/USENIX Middleware Conference.

    Google Scholar 

  • Ripeanu, M., Foster, I., and Iamnitchi, A. (2002). Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design. IEEE Internet Computing Journal, 6(1).

    Google Scholar 

  • Rowstron, A. and Druschel, P. (2001). Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems. In IFIP/ACM International Conference on Distributed Systems Platforms (Middleware).

    Google Scholar 

  • Saroiu, S., Gummadi, K. P., and Gribble, S. D. (2002). A Measurement Study of Peer-to-Peer File Sharing Systems. In Proceedings of Multimedia Computing and Networking (MMCN).

    Google Scholar 

  • Sripanidkulchai, K. (2001). The Popularity of Gnutella Queries and Its Implications on Scalability. http://www.cs.cmu.edu/∼kunwadee/research/ p2p/gnutella.html.

    Google Scholar 

  • Sripanidkulchai, K., Maggs, B., and Zhang, H. (2003). Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems. In Proceedings of IEEE Infocom.

    Google Scholar 

  • Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. (2001). Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In Proceedings of ACM SIGCOMM.

    Google Scholar 

  • Tang, C., Xu, Z., and Dwarkadas, S. (2003). Peer-to-Peer Information Retrieval Using Self-Organizing Semantic Overlay Networks. In Proceedings of ACM Sigcomm.

    Google Scholar 

  • Wolman, A., Voelker, G., Sharma, N., Cardwell, N., Karlin, A., and Levy, H. (1999). On the Scale and Performance of Cooperative Web Proxy Caching. In Proceedings of ACM SOSP.

    Google Scholar 

  • Zhang, R. and Hu, Y. (2005). Assisted Peer-to-Peer Search with Partial Indexing. In Proceedings of IEEE Infocom.

    Google Scholar 

  • Zhao, B., Kubiatowicz, J., and Joseph, A. (2000). Tapestry: An Infrastructure for Wide-area Fault-tolerant Location and Routing. U. C. Berkeley Technical Report UCB//CSD-01-1141.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Sripanidkulchai, K., Zhang, H. (2005). Content Location in Peer-to-Peer Systems: Exploiting Locality. In: Tang, X., Xu, J., Chanson, S.T. (eds) Web Content Delivery. Web Information Systems Engineering and Internet Technologies Book Series, vol 2. Springer, Boston, MA. https://doi.org/10.1007/0-387-27727-7_4

Download citation

  • DOI: https://doi.org/10.1007/0-387-27727-7_4

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-24356-6

  • Online ISBN: 978-0-387-27727-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics