Skip to main content

Reducing the Bandwidth Requirements of P2P Keyword Indexing

  • Conference paper
Book cover Distributed and Parallel Computing (ICA3PP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3719))

  • 562 Accesses

Abstract

This paper describes the design and evaluation of a federated, peer-to-peer indexing system, which can be used to integrate the resources of local systems into a globally addressable index using a distributed hash table. The salient feature of the indexing systems design is the efficient dissemination of term-document indices using a combination of duplicate elimination, leaf set forwarding and conventional techniques such as aggressive index pruning, index compression, and batching. Together these indexing strategies help to reduce the number of RPC operations required to locate the nodes responsible for a section of the index, as well as the bandwidth utilization and the latency of the indexing service. Using empirical observation we evaluate the performance benefits of these cumulative optimizations and show that these design trade-offs can significantly improve indexing performance when using a distributed hash table.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bolosky, W.J., Douceur, J.R., Ely, D., Theimer, M.: Feasibility of a Serverless Distributed File System Deployed on an existing set of Desktop PCs. SIGMetrics 2000 (2000)

    Google Scholar 

  2. Rhea, S., Geels, D., Roscoe, T., Kubiatowicz, J.: Handling Churn in a DHT. Usenix 2004 (2004)

    Google Scholar 

  3. Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. SIGComm 2001 (2001)

    Google Scholar 

  4. Rowstron, A., Druschel, P.: Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  5. Reynolds, P., Vahdat, A.: Efficient Peer-to-Peer Keyword Searching. Middleware 2003 (2003)

    Google Scholar 

  6. Burkard, T.: Herodotus: A Peer-to-PeerWeb Archival System. In: Department of Electrical Engineering and Computer Science, Cambridge, Massachusetts Institute of Technology (2002)

    Google Scholar 

  7. Singh, A., Srivatsa, M., Liu, L., Miller, T.: Apoidea: A Decentralized Peer-to-Peer Architecture for Crawling the World Wide Web. SIG 2003 (2003)

    Google Scholar 

  8. Muthitacharoen, A., Chen, B., Mazières, D.: A Low-bandwidth Network File System. In: 18th SOSP (2001)

    Google Scholar 

  9. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Comm. of the ACM 13, 422–426 (1970)

    Article  MATH  Google Scholar 

  10. Sinka, M., Corne, D.: A large benchmark dataset for web document clustering. Soft Computing Systems: Design, Management and Applications 87, 881–890 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Casey, J., Zhou, W. (2005). Reducing the Bandwidth Requirements of P2P Keyword Indexing. In: Hobbs, M., Goscinski, A.M., Zhou, W. (eds) Distributed and Parallel Computing. ICA3PP 2005. Lecture Notes in Computer Science, vol 3719. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564621_6

Download citation

  • DOI: https://doi.org/10.1007/11564621_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29235-7

  • Online ISBN: 978-3-540-32071-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics