Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
- 2k Citations
- 5.3k Downloads
Abstract
This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer applications. Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming.
Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops
Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry’s scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties
References
- 1.Napster. http://www.napster.com/.
- 2.The Gnutella protocol specification, 2000. http://dss.clip2.com/GnutellaProtocol04.pdf.
- 3.W. Adjie-Winoto, E. Schwartz, H. Balakrishnan, and J. Lilley. The design and implementation of an intentional naming system. In Proc. SOSP’99, Kiawah Island, SC, Dec. 1999.Google Scholar
- 4.Y. Amir, A. Peterson, and D. Shaw. Seamlessly selecting the best copy from Internet-wide replicated web servers. In Proc. 12th Symposium on Distributed Computing, Andros, Greece, Sept. 1998.Google Scholar
- 5.W. J. Bolosky, J. R. Douceur, D. Ely, and M. Theimer. Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs. In Proc. SIGMETRICS’2000, Santa Clara, CA, 2000.Google Scholar
- 6.M. Bowman, L.L. Peterson, and A. Yeatts. Univers: An attribute-based name server. Software — Practice and Experience, 20(4):403–424, Apr. 1990.CrossRefGoogle Scholar
- 7.D. R. Cheriton and T. P. Mann. Decentralizing a global naming service for improved performance and fault tolerance. ACM Trans. Comput. Syst., 7(2):147–183, May 1989.CrossRefGoogle Scholar
- 8.I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong. Freenet: A distributed anonymous information storage and retrieval system. In Workshop on Design Issues in Anonymity and Unobservability, pages 311–320, July 2000. ICSI, Berkeley, CA, USA.Google Scholar
- 9.F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Proc. ACM SOSP’01, Banff, Canada, Oct. 2001.Google Scholar
- 10.R. Dingledine, M. J. Freedman, and D. Molnar. The Free Haven project: Distributed anonymous storage service. In Proc. Workshop on Design Issues in Anonymity and Unobservability, Berkeley, CA, July 2000.Google Scholar
- 11.P. Druschel and A. Rowstron. PAST: A large-scale, persistent peer-to-peer storage utility. In Proc. HotOS VIII, Schloss Elmau, Germany, May 2001.Google Scholar
- 12.J. Jannotti, D. K. Gifford, K. L. Johnson, M. F. Kaashoek, and J. W. O’Toole. Overcast: Reliable multicasting with an overlay network. In Proc. OSDI 2000, San Diego, CA, 2000.Google Scholar
- 13.J. Kangasharju, J. W. Roberts, and K. W. Ross. Performance evaluation of redirection schemes in content distribution networks. In Proc. 4th Web Caching Workshop, San Diego, CA, Mar. 1999.Google Scholar
- 14.J. Kangasharju and K. W. Ross. A replicated architecture for the domain name system. In Proc. IEEE Infocom 2000, Tel Aviv, Israel, Mar. 2000.Google Scholar
- 15.J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W Weimer, C. Wells, and B. Zhao. Oceanstore: An architecture for globalscale persistent store. In Proc. ASPLOS’2000, Cambridge, MA, November 2000.Google Scholar
- 16.B. Lampson. Designing a global name service. In Proc. Fifth Symposium on the Principles of Distributed Computing, pages 1–10, Minaki, Canada, Aug. 1986.Google Scholar
- 17.J. Li, J. Jannotti, D. S. J. D. Couto, D. R. Karger, and R. Morris. A scalable location service for geographical ad hoc routing. In Proc. of ACM MOBICOM 2000, Boston, MA, 2000.Google Scholar
- 18.C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby copies of replicated objects in a distributed environment. Theory of Computing Systems, 32:241–280, 1999.zbMATHCrossRefMathSciNetGoogle Scholar
- 19.S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In Proc. ACM SIGCOMM’01, San Diego, CA, Aug. 2001.Google Scholar
- 20.J. Reynolds. RFC 1309: Technical overview of directory services using the X.500 protocol, Mar. 1992.Google Scholar
- 21.A. Rowstron and P. Druschel. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proc. ACM SOSP’01, Banff, Canada, Oct. 2001.Google Scholar
- 22.A. Rowstron, A.-M. Kermarrec, P. Druschel, and M. Castro. Scribe: The design of a large-scale event notification infrastructure. Submitted for publication. June 2001. http://www.research.microsoft.com/antr/SCRIBE/.
- 23.M. A. Sheldon, A. Duda, R. Weiss, and D. K. Gifford. Discover: A resource discovery system based on content routing. In Proc. 3rd International World Wide Web Conference, Darmstadt, Germany, 1995.Google Scholar
- 24.I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proc. ACM SIGCOMM’01, San Diego, CA, Aug. 2001.Google Scholar
- 25.P. F. Tsuchiya. The landmark hierarchy: a new hierarchy for routing in very large networks. In SIGCOMM’88, Stanford, CA, 1988.Google Scholar
- 26.E. Zegura, K. Calvert, and S. Bhattacharjee. How to model an internetwork. In INFOCOM’96, San Francisco, CA, 1996.Google Scholar
- 27.B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph. Tapestry: An infrastructure for fault-resilient wide-area location and routing. Technical Report UCB//CSD-01-1141, U. C. Berkeley, April 2001.Google Scholar