On Disconnection Node Failure and Stochastic Static Resilience of P2P Communication Networks
There exist a large number of graph optimization problems in the literature, which arise in network design and analysis. Our objective in this paper is to highlight the disconnection probability which can arise in interconnect networks of large-scale parallel processors systems. Although traditional measures of fault-tolerance such as reliability and availability are applicable to such systems, these measures were designed mostly for mission-oriented applications or repairable systems. They fail to account for the high redundancy levels typical in peer-to-peer (P2P) communication and distributed systems. For these systems, new measures have been introduced that can evaluate the capability of a system for graceful degradation. In the design of such systems, one of the most fundamental considerations is the reliability of their interconnected networks, which can be usually characterized by connectivity of the topological structure of the network. In this paper, we analyze the problem of network disconnection in the context of large-scale P2P networks and understand how static patterns of node failure affect the resilience of such networks. Simulation results based on the network topology confirm the validity of the analytical approximation and demonstrate the localizer efficiency.
KeywordsTopological Structure Random Graph Mesh Network Node Failure Distribute Hash Table
Unable to display preview. Download preview PDF.
- 4.Mejia, A., et al.: Segment-Based Routing: An Efficient Fault-Tolerant Routing Algorithm for Meshes and Tori. In: IEEE International Parallel & Distributed Processing Symposium (IPDPS) (2006) Google Scholar
- 5.Montanana, J.M., et al.: A Transition-Based Fault-Tolerant Routing Methodology for InfiniBand Networks. In: IEEE International Parallel & Distributed Processing Symposium (IPDPS) (2004) Google Scholar
- 7.Chakravorty, S., Kale, L.V.: A Fault Tolerant Protocol for Massively Parallel Systems, IEEE International Parallel & Distributed Processing Symposium (IPDPS) (2004) Google Scholar
- 9.Leonard, D., et al.: On static and dynamic partitioning behavior of large-scale networks, ICNP05, pp. 345–357 (2005)Google Scholar
- 10.Leonard, D., et al.: On lifetime-based node failure and stochastic resilience of decentralized peer-to-peer networks. In: SIGMETRICS, pp. 26–37 (2005)Google Scholar
- 11.Stoica, I., et al.: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In: Proceedings of ACM SIGCOMM’01, San Diego (2001)Google Scholar
- 13.Gummadi, K.: The impact of DHT routing geometry on resilience and proximity. In: ACM SIGCOMM, pp. 381–394 (2003)Google Scholar
- 14.Kaashoek, F., Karger, D.R.: Koorde: A simple degree-optimal distributed hash table. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, Springer, Heidelberg (2003)Google Scholar
- 15.Massoulié, L., Ganesh, A.J., Kermarrec, A.M.: Network Awareness and Failure Resilience in Self-Organizing Overlay Networks. In: IEEE Symposium on Reliable and Distributed Systems, Florence (2003)Google Scholar
- 16.Fiat, A., Saia, J.: Censorship Resistant Peer-to-Peer Content Addressable Networks. In: Proceedings of Symposium on Discrete Algorithms, ACM-SIAM SODA (2002)Google Scholar
- 18.Avezienis, A.: Fault-tolerant computing- an overview. IEEE Transactions on Computers 4, 5–8 (1971)Google Scholar
- 19.Zimmerman, G.W., Esfahanian, A.H.: A New Approach to System-Level Fault-Tolerance in Message-Passing Multicomputers. In: Great Lakes Computer Science Conference 38(11), 357–363 (1989)Google Scholar