Synchronous Do-All with Crashes: Using Perfect Knowledge and Reliable Multicast
We start the study of theDo-Allproblem by considering a synchronous distributed environment and under the adversary that can cause pro- cessor crashes, the more benign type of adversity. In order to understand better the inherent limitations and difficulties of solving theDo-All and Do-All iterative problems in the presence of crashes, we first abstract away any communication issues by assuming an oracle that provides load-balancing and computational progress information to the processors. Such and oracle provides, what we call, perfect knowledge to the algorithms solving the problem. We present matching upper and lower bounds on total-work for models withperfect knowledge. These bounds arefailure-sensitive, which means we give bounds that carefully incorporate the (maximum) number of processor crashes. We then present an algorithm that efficiently solves theDo-All and iterative Do-Allproblems assuming a message-passing environment wherereliable multicast is available. If a processor crashes after starting a multicast of a message, then this message is either received by all non-faulty targeted processors or by none. In this setting the availability of reliable multicast effectively approximates the availability of perfect knowledge, making it pos-sible to use the complexity results for perfect knowledge in the analysis of the algorithm.
KeywordsPerfect Knowledge Message Complexity Local View Report Message Multicast Message
Unable to display preview. Download preview PDF.