IN THIS chapter we discuss the simulation of fault-free prams on fault-prone prams. The simulation is based on a technique for executing arbitrary pram steps on a pram whose processors are subject to fail-stop failures. In each of the specific simulations, the execution of a single N-processor pram step on a fail-stop P-processor pram has the same asymptotic complexity as that of solving a N-size instance of the Write-All problem using P fail-stop processors. We also show that in some cases it is possible to develop fault-tolerant algorithms that improve on the efficiency of the oblivious simulations. Finally, we discuss parallel efficiency classes and closures with respect to fault tolerant simulations.
KeywordsShared Memory Overhead Ratio Instruction Counter Private Memory Parallel Prefix
Unable to display preview. Download preview PDF.
- R.M. Karp and V. Ramachandran, “A Survey of Parallel Algorithms for Shared-Memory Machines”, in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.Google Scholar
- F. Thomson Leighton, Introduction to Parallel Algorithms and Architectures: Array, Trees, Hypercubes, Morgan Kaufman Publishers, San Mateo, CA, 1992.Google Scholar
- P.C. Kanellakis and A.A. Shvartsman, “Efficient Parallel Algorithms Can Be Made Robust”, Distributed Computing, vol. 5, no. 4, pp. 201–217, 1992; prelim. vers. in Proc. of the 8th ACM PODC, pp. 211–222, 1989.Google Scholar
- Z. M. Kedem, K. V. Palem, and P. Spirakis, “Efficient Robust Parallel Computations,” Proc. 22nd ACM Symp. on Theory of Computing, pp. 138–148, 1990.Google Scholar
- C. Martel, R. Subramonian, and A. Park, “Asynchronous PRAMS are (Almost) as Good as Synchronous PRAMS,” in Proc. 32d IEEE Symposium on Foundations of Computer Science, pp. 590–599, 1990.Google Scholar
- Z.M. Kedem, K.V. Palem, A. Raghunathan, and P. Spirakis, “Combining Tentative and Definite Executions for Dependable Parallel Computing,” in Proc 23d ACM. Symposium on Theory of Computing, pp. 381–390, 1991.Google Scholar
- P.C. Kanellakis, D. Michailidis, A.A. Shvartsman, “Controlling Memory Access Concurrency in Efficient Fault-Tolerant Parallel Algorithms”, Nordic J. of Computing, vol. 2, pp. 146–180, 1995 (prel. vers. in 7th Int-1 Work. on Distributed Algorithms, pp. 99–114, 1993 ).Google Scholar
- B.S. Chlebus, L. Gasieniec and A. Pelc, “Fast Determinsitic Simulation of Computations on Faulty Parallel Machines”, in Proc of the 3rd Annual European Symp. on Algorithms, 1995.Google Scholar
- R.E. Tarjan, U. Vishkin, “Finding biconnected components and computing tree functions in logarithmic parallel time”, in Proc. of the 25th IEEE FOCS, pp. 12–22, 1984.Google Scholar
- M. Loui and H. Abu-Amara, “Memory requirements for agreement among unreliable asynchronous processes,” in Advances in Computing Research, F.P. Preparata, Ed., vol. 4, pp. 163–183, 1987.Google Scholar
- Z.M. Kedem, K.V. Palem, M.O. Rabin, A. Raghunathan, “Efficient Program Transformations for Resilient Parallel Computation via Randomization,” in Proc. 24th ACM Symp. on Theory of Comp., pp. 306–318, 1992.Google Scholar
- N. Pippenger, “On Simultaneous Resource Bounds”, in Proc. of 20th IEEE Symposium on Foundations of Computer Science, pp. 307–311, 1979.Google Scholar
- M. Ajtai, J. Aspnes, C. Dwork, O. Waarts, “A Theory of Competitive Analysis for Distributed Algorithms”, mansucript, 1996 (prelim. vers. appears as “The Competitive Analysis of Wait-Free Algorithms and its Application to the Cooperative Collect Problem”, in Proc. of the 35th IEEE Symp. on Foundations of Computer Science, 1994 ).Google Scholar