Fault-Tolerant Simulations

  • Paris Christos Kanellakis
  • Alex Allister Shvartsman
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 401)


IN THIS chapter we discuss the simulation of fault-free prams on fault-prone prams. The simulation is based on a technique for executing arbitrary pram steps on a pram whose processors are subject to fail-stop failures. In each of the specific simulations, the execution of a single N-processor pram step on a fail-stop P-processor pram has the same asymptotic complexity as that of solving a N-size instance of the Write-All problem using P fail-stop processors. We also show that in some cases it is possible to develop fault-tolerant algorithms that improve on the efficiency of the oblivious simulations. Finally, we discuss parallel efficiency classes and closures with respect to fault tolerant simulations.


Shared Memory Overhead Ratio Instruction Counter Private Memory Parallel Prefix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliographic Notes

  1. [60]
    R.M. Karp and V. Ramachandran, “A Survey of Parallel Algorithms for Shared-Memory Machines”, in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.Google Scholar
  2. [74]
    F. Thomson Leighton, Introduction to Parallel Algorithms and Architectures: Array, Trees, Hypercubes, Morgan Kaufman Publishers, San Mateo, CA, 1992.Google Scholar
  3. [46]
    T.J. Harris, “A Survey of PRAM Simulation Techniques, ACM Computing Surveys, vol. 26, no. 2, pp. 187–206, 1994.CrossRefGoogle Scholar
  4. [56]
    P.C. Kanellakis and A.A. Shvartsman, “Efficient Parallel Algorithms Can Be Made Robust”, Distributed Computing, vol. 5, no. 4, pp. 201–217, 1992; prelim. vers. in Proc. of the 8th ACM PODC, pp. 211–222, 1989.Google Scholar
  5. [63]
    Z. M. Kedem, K. V. Palem, and P. Spirakis, “Efficient Robust Parallel Computations,” Proc. 22nd ACM Symp. on Theory of Computing, pp. 138–148, 1990.Google Scholar
  6. [110]
    A. A. Shvartsman, “Achieving Optimal CRCW PRAM Fault-Tolerance”, Information Processing Letters, vol. 39, no. 2, pp. 59–66, 1991.MathSciNetzbMATHCrossRefGoogle Scholar
  7. [87]
    C. Martel, R. Subramonian, and A. Park, “Asynchronous PRAMS are (Almost) as Good as Synchronous PRAMS,” in Proc. 32d IEEE Symposium on Foundations of Computer Science, pp. 590–599, 1990.Google Scholar
  8. [62]
    Z.M. Kedem, K.V. Palem, A. Raghunathan, and P. Spirakis, “Combining Tentative and Definite Executions for Dependable Parallel Computing,” in Proc 23d ACM. Symposium on Theory of Computing, pp. 381–390, 1991.Google Scholar
  9. [54]
    P.C. Kanellakis, D. Michailidis, A.A. Shvartsman, “Controlling Memory Access Concurrency in Efficient Fault-Tolerant Parallel Algorithms”, Nordic J. of Computing, vol. 2, pp. 146–180, 1995 (prel. vers. in 7th Int-1 Work. on Distributed Algorithms, pp. 99–114, 1993 ).Google Scholar
  10. [22]
    B.S. Chlebus, L. Gasieniec and A. Pelc, “Fast Determinsitic Simulation of Computations on Faulty Parallel Machines”, in Proc of the 3rd Annual European Symp. on Algorithms, 1995.Google Scholar
  11. [113]
    R.E. Tarjan, U. Vishkin, “Finding biconnected components and computing tree functions in logarithmic parallel time”, in Proc. of the 25th IEEE FOCS, pp. 12–22, 1984.Google Scholar
  12. [71]
    L.E. Ladner and M.J. Fischer, “Parallel Prefix Computation”, Journal of the ACM, vol. 27, no. 4, pp. 831–838, 1980.MathSciNetzbMATHCrossRefGoogle Scholar
  13. [14]
    G. Baudet, “Asynchronous iterative methods for multiprocessors”, JACM, vol. 25, no. 2, pp. 226–244, 1978.MathSciNetzbMATHCrossRefGoogle Scholar
  14. [77]
    M. Loui and H. Abu-Amara, “Memory requirements for agreement among unreliable asynchronous processes,” in Advances in Computing Research, F.P. Preparata, Ed., vol. 4, pp. 163–183, 1987.Google Scholar
  15. [61]
    Z.M. Kedem, K.V. Palem, M.O. Rabin, A. Raghunathan, “Efficient Program Transformations for Resilient Parallel Computation via Randomization,” in Proc. 24th ACM Symp. on Theory of Comp., pp. 306–318, 1992.Google Scholar
  16. [99]
    N. Pippenger, “On Simultaneous Resource Bounds”, in Proc. of 20th IEEE Symposium on Foundations of Computer Science, pp. 307–311, 1979.Google Scholar
  17. [7]
    M. Ajtai, J. Aspnes, C. Dwork, O. Waarts, “A Theory of Competitive Analysis for Distributed Algorithms”, mansucript, 1996 (prelim. vers. appears as “The Competitive Analysis of Wait-Free Algorithms and its Application to the Cooperative Collect Problem”, in Proc. of the 35th IEEE Symp. on Foundations of Computer Science, 1994 ).Google Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • Paris Christos Kanellakis
    • 1
  • Alex Allister Shvartsman
    • 2
  1. 1.Brown UniversityProvidenceUSA
  2. 2.Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations