Models for Robust Computation

  • Paris Christos Kanellakis
  • Alex Allister Shvartsman
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 401)


FORMULATING suitable models of parallel computation and processor failures goes hand in hand with the study of algorithms and their complexity. In this chapter we revisit and formally define the models of computation that are the subject of our presentation, the models of failures that we are addressing, and the major variations of the fail-stop parallel random access machine. We define and discuss the complexity measures that we use to characterize the efficiency of algorithms for the models selected and in the context of particular failure models. We introduce the high-level programming notation used to specify algorithms and we discuss the implementation and architectural issues related to the abstract models we study.


Memory Access Shared Memory Failure Model Failure Pattern Robust Computation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliographic Notes

  1. [42]
    S. Fortune and J. Wyllie, “Parallelism in Random Access Machines”, Proc. the 10th ACM Symposium on Theory of Computing, pp. 114–118, 1978.Google Scholar
  2. [119]
    J. C. Wyllie, The Complexity of Parallel Computation, Ph.D. Thesis, Cornell University, TR 79–387, 1979.Google Scholar
  3. [60]
    R.M. Karp and V. Ramachandran, “A Survey of Parallel Algorithms for Shared-Memory Machines”, in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.Google Scholar
  4. [37]
    D. Eppstein and Z. Galil; “Parallel Techniques for Combinatorial Computation”, Annual Computer Science Review, 3 (1988), pp. 233–83.MathSciNetCrossRefGoogle Scholar
  5. [45]
    A. Gibbons and P. Spirakis, Eds., Lectures on Parallel Computation, Cambridge International Series on Parallel Computation: 4, Cambridge University Press, 1993.Google Scholar
  6. [15]
    P. Beame and J. Hoastad, “Optimal bounds for decision problems on the CRCW PRAM,” Journal of the ACM, vol. 36, no. 3, pp. 643–670, 1989.zbMATHCrossRefGoogle Scholar
  7. [76]
    M. Li and Y. Yesha, “New Lower Bounds for Parallel Computation,” Journal of the ACM, vol. 36, no. 3, pp. 671–680, 1989.MathSciNetzbMATHCrossRefGoogle Scholar
  8. [116]
    L. Valiant, “General Purpose Parallel Architectures,” in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.Google Scholar
  9. [94]
    S. Owicki and D. Gries, “An Axiomatic Proof Technique for Parallel Programs I”, Acta Informatica, vol. 6, pp. 319–340, 1976.MathSciNetzbMATHCrossRefGoogle Scholar
  10. [109]
    J. T. Schwartz, “Ultracomputers”, ACM Transactions on Programming Languages and Systems, vol. 2, no. 4, pp. 484–521, 1980.zbMATHCrossRefGoogle Scholar
  11. [56]
    P.C. Kanellakis and A.A. Shvartsman, “Efficient Parallel Algorithms Can Be Made Robust”, Distributed Computing, vol. 5, no. 4, pp. 201–217, 1992; prelim. vers. in Proc. of the 8th ACM PODC, pp. 211–222, 1989.Google Scholar
  12. [87]
    C. Martel, R. Subramonian, and A. Park, “Asynchronous PRAMS are (Almost) as Good as Synchronous PRAMS,” in Proc. 32d IEEE Symposium on Foundations of Computer Science, pp. 590–599, 1990.Google Scholar
  13. [7]
    M. Ajtai, J. Aspnes, C. Dwork, O. Waarts, “A Theory of Competitive Analysis for Distributed Algorithms”, mansucript, 1996 (prelim. vers. appears as “The Competitive Analysis of Wait-Free Algorithms and its Application to the Cooperative Collect Problem”, in Proc. of the 35th IEEE Symp. on Foundations of Computer Science, 1994 ).Google Scholar
  14. [54]
    P.C. Kanellakis, D. Michailidis, A.A. Shvartsman, “Controlling Memory Access Concurrency in Efficient Fault-Tolerant Parallel Algorithms”, Nordic J. of Computing, vol. 2, pp. 146–180, 1995 (prel. vers. in 7th Int-1 Work. on Distributed Algorithms, pp. 99–114, 1993 ).CrossRefGoogle Scholar
  15. [26]
    R. Cole and O. Zajicek, “The APRAM: Incorporating Asynchrony into the PRAM Model,” in Proc. of the 1989 ACM Symp. on Parallel Algorithms and Architectures, pp. 170–178, 1989.Google Scholar
  16. [27]
    R. Cole and O. Zajicek, “The Expected Advantage of Asynchrony,” in Proc. 2nd ACM Symp. on Parallel Algorithms and Architectures, pp. 85–94, 1990.Google Scholar
  17. [44]
    P. Gibbons, “A More Practical PRAM Model,” in Proc. of the 1989 ACM Symposium on Parallel Algorithms and Architectures, pp. 158–168, 1989.CrossRefGoogle Scholar
  18. [84]
    C. Martel, A. Park, and R. Subramonian, “Work-optimal Asynchronous Algorithms for Shared Memory Parallel Computers,” SIAM Journal on Computing, vol. 21, pp. 1070–1099, 1992MathSciNetzbMATHCrossRefGoogle Scholar
  19. [93]
    N. Nishimura, “Asynchronous Shared Memory Parallel Computation,” in Proc. 3rd ACM Symp. on Parallel Algor. and Architect., pp. 76–84, 1990.Google Scholar
  20. [95]
    C.H. Papadimitriou and M. Yannakakis, “Towards an Architecture-Independent Analysis of Parallel Algorithms”, in Proc. of the 20th Annual ACM Symp. on Theory of Computing, pp. 510–513, 1988.Google Scholar
  21. [6]
    A. Aggarwal, A.K. Chandra and M. Snir, “On Communication Latency in PRAM Computations, in Proc. of 1st ACM Symposium on Parallel Algorithms and Architectures, pp. 11–21, 1989.CrossRefGoogle Scholar
  22. [85]
    C. Martel and A. Raghunathan, “Asynchronous PRAMs with Memory Latency”, Tech. Report., U.C. Davis, 1992Google Scholar
  23. [117]
    L. Valiant, “A Bridging Model for Parallel Computation,” Communications of the ACM, vol. 33, no. 8, pp. 103–111, 1990.CrossRefGoogle Scholar
  24. [32]
    D. Culler, R. Karp, D. Patterson, A. Sahay, K.E. Schauser, E. Santos, R. Subramonian and T. van Eicken, “LogP: Towards a Realistic Model of Parallel Computation”, in 4th ACM PPOPP, pp. 1–12, 1993.Google Scholar
  25. [31]
    F. Cristian, “Understanding Fault-Tolerant Distributed Systems”, in Communications of the ACM, vol. 3, no. 2, pp. 56–78, 1991.Google Scholar
  26. [52]
    IEEE Computer, “Fault-Tolerant Systems”, special issue, vol. 23, no. 7, 1990.Google Scholar
  27. [108]
    R.D. Schlichting and F.B. Schneider, “Fail-Stop Processors: an Approach to Designing Fault-tolerant Computing Systems”, ACM Transactions on Computer Systems, vol. 1, no. 3, pp. 222–238, 1983.CrossRefGoogle Scholar
  28. [8]
    G. Almasi and A. Gottlieb, Highly Parallel Computing, Second Edition, Benjamin/Cummins, 1993.Google Scholar
  29. [101]
    N. Pippenger, “Communications Networks,” in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.Google Scholar
  30. [3]
    G. B. Adams III, D. P. Agrawal, H. J. Seigel, “A Survey and Comparison of Fault-tolerant Multistage Interconnection Networks”, IEEE Computer, 20, 6, pp. 14–29, 1987.CrossRefGoogle Scholar
  31. [90]
    K. Mehlhorn and U. Vishkin, “Randomized and Deterministic Simulations of PRAMs by Parallel Machines with Restricted Granularity of Parallel Memories”, Acta Informatica, vol. 21, no. 4, pp. 339–374, 1984.MathSciNetzbMATHCrossRefGoogle Scholar
  32. [115]
    E. Upfal and A. Widgerson, “How to Share Memory in a Distributed System,” J. of the ACM, vol. 34, no. 1, pp. 116–127, 1987.zbMATHCrossRefGoogle Scholar
  33. [98]
    A. Pietracaprina and F.P. Preparata, “A Practical Constructive Scheme for Deterministic Shared-Memory Access”, Tech. Report CS-93–14, Brown University, 1993.Google Scholar
  34. [104]
    M.O. Rabin, “Efficient Dispersal of Information for Security, Load Balancing and Fault Tolerance”, J. of ACM, vol. 36, no. 2, pp. 335–348, 1989.MathSciNetzbMATHCrossRefGoogle Scholar
  35. [103]
    F.P. Preparata, “Holographic Dispersal and Recovery of Information,” in IEEE Trans. on Info. Theory, vol. 35, no. 5, pp. 1123–1124, 1989.Google Scholar
  36. [92]
    R. Negrini, M.G. Sami and R. Stefanelli, Fault-Tolerance through Reconfiguration of VLSI and WSI Arrays, the MIT Press, 1989.Google Scholar
  37. [88]
    R. McEliece, The Theory of Information and Coding, Addison-Wesley, 1977.Google Scholar
  38. [107]
    D.B. Sarrazin and M. Malek, “Fault-Tolerant Semiconductor Memories”, IEEE Computer, vol. 17, no. 8, pp. 49–56, 1984.CrossRefGoogle Scholar
  39. [120]
    I-L. Yen, E.L. Leiss and F.B. Bastiani, “Exploiting Redundancy to Speed Up Parallel System”, IEEE Parallel and Distributed Technology, vol. 1, no. 3, 1993.Google Scholar
  40. [2]
    J.A. Abraham, P. Banerjee, C.-Y. Chen, W. K. Fuchs, S.-Y. Kuo, A.L. Narasimha Reddy, “Fault tolerance techniques for systolic arrays”, IEEE Computer, Vol. 20, No. 7, pp. 65–76, 1987.CrossRefGoogle Scholar
  41. [21]
    M. Chean and J.A.B. Fortes, “A Taxonomy of Reconfiguration Techniques for Fault-Tolerant Processor Arrays,” IEEE Computer, vol. 23, no. 1, pp. 55–69, 1990.CrossRefGoogle Scholar
  42. [53]
    C. Kaklamanis, A. Karlin, F. Leighton, V. Milenkovic, P. Raghavan, S. Rao, C. Thomborson, A. Tsantilas, “Asymptotically Tight Bounds for Computing with Arrays of Processors,” in Proc. of the 31st IEEE Symposium on Foundations of Computer Science, pp. 285–296, 1990.Google Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • Paris Christos Kanellakis
    • 1
  • Alex Allister Shvartsman
    • 2
  1. 1.Brown UniversityProvidenceUSA
  2. 2.Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations