A framework for analyzing locality and portability issues in parallel computing

Extended abstract
  • Abhiram Ranade
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 678)


This work potentially affect two areas: interconnection network design, and parallel programming methodology.

A key issue in designing parallel computers is the balance between computing power and communication capacity. As we have observed, there exist several problems that are inherently nonlocal, and therefore require high communication capability for efficient implementation. We also listed several problems for which fast network implementations can be designed. Some of these problems, however, possess only limited locality, and thus require relatively powerful communication networks (e.g. Butterflies). To summarize, we cannot give a clear answer to the question of how powerful communication networks we must build; but as more results become known about locality of different problems and as we develop locality exploiting algorithms for more problems, we will have a more complete answer.

Our ideas provide a methodology for developing portable parallel programs. The first step given a problem is to determine its gross locality. This determines a native architecture for the problem. The next step is to design an algorithm on the native model that fully exploits locality. This algorithm can now be simulated on different architectures, and is guaranteed to have good efficiency.


Convex Hull Communication Complexity Native Model Nonlocal Problem Processor Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    F. Abolhassan, R. Drefenstedt, J. Keller, W. Paul, and D. Scheerer. On the physical design of PRAMS. In J. Buchmann, H. Ganziger, and W. Paul, editors, Informatik-Festschrift zum 60. Geburstag von Gunter Hotz. Teubner Verlag, 1992.Google Scholar
  2. 2.
    F. Abolhassan, J. Keller, and W. Paul. On the cost-effectiveness of PRAMS. In IEEE Symposium on Parallel and Distributed Processing, pages 2–9, December 1991.Google Scholar
  3. 3.
    K. Abrahamson, N. Dadoun, D. Kirkpatrick, and T. Pryztycka. A simple parallel tree contraction algorithm. Technical Report 87-30, University of British Columbia, 1987.Google Scholar
  4. 4.
    Alok Aggarwal, Ashok Chandra, and Marc Snir. Communication Complexity of PRAMS. Theoretical Computer Science, pages 3–28, March 1990.Google Scholar
  5. 5.
    Robert Alverson, David Callahan, Daniel Cummings, et al. The TERA Computer System. In Proceedings of Supercomputing 90, pages pp1–6, 1990.Google Scholar
  6. 6.
    M. Atallah and M. Goodrich. Efficient parallel solutions to some geometric problems. Journal of Parallel and Distributed Computing, 3:492–507, 1986.Google Scholar
  7. 7.
    S. N. Bhatt, F. R. K. Chung, J. W. Hong, F. T. Leighton, and A. L. Rosenberg. Optimal Simulations by Butterfly Networks. In Proceedings of STOC 88, pages 192–204, 1988.Google Scholar
  8. 8.
    S. N. Bhatt, F. R. K. Chung, F. T. Leighton, and A. L. Rosenberg. Optimal simulations of tree machines. In Proceedings of the IEEE Annual Symposium on The Foundations of Computer Science, pages 274–282, 1986.Google Scholar
  9. 9.
    David Blackston and Abhiram Ranade. Snakesort: A family of optimal randomized sorting algorithms, 1993. manuscript.Google Scholar
  10. 10.
    Joseph Cheriyan, Torben Hagerup, and Kurt Mehlhorn. Can maximum flow be computed in o(nm) time? Technical Report A 90/07, Universitat des Saarlandes, May 1990.Google Scholar
  11. 11.
    R. Cole and U. Vishkin. Approximate and exact parallel scheduling with application to list, tree and graph problems. In Proceedings of the IEEE Annual Symposium on The Foundations of Computer Science, pages 478–491, 1986.Google Scholar
  12. 12.
    D. Culler, R. Karp, D. Patterson, A. Sahay, K. Schauser, E. Santos, R. Subramonian, and T. Eicken. LogP: Towards a realistic model of Parallel Computation. In Principles and Practice of Parallel Programming, 1992. To appear.Google Scholar
  13. 13.
    H. Gazit. An optimal randomized parallel algorithm for finding connected components in a graph. In Proceedings of the IEEE Annual Symposium on The Foundations of Computer Science, pages 492–501, 1986.Google Scholar
  14. 14.
    Joseph Ja'Ja'. The VLSI Complexity of Selected Graph Problems. Journal of the ACM, 31:377–391, April 1984.Google Scholar
  15. 15.
    R. Koch, T. Leighton, B. Maggs, S. Rao, and A. Rosenberg. Work-preserving emulations of fixed-connection networks. In Proceedings of the ACM Annual Symposium on Theory of Computing, pages 227–240, May 1989.Google Scholar
  16. 16.
    Ernst Mayr, 1992. Personal Communication.Google Scholar
  17. 17.
    Abhiram G. Ranade. Optimal speedup for backtrack search on a butterfly network. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 40–48, July 1991.Google Scholar
  18. 18.
    Abhiram G. Ranade. Communication efficient algorithms for some geometric problems. In preparation., 1992.Google Scholar
  19. 19.
    Abhiram G. Ranade. Maintaining dynamic ordered sets on processor networks. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 127–137, June–July 1992.Google Scholar
  20. 20.
    Abhiram G. Ranade, Sandeep N. Bhatt, and S. Lennart Johnsson. The Fluent Abstract Machine. In Proceedings of the Fifth MIT Conference on Advanced Research in VLSI, pages 71–94, March 1988. Also available as Yale Univ. Comp. Sc. TR-573.Google Scholar
  21. 21.
    L. G. Valiant. A Bridging Model for Parallel Computation. Communications of the ACM, 33(8):103–111, August 1990.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Abhiram Ranade
    • 1
  1. 1.Computer Science DivisionUniversity of CaliforniaBerkeley

Personalised recommendations