An overview of a compiler for scalable parallel machines

Amarasinghe, Saman P.; Anderson, Jennifer M.; Lam, Monica S.; Lim, Amy W.

doi:10.1007/3-540-57659-2_15

Saman P. Amarasinghe¹,
Jennifer M. Anderson¹,
Monica S. Lam¹ &
…
Amy W. Lim¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 768))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

167 Accesses
45 Citations

Abstract

This paper presents an overview of a parallelizing compiler to automatically generate efficient code for large-scale parallel architectures from sequential input programs. This research focuses on loop-level parallelism in dense matrix computations. We illustrate the basic techniques the compiler uses by describing the entire compilation process for a simple example.

Our compiler is organized into three major phases: analyzing array references, allocating the computation and data to the processors to optimize parallelism and locality, and generating code.

An optimizing compiler for scalable parallel machines requires more sophisticated program analysis than the traditional data dependence analysis. Our compiler uses a precise data-flow analysis technique to identify the producer of the value read by each instance of a read access. In order to allocate the computation and data to the processors, the compiler first transforms the program to expose loop-level parallelism in the computation. It then finds a decomposition of the computation and data such that parallelism is exploited and the communication overhead is minimized. The compiler will trade off extra degrees of parallelism to reduce or eliminate communication. Finally, the compiler generates code to manage the multiple address spaces and to communicate data across processors.

This research was supported in part by DARPA contracts N00039-91-C-0138 and DABT63-91-K-0003, an NSF Young Investigator Award and fellowships from Digital Equipment Corporation's Western Research Laboratory and Intel Corporation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Compiler Optimizations for Parallel Programs

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

Parray: A Unifying Array Representation for Heterogeneous Parallelism

References

S. P. Amarasinghe and M. S. Lam. Communication optimization and code generation for distributed memory machines. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation, June 1993.
Google Scholar
J. M. Anderson and M. S. Lam. Global optimizations for parallelism and locality on scalable parallel machines. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation, June 1993.
Google Scholar
B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran. Scientific Programming, 1(1):31–50, Fall 1992.
Google Scholar
P. Feautrier. Parametric integer programming. Technical Report 209, Laboratoire Methodologie and Architecture Des Systemes Informatiques, January 1988.
Google Scholar
Paul Feautrier. Array expansion. In International Conference on Supercomputing, pages 429–442, July 1988.
Google Scholar
Paul Feautrier. Dataflow analysis of scalar and array references. Journal of Parallel and Distributed Computing, 20(1):23–53, February 1991.
Google Scholar
High Performance Fortran Forum. High Performance Fortran Language Specification, January 1993. Draft Version 1.0.
Google Scholar
S. Hiranandani, K. Kennedy, and C.-W. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66–80, August 1992.
Article Google Scholar
Intel Corporation, Santa Clara, CA. iPSC/2 and iPSC/860 User's Guide, June 1990.
Google Scholar
F. Irigoin and R. Triolet. Supernode partitioning. In Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, pages 319–329, January 1988.
Google Scholar
C. Koelbel, P. Mehrotra, and J. Van Rosendale. Supporting shared data structures on distributed memory architectures. In Proceedings of the Second ACM/SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 177–186, March 1990.
Google Scholar
D. Lenoski, K. Gharachorloo, J. Laudon, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam. The Stanford DASH Multiprocessor. IEEE Computer, 25(3):63–79, March 1992.
Google Scholar
D. E. Maydan. Accurate Analysis of Array References. PhD thesis, Stanford University, September 1992. Published as CSL-TR-92-547.
Google Scholar
D. E. Maydan, S. P. Amarasinghe, and M. S. Lam. Array data flow analysis and its use in array privatization. In Proc. 20th Annual ACM Symposium on Principles of Programming Languages, January 1993.
Google Scholar
A. Rogers and K. Pingali. Compiling for locality. In Proceedings of the 1990 International Conference on Parallel Processing, pages 142–146, June 1990.
Google Scholar
C.-W. Tseng. An Optimizing Fortran D Compiler for MIMD Distributed-Memory Machines. PhD thesis, Rice University, January 1993. Published as Rice COMP TR93-199.
Google Scholar
P.-S. Tseng. A Parallelizing Compiler for Distributed Memory Parallel Computers. PhD thesis, Carnegie Mellon University, May 1989. Published as CMU-CS-89-148.
Google Scholar
M. E. Wolf and M. S. Lam. A data locality optimizing algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 30–44, June 1991.
Google Scholar
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. Transactions on Parallel and Distributed Systems, 2(4):452–470, October 1991.
Article Google Scholar
M. J. Wolfe. Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge, MA, 1989.
Google Scholar
H. P. Zima, H.-J. Bast, and M. Gerndt. SUPERB: A tool for semi-automatic MIMD / SIMD parallelization. Parallel Computing, 6(1):1–18, January 1988.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Systems Laboratory, Stanford University, 94305, CA
Saman P. Amarasinghe, Jennifer M. Anderson, Monica S. Lam & Amy W. Lim

Authors

Saman P. Amarasinghe
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer M. Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Monica S. Lam
View author publications
You can also search for this author in PubMed Google Scholar
Amy W. Lim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amarasinghe, S.P., Anderson, J.M., Lam, M.S., Lim, A.W. (1994). An overview of a compiler for scalable parallel machines. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1993. Lecture Notes in Computer Science, vol 768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57659-2_15

Download citation

DOI: https://doi.org/10.1007/3-540-57659-2_15
Published: 31 May 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57659-4
Online ISBN: 978-3-540-48308-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

An overview of a compiler for scalable parallel machines

Abstract

Access this chapter

Preview

Similar content being viewed by others

Compiler Optimizations for Parallel Programs

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

Parray: A Unifying Array Representation for Heterogeneous Parallelism

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An overview of a compiler for scalable parallel machines

Abstract

Access this chapter

Preview

Similar content being viewed by others

Compiler Optimizations for Parallel Programs

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

Parray: A Unifying Array Representation for Heterogeneous Parallelism

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation