Optimal reordering and mapping of a class of nested-loops for parallel execution

Lam, Chi-Chung; Sadayappan, P.; Wenger, Rephael

doi:10.1007/BFb0017261

Chi-Chung Lam¹,
P. Sadayappan¹ &
Rephael Wenger¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1239))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

121 Accesses
2 Citations

Abstract

This paper addresses the compile-time optimization of a class of nested-loop computations that arise in some computational physics applications. The computations involve summations over products of array terms in order to compute multi-dimensional surface and volume integrals. Reordering additions and multiplications and applying the distributive law can significantly reduce the number of operations required in evaluating these summations. In a multiprocessor environment, proper distribution of the arrays among processors will reduce the inter-processor communication time. We present a formal description of the operation minimization problem, a proof of its NP-completeness, and a pruning strategy for finding the optimal solution in small cases. We also give an algorithm for determining the optimal distribution of the arrays among processors in a multiprocessor environment.

Supported in part by NSF grant DMR-9520319.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. N. Fischer and R. J. Leblanc Jr. Crafting a Compiler. Menlo Park, CA: Benjamin/ Cummings, 1991.
Google Scholar
Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. New York: W. H. Freeman, 1979.
Google Scholar
Ken Kennedy and Kathryn S. McKinley. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution. In Languages and Compilers for Parallel Computing, August 1993, 301–320.
Google Scholar
Ken Kennedy and Kathryn S. McKinley. Optimizing for Parallelism and Data Locality. In Proceedings of the 1992 ACM International Conference on Supercomputing, July 1992, 323–334.
Google Scholar
V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing: Design and Analysis of Algorithms. RedWood City, CA: Benjamin/Cummings, 1994.
Google Scholar
C. C. Lu and W. C. Chew. Fast Algorithm for Solving Hybrid Integral Equations. In IEE Proceedings-H, 140(6): 455–460, December 1993.
Google Scholar
Edmund K. Miller. Solving Bigger Problems-By Decreasing the Operation Count and Increasing the Computation Bandwidth. In Proceedings of the IEEE, 79(10): 1493–1504, October 1991.
Article Google Scholar
M. Potkonjak, M. B. Srivastava, and A. P. Chandrakasan. Multiple Constant Multiplications: Efficient and Versatile Framework and Algorithms for Exploring Common Subexpression Elimination. IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 15(2): 151–164, February 1996.
Article Google Scholar
S. Winograd. Arithmetic complexity of computations. Philadelphia: Society for Industrial and Applied Mathematics, 1980.
Google Scholar
M. Wolfe. High Performance Compilers for Parallel Computing. Addison Wesley, 1996.
Google Scholar
Michael E. Wolf and Monica S. Lam. A Data Locality Algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, June 1991, 30–44.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Science, The Ohio State University, 43210, Columbus, OH
Chi-Chung Lam, P. Sadayappan & Rephael Wenger

Authors

Chi-Chung Lam
View author publications
You can also search for this author in PubMed Google Scholar
P. Sadayappan
View author publications
You can also search for this author in PubMed Google Scholar
Rephael Wenger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

David Sehr Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lam, CC., Sadayappan, P., Wenger, R. (1997). Optimal reordering and mapping of a class of nested-loops for parallel execution. In: Sehr, D., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1996. Lecture Notes in Computer Science, vol 1239. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0017261

Download citation

DOI: https://doi.org/10.1007/BFb0017261
Published: 10 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63091-3
Online ISBN: 978-3-540-69128-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics