Abstract
In this paper we generalize the framework of linear loop transformations in the sense that loop alignment is considered as a new component in the transformation process. The aim is to match the structure of loop nests with the data distribution and alignment in order to eliminate non-local references whenever possible when compiling a sequential program for a distributed memory machine. The alignment and distribution functions are assumed to be user specified or automatically generated by the compiler. The transformation process is modelled with non-singular matrices and we use the ideas recently proposed in this field to find part of the transformation matrix and generate an efficient transformed code. However, additional aspects have to be studied when the alignment and distribution functions are considered, both in the obtaining of the transformation matrix and in the generation of code.
Preview
Unable to display preview. Download preview PDF.
References
Polychronopoulos C., Parallel Programming and Compilers, Kluwer Academic Publishers, 1988.
Wolfe M., Optimizing Supercompilers for Supercomputers, The MIT Press, 1989.
Fox G. et al., Fortran-D Language Specification, Technical Report TR90-140, Dept. of Computer Science, Rice University, revised January 1992.
David Loveman (ed.), Draft High Performance Fortran Language Specification Version 1.0, Technical Report TR92-225, CRPC, Rice University, January 1993.
Callahan D. and Kennedy K., Compiling Programs for Distributed-Memory Multiprocessors, Journal of Supercomputing, vol. 2, no. 2, October 1988.
Hiranandani S., Kennedy K. and Tseng C., Evaluation of Compiler Optimizations for Fortran D on MIMD Distributed-Memory Machines, in Proceedings of the 1992 ACM International Conference on Supercomputing, July 1992.
Karp A.H., Programming for Parallelism, Computer, vol. 20, no. 5, May 1987.
Banerjee U., Unimodular Transformations of Double Loops, chapter 10 of Advances in Languages and Compilers for Parallel Processing, The MIT Press, 1991.
Wolf M.E. and Lam M.S., A Loop Transformation Theory and an Algorithm to Maximize Parallelism, IEEE Transactions on Parallel and Distributed Systems, vol. 2, no. 4, October 1991.
Wolf M.E. and Lam M., A Data Locality Optimizing Algorithm, in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1991.
Li W. and Pingali K., Access Normalization: Loop Restructuring for NUMA Compilers, in Proceedings of the Fifth Int. Conference on Architectural Support for Programming Languages and Operating Systems, October 1992.
Li W. and Pingali K., A Singular Loop Transformation Framework Based on Non-Singular Matrices, in Proceedings of the Fifth Workshop on Languages and Compilers for Parallel Computers, August 1992.
Fernández A., Systematic Transformation of Systolic Algorithms for Programming Distributed Memory Multiprocessors, Ph.D. Thesis, Department of Computer Architecture, Polytechnic University of Catalunya (Spain), November 1992.
Ramanujam J., Non-unimodular Transformations of Nested Loops, in Proceedings of the Supercomputing'92, November 1992.
Padua D.A., Multiprocessors: Discussions of some theoretical and practical problems, Technical Report DCS UIUCDCS-R-79-990, Ph.D. dissertation, University of Illinois at Urbana-Champaign, November 1979.
Peir J-K., Program Partitioning and Synchronization on Multiprocessor Systems, Ph.D. Thesis, University of Illinois at Urbana-Champaign, 1986.
Allen R., Callahan D. and Kennedy K., Automatic Decomposition of Scientific Programs for Parallel Execution, in Proceedings of the 14th ACM Symposium Principles of Programming Languages, January 1987.
Banerjee U., Dependence Analysis for Supercomputing, Kluwer Academic Publishers, 1988.
Kennedy K. and Kremer U., Automatic Data Alignment and Distribution for Loosely Synchronous Problems in an Interactive Programming Environment, Technical Report TR91-155, Dept. of Computer Science, Rice University, April 1991.
Li J., Compiling Crystal for Distributed-Memory Machines, Ph.D. Thesis, Dep. of Computer Science, Yale University, December 1991.
Moldovan D.I. and Fortes J.A.B., Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays, IEEE Transactions on Computers, vol. 35, no. 1, January 1986.
Lu L. and Chen M., New Loop Transformation Techniques for Massive Parallelism, Research Report TR-833, Department of Computer Science, Yale University, October 1990.
Schrijver A., Theory of Linear and Integer Programming, John Wiley and Sons, 1986.
Ayguadé E. and Torres J., Partitioning the Statement per Iteration Space Using Non-singular Matrices, in Proceedings of the 1993 ACM International Conference on Supercomputing, July 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Torres, J., Ayguadé, E., Labarta, J., Valero, M. (1994). Align and distribute-based linear loop transformations. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1993. Lecture Notes in Computer Science, vol 768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57659-2_19
Download citation
DOI: https://doi.org/10.1007/3-540-57659-2_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57659-4
Online ISBN: 978-3-540-48308-3
eBook Packages: Springer Book Archive