Loop Parallelization Algorithms

Darte, Alain; Robert, Yves; Vivien, Frédéric

doi:10.1007/3-540-45403-9_5

Alain Darte⁶,
Yves Robert⁶ &
Frédéric Vivien⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1808))

525 Accesses
2 Citations

Summary

This chapter is devoted to a comparative survey of loop parallelization algorithms. Various algorithms have been presented in the literature, such as those introduced by Allen and Kennedy, Wolf and Lam, Darte and Vivien, and Feautrier. These algorithms make use of different mathematical tools. Also, they do not rely on the same representation of data dependences. In this chapter, we survey each of these algorithms, and we assess their power and limitations, both through examples and by stating “optimality” results. An important contribution of this chapter is to characterize which algorithm is the most suitable for a given representation of dependences. This result is of practical interest, as it provides guidance for a compiler-parallelizer: given the dependence analysis that is available, the simplest and cheapest parallelization algorithm that remains optimal should be selected.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. R. Allen and K. Kennedy. PFC: a program to convert programs to parallel form. Technical report, Dept. of Math. Sciences, Rice University, TX, March 1982.
Google Scholar
J. R. Allen and K. Kennedy. Automatic translations of Fortran programs to vector form. ACM Toplas, 9:491–542, 1987.
Article MATH Google Scholar
Utpal Banerjee. A theory of loop permutations. In Gelernter, Nicolau, and Padua, editors, Languages and Compilers for Parallel Computing. MIT Press, 1990.
Google Scholar
A. J. Bernstein. Analysis of programs for parallel processing. In IEEE Trans. on El. Computers, EC-15, 1966.
Google Scholar
Pierre Boulet, Alain Darte, Tanguy Risset, and Yves Robert. (pen)-ultimate tiling? Integration, the VLSI Journal, 17:33–51, 1994.
Article Google Scholar
D. Callahan. A Global Approach to Detection of Parallelism. PhD thesis, Dept. of Computer Science, Rice University, Houston, TX, 1987.
Google Scholar
J.-F. Collard, D. Barthou, and P. Feautrier. Fuzzy Array Dataflow Analysis. In Proceedings of 5th ACM SIGPLAN Symp. on Principles and practice of Parallel Programming, Santa Barbara, CA, July 1995.
Google Scholar
Jean-François Collard. Code generation in automatic parallelizers. In Claude Girault, editor, Proc. Int. Conf. on Application in Parallel and Distributed Computing. IFIP WG 10.3, pages 185–194. North Holland, April 1994.
Google Scholar
Jean-François Collard, Paul Feautrier, and Tanguy Risset. Construction of DO loops from systems of affine constraints. Parallel Processing Letters, 5(3):421–436, September 1995.
Article Google Scholar
Alain Darte, Leonid Khachiyan, and Yves Robert. Linear scheduling is nearly optimal. Parallel Processing Letters, 1(2):73–81, 1991.
Article Google Scholar
Alain Darte and Yves Robert. Mapping uniform loop nests onto distributed memory architectures. Parallel Computing, 20:679–710, 1994.
Article MATH Google Scholar
Alain Darte and Yves Robert. Affine-by-statement scheduling of uniform and affine loop nests over parametric domains. J. Parallel and Distributed Computing, 29:43–59, 1995.
Article Google Scholar
Alain Darte, Georges-André Silber, and Frédéric Vivien. Combining retiming and scheduling techniques for loop parallelization and loop tiling. Technical Report 96-34, LIP, ENS-Lyon, France, November 1996.
Google Scholar
Alain Darte and Frédéric Vivien. Automatic parallelization based on multidimensional scheduling. Technical Report 94-24, LIP, ENS-Lyon, France, September 1994.
Google Scholar
Alain Darte and Frédéric Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. In Proceedings of PACT’96, Boston, MA, October 1996. IEEE Computer Society Press.
Google Scholar
Alain Darte and Frédéric Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Technical Report 96-06, LIP, ENS-Lyon, France, April 1996.
Google Scholar
Alain Darte and Frédéric Vivien. On the optimality of Allen and Kennedy’s algorithm for parallelism extraction in nested loops. Journal of Parallel Algorithms and Applications, 96. Special issue on Optimizing Compilers for Parallel Languages.
Google Scholar
Paul Feautrier. Dataflow analysis of array and scalar references. Int. J. Parallel Programming, 20(1):23–51, 1991.
Article MATH Google Scholar
Paul Feautrier. Some efficient solutions to the affine scheduling problem, part I, one-dimensional time. Int. J. Parallel Programming, 21(5):313–348, October 1992.
Article MATH MathSciNet Google Scholar
Paul Feautrier. Some efficient solutions to the affine scheduling problem, part II, multi-dimensional time. Int. J. Parallel Programming, 21(6):389–420, December 1992.
Article MATH MathSciNet Google Scholar
F. Irigoin, P. Jouvelot, and R. Triolet. Semantical interprocedural parallelization: an overview of the PIPS project. In Proceedings of the 1991 ACM International Conference on Supercomputing, Cologne, Germany, June 1991.
Google Scholar
F. Irigoin and R. Triolet. Computing dependence direction vectors and dependence cones with linear systems. Technical Report ENSMP-CAI-87-E94, Ecole des Mines de Paris, Fontainebleau (France), 1987.
Google Scholar
F. Irigoin and R. Triolet. Supernode partitioning. In Proc. 15th Annual ACM Symp. Principles of Programming Languages, pages 319–329, San Diego, CA, January 1988.
Google Scholar
R.M. Karp, R.E. Miller, and S. Winograd. The organization of computations for uniform recurrence equations. Journal of the ACM, 14(3):563–590, July 1967.
Article MATH MathSciNet Google Scholar
W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott. New user interface for Petit and other interfaces: user guide. University of Maryland, June 1995.
Google Scholar
Leslie Lamport. The parallel execution of DO loops. Communications of the ACM, 17(2):83–93, February 1974.
Article MATH MathSciNet Google Scholar
Amy W. Lim and Monica S. Lam. Maximizing parallelism and minimizing synchronization with affine transforms. In Proceedings of the 24th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 1997.
Google Scholar
Wolfgang Meisl. Practical methods for scheduling and allocation in the polytope model. World Wide Web document, URL: http://brahms.fmi.uni-passau.de/cl/loopo/doc.
R. Schreiber and Jack J. Dongarra. Automatic blocking of nested loops. Technical Report 90-38, The University of Tennessee, Knoxville, TN, August 1990.
Google Scholar
Alexander Schrijver. Theory of Linear and Integer Programming. John Wiley and Sons, New York, 1986.
MATH Google Scholar
Michael E. Wolf and Monica S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. Parallel Distributed Systems, 2(4):452–471, October 1991.
Article Google Scholar
M. Wolfe. Optimizing Supercompilers for Supercomputers. PhD thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, October 1982.
Google Scholar
Michael Wolfe. Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge MA, 1989.
MATH Google Scholar
Michael Wolfe. TINY, a loop restructuring research tool. Oregon Graduate Institute of Science and Technology, December 1990.
Google Scholar
Michael Wolfe. High Performance Compilers For Parallel Computing. Addison-Wesley Publishing Company, 1996.
Google Scholar
Jingling Xue. Automatic non-unimodular transformations of loop nests. Parallel Computing, 20(5):711–728, May 1994.
Article MATH MathSciNet Google Scholar
Hans Zima and Barbara Chapman. Supercompilers for Parallel and Vector Computers. ACM Press, 1990.
Google Scholar

Download references

Author information

Authors and Affiliations

LIP, Ecole Normale Supérieure de Lyon, F - 69364, Lyon Cedex 07, France
Alain Darte, Yves Robert & Frédéric Vivien

Authors

Alain Darte
View author publications
You can also search for this author in PubMed Google Scholar
Yves Robert
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Vivien
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computing, Georgia Institute of Technology, 801 Atlantic Drive, Atlanta, GA, 30332, USA
Santosh Pande
Department of ECECS, University of Cincinnati, P.O. Box 210030, Cincinnati, OH, 45221-0030, USA
Dharma P. Agrawal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Darte, A., Robert, Y., Vivien, F. (2001). Loop Parallelization Algorithms. In: Pande, S., Agrawal, D.P. (eds) Compiler Optimizations for Scalable Parallel Systems. Lecture Notes in Computer Science, vol 1808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45403-9_5

Download citation

DOI: https://doi.org/10.1007/3-540-45403-9_5
Published: 18 May 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41945-7
Online ISBN: 978-3-540-45403-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics