Abstract
In this paper, we extend software pipelining techniques for scheduling nested loops. Under this framework, a periodic scheduling function, called r-periodic schedule, is associated with each operation of the loop body for the entire iteration space. We present a simple problem formulation as well as efficient solutions which give provable, asymptotically time-optimal schedules for nested loops under our program model. It also provides some useful insights on the interplay between unimodular loop transformations and fine-grain scheduling under a unified frame-work.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This work was supported by the EPPP project, financed by Industry Science Canada, Alex Parallel Computers, Digital Equipment Canada, IBM Canada and CRIM (Centre de recherche informatique de Montréal).
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
A. Aiken. Compaction-based parallelization. (PhD thesis), Technical Report 88-922, Cornell University, 1988.
A. Aiken and A. Nicolau. Optimal loop parallelization. In Proceedings of the 1988 ACM SIGPLAN Conference on Programming Languages Design and Implementation, June 1988.
A. Aiken and A. Nicolau. A realistic resource-constrained software pipelining algorithm. In Proceedings of the Third Workshop on Programming Languages and Compilers for Parallel Computing, Irvine, CA, August 1990.
W. Backes, U. Schwiegelshohn, and L. Thiele. Analysis of free schedule in periodic graphs. In SPAA '92, pages 333–339. ACM, 1992.
U. Banerjee. Unimodular transformations of double loops. In Proceedings of the Third Workshop on Programming Languages and Compilers for Parallel Computing, Irvine, CA, August 1990. Also published in Monographs in Parallel and Distributed Computing, Pages 192–219, Pitman, 1991.
C.J. Brownhill and A. Nicolau. A hierarchical parallelizing compiler for VLIW/MIMD machines. In Proceedings of the 5-th Workshop on Languages and Compilers for Parallel Computing, August 1992.
P.R. Cappello and K. Steiglitz. Unifying VLSI array design with linear transformations of space-time. In F.P. Preparata, editor, Advances in Computing Research, Greenwich, U.K., 1984. JAI Press Inc.
M.C. Chen. Synthesizing systolic designs. In 1985 International Symposium on VLSI Technology, Systems and Applications, 1985.
P. Chretienne. Les réseaux de Petri temporisés. PhD thesis, Thèse d'état, Université P. et M. Curie, 1983.
L. Cooper and D. Steinberg. Introduction to methods of optimization. W.B. Saunders Company, 1970.
A. Darte, L. Khachiyan, and Y. Robert. Linear scheduling is nearly optimal. Parallel Processing Letters, 1(2):pp. 73–81, 1991.
A. Darte, T. Risset, and Y. Robert. Loop nest scheduling and transformations. In Proc. of Environments and Tools for Parallel Scientific Computing, J.J. Dongarra and B. Tourancheau eds., North Holland (1993), July 1992.
A. Darte and Y. Robert. Scheduling uniform loop nests. Rapport 92-10, Laboratoire de l'Informatique du Parallélisme, Ecole Normale Supérieure de Lyon, February 1992.
J.M. Delosme and I.C.F. Ipsen. Efficient systolic arrays for the solution of Toeplitz systems: an illustration of a methodology for the construction of systolic architectures for VLSI. In Systolic Arrays, 1986.
V. Van Dongen, G.R. Gao, and Q. Ning. Extending software pipelining to nested loops. Technical Report Acaps Memo 53, School of Computer Science, McGill University, December 1992.
K. Ebcioglu. A compilation technique for software pipelining of loops with conditional jumps. In Proceedings of the 20th Annual Workshop on Microprogramming, December 1987.
K. Ebcioglu and A. Nicolau. A global resource-constrained parallelization technique. In Proceedings of the ACM SIGARCH International Conference on Supercomputing, June 1989.
P. Feautrier. A collection of papers on the systematic construction of parallel and distributed programs. Technical Report Hors-série, Lab. MASI, Université P. et M. Curie, 4 Place Jussieu 75252 Paris Cédex 05, June 1992.
G. R. Gao, Y. B. Wong, and Qi Ning. A Petri-Net model for fine-grain loop scheduling. In Proceedings of the '91 ACM-SIGPLAN Conference on Programming Language Design and Implementation, pages 204–218, Toronto, Canada, June 1991.
F. Gasperoni and U. Schwiegelshohn. ”Scheduling Loops on Parallel Processors: a Simple Algorithm with Close to Optimum Performance,” Proc. Int. Conf. CONPAR 92, G. Goos and J. Hartmanis Editors, Lecture notes in Computer Science 634, pp. 625–636, Springer Verlag 1992.
H.V. Jagadish, S.K. Rao, and T. Kailath. Array architectures for iterative algorithms. Proceedings of the IEEE, 1987.
B. Joinnault. Conception d'algorithmes et d'architectures systoliques. Thèse de l'Université de Rennes I, Sept 1987.
Monica Lam. Software pipelining: An effective scheduling technique for VLIW machines. In Proceedings of the 1988 ACM SIGPLAN Conference on Programming Languages Design and Implementation, pages 318–328, Atlanta, GA, June 1988.
Eugene L. Lawler. Combinatorial Optimization: Networks and Matroids. Saunders College Publishing, Ft Worth, TX, 1976.
L.S. Liu, C.W. Ho, and J.P. Sheu. On the parallelism of nested for-loops using index shift method. In Proc. of International Conf. on Parallel Processing, pages II-119–II-123, Aug. 1992.
A. Nicolau, K. Pingali, and A. Aiken. Fine-grain compilation for pipelined machines. Technical Report TR-88-934, Department of Computer Science, Cornell University, Ithaca, NY, 1988.
Q. Ning and G.R. Gao. A novel framework of register allocation for software pipelining. In Proceedings of 20th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '93), pages 29–42, Charleston, South Carolina, January 10–13 1993.
Constantine Polychronopoulos. Toward auto-scheduling compilers. Technical report, University of Illinois-CSRD, May 1988. CSRD Rpt. No. 789.
P. Quinton. Automatic synthesis of systolic arrays from uniform recurrent equations. In Proc. IEEE 11-th Int. Sym. on Computer Architecture, 1984.
B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the 14th Annual Workshop on Microprogramming, pages 183–198, 1981.
R. Reiter. Scheduling parallel computations. Journal of ACM, 15:590–599, October 1968.
Y. Robert and S. Song. Revisiting cycle shrinking. Parallel Computing, pages pp. 481–496, 1992.
R. F. Touzeau. A FORTRAN compiler for the FPS-164 scientific computer. In Proceedings of the ACM SIGPLAN '84 Symposium on Compiler Construction, pages 48–57, June 1984.
V. Van Dongen, G. Gao, and Q. Ning. A polynomial time method for optimal software pipelining. In Proceedings of CONPAR '92, Lecture Notes in Computer Science 634, Paris, France, September 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gao, G.R., Ning, Q., van Dongen, V. (1994). Extending software pipelining techniques for scheduling nested loops. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1993. Lecture Notes in Computer Science, vol 768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57659-2_20
Download citation
DOI: https://doi.org/10.1007/3-540-57659-2_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57659-4
Online ISBN: 978-3-540-48308-3
eBook Packages: Springer Book Archive