Abstract
Loop unwinding is a known technique for reducing loop overhead, exposing parallelism and increasing the efficiency of pipelining. Traditional loop unwinding is limited to the innermost loop in a group of nested loops and the amount of unwinding is either fixed or has to be specified by the user, on a case by case basis. In this paper we present a general technique for automatically unwinding multiply nested loops, explain its advantages over other transformation techniques and illustrate its practical effectiveness. Loop Quantization could be beneficial by itself, or coupled with other loop transformations (e.g., Do-across).
This work is supported in part by NSF grant DCR 8502884, ONR grant N00014-86-K-0215, and the Cornell NSF Supercomputing Center.
Preview
Unable to display preview. Download preview PDF.
References
A.Aiken and A.Nicolau. Loop Quantization: An analysis and Algorithm. Technical Report No.87-821, Department of Computer Science, Cornell University, March 1987.
J.R.Allen and K.Kennedy. Automatic Loop Interchange. In the Proceedings of the Symposium on Compiler Construction, SIGPLAN Notices, Vol.19 No.6, 1984.
Alliant. Product Summary. Alliant Computer Systems Corporation. Acton Mass. January 1985.
U.Banerjee. Speedup of Ordinary Programs. University of Illinois Computer Science Technical Report UIUCDS-R-79-989, Oct. 1979.
R. Bogen. MACSYMA Reference Manual. Symbolics Inc., Cambridge, Mass. December 1983.
R.Brent. The Parallel Evaluation of General Arithmetic Expressions. Journal of the ACM 21, pp. 201–206, 1974.
A.E. Charlesworth. An approach to Scientific Array Processing: The Architectural Design of the AP-120b/FPS-164 Family. IEEE Computer, Vol.14, No.3, pp.18–27, 1981.
R.Cytron. Doacross: beyond vectorization for multiprocessors. Proceedings of the 1986 International Conference on Parallel Processing, pp.836–844, Aug.1986.
J.A.Fisher, J.R.Ellis, J.C.Ruttenberg and A.Nicolau. Parallel Processing: A Smart Compiler and a Dumb Machine. Proc. of the ACM Symposium on Compiler Construction, 1984.
J. A. Fisher. The Optimization of Horizontal Microcode within and beyond Basic Blocks: an Application of Processor Scheduling with Resources. New York University Ph. D. thesis, New York, 1979.
J.A.Fisher Very long instruction word architectures and the ELI-512. Yale University Department of Computer Science, Technical report # 253, 1982.
J. R. Goodman, J. Hsieh, K. Liou, A. R. Pleszkun, P. B. Schechter, H. C. Young. PIPE: A VLSI Decoupled Architecture. The 12th Annual International Symposium on Computer Architecture, June 17–19, 1985, Boston, MA, 20–27.
R.W.Heuft and W.D.Little. Improved Time and Parallel Processor Bounds for Fortran-like Loops. IEEE Transactions on Computers Vol.31, No.1, 1982.
D.J. Kuck. Parallel Processing of Ordinary Programs. In Advances in Computers, Vol. 15, pp. 119–179, 1976.
R.H.Khun. Optimization and Interconnection Complexity for: Parallel Processors, Single-Stage Networks and Decision Trees. Ph.D. Thesis, University of Illinois at Urbana-Champaign, 1980.
F. H. McMahon. Lawrence Livermore National Laboratory FORTRAN Kernels: MFLOPS. Livermore, CA. 1983.
Y.Muraoka. Parallelism Exposure and Exploitation in Programs. University of Illinois, Urbana, Dept. of Computer Science, Tech. Rep. 71–424, 1971.
A.Nicolau. Parallelism, Memory Anti-Aliasing and Correctness for Trace Scheduling Compilers. Yale University Ph.D. Thesis, June 1984.
A.Nicolau. Percolation Scheduling: A Parallel Compilation Technique. Cornell University, Dept. of Computer Science Technical Report TR-85-678, May 1985.
A. Nicolau and K. Karplus. ROPE: a Statically Scheduled Supercomputer Architecture. First International Conference on Supercomputing Systems, St. Petersburg, FL, December 1985.
C.L.Seitz. The Cosmic Cube. Communications of the ACM, Vol.28, No.1 January 1985.
J.Solworth and A.Nicolau. Microflow: A fine-grain Parallel Processing Approach. Cornell University, Dept. of Computer Science Technical Report TR-85-710
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1988 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nicolau, A. (1988). Loop quantization or unwinding done right. In: Houstis, E.N., Papatheodorou, T.S., Polychronopoulos, C.D. (eds) Supercomputing. ICS 1987. Lecture Notes in Computer Science, vol 297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-18991-2_17
Download citation
DOI: https://doi.org/10.1007/3-540-18991-2_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-18991-6
Online ISBN: 978-3-540-38888-3
eBook Packages: Springer Book Archive