Abstract
Naive code generation from high-level languages that encourage modularity can give rise to large numbers of simple loops for array-based programs. Collective loop fusion and array contraction can be used on such codes to improve temporal locality and performance. The problem is typically formalised using a loop dependence graph (LDG), with solutions denoted by fusion partitions. Much previous work has concentrated on approaches to the abstract formulation. We present our technique called iterative collective loop fusion based on empirically evaluating different transformations, and show how it can provide speedups over existing approaches of up to 1.38. We also give results showing that applying such techniques to high-level languages can provide speedups of up to 2.45 over the original code, and outperforms an equivalent code in Fortran.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Watt, S.M.: Aldor Users Guide, http://www.aldor.org
Kennedy, K., McKinley, K.S.: Typed Fusion with Applications to Parallel and Sequential Code Generation. Techreport TR93-208. Rice University Dept. of Computer Science (1993)
Gao, G.R., Olsen, R., Sarkar, V., Thekkath, R.: Collective Loop Fusion for Array Contraction. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.A. (eds.) LCPC 1992. LNCS, vol. 757, pp. 281–295. Springer, Heidelberg (1993)
Lewis, E.C., Lin, C., Snyder, L.: The implementation and evaluation of fusion and contraction in array languages. In: PLDI 1998. Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, pp. 50–59. ACM Press, New York (1998)
Song, Y., Xu, R., Wang, C., Li, Z.: Data locality enhancement by memory reduction. In: ICS 2001. Proceedings of the 15th international conference on Supercomputing, pp. 50–64. ACM Press, New York (2001)
Ding, C., Kennedy, K.: The Memory Bandwidth Bottleneck and its Amelioration by a Compiler. In: IPDPS 2000: Proceedings of the 14th International Symposium on Parallel and Distributed Processing, p. 181. IEEE Computer Society, Los Alamitos (2000)
Singhai, S., McKinley, K.S.: A Parameterized Loop Fusion Algorithm for Improving Parallelism and Cache Locality. The Computer Journal 40(6), 340–355 (1997)
Darte, A.: On the Complexity of Loop Fusion. In: PACT 1999, Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, p. 149. IEEE Computer Society, Los Alamitos (1999)
Kennedy, K.: Fast greedy weighted fusion. In: ICS 2000, pp. 131–140. ACM Press, New York (2000)
Megiddo, N., Sarkar, V.: Optimal weighted loop fusion for parallel programs. In: SPAA 1997: Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, pp. 282–291. ACM Press, New York (1997)
Parello, D., Temam, O., Verdun, J.-M.: On increasing architecture awareness in program optimizations to bridge the gap between peak and sustained processor performance: matrix-multiply revisited. In: Supercomputing 2002, pp. 1–11. IEEE Computer Society Press, Los Alamitos (2002)
Kisuki, T., Knijnenburg, P.M.W., O’Boyle, M.F.P.: Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation. In: PACT 2000, p. 237. IEEE Computer Society Press, Los Alamitos (2000)
Nisbet, A.P.: GAPS: Iterative Feedback Directed Parallelisation Using Genetic Algorithms. In: Proceedings of Workshop on Profile and Feedback-Directed Compilation at PACT 1998, Paris, France (1998)
Gheorghita, S.V., Corporaal, H., Basten, T.: Iterative Compilation for Energy Reduction. Journal of Embedded Computing (to appear, 2005)
Ashby, T.J.: Design and Optimisation of Scientific Programs in a Categorical Language. PhD Thesis, University of Edinburgh (2005)
Freund, R., Nachtigal, N.: QMRpack, http://www.netlib.org/linalg/qmr/
Greenbaum, A.: Iterative methods for solving linear systems. Society for Industrial and Applied Mathematics (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ashby, T.J., O’Boyle, M.F.P. (2006). Iterative Collective Loop Fusion. In: Mycroft, A., Zeller, A. (eds) Compiler Construction. CC 2006. Lecture Notes in Computer Science, vol 3923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11688839_17
Download citation
DOI: https://doi.org/10.1007/11688839_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33050-9
Online ISBN: 978-3-540-33051-6
eBook Packages: Computer ScienceComputer Science (R0)