Iteration Space Slicing for Locality
Improving data locality in programs which manipulate arrays has been the subject of a great deal of research. Much of the work on improving data locality examines individual loop nests; other work includes transformations such as loop fusion, which combines loops so that multiple loop nests can be transformed as a single loop nest. We propose a data-driven method to optimize locality across multiple loop nests. Our technique achieves loop fusion-like results even when normal loop fusion is illegal without enabling transformations. Given an array whose locality should be optimized, it also finds other calculations that can profitably be executed with the computation of that array.
KeywordsTransitive Closure Array Element Iteration Space Loop Nest Innermost Loop
Unable to display preview. Download preview PDF.
- 1.D. Callahan, S. Carr, and K. Kennedy. Improving register allocation for subscripted variables. In ACM SIGPLAN’ 90 Conference on Programming Language Design and Implementation, June 1990.Google Scholar
- 2.B. Cmelik and D. Keppel. Shade: a fast instruction-set simulator for execution profiling. ACM SIGMETRICS Performance Evaluation Review, 22(1):128–137, May 1994.Google Scholar
- 3.W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott. The Omega Library interface guide. Technical Report CS-TR-3445, Dept. of Computer Science, University of Maryland, College Park, Mar. 1995. The Omega library is available from http://www.cs.umd.edu/projects/omega Google Scholar
- 4.W. Kelly, W. Pugh, and E. Rosser. Code generation for multiple mappings. In The 5th Symposium on the Frontiers of Massively Parallel Computation, pages 332–341, McLean, Virginia, Feb. 1995.Google Scholar
- 5.W. Kelly, W. Pugh, E. Rosser, and T. Shpeisman. Transitive closure of infinite graphs and its applications. International J. of Parallel Programming, 24(6):579–598, Dec. 1996.Google Scholar
- 6.I. Kodukula, N. Ahmed, and K. Pingali. Data-centric multi-level blocking. In ACM SIGPLAN’ 97 Conference on Programming Language Design and Implementation, June 1997.Google Scholar
- 7.S. S. Lumetta, A. M. Mainwaring, and D. E. Culler. Multi-protocol active messages on a cluster of smp’s. In Proceedings of SC’ 97, Nov. 1997.Google Scholar
- 9.S. Microsystems. The ultrasparctm processor technology white paper. Technical Report WPR-0021, Sun Microsystems, 1998. Available from http://www.sun.com/microelectronics/whitepapers/.
- 10.W. Pugh and E. Rosser. Iteration space slicing and its application to communication optimization. In Proceedings of the 1997 International Conference on Supercomputing, July 1997.Google Scholar
- 11.E. J. Rosser. Fine Grained Analysis of Array Computations. PhD thesis, Dept. of Computer Science, The University of Maryland, Sept. 1998.Google Scholar
- 12.M. Weiser. Program slicing. IEEE Transactions on Software Engineering, pages 352–357, July 1984.Google Scholar
- 13.A. Yoshida, K. Koshizuka, and H. Kasahara. Data-localization for fortran macrodataflow computation using partial static task assignment. In Proceedings of the 1996 International Conference on Supercomputing, pages 61–68, May 1996.Google Scholar