Array Unification: A Locality Optimization Technique

Kandemir, Mahmut Taylan

doi:10.1007/3-540-45306-7_18

Mahmut Taylan Kandemir⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2027))

Included in the following conference series:

International Conference on Compiler Construction

698 Accesses
6 Citations

Abstract

One of the key challenges facing computer architects and compiler writers is the increasing discrepancy between processor cycle times and main memory access times. To alleviate this problem for a class of array-dominated codes, compilers may employ either control-centric transformations that change data access patterns of nested loops or data-centric transformations that modify the memory layouts of multi-dimensional arrays. Most of the layout optimizations proposed so far either modify the layout of each array independently or are based on explicit data reorganizations at runtime.

This paper describes a compiler technique, called array unification, that automatically maps multiple arrays into a single data (array) space to improve data locality. We present a mathematical framework that enables us to systematically derive suitable mappings for a given program. The framework divides the arrays accessed by the program into several groups and each group is transformed to improve spatial locality and reduce the number of conflict misses. As compared to the previous approaches, the proposed technique works on a larger scope and makes use of independent layout transformations as well whenever necessary. Preliminary results on two benchmark codes show significant improvements in cache miss rates and execution time.

Download to read the full chapter text

Chapter PDF

Data Layout Optimization for Portable Performance

Locality-Based Optimizations in the Chapel Compiler

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

T. Chilimbi, M. Hill, and J. Larus. Cache-conscious structure layout. In Proc. The SIGPLAN’99 Conf. on Prog. Lang. Design and Impl., Atlanta, GA, May 1999.
Google Scholar
M. Cierniak and W. Li. Unifying data and control transformations for distributed shared memory machines. In Proc. SIGPLAN’ 95 Conf. on Programming Language Design and Implementation, June 1995.
Google Scholar
C. Ding and K. Kennedy. Improving cache performance in dynamic applications through data and computation reorganization at runtime. In Proc. ACM SIGPLAN Conf. on Prog. Lang. Design and Implementation, Georgia, May, 1999.
Google Scholar
C. Ding and K. Kennedy. Inter-array data regrouping. In Proc. the 12th Workshop on Languages and Compilers for Parallel Computing, San Diego, CA, August 1999.
Google Scholar
C. Eisenbeis, S. Lelait, and B. Marmol. The meeting graph: a new model for loop cyclic register allocation. In Proc. the IFIP WG 10.3 Working Conference on Parallel Architectures and Compilation Techniques, Limassol, Cyprus, June 1995.
Google Scholar
D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformations. Journal of Parallel & Distributed Computing, 5(5):587–616, October 1988.
Article Google Scholar
F. Irigoin and R. Triolet. Super-node partitioning. In Proc. 15th Annual ACM Symp. Principles of Prog. Lang., pp. 319–329, San Diego, CA, January 1988.
Google Scholar
M. Kandemir, A. Choudhary, N. Shenoy, P. Banerjee, and J. Ramanujam. A hyperplane based approach for optimizing spatial locality in loop nests. In Proc. 1998 ACM Intl. Conf. on Supercomputing, Melbourne, Australia, July 1998.
Google Scholar
M. Kandemir, J. Ramanujam, and A. Choudhary. A compiler algorithm for optimizing locality in loop nests. In Proc. 11th ACM Intl. Conf. on Supercomputing, pages 269–276, Vienna, Austria, July 1997.
Google Scholar
I. Kodukula, N. Ahmed, and K. Pingali. Data-centric multi-level blocking. In Proc. SIGPLAN Conf. Programming Language Design and Implementation, June 1997.
Google Scholar
S.-T. Leung and J. Zahorjan. Optimizing data locality by array restructuring. Technical Report TR 95-09-01, Dept. Computer Science and Engineering, University of Washington, Sept. 1995.
Google Scholar
W. Li. Compiling for NUMA parallel machines. Ph.D. Thesis, Cornell Uni., 1993.
Google Scholar
S. Y. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. Ph.D. Thesis, Dept. of EECS, MIT, Cambridge, Massachusetts, June 1996.
Google Scholar
K. McKinley, S. Carr, and C.W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 1996.
Google Scholar
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proc. the ACM Intl. Conf. on Supercomputing, Rhodes, Greece, June 1999.
Google Scholar
M. O’Boyle and P. Knijnenburg. Integrating loop and data transformations for global optimisation. In Intl. Conf. on Parallel Architectures and Compilation Techniques, October 1998, Paris, France.
Google Scholar
O. Temam, E. Granston, and W. Jalby. To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In Proc. the IEEE Supercomputing’93, Portland, November 1993.
Google Scholar
G. Rivera and C.-W. Tseng. Data transformations for eliminating conflict misses. In Proc. the 1998 ACM SIGPLAN Conf. on Prog. Lang. Design and Implementation, Montreal, Canada, June 1998.
Google Scholar
M. Wolf and M. Lam. A data locality optimizing algorithm. In Proc. ACM SIGPLAN 91 Conf. Prog. Lang. Design and Implementation, pages 30–44, June 1991.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering Department, The Pennsylvania State University, University Park, PA, 16802-6106, USA
Mahmut Taylan Kandemir

Authors

Mahmut Taylan Kandemir
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fachrichtung Informatik, Universität des Saarlandes, Postfach 15 11 50, 66041, Saarbrücken, Germany
Reinhard Wilhelm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kandemir, M.T. (2001). Array Unification: A Locality Optimization Technique. In: Wilhelm, R. (eds) Compiler Construction. CC 2001. Lecture Notes in Computer Science, vol 2027. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45306-7_18

Download citation

DOI: https://doi.org/10.1007/3-540-45306-7_18
Published: 23 March 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41861-0
Online ISBN: 978-3-540-45306-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Array Unification: A Locality Optimization Technique

Abstract

Chapter PDF

Similar content being viewed by others

Data Layout Optimization for Portable Performance

Locality-Based Optimizations in the Chapel Compiler

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Array Unification: A Locality Optimization Technique

Abstract

Chapter PDF

Similar content being viewed by others

Data Layout Optimization for Portable Performance

Locality-Based Optimizations in the Chapel Compiler

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation