Compiler and Run-Time Support for Improving Locality in Scientific Codes
Modern microprocessors provide high performance by exploiting data locality with carefully designed multi-level caches. However, advanced scientific computations have features such as adaptive irregular memory accesses and large data sets that make utilizing caches effectively dificult. Traditional program transformations are frequently inapplicable or insuficient. Exploiting locality for these applications requires compile-time analyses and run-time systems to perform data layout and computation transformations. Run-time systems are needed because many programs are not analyzable statically, but compiler support is still crucial both for inserting interfaces to the run-time system and for directly applying program transformations where possible. Cooperation between the compiler and run-time will be critical for advanced scientific codes. We investigate software support for improving locality for advanced scientific applications on both sequential and parallel machines. We examine issues for both irregular adaptive and dense-matrix codes.
KeywordsCache Line Cache Performance Memory Access Pattern Outer Loop Iteration Modern Microprocessor
Unable to display preview. Download preview PDF.
- 2.C. Ding and K. Kennedy. Improving cache performance of dynamic applications with computation and data layout transformations. In Proceedings of the SIGPLAN PLDI, Atlanta, May 1999.Google Scholar
- 3.H. Han and C.-W. Tseng. Improving locality for adaptive irregular scientific codes. Technical Report CS-TR-4039, Dept. of Computer Science,University of Maryland, College Park, September 1999.Google Scholar
- 4.G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. In Proceedings of the 24th ICPP, Oconomowoc, August 1995.Google Scholar
- 5.M. Lam, E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In Proceedings of the ASPLOS-IV, SantaClara, April 1991.Google Scholar
- 6.G. Rivera and C.-W. Tseng. Data transformations for eliminating conflict misses. In Proceedings of the SIGPLAN PLDI, Montreal, June 1998.Google Scholar
- 7.G. Rivera and C.-W. Tseng. Eliminating conflict misses for high performance architectures. In Proceedings of the ICS, Melbourne, July 1998.Google Scholar
- 8.G. Rivera and C.-W. Tseng. A comparison of compiler tiling algorithms. In Proceedings of the 8th Conference on Compiler Construction, Amsterdam, March 1999.Google Scholar