Compiler and Run-Time Support for Improving Locality in Scientific Codes

Extended Abstract
  • Hwansoo Han
  • Gabriel Rivera
  • Chau-Wen Tseng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1863)


Modern microprocessors provide high performance by exploiting data locality with carefully designed multi-level caches. However, advanced scientific computations have features such as adaptive irregular memory accesses and large data sets that make utilizing caches effectively dificult. Traditional program transformations are frequently inapplicable or insuficient. Exploiting locality for these applications requires compile-time analyses and run-time systems to perform data layout and computation transformations. Run-time systems are needed because many programs are not analyzable statically, but compiler support is still crucial both for inserting interfaces to the run-time system and for directly applying program transformations where possible. Cooperation between the compiler and run-time will be critical for advanced scientific codes. We investigate software support for improving locality for advanced scientific applications on both sequential and parallel machines. We examine issues for both irregular adaptive and dense-matrix codes.


Cache Line Cache Performance Memory Access Pattern Outer Loop Iteration Modern Microprocessor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    M. Berger and S. Bokhari. A partitioning strategy for non-uniform problems on multiprocessors. IEEE Transactions on Computers, 37(12):570–580, 1987.CrossRefGoogle Scholar
  2. 2.
    C. Ding and K. Kennedy. Improving cache performance of dynamic applications with computation and data layout transformations. In Proceedings of the SIGPLAN PLDI, Atlanta, May 1999.Google Scholar
  3. 3.
    H. Han and C.-W. Tseng. Improving locality for adaptive irregular scientific codes. Technical Report CS-TR-4039, Dept. of Computer Science,University of Maryland, College Park, September 1999.Google Scholar
  4. 4.
    G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. In Proceedings of the 24th ICPP, Oconomowoc, August 1995.Google Scholar
  5. 5.
    M. Lam, E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In Proceedings of the ASPLOS-IV, SantaClara, April 1991.Google Scholar
  6. 6.
    G. Rivera and C.-W. Tseng. Data transformations for eliminating conflict misses. In Proceedings of the SIGPLAN PLDI, Montreal, June 1998.Google Scholar
  7. 7.
    G. Rivera and C.-W. Tseng. Eliminating conflict misses for high performance architectures. In Proceedings of the ICS, Melbourne, July 1998.Google Scholar
  8. 8.
    G. Rivera and C.-W. Tseng. A comparison of compiler tiling algorithms. In Proceedings of the 8th Conference on Compiler Construction, Amsterdam, March 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Hwansoo Han
    • 1
  • Gabriel Rivera
    • 1
  • Chau-Wen Tseng
    • 1
  1. 1.Department of Computer ScienceUniversity of MarylandCollege ParkUSA

Personalised recommendations