Skip to main content

A Loop Transformation Algorithm Based on Explicit Data Layout Representation for Optimizing Locality

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1656))

Abstract

We present a cache locality optimization technique that can opitimize a loop nest even if the arrays referenced have different layouts in memory. Such a capability is required for a global locality optimization framework that applies both loop and data transformation to a sequence of loop nests for optimizing locality. Our method finds a non-singular iteratio-space transformation matrix such that in a given loop nest spatial locality is exploited in the innermost loops where it is most useful. The method builds inverse of a non-singular transformation matrix column-by-column starting from the rightmost column. In addition, our approach can work in those eases where the data layouts of a subset of the referenced arrays is unknown. Experimental results on an 8-processor SGI Origin 2000 show that our technic reduces execution times by up to 72%.

M. Kandemir and A. Choudhary were supported by NSF Young Investigator Award CCR-9357840, NSF grant CCR-9509143 and Air Force contract F30602-97-C-0026. J. Ramanujam was supported by NSF Young Investigator Award CCR-9457768. P. Banerjee was supported by NSF grant CCR-9526325 and by DARPA contract DABT-63-97-C-0035.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Abu-Sufah, D. J. Kuck, and D. H. Lawrie. On the performance enhancement of paging systems through program analysis and transformations. IEEE Trans. Comp., C-30(5):341–356, 1981.

    Article  Google Scholar 

  2. M. Cierniak and W. Li. Unifying data and control transformations for distributed shared memory machines. Proc. SIGPLAN Conf. Prog. Lang. Des. & Imp., June 1995.

    Google Scholar 

  3. M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A Matrix-Based Approach to the Global Locality Optimization Problem In Proc. 1998 Int. Conf. Parallel Architectures & Compilation Techniques (PACT 98), October 1998.

    Google Scholar 

  4. M. Kandemir, A. Choudhary, N. Shenoy, P. Banerjee, and J. Ramanujam. A hyperplane based approach for optimizing spatial locality in loop nests. In Proc. 12th ACM Int. Conf. Supercomputing, July 1998.

    Google Scholar 

  5. M. Kandemir, J. Ramanujam, and A. Choudhary. A compiler algorithm for optimizing locality in loop nests. In Proc. 11th ACM Int. Conf. Supercomputing, pp. 269–276, July 1997.

    Google Scholar 

  6. I. Kodukula, N. Ahmed, and K. Pingali. Data-centric multi-level blocking. In Proc. SIGPLAN Conf. Prog. Lang. Des. & Imp., June 1997.

    Google Scholar 

  7. S.-T. Leungand J. Zahorjan. Optimizing data locality by array restructuring. Technical Report TR 95-09-01, CSE Dept., University of Washington, Sep. 1995.

    Google Scholar 

  8. W. Li. Compilingfor NUMA parallel machines. Ph.D. Thesis, Cornell University, 1993.

    Google Scholar 

  9. K. McKinley, S. Carr, and C. W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 1996.

    Google Scholar 

  10. M. O’Boyle and P. Knijnenburg. Non-singular data transformations: Definition, validity, applications. In Proc. 6th Workshop on Compilers for Par. Comp., pp. 287–297, Germany, 1996.

    Google Scholar 

  11. J. Ramanujam and P. Sadayappan. Compile-time techniques for data distribution in distributed memory machines. In IEEE Trans. Par. & Dist. Sys., 2(4):472–482, Oct. 1991.

    Article  Google Scholar 

  12. A. Schrijver. Theory of linear and integer programming, John Wiley, 1986.

    Google Scholar 

  13. M. Wolf and M. Lam. A data locality optimizing algorithm. In Proc. ACM SIGPLAN 91 Conf. Programming Language Design and Implementation, pp. 30–44, June 1991.

    Google Scholar 

  14. M. Wolfe. High performance compilers for parallel computing, Addison Wesley, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kandemir, M., Ramanujam, J., Choudhary, A., Banerjee, P. (1999). A Loop Transformation Algorithm Based on Explicit Data Layout Representation for Optimizing Locality. In: Chatterjee, S., et al. Languages and Compilers for Parallel Computing. LCPC 1998. Lecture Notes in Computer Science, vol 1656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48319-5_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-48319-5_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66426-0

  • Online ISBN: 978-3-540-48319-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics