Abstract
Linear algebra codes contain data locality which can be exploited by tiling multiple loop nests. Several approaches to tiling have been suggested for avoiding conflict misses in low associativity caches. We propose a new technique based on intra-variable padding and compare its performance with existing techniques. Results show padding improves performance of matrix multiply by over 100% in some cases over a range of matrix sizes. Comparing the efficacy of different tiling algorithms, we discover rectangular tiles are slightly more efficient than square tiles. Overall, tiling improves performance from 0-250%. Copying tiles at run time proves to be quite effective.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bacon, D., Chow, J.-H., Ju, D.-C., Muthukumar, K., Sarkar, V.: A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness. In: Proceedings of CASCON 1994, Toronto, Canada (October 1994)
Bailey, D.: Unfavorable strides in cache memory systems. Technical Report RNR- 92-015, NASA Ames Research Center (May 1992)
Callahan, D., Carr, S., Kennedy, K.: Improving register allocation for subscripted variables. In: Proceedings of the SIGPLAN 1990 Conference on Programming Language Design and Implementation, White Plains, NY (June 1990)
Carr, S., Kennedy, K.: Compiler blockability of numerical algorithms. In: Proceedings of Supercomputing 1992, Minneapolis, MN (November 1992)
Cierniak, M., Li, W.: Unifying data and control transformations for distributed shared-memory machines. In: Proceedings of the SIGPLAN 1995 Conference on Programming Language Design and Implementation, La Jolla, CA (June 1995)
Coleman, S., McKinley, K.S.: Tile size selection using cache organization and data layout. In: Proceedings of the SIGPLAN 1995 Conference on Programming Language Design and Implementation, La Jolla, CA (June 1995)
Esseghir, K.: Improving data locality for caches. Master’s thesis, Dept. of Computer Science, Rice University (September 1993)
Ferrante, J., Sarkar, V., Thrash, W.: On estimating and enhancing cache effectiveness. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds.) Languages and Compilers for Parallel Computing, Fourth International Workshop, Santa Clara, CA. Springer, Heidelberg (1991)
Gannon, D., Jalby, W., Gallivan, K.: Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing 5(5), 587–616 (1988)
Ghosh, S., Martonosi, M., Malik, S.: Cache miss equations: An analytical representation of cache misses. In: Proceedings of the 1997 ACM International Conference on Supercomputing, Vienna, Austria (July 1997)
Irigoin, F., Triolet, R.: Supernode partitioning. In: Proceedings of the Fifteenth Annual ACM Symposium on the Principles of Programming Languages, San Diego, CA (January 1988)
Kandemir, M., Ramanujam, J., Choudhary, A.: A compiler algorithm for optimizing locality in loop nests. In: Proceedings of the 1997 ACM International Conference on Supercomputing, Vienna, Austria (July 1997)
Lam, M., Rothberg, E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. In: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), Santa Clara, CA (April 1991)
Lebeck, A., Wood, D.: Cache profiling and the SPEC benchmarks: A case study. IEEE Computer 27(10), 15–26 (1994)
Manjikian, N., Abdelrahman, T.: Fusion of loops for parallelism and locality. IEEE Transactions on Parallel and Distributed Systems 8(2), 193–209 (1997)
Mitchell, N., Carter, L., Ferrante, J., Hogstedt, K.: Quantifying the multi-level nature of tiling interactions. In: Proceedings of the Tenth Workshop on Languages and Compilers for Parallel Computing, Minneapolis, MN (August 1997)
McKinley, K.S., Carr, S., Tseng, C.-W.: Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems 18(4), 424–453 (1996)
McKinley, K.S., Temam, O.: A quantitative analysis of loop nest locality. In: Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Boston, MA (October 1996)
O’Boyle, M., Knijnenburg, P.: Non-singular data transformations: Definition, validity, and applications. In: Proceedings of the 1997 ACM International Conference on Supercomputing, Vienna, Austria (July 1997)
Rivera, G., Tseng, C.-W.: Data transformations for eliminating conflict misses. In: Proceedings of the SIGPLAN 1998 Conference on Programming Language Design and Implementation, Montreal, Canada (June 1998)
Rivera, G., Tseng, C.-W.: Eliminating conflict misses for high performance architectures. In: Proceedings of the 1998 ACM International Conference on Supercomputing, Melbourne, Australia (July 1998)
Temam, O., Fricker, C., Jalby, W.: Cache interference phenomena. In: Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement & Modeling Computer Systems, Santa Clara, CA (May 1994)
Temam, O., Granston, E., Jalby, W.: To copy or not to copy: A compiletime technique for assessing when data copying should be used to eliminate cache conflicts. In: Proceedings of Supercomputing 1993, Portland, OR (November 1993)
Wolf, M., Maydan, D., Chen, D.-K.: Combining loop transformations considering caches and scheduling. In: Proceedings of the 29th IEEE/ACM International Symposium on Microarchitecture, Paris, France (December 1996)
Wolf, M.E., Lam, M.: A data locality optimizing algorithm. In: Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation, Toronto, Canada (June 1991)
Wolfe, M.J.: More iteration space tiling. In: Proceedings of Supercomputing 1989, Reno, NV (November 1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rivera, G., Tseng, CW. (1999). A Comparison of Compiler Tiling Algorithms. In: Jähnichen, S. (eds) Compiler Construction. CC 1999. Lecture Notes in Computer Science, vol 1575. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49051-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-49051-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65717-0
Online ISBN: 978-3-540-49051-7
eBook Packages: Springer Book Archive