Skip to main content
Log in

Simultaneous Minimization of Capacity and Conflict Misses

  • Short Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Loop tiling (or loop blocking) is a well-known loop transformation to improve temporal locality in nested loops which perform matrix computations. When targeting caches that have low associativities, one of the key challenges for loop tiling is to simultaneously minimize capacity misses and conflict misses. This paper analyzes the effect of the tile size and the array-dimension size on capacity misses and conflict misses. The analysis supports the approach of combining tile-size selection (to minimize capacity misses) with array padding (to minimize conflict misses).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jingling Xue. Loop Tiling for Parallelism. Kluwer Academic Publishers, 2000.

  2. Boulet P, Dongarra J, Robert Y et al. Static tiling for heterogeneous computing platforms. Parallel Computing, 1999, 25(5): 547–568.

    Article  MATH  Google Scholar 

  3. Lam M S, Rothberg E E, Wolf M E. The cache performance and optimizations of blocked algorithms. In Proc. the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA, April 1991, pp.63–74.

  4. Chame J, Moon S. A tile selection algorithm for data locality and cache interference. In Proc. the Thirteenth ACM International Conference on Supercomputing, Rhodes, Greece, June 1999, pp.492–499.

  5. Coleman S, McKinley K S. Tile size selection using cache organization and data layout. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995, pp.279–290.

  6. Panda P, Nakamura H, Dutt N et al. Augmenting loop tiling with data alignment for improved cache performance. IEEE Transactions on Computers, February 1999, 48(2): 142–149.

    Article  Google Scholar 

  7. Rivera G, Tseng C W. A comparison of compiler tiling algorithms. In Proc. the Eighth International Conference on Compiler Construction, Amsterdam, The Netherlands, March 1999, pp.168–182.

  8. Hong J W, Kung H. I/O complexity: The red-blue pebble game. In Proc. the Thirteenth Annual ACM Symposium on Theory of Computing, Milwaukee, Wisconsin, May 1981, pp.326–333.

  9. Song Y H, Li Z Y. New tiling techniques to improve cache temporal locality. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, GA, May 1999, pp.215–228.

  10. Bacon D, Chow J H, Ju D \it et al. \rm A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness. In Proc. CASCON'94, Toronto, Ontario, October, 1994, pp.270–282.

  11. Li Z Y, Song Y H. Automatic tiling of iterative stencil loops. ACM Trans. Programming Languages and Systems, November 2004, 26(6): 975–1028.

    Article  Google Scholar 

  12. Object-Oriented Scientific Computing. http://www.oonumerics.org/blitz/benchmarks/. Blitz++.

  13. Admas J C. MUDPACK: Multigrid software for elliptic partial differential equations. http://www.scd.ucar.edu/css/software/mudpack/.

  14. Ghosh S, Martonosi M, Malik S. Precise miss analysis for program transformations with caches of arbitrary associativity. In Proc. the Eighth ACM Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, October 1998, pp.228–239.

  15. Rivera G, Tseng C W. Tiling optimizations for 3D scientific computations. In Proc. IEEE/ACM SC 2000, November 2000.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiyuan Li.

Additional information

This work is sponsored in part by National Science Foundation of USA under Grant Nos. ST-HEC-0444285, CCR-950254, ACI/ITR-0082834 and CCR-9975309, by Indiana 21st Century Fund, and by a donation from Sun Microsystems, Inc.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Z. Simultaneous Minimization of Capacity and Conflict Misses. J Comput Sci Technol 22, 497–504 (2007). https://doi.org/10.1007/s11390-007-9069-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-007-9069-8

Keywords

Navigation