Modeling and improving locality for irregular problems: Sparse matrix-Vector product on cache memories as a case study
In this paper we introduce a model for representing and improving the locality of sparse matrices for irregular problems. We focus our attention on the behavior of iterative methods for the solution of sparse linear systems with irregular patterns. In particular the product of a sparse matrix by a dense vector (SpM×V) is closely examined, as this is one of the basic kernels in such codes. As a representative level of the memory hierarchy, we consider the cache memory. In our model, locality is measured taking into account pairs of rows or columns of sparse matrices. In order to evaluate this locality four functions based on two parameters called entry matches and cache line matches are introduced. Using an analogy of this problem to the Traveling Salesman Problem we have applied two algorithms in order to solve it; one based on the construction of minimum spanning trees and the other on the nearest-neighbor heuristic. These techniques were tested over a set of sparse matrices. The results were assesed through the measurement of cache misse on a standard cache memory.
Keywordssparse matrix temporal locality spatial locality memory hierarchy cache memory iterative methods optimization problem
Unable to display preview. Download preview PDF.
- 1.R. Barret, M. Berry, T. Chan. J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. van der Vorst. Templates for the Solution of Linear Systems.: Building Blocks for Iterative Methods. SIAM Press, 1994.Google Scholar
- 2.Ken Chen. A study on the cache memory miss ratio issue in multiprocessor systems. Technical report, INRIA, Institut National De Recherche en Informatique et en Automatique, October 1990.Google Scholar
- 3.I.S. Duff, R.G. Grimes, and J.G. Lewis. User's guide for the harwell-boeing collection. Technical report, CERFACS, 1992.Google Scholar
- 4.A. George. Direct solution of sparse positive definite systems: some basic ideas and open problems.I. S. Duff, Academic Press, 1981.Google Scholar
- 5.Alan Gibbons. Algorithmic Graph Theory. Cambridge University Press, 1984.Google Scholar
- 6.J.L. Hennesy, and D.A. Patterson. Computer architectures: a quantitative approach. Morgan Kaufman Publishers, Palo Alto, 1990.Google Scholar
- 7.Mark D. Hill and James R. Larus. Cache considerations for multiprocessors programmers. In Communications of the ACM, volume 33, pages 97–102. 1990.Google Scholar
- 8.M.D. Hill. Aspects of Cache Memory and Instruction Buffer Performance. PhD thesis, University of California, Berkeley, 1987.Google Scholar
- 9.J.J. Navarro, E. García, J.L. Larriba-Pey, and T. Juan. Block algorithms for sparse matrix computations on high performance workstations. Proc. IEEE Int'l. Conf. on Supercomputing (ICS'96), pages 301–309, 1996.Google Scholar
- 10.Gerhard Reinelt. The Traveling Salesman. Computational Solutions for TSP applications. Lecture Notes in Computer Science. Springer-Verlag, 1991.Google Scholar
- 11.O. Temam and W. Jalby. Characterizing the behavior of sparse algorithms on caches. Int'l Conf. on Supercomputing (ICS'92), pages 578–587, 1992.Google Scholar
- 12.Evan Torrie, M. Martonosi, C. Tseng, and M.W. Hall. Characterizing the memory behavior of compiler-parallelized applications. IEEE Transactions on Parallel and Distributed Systems, 7(6), December 1996.Google Scholar