CRSD: Application Specific Auto-tuning of SpMV for Diagonal Sparse Matrices

Sun, Xiangzheng; Zhang, Yunquan; Wang, Ting; Long, Guoping; Zhang, Xianyi; Li, Yan

doi:10.1007/978-3-642-23397-5_32

Xiangzheng Sun¹⁸,
Yunquan Zhang¹⁸,
Ting Wang¹⁸,
Guoping Long¹⁸,
Xianyi Zhang¹⁸ &
…
Yan Li¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6853))

Included in the following conference series:

European Conference on Parallel Processing

1512 Accesses
5 Citations

Abstract

Sparse Matrix-Vector multiplication (SpMV) is an important computational kernel in scientific applications. Its performance highly depends on the nonzero distribution of sparse matrices. In this paper, we propose a new storage format for diagonal sparse matrices, defined as Compressed Row Segment with Diagonal-pattern (CRSD). We design diagonal patterns to represent the diagonal distribution. As the diagonal distributions are similar within matrices from one application, some diagonal patterns remain unchanged. First, we sample one matrix to obtain the unchanged diagonal patterns. Next, the optimal SpMV codelets are generated automatically for those diagonal patterns. Finally, we combine the generated codelets as the optimal SpMV implementation. In addition, the information collected during auto-tuning process is also utilized for parallel implementation to achieve load-balance. Experimental results demonstrate that the speedup reaches up to 2.37 (1.70 on average) in comparison with DIA and 4.60 (2.10 on average) in comparison with CSR under the same number of threads on two mainstream multi-core platforms.

This paper is supported by the National 863 Plan of China (No.2006AA01A125, No. 2009AA01A129, No. 2009AA01A134), the China HGJ Significant Project (No. 2009ZX01036-001-002), the Knowledge Innovation Program of the Chinese Academy of Sciences (No.KGCX1-YW-13), the Ministry of Finance (No. ZDYZ2008-2).

Download to read the full chapter text

Chapter PDF

A Dynamic Parameter Tuning Method for High Performance SpMM

Adaptive sparse matrix representation for efficient matrix–vector multiplication

Article 28 November 2015

Benchmarking SpMV Methods on Many-Core Platforms

Keywords

References

Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput oriented processors. In: Supercomputing (2009)
Google Scholar
Vuduc, R.W.: Automatic Performance of Sparse Matrix Kernels. The dissertation of Ph.D, Computer Science Division, U.C. Berkeley (2003)
Google Scholar
Im, E.: Optimizing the performance of sparse matrix-vector multiplication. PhD thesis, University of California, Berkeley (2000)
Google Scholar
Belgin, M., Back, G., Ribbens, C.J.: Pattern-based sparse matrix representation for memory-efficient SMVM kernels. In: International Conference on Supercomputing, NY, USA (2009)
Google Scholar
Williams, S., Oliker, L., Vuduc, R., Shalf, J., Yelick, K., Demmel, J.: Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, Reno, Nevada, November 10-16 (2007)
Google Scholar
Kulkarni, M., Pingali, K.: An experimental study of self-optimizing dense linear algebra software. Proceedings of the IEEE 96(5), 832–848 (2008)
Article Google Scholar
Vuduc, R., Demmel, J., Yelick, K.: OSKI: A library of automatically tuned sparse matrix kernels. In: Proceedings of SciDAC 2005, Journal of Physics: Conference Series (2005)
Google Scholar
Im, E.-J., Yelick, K.A.: Optimizing sparse matrix computations for register reuse in SPARSITY. In: Alexandrov, V.N., Dongarra, J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS-ComputSci 2001. LNCS, vol. 2073, pp. 127–136. Springer, Heidelberg (2001)
Chapter Google Scholar
Vuduc, R., Demmel, J., Yelick, K., Kamil, S., Nishtala, R., Lee, B.: Performance optimizations and bounds for sparse matrix-vector multiply. In: Supercomputing, Baltimore, MD (2002)
Google Scholar
Vuduc, R.W., Moon, H.-J.: Fast sparse matrix-vector multiplication by exploiting variable block structure. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds.) HPCC 2005. LNCS, vol. 3726, pp. 807–816. Springer, Heidelberg (2005)
Chapter Google Scholar
Nishtala, R., Vuduc, R., Demmel, J.W., Yelick, K.A.: When cache blocking sparse matrix vector multiply works and why. Applicable Algebra in Engineering, Communication, and Computing (2007)
Google Scholar
Willcock, J., Lumsdaine, A.: Accelerating sparse matrix computations via data compression. In: ICS 2006: Proceedings of the 20th Annual International Conference on Supercomputing, pp. 307–316. ACM Press, New York (2006)
Google Scholar
Kourtis, K., Goumas, G., Koziris, N.: Optimizing sparse matrix-vector multiplication using index and value compression. In: Proceedings of the 5th Conference on Computing Frontiers, Ischia, Italy, May 5-7 (2008)
Google Scholar
Boisvert, R., Pozo, R., Remington, K., Miller, B., Lipman, R.: NISTMatrixMarket, http://math.nist.gov/MatrixMarket/index.html
Chana, K.H., Li, L., Liao, X.: Modelling the core convection using finite element and finite difference methods. Physics of the Earth and Planetary Interiors 157(2), 124–138 (2006)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Lab. of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences. Graduate University of Chinese Academy of Sciences, China
Xiangzheng Sun, Yunquan Zhang, Ting Wang, Guoping Long, Xianyi Zhang & Yan Li

Authors

Xiangzheng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yunquan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ting Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guoping Long
View author publications
You can also search for this author in PubMed Google Scholar
Xianyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Equipe Runtime, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France
Emmanuel Jeannot & Raymond Namyst &
Equipe HIEPACS, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France
Jean Roman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, X., Zhang, Y., Wang, T., Long, G., Zhang, X., Li, Y. (2011). CRSD: Application Specific Auto-tuning of SpMV for Diagonal Sparse Matrices. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6853. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23397-5_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-23397-5_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23396-8
Online ISBN: 978-3-642-23397-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CRSD: Application Specific Auto-tuning of SpMV for Diagonal Sparse Matrices

Abstract

Chapter PDF

Similar content being viewed by others

A Dynamic Parameter Tuning Method for High Performance SpMM

Adaptive sparse matrix representation for efficient matrix–vector multiplication

Benchmarking SpMV Methods on Many-Core Platforms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

CRSD: Application Specific Auto-tuning of SpMV for Diagonal Sparse Matrices

Abstract

Chapter PDF

Similar content being viewed by others

A Dynamic Parameter Tuning Method for High Performance SpMM

Adaptive sparse matrix representation for efficient matrix–vector multiplication

Benchmarking SpMV Methods on Many-Core Platforms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation