Parallel Matrix Transposition and Vector Multiplication Using OpenMP

Weng, Tien-Hsiung; Batjargal, Delgerdalai; Pham, Hoa; Hsieh, Meng-Yen; Li, Kuan-Ching

doi:10.1007/978-1-4614-6747-2_30

Tien-Hsiung Weng³,
Delgerdalai Batjargal³,
Hoa Pham³,
Meng-Yen Hsieh³ &
…
Kuan-Ching Li³

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 234))

2168 Accesses
2 Citations

Abstract

In this chapter, we propose two parallel algorithms for sparse matrix transposition and vector multiplication using CSR format: with and without actual matrix transposition. Both algorithms are parallelized using OpenMP. Experimentations are run on a quad-core Intel Xeon64 CPU E5507. We measure and compare the performance of our algorithms with that of using CSB scheme. Our experimental results show that actual matrix transposition algorithm is comparable to the CSB-based algorithm; on the other hand, direct sparse matrix-transpose-vector multiplication using CSR significantly outperforms CSB-based algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kotakemori H, Hasegawa H, Kajiyama T, Nukada A, Suda R, Nishida A (2008) Performance evaluation of parallel sparse matrix-vector products on SGI Altix3700. In: Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming, LNCS, West Lafayette, 12–14 May 2008, vol 5315, pp 153–163
Google Scholar
Godwin J, Holewinski J, Sadayappan P (2012) High-performance sparse matrix-vector multiplication on GPUs for structured grid computations. In: Proceedings of the 5th annual workshop on general purpose processing with graphics processing units, New York, 2012, pp 47–56
Google Scholar
Buluç A, Fineman JT, Frigo M, Gilbert JR, Leiserson CE (2009) Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: Proceedings of the 21th annual symposium on parallelism in algorithms and architectures, Calgary, 11–13 Aug 2009, pp 233–244
Google Scholar
Mateescu G, Bauer GH, Fiedler RA (2011) Optimizing matrix transposes using a POWER7 cache model and explicit prefetching. In: Proceedings of the second international workshop on performance modeling, benchmarking and simulation of high performance computing systems, Seattle, 12–18 Nov 2011, pp 5–6
Google Scholar
Gustavson FG (1978) Two fast algorithms for sparse matrices: multiplication and permuted transposition. ACM Trans Math Softw 4(3):250–269
Article MathSciNet MATH Google Scholar
OpenMP Architecture Review Board, Fortran 2.0 and C/C++ 1.0 specifications. Available at: www.openmp.org
Cilk Arts, Inc. (2009) Cilk++ Programmer’s Guide. Cilk Arts, Inc., Burlington. Available at: http://www.cilk.com
Buluç A (2011) Parallel SpMV and SpMVT using CSB, research software. Available at: http://gauss.cs.ucsb.edu/~aydin/software.html
Li K-C, Weng T-H (2009) Performance-based parallel application toolkit for high-performance clusters. J Supercomput 48(1):43–65
Article MATH Google Scholar
Davis TA (1994) University of Florida sparse matrix collection. NA Digest 92
Google Scholar
Li K-C, Chang H-C (2007) The design and implementation of Visuel performance monitoring and analysis toolkit for cluster and grid environments. J Supercomput 40(3):299–317
Article Google Scholar

Download references

Acknowledgements

This chapter is based upon work supported in part by Taiwan National Science Council (NSC) grants no. NSC101-2221-E-126-002– and Delta Electronics. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSC or Delta Electronics.

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, Providence University, Taichung, 43301, Taiwan
Tien-Hsiung Weng, Delgerdalai Batjargal, Hoa Pham, Meng-Yen Hsieh & Kuan-Ching Li

Authors

Tien-Hsiung Weng
View author publications
You can also search for this author in PubMed Google Scholar
Delgerdalai Batjargal
View author publications
You can also search for this author in PubMed Google Scholar
Hoa Pham
View author publications
You can also search for this author in PubMed Google Scholar
Meng-Yen Hsieh
View author publications
You can also search for this author in PubMed Google Scholar
Kuan-Ching Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tien-Hsiung Weng .

Editor information

Editors and Affiliations

School of Engineering, Mercer University, 151 Brookefield Drive, Macon, 31210, Georgia, USA
Jengnan Juang
National Changhua University of Educatio, No. 1, Jin De Road, Changhua City, 500, Taiwan R.O.C.
Yi-Cheng Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weng, TH., Batjargal, D., Pham, H., Hsieh, MY., Li, KC. (2013). Parallel Matrix Transposition and Vector Multiplication Using OpenMP. In: Juang, J., Huang, YC. (eds) Intelligent Technologies and Engineering Systems. Lecture Notes in Electrical Engineering, vol 234. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6747-2_30

Download citation

DOI: https://doi.org/10.1007/978-1-4614-6747-2_30
Published: 28 February 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6746-5
Online ISBN: 978-1-4614-6747-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics