Skip to main content

Parallel Matrix Transposition and Vector Multiplication Using OpenMP

  • Conference paper
  • First Online:
Intelligent Technologies and Engineering Systems

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 234))

Abstract

In this chapter, we propose two parallel algorithms for sparse matrix transposition and vector multiplication using CSR format: with and without actual matrix transposition. Both algorithms are parallelized using OpenMP. Experimentations are run on a quad-core Intel Xeon64 CPU E5507. We measure and compare the performance of our algorithms with that of using CSB scheme. Our experimental results show that actual matrix transposition algorithm is comparable to the CSB-based algorithm; on the other hand, direct sparse matrix-transpose-vector multiplication using CSR significantly outperforms CSB-based algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kotakemori H, Hasegawa H, Kajiyama T, Nukada A, Suda R, Nishida A (2008) Performance evaluation of parallel sparse matrix-vector products on SGI Altix3700. In: Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming, LNCS, West Lafayette, 12–14 May 2008, vol 5315, pp 153–163

    Google Scholar 

  2. Godwin J, Holewinski J, Sadayappan P (2012) High-performance sparse matrix-vector multiplication on GPUs for structured grid computations. In: Proceedings of the 5th annual workshop on general purpose processing with graphics processing units, New York, 2012, pp 47–56

    Google Scholar 

  3. Buluç A, Fineman JT, Frigo M, Gilbert JR, Leiserson CE (2009) Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: Proceedings of the 21th annual symposium on parallelism in algorithms and architectures, Calgary, 11–13 Aug 2009, pp 233–244

    Google Scholar 

  4. Mateescu G, Bauer GH, Fiedler RA (2011) Optimizing matrix transposes using a POWER7 cache model and explicit prefetching. In: Proceedings of the second international workshop on performance modeling, benchmarking and simulation of high performance computing systems, Seattle, 12–18 Nov 2011, pp 5–6

    Google Scholar 

  5. Gustavson FG (1978) Two fast algorithms for sparse matrices: multiplication and permuted transposition. ACM Trans Math Softw 4(3):250–269

    Article  MathSciNet  MATH  Google Scholar 

  6. OpenMP Architecture Review Board, Fortran 2.0 and C/C++ 1.0 specifications. Available at: www.openmp.org

  7. Cilk Arts, Inc. (2009) Cilk++ Programmer’s Guide. Cilk Arts, Inc., Burlington. Available at: http://www.cilk.com

  8. Buluç A (2011) Parallel SpMV and SpMVT using CSB, research software. Available at: http://gauss.cs.ucsb.edu/~aydin/software.html

  9. Li K-C, Weng T-H (2009) Performance-based parallel application toolkit for high-performance clusters. J Supercomput 48(1):43–65

    Article  MATH  Google Scholar 

  10. Davis TA (1994) University of Florida sparse matrix collection. NA Digest 92

    Google Scholar 

  11. Li K-C, Chang H-C (2007) The design and implementation of Visuel performance monitoring and analysis toolkit for cluster and grid environments. J Supercomput 40(3):299–317

    Article  Google Scholar 

Download references

Acknowledgements

This chapter is based upon work supported in part by Taiwan National Science Council (NSC) grants no. NSC101-2221-E-126-002– and Delta Electronics. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSC or Delta Electronics.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tien-Hsiung Weng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this paper

Cite this paper

Weng, TH., Batjargal, D., Pham, H., Hsieh, MY., Li, KC. (2013). Parallel Matrix Transposition and Vector Multiplication Using OpenMP. In: Juang, J., Huang, YC. (eds) Intelligent Technologies and Engineering Systems. Lecture Notes in Electrical Engineering, vol 234. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6747-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-6747-2_30

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-6746-5

  • Online ISBN: 978-1-4614-6747-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics