Skip to main content

Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7905))

Abstract

The increasing number of processing elements and decreasing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on massively parallel systems scientific software must evolve across the entire stack to exploit the multiple levels of parallelism exposed in modern architectures. In this paper we demonstrate the use of hybrid MPI/OpenMP parallelisation to optimise parallel sparse matrix-vector multiplication in PETSc, a widely used scientific library for the scalable solution of partial differential equations. Using large matrices generated by Fluidity, an open source CFD application code which uses PETSc as its linear solver engine, we evaluate the effect of explicit communication overlap using task-based parallelism and show how to further improve performance by explicitly load balancing threads within MPI processes. We demonstrate a significant speedup over the pure-MPI mode and efficient strong scaling of sparse matrix-vector multiplication on Fujitsu PRIMEHPC FX10 and Cray XE6 systems.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cray XE6 system (March 2013), http://www.cray.com/Products/Computing/XE/Specifications/Specifications-XE6.aspx

  2. Fluidity Manual. Applied Modelling and Computation Group, Department of Earth Science and Engineering, South Kensington Campus, Imperial College London, London, SW7 2AZ, UK, version 4.1.8.2 edn. (March 2013), http://launchpad.net/fluidity/4.1/4.1.8.2/+download/fluidity-manual-4.1.8.2.pdf

  3. Fujitsu PRIMEHPC FX10 (March 2013), http://www.fujitsu.com/global/services/solutions/tc/hpc/products/primehpc/spec/

  4. Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Kumar, S., Lusk, E., Thakur, R., Träff, J.L.: MPI on a Million Processors. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) PVM/MPI. LNCS, vol. 5759, pp. 20–30. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Balay, S., Brown, J., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11 - Revision 3.3, Argonne National Laboratory (2012)

    Google Scholar 

  6. Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhäuser Press (1997)

    Google Scholar 

  7. Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009, pp. 18:1–18:11. ACM, New York (2009)

    Google Scholar 

  8. Butler, M., Barnes, L., Sarma, D.D., Gelinas, B.: Bulldozer: An approach to multithreaded compute performance. IEEE Micro 31(2), 6–15 (2011)

    Article  Google Scholar 

  9. Goumas, G., Kourtis, K., Anastopoulos, N., Karakasis, V., Koziris, N.: Performance evaluation of the sparse matrix-vector multiplication on modern architectures. The Journal of Supercomputing 50, 36–77 (2009)

    Article  Google Scholar 

  10. Piggott, M.D., Gorman, G.J., Pain, C.C., Allison, P.A., Candy, A.S., Martin, B.T., Wells, M.R.: A new computational framework for multi-scale ocean modelling based on adapting unstructured meshes. International Journal for Numerical Methods in Fluids 56(8), 1003–1015 (2008)

    Article  MathSciNet  Google Scholar 

  11. Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In: 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pp. 427–436 (2009)

    Google Scholar 

  12. Reid, F.J.L., Bull, J.M.: OpenMP microbenchmarks version 2.0. In: European Workshop on OpenMP, EWOMP (2004)

    Google Scholar 

  13. Schubert, G., Fehske, H., Hager, G., Wellein, G.: Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems. Parallel Processing Letters 21(3), 339–358 (2011)

    Article  MathSciNet  Google Scholar 

  14. Wellein, G., Hager, G., Basermann, A., Fehske, H.: Fast sparse matrix-vector multiplication for teraflop/s computers. In: Palma, J.M.L.M., Sousa, A.A., Dongarra, J., Hernández, V. (eds.) VECPAR 2002. LNCS, vol. 2565, pp. 287–301. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  15. Williams, S., Oliker, L., Vuduc, R., Shalf, J., Yelick, K., Demmel, J.: Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Computing 35(3), 178–194 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lange, M., Gorman, G., Weiland, M., Mitchell, L., Southern, J. (2013). Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2013. Lecture Notes in Computer Science, vol 7905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38750-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38750-0_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38749-4

  • Online ISBN: 978-3-642-38750-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics