MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

Graham, Richard L.; Shipman, Galen

doi:10.1007/978-3-540-87475-1_21

MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

Richard L. Graham¹ &
Galen Shipman¹

Conference paper

1054 Accesses
43 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5205))

Abstract

With local core counts on the rise, taking advantage of shared-memory to optimize collective operations can improve performance. We study several on-host shared memory optimized algorithms for MPI_Bcast, MPI_Reduce, and MPI_Allreduce, using tree-based, and reduce-scatter algorithms. For small data operations with relatively large synchronization costs fan-in/fan-out algorithms generally perform best. For large messages data manipulation constitute the largest cost and reduce-scatter algorithms are best for reductions. These optimization improve performance by up to a factor of three. Memory and cache sharing effect require deliberate process layout and careful radix selection for tree-based methods.

Research sponsored by the Mathematical, Information, and Computational Sciences Division, Office of Advanced Scientific Computing Research, U.S. Department of Energy, under Contract No. DE-AC05-00OR22725 with UT-Battelle, LLC.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Thakur, R., Gropp, W.: Improving the performance of collective operations in mpich. In: Lecture Notes In Computer Science, pp. 257–267 (2006)
Google Scholar
Rabenseifner, R.: Optimization of collective reduction operations. In: Lecture Notes In Computer Science, pp. 1–9 (2004)
Google Scholar
LA-MPI, http://public.lanl.gov/lampi
Sistare, S., van de Vaart, R., Loh, E.: Optimization of mpi collectives on clusters of large-scale smp’s. In: Proceedings of SC 1999: High Performance Networking and Computing (1999)
Google Scholar
NEC web page, http://www.nec.de
Mamidala, A.R., et al.: Mpi collectives on modern multicore clusters: Performance optimizations and communication characteristics. In: CCGRID 2008 (accepted for publication, 2008)
Google Scholar
Mamidala, A.R., Vishnu, A., Panda, D.K.: Efficient shared memory and rdma based design for mpi_allgather over infiniband. In: Lecture Notes In Computer Science
Google Scholar
Tipparaju, V., Nieplocha, J., Panda, D.: Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of the International Parallel and Distributed Processing Symposium (2003)
Google Scholar
Wu, M.S., Kendall, R.A., Aluru, S.: Exploring collective communications on a cluster of smps. In: Proceedings, HPCAsia2004, pp. 114–117 (2004)
Google Scholar
Graham, R.L., Choi, S.E., Daniel, D.J., Desai, N.N., Minnich, R.G., Rasmussen, C.E., Risinger, L.D., Sukalksi, M.W.: A network-failure-tolerant message-passing system for terascale clusters. International Journal of Parallel Programming 31(4) (2003)
Google Scholar
Open MPI, http://www.open-mpi.org

Download references

Author information

Authors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, TN, USA
Richard L. Graham & Galen Shipman

Authors

Richard L. Graham
View author publications
You can also search for this author in PubMed Google Scholar
Galen Shipman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alexey Lastovetsky Tahar Kechadi Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Graham, R.L., Shipman, G. (2008). MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives. In: Lastovetsky, A., Kechadi, T., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2008. Lecture Notes in Computer Science, vol 5205. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87475-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-540-87475-1_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87474-4
Online ISBN: 978-3-540-87475-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics