An Evaluation of OpenSHMEM Interfaces for the Variable-Length Alltoallv() Collective Operation

Lopez, M. Graham; Shamis, Pavel; Gorentla Venkata, Manjunath

doi:10.1007/978-3-319-26428-8_6

M. Graham Lopez¹⁷,
Pavel Shamis¹⁷ &
Manjunath Gorentla Venkata¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9397))

Included in the following conference series:

Workshop on OpenSHMEM and Related Technologies

345 Accesses

Abstract

Alltoallv() is a collective operation which allows all processes to exchange variable amounts of data with all other processes in the communication group. This means that Alltoallv() requires not only \(O(N^2)\) communications, but typically also additional exchanges of the data lengths that will be transmitted in the eventual Alltoallv() call. This pre-exchange is used to calculate the proper offsets for the receiving buffers on the target processes. However, we propose two new candidate interfaces for Alltoallv() that would mitigate the need for the user to set up this extra exchange of information at the possible cost of memory efficiency. We explain the new interface variants and show how a single call can be used in place of the traditional Alltoall()/ Alltoallv() pair. We then discuss the performance tradeoffs for overall communication and memory costs, as well as both software and hardware-based optimizations and their applicability to the various proposed interfaces.

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
If the null character is unsuitable, the flag parameter can typically be chosen to be the maximum value of the corresponding datatype, e.g. LONG_MAX for long int, since there will typically be overflow issues with a data set if its domain includes this value.

References

CPMD. http://cpmd.org/
Calculating the properties of materials from first principles, June 2012. http://www.castep.org/
Programming environments release announcement for cray XC30 systems (2013). http://docs.cray.com/books/S-9408-1306//S-9408-1306.pdf
Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Frederickson, P., Lasinski, T., Schreiber, R., Simon, H., Venkatakrishnan, V., Weeratunga, S.: The NAS parallel benchmarks. Int. J. High Perform. Comput. Appl. 5(3), 63–73 (1991). http://hpc.sagepub.com/content/5/3/63.abstract
Article Google Scholar
Bruck, J., Ho, C.T., Upfal, E., Kipnis, S., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distrib. Syst. 8(11), 1143–1156 (1997). http://dx.doi.org/10.1109/71.642949
Article Google Scholar
Goglin, B., Moreaud, S.: Knem: A generic and scalable kernel-assisted intra-node MPI communication framework. J. Parallel Distrib. Comput. 73(2), 176–188 (2013). http://www.sciencedirect.com/science/article/pii/S0743731512002316
Article Google Scholar
Jackson, A., Booth, S.: Planned AlltoallV a clustered approach. Technical report, EPCC Edinburgh Parallel Computing Centre, July 2004
Google Scholar
Ma, T., Bosilca, G., Bouteiller, A., Goglin, B., Squyres, J., Dongarra, J.: Kernel assisted collective intra-node mpi communication among multi-core and many-core cpus. In: 2011 International Conference on Parallel Processing (ICPP), pp. 532–541, September 2011
Google Scholar
Pophale, S., Nanjegowda, R., Curtis, A.R., Chapman, B., Jin, H., Poole, S.W., Kuehn, J.A.: OpenSHMEM performance and potential: a NPB experimental study. In: PGAS, January 2012
Google Scholar
Xu, C., Venkata, M., Graham, R., Wang, Y., Liu, Z., Yu, W.: Sloavx: Scalable logarithmic alltoallv algorithm for hierarchical multicore systems. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 369–376, May 2013
Google Scholar
Yu, W., Panda, D., Buntinas, D.: Scalable, high-performance nic-based all-to-all broadcast over myrinet/gm. In: 2004 IEEE International Conference on Cluster Computing, pp. 125–134, September 2004
Google Scholar

Download references

Acknowledgements

The work at Oak Ridge National Laboratory (ORNL) is supported by the United States Department of Defense and used the resources of the Extreme Scale Systems Center located at ORNL.

Author information

Authors and Affiliations

Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37921, USA
M. Graham Lopez, Pavel Shamis & Manjunath Gorentla Venkata

Authors

M. Graham Lopez
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Shamis
View author publications
You can also search for this author in PubMed Google Scholar
Manjunath Gorentla Venkata
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Graham Lopez .

Editor information

Editors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Manjunath Gorentla Venkata
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Pavel Shamis
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Neena Imam
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
M. Graham Lopez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lopez, M.G., Shamis, P., Gorentla Venkata, M. (2015). An Evaluation of OpenSHMEM Interfaces for the Variable-Length Alltoallv() Collective Operation. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M. (eds) OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies. OpenSHMEM 2014. Lecture Notes in Computer Science(), vol 9397. Springer, Cham. https://doi.org/10.1007/978-3-319-26428-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-26428-8_6
Published: 09 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26427-1
Online ISBN: 978-3-319-26428-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics