Workstealing and Nested Parallelism in SMP Systems

Meadows, Larry; Pennycook, Simon J.; Duran, Alex; Wilmarth, Terry; Cownie, Jim

doi:10.1007/978-3-319-45550-1_4

Larry Meadows¹⁶,
Simon J. Pennycook¹⁶,
Alex Duran¹⁶,
Terry Wilmarth¹⁶ &
…
Jim Cownie¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9903))

Included in the following conference series:

International Workshop on OpenMP

1162 Accesses
2 Citations

Abstract

We present a workstealing scheduler and show its use in two separate areas: (1) to enable hierarchical parallelism and per-core load balancing in stencil codes, and (2) to reduce overhead in per-thread load balancing in particle codes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Andreolli, C.: Eight Optimizations for 3-Dimensional Finite Difference (3DFD) Code with an Isotropic (ISO). https://software.intel.com/en-us/articles/eight-optimizations-for-3-dimensional-finite-difference-3dfd-code-with-an-isotropic-iso. Accessed 21 Oct 2014
Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. CACM 52(4), 65 (2009)
Article Google Scholar
Jeffers, J., Reinders, J.: Intel Xeon Phi Coprocessor High-Performance Programming. Morgan Kauffman, Boston (2013)
Google Scholar
Dempsey, J.: Plesiochronous phasing barriers. In: Jeffers, J., Reinders, J. (eds.) High Performance Parallelism Pearls, pp. 87–115. Morgan Kauffman, Boston (2015)
Chapter Google Scholar
Briggs, J., et al.: Separable projection integrals for higher-order correlators of the cosmic microwave sky: acceleration by factors exceeding 100, Cornell University Library. http://arxiv.org/abs/1503.08809
Meadows, L., Kim, J., Wells, A.: Parallelization methods for hierarchical SMP systems. In: Terboven, C., et al. (eds.) IWOMP 2015. LNCS, vol. 9342, pp. 247–259. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24595-9_18
Chapter Google Scholar
McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, December 1995
Google Scholar
Sbalzarini, I.F., Walther, J.H., Bergdorf, M., Hieber, S.E., Kotsalis, E.M., Koumoutsakos, P.: PPM a highly efficient parallel particlemesh library for the simulation of continuum systems. J. Comput. Phys. 215(2), 566 (2006)
Article MATH Google Scholar
Madduri, K., Im, E.-J., Ibrahim, K.Z., Williams, S., Ethier, S., Oliker, L.: Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms. Parallel Comput. 37(9), 501 (2011)
MathSciNet Google Scholar
Schweizer, H., Besta, M., Hoefler, T.: Evaluating the cost of atomic operations on modern architectures. In: Proceedings of Parallel Architectures and Compilation (2015)
Google Scholar
Dureau, D., Poëtte, G.: Hybrid parallel programming models for AMR neutron Monte-Carlo transport. In: Joint International Conference on Supercomputing in Nuclear Applications and Monte Carlo (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Intel Corporation, Hillsboro, OR, USA
Larry Meadows, Simon J. Pennycook, Alex Duran, Terry Wilmarth & Jim Cownie

Authors

Larry Meadows
View author publications
You can also search for this author in PubMed Google Scholar
Simon J. Pennycook
View author publications
You can also search for this author in PubMed Google Scholar
Alex Duran
View author publications
You can also search for this author in PubMed Google Scholar
Terry Wilmarth
View author publications
You can also search for this author in PubMed Google Scholar
Jim Cownie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Larry Meadows .

Editor information

Editors and Affiliations

RIKEN AICS , Kobe, Japan
Naoya Maruyama
Lawrence Livermore National Laboratory , Livermore, California, USA
Bronis R. de Supinski
RIKEN AICS , Kobe, Japan
Mohamed Wahib

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meadows, L., Pennycook, S.J., Duran, A., Wilmarth, T., Cownie, J. (2016). Workstealing and Nested Parallelism in SMP Systems. In: Maruyama, N., de Supinski, B., Wahib, M. (eds) OpenMP: Memory, Devices, and Tasks. IWOMP 2016. Lecture Notes in Computer Science(), vol 9903. Springer, Cham. https://doi.org/10.1007/978-3-319-45550-1_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-45550-1_4
Published: 21 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45549-5
Online ISBN: 978-3-319-45550-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics