Early Experiences Porting Three Applications to OpenMP 4.5

Karlin, Ian; Scogland, Tom; Jacob, Arpith C.; Antao, Samuel F.; Bercea, Gheorghe-Teodor; Bertolli, Carlo; de Supinski, Bronis R.; Draeger, Erik W.; Eichenberger, Alexandre E.; Glosli, Jim; Jones, Holger; Kunen, Adam; Poliakoff, David; Richards, David F.

doi:10.1007/978-3-319-45550-1_20

Ian Karlin¹⁶,
Tom Scogland¹⁶,
Arpith C. Jacob¹⁷,
Samuel F. Antao¹⁷,
Gheorghe-Teodor Bercea¹⁸,
Carlo Bertolli¹⁷,
Bronis R. de Supinski¹⁶,
Erik W. Draeger¹⁶,
Alexandre E. Eichenberger¹⁷,
Jim Glosli¹⁶,
Holger Jones¹⁶,
Adam Kunen¹⁶,
David Poliakoff¹⁶ &
…
David F. Richards¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9903))

Included in the following conference series:

International Workshop on OpenMP

1179 Accesses
12 Citations

Abstract

Many application developers need code that runs efficiently on multiple architectures, but cannot afford to maintain architecturally specific codes. With the addition of target directives to support offload accelerators, OpenMP now has the machinery to support performance portable code development. In this paper, we describe application ports of Kripke, Cardioid, and LULESH to OpenMP 4.5 and discuss our successes and failures. Challenges encountered include how OpenMP interacts with C++ including classes with virtual methods and lambda functions. Also, the lack of deep copy support in OpenMP increased code complexity. Finally, GPUs inability to handle virtual function calls required code restructuring. Despite these challenges we demonstrate OpenMP obtains performance within 10 % of hand written CUDA for memory bandwidth bound kernels in LULESH. In addition, we show with a minor change to the OpenMP standard that register usage for OpenMP code can be reduced by up to 10 %.

The rights of this work are transferred to the extent transferable according to title 17 U.S.C. 105.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://codesign.llnl.gov/lulesh.php.
2.
firstprivate: Specifies that each thread should have its own instance of a variable, and that the variable should be initialized with the value of the variable, because it exists before the parallel construct.

References

Openmp application programming interface, November 2015. http://www.openmp.org/mp-documents/openmp-4.5.pdf
Beckingsale, D.: Lightweight models for dynamically tuning data-dependent code, April 2016
Google Scholar
Bercea, G.T., Bertolli, C., Antao, S.F., Jacob, A.C., Eichenberger, A.E., Chen, T., Sura, Z., Sung, H., Rokos, G., Appelhans, D., et al.: Performance analysis of openmp on a gpu using a coral proxy application. In: Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems, p. 2. ACM (2015)
Google Scholar
Beyer, J.C., Stotzer, E.J., Hart, A., de Supinski, B.R.: OpenMP for accelerators. In: Chapman, B.M., Gropp, W.D., Kumaran, K., Müller, M.S. (eds.) IWOMP 2011. LNCS, vol. 6665, pp. 108–121. Springer, Heidelberg (2011)
Chapter Google Scholar
Draeger, E.W., Karlin, I., Scogland, T., Richards, D., Glosli, J., Jones, H., Poliakoff, D., Kunen, A.: Openmp 4.5 ibm november 2015 hackathon: current status and lessons learned. Technical report LLNL-TR-680824, Lawrence Livermore National Laboratory, January 2016
Google Scholar
Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014)
Article Google Scholar
Grauer-Gray, S., Xu, L., Searles, R., Ayalasomayajula, S., Cavazos, J.: Auto-tuning a high-level language targeted to GPU codes. Innov. Parallel Comput. 2012, 1–10 (2012)
Google Scholar
Hornung, R., Keasler, J.: The raja portability layer: overview and status. Technical report LLNL-TR-661403, Lawrence Livermore National Laboratory, September 2014
Google Scholar
Karlin, I., Bhatele, A., Chamberlain, B.L., Cohen, J., Devito, Z., Gokhale, M., Haque, R., Hornung, R., Keasler, J., Laney, D., Luke, E., Lloyd, S., McGraw, J., Neely, R., Richards, D., Schulz, M., Still, C.H., Wang, F., Wong, D.: Lulesh programming model and performance ports overview. Technical report LLNL-TR-608824, December 2012
Google Scholar
Kunen, A.J.: Tloops - raja-like transformations in kripke, February 2015
Google Scholar
Kunen, A., Bailey, T., Brown, P.: Kripke-a massively parallel transport mini-app. Technical report LLNL-CONF-675389, Lawrence Livermore National Laboratory, April 2015
Google Scholar
Lee, S., Vetter, J.S.: Early evaluation of directive-based GPU programming models for productive exascale computing. IEEE Computer Society Press, November 2012
Google Scholar
Martineau, M., McIntosh-Smith, S., Boulton, M., Gaudin, W.: An evaluation of emerging many-core parallel programming models. In: Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores, pp. 1–10. ACM (2016)
Google Scholar
Martineau, M., McIntosh-Smith, S., Gaudin, W.: Evaluating openmp 4.0’s effectiveness as a heterogeneous parallel programming model. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW). IEEE, May 2016
Google Scholar
Muralidharan, S., Garland, M., Catanzaro, B., Sidelnik, A., Hall, M.: A collection-oriented programming model for performance portability. ACM SIGPLAN Not. 50, 263–264 (2015). ACM
Article Google Scholar
Pickering, B.P., Jackson, C.W., Scogland, T.R., Feng, W.C., Roy, C.J.: Directive-based GPU programming for computational fluid dynamics. Comput. Fluids 114, 242–253 (2015). http://www.sciencedirect.com/science/article/pii/S004579301500081X
Article MathSciNet Google Scholar
Richards, D.F., Glosli, J.N., Draeger, E.W., Mirin, A.A., Chan, B., Fattebert, J., Krauss, W.D., Oppelstrup, T., Butler, C.J., Gunnels, J.A., et al.: Towards real-time simulation of cardiac electrophysiology in a human heart at high resolution. Comput. Meth. Biomech. Biomed. Eng. 16(7), 802–805 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
Ian Karlin, Tom Scogland, Bronis R. de Supinski, Erik W. Draeger, Jim Glosli, Holger Jones, Adam Kunen, David Poliakoff & David F. Richards
IBM T.J. Watson Research Center, Yorktown Heights, NY, 10598, USA
Arpith C. Jacob, Samuel F. Antao, Carlo Bertolli & Alexandre E. Eichenberger
Department of Computing, Imperial College London, London, SW7 2AZ, UK
Gheorghe-Teodor Bercea

Authors

Ian Karlin
View author publications
You can also search for this author in PubMed Google Scholar
Tom Scogland
View author publications
You can also search for this author in PubMed Google Scholar
Arpith C. Jacob
View author publications
You can also search for this author in PubMed Google Scholar
Samuel F. Antao
View author publications
You can also search for this author in PubMed Google Scholar
Gheorghe-Teodor Bercea
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Bertolli
View author publications
You can also search for this author in PubMed Google Scholar
Bronis R. de Supinski
View author publications
You can also search for this author in PubMed Google Scholar
Erik W. Draeger
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre E. Eichenberger
View author publications
You can also search for this author in PubMed Google Scholar
Jim Glosli
View author publications
You can also search for this author in PubMed Google Scholar
Holger Jones
View author publications
You can also search for this author in PubMed Google Scholar
Adam Kunen
View author publications
You can also search for this author in PubMed Google Scholar
David Poliakoff
View author publications
You can also search for this author in PubMed Google Scholar
David F. Richards
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ian Karlin .

Editor information

Editors and Affiliations

RIKEN AICS , Kobe, Japan
Naoya Maruyama
Lawrence Livermore National Laboratory , Livermore, California, USA
Bronis R. de Supinski
RIKEN AICS , Kobe, Japan
Mohamed Wahib

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karlin, I. et al. (2016). Early Experiences Porting Three Applications to OpenMP 4.5. In: Maruyama, N., de Supinski, B., Wahib, M. (eds) OpenMP: Memory, Devices, and Tasks. IWOMP 2016. Lecture Notes in Computer Science(), vol 9903. Springer, Cham. https://doi.org/10.1007/978-3-319-45550-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-45550-1_20
Published: 21 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45549-5
Online ISBN: 978-3-319-45550-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics