OpenMP $$^{\textregistered }$$ Runtime Instrumentation for Optimization

Doodi, Taru; Peyton, Jonathan; Cownie, Jim; Garzaran, Maria; Kalidas, Rubasri; Kim, Jeongnim; Mathuriya, Amrita; Wilmarth, Terry; Zheng, Gengbin

doi:10.1007/978-3-319-65578-9_19

Taru Doodi¹⁸,
Jonathan Peyton¹⁸,
Jim Cownie¹⁸,
Maria Garzaran¹⁸,
Rubasri Kalidas¹⁸,
Jeongnim Kim¹⁸,
Amrita Mathuriya¹⁸,
Terry Wilmarth¹⁸ &
…
Gengbin Zheng¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10468))

Included in the following conference series:

International Workshop on OpenMP

1035 Accesses
1 Citations

Abstract

The OpenMP (The OpenMP name is a registered trademark of the OpenMP Architecture Review Board.) application programming interface provides a simple way for programmers to write parallel programs that are portable between machines and vendors. Programmers parallelize their programs to obtain higher performance, but, as the number of cores per processor increases, taking advantage of parallelism efficiently becomes more difficult. To facilitate efficient parallelization and avoid poor utilization of machine resources, programmers need to know where an application is spending time and what factors hinder scalability.

In this paper, we present a Tool for Runtime Instrumentation of OpenMP programs (TRIO) that automatically collects statistics about an application’s use of the OpenMP runtime. TRIO provides statistics such as the total number of times an OpenMP construct is called, the time spent in each OpenMP construct, and the total time spent within the OpenMP runtime. TRIO helps to identify the runtime calls where a program spends most of the time and which constructs are called the most at runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://openmp.llvm.org.
2.
These are application teams and crews and do not refer to OpenMP constructs.

References

Shende, S.S., Malony, A.D.: The Tau parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)
Article Google Scholar
Intel Vtune Amplifier. https://software.intel.com/en-us/intel-vtune-amplifier-xe
CORAL Benchmarks. https://asc.llnl.gov/CORAL-benchmarks/
Barthou, D., Charif Rubial, A., Jalby, W., Koliai, S., Valensi, C.: Performance tuning of x86 OpenMP codes with MAQAO. In: Müller, M., Resch, M., Schulz, A., Nagel, W. (eds.) Tools for High Performance Computing, pp. 95–113. Springer, Heidelberg (2010)
Google Scholar
Fürlinger, K., Gerndt, M.: ompP: a profiling tool for OpenMP. In: Mueller, M.S., Chapman, B.M., Supinski, B.R., Malony, A.D., Voss, M. (eds.) IWOMP 2005. LNCS, vol. 4315, pp. 15–23. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68555-5_2
Chapter Google Scholar
Mohr, B., Wolf, F.: KOJAK – a tool set for automatic performance analysis of parallel programs. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 1301–1304. Springer, Heidelberg (2003). doi:10.1007/978-3-540-45209-6_177
Chapter Google Scholar
Geimer, M., Wolf, F., Wylie, B., Abraham, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. Concurr. Comput. Pract. Exper. 22(6), 702–719 (2010)
Google Scholar
Knupfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Muller, M., Nagel, W.: The Vampir Performance analysis tool set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.) Tools for High Performance Computing, pp. 139–155. Springer, Berlin, Heidelberg (2008)
Chapter Google Scholar
Mohr, B., Malony, A., Shende, S., Wolf, F.: Design and prototype of a performance tool interface for OpenMP. J. Supercomput. 23(1), 105–128 (2002)
Article MATH Google Scholar
Itzkowitz, M., Mazurov, O., Copay, N., Lin, Y.: An OpenMP runtime API for profiling, OpenMP official ARB White Paper 314, pp. 181–190 (2007)
Google Scholar
HPC Toolkit. http://hpctoolkit.org/manual/HPCToolkit-users-manual.pdf
Eichenberger, A.E., et al.: OMPT: an OpenMP tools application programming interface for performance analysis. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 171–185. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40698-0_13
Chapter Google Scholar
Bull, J.M., O’Neill, D.: A microbenchmark suite for OpenMP 2.0. SIGARCH Comput. Archit. News 29(5), 41–48 (2001)
Google Scholar
LLVM OpenMP. openmp.llvm.org

Download references

Acknowledgement

This material is based upon work supported by Subcontract No. B609815 with Argonne National Laboratory and Intel Federal LLC. We thank professor John Mellor-Crummey for his feedback on OMPT and its comparison with TRIO.

Author information

Authors and Affiliations

Intel Corporation, Austin, USA
Taru Doodi, Jonathan Peyton, Jim Cownie, Maria Garzaran, Rubasri Kalidas, Jeongnim Kim, Amrita Mathuriya, Terry Wilmarth & Gengbin Zheng

Authors

Taru Doodi
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Peyton
View author publications
You can also search for this author in PubMed Google Scholar
Jim Cownie
View author publications
You can also search for this author in PubMed Google Scholar
Maria Garzaran
View author publications
You can also search for this author in PubMed Google Scholar
Rubasri Kalidas
View author publications
You can also search for this author in PubMed Google Scholar
Jeongnim Kim
View author publications
You can also search for this author in PubMed Google Scholar
Amrita Mathuriya
View author publications
You can also search for this author in PubMed Google Scholar
Terry Wilmarth
View author publications
You can also search for this author in PubMed Google Scholar
Gengbin Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Taru Doodi .

Editor information

Editors and Affiliations

Lawrence Livermore National Laboratory, Livermore, California, USA
Bronis R. de Supinski
Sandia National Laboratories, Albuquerque, New Mexico, USA
Stephen L. Olivier
RWTH Aachen University, Aachen, Germany
Christian Terboven
Stony Brook University, Stony Brook, New York, USA
Barbara M. Chapman
RWTH Aachen University, Aachen, Germany
Matthias S. Müller

Appendix

The TRIO output included here in Fig. 4 is from the CLOMP run mentioned in Sect. 3.2. In the interest of space, we have included only the non-zero fields. The scripts use the “Total” column to process the results. Even though the fork and join barrier times are measured separately, we sum them up for plots. The raw output provides a clearer relationship between OMP_idle and OMP_serial, i.e. Total_OMP_idle can be computed using Total_OMP_serial as, $((num\_threads) - 1 ) \times Total\_OMP\_serial$.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Doodi, T. et al. (2017). OpenMP$^{\textregistered }$ Runtime Instrumentation for Optimization. In: de Supinski, B., Olivier, S., Terboven, C., Chapman, B., Müller, M. (eds) Scaling OpenMP for Exascale Performance and Portability. IWOMP 2017. Lecture Notes in Computer Science(), vol 10468. Springer, Cham. https://doi.org/10.1007/978-3-319-65578-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-65578-9_19
Published: 17 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65577-2
Online ISBN: 978-3-319-65578-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

OpenMP\(^{\textregistered }\) Runtime Instrumentation for Optimization

Abstract

Access this chapter

Notes

References

Acknowledgement