Skip to main content

OpenMP\(^{\textregistered }\) Runtime Instrumentation for Optimization

  • Conference paper
  • First Online:
Scaling OpenMP for Exascale Performance and Portability (IWOMP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10468))

Included in the following conference series:

Abstract

The OpenMP (The OpenMP name is a registered trademark of the OpenMP Architecture Review Board.) application programming interface provides a simple way for programmers to write parallel programs that are portable between machines and vendors. Programmers parallelize their programs to obtain higher performance, but, as the number of cores per processor increases, taking advantage of parallelism efficiently becomes more difficult. To facilitate efficient parallelization and avoid poor utilization of machine resources, programmers need to know where an application is spending time and what factors hinder scalability.

In this paper, we present a Tool for Runtime Instrumentation of OpenMP programs (TRIO) that automatically collects statistics about an application’s use of the OpenMP runtime. TRIO provides statistics such as the total number of times an OpenMP construct is called, the time spent in each OpenMP construct, and the total time spent within the OpenMP runtime. TRIO helps to identify the runtime calls where a program spends most of the time and which constructs are called the most at runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://openmp.llvm.org.

  2. 2.

    These are application teams and crews and do not refer to OpenMP constructs.

References

  1. Shende, S.S., Malony, A.D.: The Tau parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)

    Article  Google Scholar 

  2. Intel Vtune Amplifier. https://software.intel.com/en-us/intel-vtune-amplifier-xe

  3. CORAL Benchmarks. https://asc.llnl.gov/CORAL-benchmarks/

  4. Barthou, D., Charif Rubial, A., Jalby, W., Koliai, S., Valensi, C.: Performance tuning of x86 OpenMP codes with MAQAO. In: Müller, M., Resch, M., Schulz, A., Nagel, W. (eds.) Tools for High Performance Computing, pp. 95–113. Springer, Heidelberg (2010)

    Google Scholar 

  5. Fürlinger, K., Gerndt, M.: ompP: a profiling tool for OpenMP. In: Mueller, M.S., Chapman, B.M., Supinski, B.R., Malony, A.D., Voss, M. (eds.) IWOMP 2005. LNCS, vol. 4315, pp. 15–23. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68555-5_2

    Chapter  Google Scholar 

  6. Mohr, B., Wolf, F.: KOJAK – a tool set for automatic performance analysis of parallel programs. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 1301–1304. Springer, Heidelberg (2003). doi:10.1007/978-3-540-45209-6_177

    Chapter  Google Scholar 

  7. Geimer, M., Wolf, F., Wylie, B., Abraham, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. Concurr. Comput. Pract. Exper. 22(6), 702–719 (2010)

    Google Scholar 

  8. Knupfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Muller, M., Nagel, W.: The Vampir Performance analysis tool set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.) Tools for High Performance Computing, pp. 139–155. Springer, Berlin, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Mohr, B., Malony, A., Shende, S., Wolf, F.: Design and prototype of a performance tool interface for OpenMP. J. Supercomput. 23(1), 105–128 (2002)

    Article  MATH  Google Scholar 

  10. Itzkowitz, M., Mazurov, O., Copay, N., Lin, Y.: An OpenMP runtime API for profiling, OpenMP official ARB White Paper 314, pp. 181–190 (2007)

    Google Scholar 

  11. HPC Toolkit. http://hpctoolkit.org/manual/HPCToolkit-users-manual.pdf

  12. Eichenberger, A.E., et al.: OMPT: an OpenMP tools application programming interface for performance analysis. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 171–185. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40698-0_13

    Chapter  Google Scholar 

  13. Bull, J.M., O’Neill, D.: A microbenchmark suite for OpenMP 2.0. SIGARCH Comput. Archit. News 29(5), 41–48 (2001)

    Google Scholar 

  14. LLVM OpenMP. openmp.llvm.org

Download references

Acknowledgement

This material is based upon work supported by Subcontract No. B609815 with Argonne National Laboratory and Intel Federal LLC. We thank professor John Mellor-Crummey for his feedback on OMPT and its comparison with TRIO.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taru Doodi .

Editor information

Editors and Affiliations

Appendix

Appendix

The TRIO output included here in Fig. 4 is from the CLOMP run mentioned in Sect. 3.2. In the interest of space, we have included only the non-zero fields. The scripts use the “Total” column to process the results. Even though the fork and join barrier times are measured separately, we sum them up for plots. The raw output provides a clearer relationship between OMP_idle and OMP_serial, i.e. Total_OMP_idle can be computed using Total_OMP_serial as, \(((num\_threads) - 1 ) \times Total\_OMP\_serial\).

Fig. 4.
figure 4

Raw output from TRIO

figure a

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Doodi, T. et al. (2017). OpenMP\(^{\textregistered }\) Runtime Instrumentation for Optimization. In: de Supinski, B., Olivier, S., Terboven, C., Chapman, B., MĂĽller, M. (eds) Scaling OpenMP for Exascale Performance and Portability. IWOMP 2017. Lecture Notes in Computer Science(), vol 10468. Springer, Cham. https://doi.org/10.1007/978-3-319-65578-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65578-9_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65577-2

  • Online ISBN: 978-3-319-65578-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics