On Using Incremental Profiling for the Performance Analysis of Shared Memory Parallel Applications

  • Karl Fuerlinger
  • Michael Gerndt
  • Jack Dongarra
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4641)


Profiling is often the method of choice for performance analysis of parallel applications due to its low overhead and easily comprehensible results. However, a disadvantage of profiling is the loss of temporal information that makes it impossible to causally relate performance phenomena to events that happened prior or later during execution. We investigate techniques to add temporal dimension to profiling data by incrementally capturing profiles during the runtime of the application and discuss the insights that can be gained from this type of performance data. The context in which we explore these ideas is an existing profiling tool for OpenMP applications.


Critical Section Parallel Application Benchmark Suite Parallel Region Performance Counter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Fuerlinger, K., Gerndt, M.: ompP: A profiling tool for OpenMP. In: Proceedings of the First International Workshop on OpenMP (IWOMP 2005), Eugene, Oregon, USA (May 2005) (Accepted for publication)Google Scholar
  2. 2.
    Aslot, V., Eigenmann, R.: Performance characteristics of the SPEC OMP2001 benchmarks. SIGARCH Comput. Archit. News 29(5), 31–40 (2001)CrossRefGoogle Scholar
  3. 3.
    Mohr, B., Malony, A.D., Shende, S.S., Wolf, F.: Towards a performance tool interface for OpenMP: An approach based on directive rewriting. In: Proceedings of the Third Workshop on OpenMP (EWOMP 2001) (September 2001)Google Scholar
  4. 4.
    Itzkowitz, M., Mazurov, O., Copty, N., Lin, Y.: An OpenMP runtime API for profiling Accepted by the OpenMP ARB as an official ARB White Paper, available online at
  5. 5.
    Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.J.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000)CrossRefGoogle Scholar
  6. 6.
    Gerndt, M., Fürlinger, K.: Specification and detection of performance problems with ASL. Concurrency and Computation: Practice and Experience, 2006 (to appear)Google Scholar
  7. 7.
    Fuerlinger, K., Gerndt, M.: Analyzing overheads and scalability characteristics of OpenMP applications. In: Dayde, R. (ed.) VECPAR 2006, Rio de Janeiro, Brasil, vol. 4395, pp. 39–51. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Shende, S.S., Malony, A.D.: The TAU parallel performance system. International Journal of High Performance Computing Applications (ACTS Collection Special Issue) (2005)Google Scholar
  9. 9.
    Malony, A.D., Shende, S.S., Bell, R., Li, K., Li, L., Trebon, N.: Advances in the TAU performance analysis system. In: Getov, V., Gerndt, M., Hoisie, A., Malony, A., Miller, B. (eds.) Performance Analysis and Grid Computing, pp. 129–144. Kluwer, Dordrecht (2003)Google Scholar
  10. 10.
    Vetter, J.S., Mueller, F.: Communication characteristics of large-scale scientific applications for contemporary cluster architectures. J. Parallel Distrib. Comput. 63(9), 853–865 (2003)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Karl Fuerlinger
    • 1
  • Michael Gerndt
    • 2
  • Jack Dongarra
    • 1
  1. 1.Innovative Computing Laboratory, Department of Computer Science, University of Tennessee 
  2. 2.Lehrstuhl für Rechnertechnik und Rechnerorganisation, Institut für Informatik, Technische Universität München 

Personalised recommendations