Performance Profiling Overhead Compensation for MPI Programs

  • Sameer Shende
  • Allen D. Malony
  • Alan Morris
  • Felix Wolf
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3666)


Performance profiling of MPI programs generates overhead during execution that introduces error in profile measurements. It is possible to track and remove overhead online, but it is necessary to communicate execution delay between processes to correctly adjust their interdependent timing. We demonstrate the first implementation of a onlne measurement overhead compensation system for profiling MPI programs. This is implemented in the Tau  performance systems. It requires novel techniques for delay communication in the use of MPI. The ability to reduce measurement error is demonstrated for problematic test cases and real applications.


Performance measurement analysis parallel computing profiling message passing overhead compensation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A Portable Programming Interface for Performance Evaluation on Modern Processors. International Journal of High Performance Computing Applications 14(3), 189–204 (Fall 2000)CrossRefGoogle Scholar
  2. 2.
    Malony, A.: Performance Observability. Ph.D. thesis, University of Illinois, Urbana-Champaign (1991)Google Scholar
  3. 3.
    Malony, A., Shende, S.: Overhead Compensation in Performance Profiling. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 119–132. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  4. 4.
    Malony, A., Shende, S.: Models for On-the-Fly Compensation of Measurement Overhead in Parallel Performance Profiling. In: Malony, A., Shende, S. (eds.) (to appear) Euro-Par Conference. LNCS. Springer, Heidelberg (2005)Google Scholar
  5. 5.
    Malony, A., Shende, S.: Performance Technology for Complex Parallel and Distributed Systems. In: Kotsis, G., Kacsuk, P. (eds.) Distributed and Parallel Systems, From Instruction Parallelism to Cluster Computing, Third Workshop on Distributed and Parallel Systems (DAPSYS 2000), pp. 37–46. Kluwer, Dordrecht (2000)Google Scholar
  6. 6.
    Malony, A., et al.: Advances in the TAU Performance System. In: Getov, V., Gerndt, M., Hoisie, A., Malony, A., Miller, B. (eds.) Performance Analysis and Grid Computing, pp. 129–144. Kluwer, Norwell (2003)Google Scholar
  7. 7.
    Malony, A., Reed, D., Wijshoff, H.: Performance Measurement Intrusion and Perturbation Analysis. IEEE Transactions on Parallel and Distributed Systems 3(4), 433–450 (1992)CrossRefGoogle Scholar
  8. 8.
    Malony, A., Reed, D.: Models for Performance Perturbation Analysis. In: ACM/ONR Workshop on Parallel and Distributed Debugging, May 1991, pp. 1–12 (1991)Google Scholar
  9. 9.
    Malony, A.: Event Based Performance Perturbation: A Case Study. In: Principles and Practices of Parallel Programming (PPoPP), April 1991, pp. 201–212 (1991)Google Scholar
  10. 10.
    Sarukkai, S., Malony, A.: Perturbation Analysis of High-Level Instrumentation for SPMD Programs. In: Principles and Practices of Parallel Programming (PPoPP), May 1993, pp. 44–53 (1993)Google Scholar
  11. 11.
    Vetter, J.: Dynamic Statistical Profiling of Communication Activity in Distributed Applications. In: SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. ACM, New York (2002)Google Scholar
  12. 12.
    Lamport, L.: Time, Clocks, and the Ordering of Events in a Distributed System. Communications of the ACM 21(7), 558–565 (1978)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Sameer Shende
    • 1
  • Allen D. Malony
    • 1
  • Alan Morris
    • 1
  • Felix Wolf
    • 2
  1. 1.Department of Computer and Information ScienceUniversity of Oregon 
  2. 2.Innovative Computing LaboratoryUniversity of Tennessee 

Personalised recommendations