Automatic Profiling of MPI Applications with Hardware Performance Counters
This paper presents an automatic counter instrumentation and profiling module added to the MPI library on Cray T3E and SGI Origin2000 systems. A detailed summary of the hardware performance counters and the MPI calls of any MPI production program is gathered during execution and written in MPI_Finalize on a special syslog file. The user can get the same information in a different file. Statistical summaries are computed weekly and monthly. The paper describes experiences with this library on the Cray T3E systems at HLRS Stuttgart and TU Dresden. It focuses on the problems integrating the hardware performance counters into MPI counter profiling and presents first results with these counters. Also, a second software design is described that allows the integration of the profiling layer into a dynamic shared object MPI library without consuming the user’s PMPI profiling interface.
KeywordsMPI Counter Profiling Instrumentation Hardware Performance Counters Trace-based Profiling PerfAPI PCL
Unable to display preview. Download preview PDF.
- 1.R. Berrendorf and H. Ziegler, PCL-The Performance Counter Library: A Common Interface to Access Hardware Performance Counters on Microprocessors, internal report FZJ-ZAM-IB-9816, Forschungszentrum Jülich, Oct. 1998. http://www.fz-juelich.de/zam/docs/autoren98/berrendorf3.html
- 2.M. T. Heath, Recent Developments and Case Studies in Performance Visualization using ParaGraph, Proceedings of the Workshop Performance Measurement and Visualization of Parallel Systems, G. Haring and G. Kotsis (ed.), Moravany, Czechoslovakia, Oct. 1992, p. 175–200.Google Scholar
- 3.V. Herrarte and E. Lusk, Studying Parallel Program Behavior with Upshot, Argonne National Laboratory, technical report ANL-91/15, Aug. 1991Google Scholar
- 4.HP MPI User’s Guide, 4.1 Using counter instrumentation, HP, B6011-90001, Third Ed., June 1998.Google Scholar
- 5.P. J. Mucci, S. Browne, G. Ho and C. Deane, PerfAPI-Performance Data Standard and API. http://icl.cs.utk.edu/projects/papi/
- 6.W. E. Nagel et al., VAMPIR: Visualization and Analysis of MPI Resources, Supercomputer 63, Volume XII, Number 1, Jan. 1996, pp. 69–80. http://www.kfa-juelich.de/zam/docs/autoren95/nagel2.html and Technical Report KFA-ZAM-IB-9528, Jan. 1996 ftp://ftp.zam.kfa-juelich.de/pub/zamdoc/ib/ib-95/ib-9528.ps Google Scholar
- 7.W. E. Nagel and A. Arnold, Performance Visualization of Parallel Programs: The PARvis Environment, technical report, Forschungszentrum Jülich, 1995. http://www.fz-juelich.de/zam/PT/ReDec/SoftTools/PARtools/PARvis.html
- 8.R. Rabenseifner, Automatic MPI Counter Profiling of All Users: First Results on a CRAY T3E 900-512, Proceedings of the Message Passing Interface Developer’s and User’s Conference 1999 (MPIDC’99), Atlanta, USA, March 1999. http://www.hlrs.de/people/rabenseifner/publ/publications.html
- 9.R. Rabenseifner, S. Seidl and W. E. Nagel, Effective Performance Problem Detection of MPI Programs on MPP Systems: From the Global View to the Detail, Parallel Computing’ 99 (ParCo99), Delft, the Netherlands, August 1999. http://www.hlrs.de/people/rabenseifner/publ/publications.html
- 10.M. van Riek, B. Tourancheau and X.-F. Vigouroux, Monitoring of Distributed Memory Multicomputer Programs, University of Tennessee, technical report CS-93-204, and Center for Research on Parallel Computation, Rice University, Houston Texas, technical report CRPC-TR93441, 1993. http://www.netlib.org/tennessee/ut-cs-93-204.ps and ftp://softlib.rice.edu/pub/CRPC-TRs/reports/CRPC-TR93441.ps.gz
- 11.The Parallel Tools Consortium-http://www.ptools.org