Automatic Structure Extraction from MPI Applications Tracefiles

  • Marc Casas
  • Rosa M. Badia
  • Jesús Labarta
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4641)


The process of obtaining useful message passing applications tracefiles for performance analysis in supercomputers is a large and tedious task. When using hundreds or thousands of processors, the tracefile size can grow up to 10 or 20 GB. It is clear that analyzing or even storing these large traces is a problem. The methodology we have developed and implemented performs an automatic analysis that can be applied to huge tracefiles, which obtains its internal structure and selects meaningful parts of the tracefile. The paper presents the methodology and results we have obtained from real applications.


Discrete Fourier Transform Message Passing Interface Mathematical Morphology Periodic Region Original Trace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Paraver: performance visualization and analysis,
  2. 2.
    KOJAK: Kit for Objective Judgment and Knowledge-based Detection of Performance Bottlenecks,
  3. 3.
    Knuepfer, A., Brunst, H., Nagel, W.E.: High Performance Event Trace Visualization. In: Proc. PDP 2005, pp. 258–263 (2005)Google Scholar
  4. 4.
    Brunst, H., Kranzlmuller, D., Nagel, W.E.: Tools for Scalable Parallel Program Analysis - Vampir VNG and DeWiz. In: DAPSYS 2004, pp. 93–102 (2004)Google Scholar
  5. 5.
    Kranzlmuller, D., Scarpa, M., Volkert, J.: DeWiz - A Modular tool Architecture for Parallel Program Analysis. In: Proc. Euro-Par 2003, pp. 74–80 (2003)Google Scholar
  6. 6.
    Freitag, F., Corbalán, J., Labarta, J.: A Dynamic Periodicity Detector: Application to Speedup Computation. In: IPDPS 2001 (2001)Google Scholar
  7. 7.
    Mohr, B., Traff, J.L.: Initial Design of a Test Suite for Automatic Performance Analysis Tools. In: IPDPS (2003)Google Scholar
  8. 8.
    Nataraj, A., Malony, A., Shende, S., Morris, A.: Kernel-Level Measurement for Integrated Parallel Performance Views: the KTAU Project. In: IEEE International Conference on Cluster Computing (2006)Google Scholar
  9. 9.
    Vetter, J.S., Worley, P.H.: Asserting Performance Expectations. In: Supercomputing, ACM/IEEE 2002, Conference (2002)Google Scholar
  10. 10.
    The Message Passing Interface (MPI) standard,
  11. 11.
  12. 12.
    Badia, R.M., Labarta, J., Sirvent, R., Perez, J.M., Cela, J.M., Grima, R.: Programming grid applications with GRID Superscalar. Journal of Grid Computing 1(2) (2003)Google Scholar
  13. 13.
    Dimemas: performance prediction for message passing applications. 3rd edn., pp. 40–45, McGraw-Hill, New York, (1999),
  14. 14.
    Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Correlation and Autocorrelation Using the FFT. In: 13.2 in Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd edn., pp. 538–539. Cambridge University Press, Cambridge, England (1992)Google Scholar
  15. 15.
    Serra, J.: Image Analysis and Mathematical Morphology. Academic Press, London (1982)zbMATHGoogle Scholar
  16. 16.
    Simon, B., Odom, J., DeRose, L., Ekanadham, K., Hollingsworth, J.K., Sbaraglia, S.: Using Dynamic Tracing Sampling to Measure Long Running Programs. In: Proceedings of the 2005 ACM/IEEE conference on Supercomputing (2005)Google Scholar
  17. 17.
    Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically Characterizing Large Scale Program Behavior. In: 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2002)Google Scholar
  18. 18.
    De Chevigne, A., Kawahara, H.: YIN, a fundamental frequency estimator for speech and music. Journal of Acoustical Society of America  (2002)Google Scholar
  19. 19.
    Hoyas, S., Jimeńez, J.: Scaling of velocity fluctuations in turbulent channels up to Re=2003. Physics of fluids (2006)Google Scholar
  20. 20.
    Teysser, R.: Cosmological hydrodynamics with adaptive mesh refinement - A new high resolution code called RAMSES. Astronomy & Astrophysics (2002)Google Scholar
  21. 21.
    Springel, V., Yoshida, N., White, S.D.M.: Gadget: a code for collisionless and gasdynamical cosmological simulations. New Astronomy 6 (2001)Google Scholar
  22. 22.
    Linpack benchmark,

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Marc Casas
    • 1
  • Rosa M. Badia
    • 1
  • Jesús Labarta
    • 1
  1. 1.Barcelona Supercomputing Center (BSC), Technical University of Catalonia (UPC), Campus Nord, Modul C6, Jordi Girona, 1-3, 08034 Barcelona 

Personalised recommendations