Online Performance Analysis with the Vampir Tool Set

  • Matthias WeberEmail author
  • Johannes Ziegenbalg
  • Bert Wesarg
Conference paper


Today, performance analysis of parallel applications is mandatory to fully exploit the capabilities of modern HPC systems. Many performance analysis tools are available to support users in this challenging task. All tools usually employ one of two analysis methodologies. The majority of analysis tools, such as HPCToolkit or Vampir, follow a post-mortem analysis approach. In this approach, a measurement infrastructure records performance data during the application execution and flushes its data to the file system. The tools perform subsequent analysis steps after the application execution by using the stored performance data. Post-mortem analysis comes with the disadvantage that possibly large data volumes need to be handled by the I/O subsystem of the machine. Tools following an online analysis approach mitigate this disadvantage by avoiding the I/O subsystem. The measurement infrastructure of these tools uses the network to directly transfer the recorded performance data to the analysis components of the tool. This approach, however, comes with the limitation that the complete analysis occurs at application runtime. In this work we present a prototype implementation of Vampir capable of performing online analysis. We discuss advantages and disadvantages of both approaches and draw conclusions for designing an online performance analysis tool.


  1. 1.
  2. 2.
    Brunst, H., Malony, A.D., Shende, S.S., Bell, R.: Online remote trace analysis of parallel applications on high-performance clusters. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds.) High Performance Computing: 5th International Symposium, ISHPC 2003, Tokyo-Odaiba, Japan, October 20–22, 2003. Proceedings 13, pp. 440–449. Springer, Berlin, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Brunst, H., Weber, M.: Custom hot spot analysis of HPC software with the Vampir performance tool suite. In: Proceedings of the 6th International Parallel Tools Workshop, pp. 95–114. Springer, Berlin, Heidelberg, September 2012Google Scholar
  4. 4.
    Eschweiler, D., Wagner, M., Geimer, M., Knüpfer, A., Nagel, W., Wolf, F.: Open trace format 2: the next generation of scalable trace formats and support libraries. In: Proceedings of the 14th Biennial ParCo Conference, vol. 22 of ParCo2011, pp. 481–490, January 2012Google Scholar
  5. 5.
    Gerndt, M., Ott, M.: Automatic performance analysis with periscope. Concurr. Comput. Pract. Expe. 22(6), 736–748 (2010)Google Scholar
  6. 6.
    Grützun, V., Knoth, O., Simmel, M.: Simulation of the influence of aerosol particle characteristics on clouds and precipitation with LM-SPECS: model description and first results. Atmos. Res. 90(24), 233–242 (2008)CrossRefGoogle Scholar
  7. 7.
    Ilsche, T., Schuchart, J., Cope, J., Kimpe, D., Jones, T., Knüpfer, A., Iskra, K., Ross, R., Nagel, W.E., Poole, S.: Enabling event tracing at leadership-class scale through I/O forwarding middleware. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2012, pp. 49–60. ACM, New York, NY, USA (2012)Google Scholar
  8. 8.
    Kitayama, I., Wylie, B.J.N., Maeda, T.: Execution performance analysis of the ABySS genome sequence assembler using Scalasca on the K computer. In: Parallel Computing: On the Road to Exascale, volume 27 of Advances in Parallel Computing, pp. 63–72. International Conference on Parallel Computing 2015, Edinburgh (Scotland), 1 Sep 2015–4 Sep 2015, IOS Press, September 2016Google Scholar
  9. 9.
    Knüpfer, A., Brendel, R., Brunst, H., Mix, H., Nagel, W.E.: Introducing the open trace format (OTF). In: Proceedings of the 6th International Conference on Computational Science - Volume Part II, ICCS 2006, pp. 526–533. Springer, Berlin, Heidelberg (2006)Google Scholar
  10. 10.
    Knüpfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Müller, M.S., Nagel, W.E.: The Vampir performance analysis tool-set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.), Tools for High Performance Computing, Proceedings of the 2nd International Workshop on Parallel Tools for High Performance Computing. Springer, Berlin, Heidelberg, July 2008Google Scholar
  11. 11.
    Knüpfer, A., Rössel, C., an Mey, D., Biersdorff, S., Diethelm, K., Eschweiler, D., Geimer, M., Gerndt, M., Lorenz, D., Malony, A.D., Nagel, W.E., Oleynik, Y., Philippen, P., Saviankou, P., Schmidl, D., Shende, S., Tschüter, R., Wagner, M., Wesarg, B., Wolf, F.: Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir. In: Proceedings of 5th Parallel Tools Workshop, pp. 79–91. Springer, Berlin, Heidelberg (2012)Google Scholar
  12. 12.
    Lee, G.L., Ahn, D.H., Arnold, D.C., de Supinski, B.R., Legendre, M., Miller, B.P., Schulz, M., Liblit, B.: Lessons learned at 208K: towards debugging millions of cores. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC 2008, pp. 26:1–26:9. IEEE Press, Piscataway, NJ, USA, (2008)Google Scholar
  13. 13.
    Miller, B.P., Callaghan, M.D., Cargille, J.M., Hollingsworth, J.K., Irvin, R.B., Karavanic, K.L., Kunchithapadam, K., Newhall, T.: The Paradyn parallel performance measurement tool. Computer 28(11), 37–46 (1995)CrossRefGoogle Scholar
  14. 14.
    Roth, P.C., Arnold, D.C., Miller, B.P.: MRNet: a software-based multicast/reduction network for scalable tools. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003. ACM, New York, NY, USA (2003)Google Scholar
  15. 15.
    TOP500 List of the World’s Fastest Supercomputers (2017).
  16. 16.
    Wagner, M., Hilbrich, T., Brunst, H.: Online performance analysis: an event-based workflow design towards Exascale. In: 2014 IEEE International Conference on High Performance Computing and Communications, 2014 IEEE 6th International Symposium on Cyberspace Safety and Security, 2014 IEEE 11th International Conference on Embedded Software and Systems (HPCC, CSS, ICESS), pp. 839–846, August 2014Google Scholar
  17. 17.
    Weber, M., Geisler, R., Brunst, H., Nagel, W.E.: Folding methods for event timelines in performance analysis. In: Proceedings of the 29th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 205–214. IEEE Computer Society, May 2015Google Scholar
  18. 18.
    Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Frings, W., Fürlinger, K., Geimer, M., Hermanns, M.-A., Mohr, B., Moore, S., Pfeifer, M., Szebenyi, Z.: Usage of the SCALASCA toolset for scalable performance analysis of large-scale parallel applications. In: Proceedings of the 2nd Parallel Tools Workshop, Stuttgart, Germany, pp. 157–167. Springer, July 2008Google Scholar
  19. 19.
    Wylie, B.J.N., Geimer, M., Mohr, B., Böhme, D., Szebenyi, Z., Wolf, F.: Large-scale performance analysis of Sweep3D with the Scalasca toolset. Parallel Proces. Lett. 20(4), 397–414 (2010)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Matthias Weber
    • 1
    Email author
  • Johannes Ziegenbalg
    • 1
  • Bert Wesarg
    • 1
  1. 1.TU Dresden ZIHDresdenGermany

Personalised recommendations