Skip to main content

MPI Correctness Checking with Marmot

  • Conference paper
Book cover Tools for High Performance Computing

Abstract

Parallel programming is a complex, and since the multi-core era has dawned, also a more and more common task that can be alleviated considerably by tools supporting the application development and porting process. The Message Passing Interface (MPI) is widely used to write parallel programs using message passing, but it does not guarantee portability between different MPI implementations. When an application runs without any problems on one platform but crashes or gives wrong results on another platform, developers tend to blame the compiler/architecture/MPI implementation. In many cases the problem is a subtle programming error in the application undetected on the platforms used previously. Finding this bug can be a very strenuous and difficult task. In this paper we present the Marmot tool, an automated correctness checker for MPI applications during runtime. Examples of violations of the MPI standard are the introduction of irreproducibility, deadlocks, incorrect management of resources such as communicators, groups, datatypes etc. or the use of non-portable constructs. To cover different aspects of correctness debugging in a user-friendly environment, also in hybrid applications using both MPI and OpenMP, we also work on coupling Marmot with a parallel debugger (DDT) or a threading tool (Intel® Thread Checker). Some examples of experiences with real-world applications are given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Message Passing Interface Forum. MPI: A Message Passing Interface Standard, June 1995. http://www.mpi-forum.org/.

  2. Message Passing Interface Forum. MPI-2: Extensions to the Message Passing Interface, July 1997. http://www.mpi-forum.org/.

  3. Jeffrey S. Vetter and Bronis R. de Supinski. Dynamic Software Testing of MPI Applications with Umpire. In Proceedings of the 2000 ACM/IEEE Supercomputing Conference (SC 2000), Dallas, Texas, 2000.

    Google Scholar 

  4. William D. Gropp. Runtime Checking Of Datatype Signatures In MPI. In Recent Advances In Parallel Virtual Machine And Message Passing. 7th European PVM/MPI Users’ Group Meeting. LNCS 1908, pages 160-167. Springer 2000.

    Google Scholar 

  5. Chris Falzone, Anthony Chan, Ewing Lusk and William Gropp. Collective Error Detection for MPI Collective Operations. In Recent Advances In Parallel Virtual Machine And Message Passing. 12th European PVM/MPI Users’ Group Meeting. LNCS 3666, pages 138-147. Springer 2005.

    Google Scholar 

  6. J.L. Träff and J. Worringen. Verifying Collective MPI Calls. In Recent Advances In Parallel Virtual Machine And Message Passing. 11th European PVM/MPI Users’ Group Meeting. LNCS 3241, pages 18-27, Springer, 2004.

    Google Scholar 

  7. Dieter Kranzlmüller. Event Graph Analysis For Debugging Massively Parallel Programs. Phd thesis, Joh. Kepler University Linz, Austria, 2000.

    Google Scholar 

  8. Glenn Luecke, Yan Zou, James Coyle, Jim Hoekstra and Marina Kraeva. Deadlock Detection In MPI Programs. In Concurrency and Computation: Practice and Experience. 2002, vol. 14, pages 911-932.

    Google Scholar 

  9. Marmot. http://www.hlrs.de/organization/amt/projects/marmot

  10. Bettina Krammer, Matthias S. Müller and Michael M. Resch. MPI I/O Analysis and Error Detection with Marmot. In Recent Advances In Parallel Virtual Machine And Message Passing. 11th European PVM/MPI Users’ Group Meeting. LNCS 3241, pages 242-250, Springer, 2004.

    Google Scholar 

  11. Bettina Krammer, Katrin Bidmon, Matthias S. Müller, and Michael M. Resch. Marmot: An MPI analysis and checking tool. In Proceedings of PARCO 2003, pages 493-500, Elsevier, 2004.

    Google Scholar 

  12. Bettina Krammer, Matthias S. Müller and Michael M. Resch. MPI Application Development Using the Analysis Tool Marmot, In Proceedings of ICCS 2004, LNCS 3038, pages 464-471, Springer 2004.

    Google Scholar 

  13. Bettina Krammer, Valentin Himmler, David Lecomber. Coupling DDT and Marmot for Debugging of MPI Applications. In Proc. of ParCo 2007, Jülich/Aachen, Germany, September 4-7, 2007. NIC Series, Vol. 38, pp. 653-660

    Google Scholar 

  14. KOJAK. Kit for Objective Judgement and Knowledge-based Detection of Performance Bottlenecks http://www.fz-juelich.de/jsc/kojak/

  15. Markus Geimer, Felix Wolf, Brian J.N. Wylie, and Bernd Mohr. Scalable Parallel Trace-Based Performance Analysis. In Proceedings of the 13th European Parallel Virtual Machine and Message Passing Interface Conference, LNCS 4192, pages 303-312, Springer 2006.

    Google Scholar 

  16. DDT. The Distributed Debugging Tool. http://www.allinea.com/?page=48

  17. Totalview. http://www.totalviewtech.com/productsTV.htm

  18. mpigdb. http://www-unix.mcs.anl.gov/mpi/MPICH/docs/userguide/node26.htm#Node29

  19. The GNU Project Debugger. http://www.gnu.org/manual/gdb

  20. The Data Display Debugger. http://www.gnu.org/software/ddd/

  21. The Cross-Platform Makefile Generator http://www.cmake.org

  22. Brett Carson and Ian A. Mason. ClusterGrind: Valgrinding LAM/MPI Applications. In Recent Advances In Parallel Virtual Machine And Message Passing. 12th European PVM/MPI Users’ Group Meeting. LNCS 3666, pages 325-332. Springer 2005.

    Google Scholar 

  23. Rainer Keller, Shiqing Fan and Michael Resch. Memory debugging of MPI-parallel Applications in Open MPI. In Proceedings of ParCo’07, G.R. Joubert et al. (eds), Juelich, Germany, September, 2007.

    Google Scholar 

  24. Julian Seward and Nicholas Nethercote. Using valgrind to detect undefined value errors with bit-precision. In ATEC ’05: Proceedings of the annual conference on USENIX Annual Technical Conference, Berkeley, CA, USA, USENIX Association (2005), 2–2.

    Google Scholar 

  25. Jayant DeSouza, Bob Kuhn and Bronis R. de Supinski. Automated, scalable debugging of MPI programs with Intel Message Checker. SE-HPCS ’05, St. Louis, Missouri, USA. http://csdl.ics.hawaii.edu/se-hpcs/papers/11.pdf

  26. Patrick Ohly and Werner Krotz-Vogel. Automated MPI Correctness Checking: What if There Were a Magic Option? 8th LCI ’07, South Lake Tahoe, California, USA. May 2007. http://softwarecommunity.intel.com/isn/Downloads/multicore/Krotz-Vogel_lci-hpcc-correctness.pdf

  27. Sack, P., Bliss, B.E., Ma, Z., Petersen, P., Torrellas, J.: Accurate and efficient filtering for the intel thread checker race detector. In: ASID ’06: Proceedings of the 1st workshop on Architectural and system support for improving software dependability, New York, NY, USA, ACM (2006) 34–41

    Chapter  Google Scholar 

  28. A. Tirado-Ramos, H. Ragas, D. Shamonin, H. Rosmanith, and D. Kranzlmueller. Integration of blood flow visualization on the grid: the flowfish/gvk approach. In 2nd European Across Grids Conference, Nicosia, Cyprus, January 28-30 2004.

    Google Scholar 

  29. ParMA: Parallel Programming for Multi-core Architectures - ITEA2 Project (06015). http://www.parma-itea2.org/

  30. Bettina Krammer and Rainer Keller. The ParMA Project. inSiDE, Vol 5, No. 1, Spring 2007.

    Google Scholar 

  31. Interactive European Grid. http://www.interactive-grid.eu/

  32. S. Jimenez, V. Martin-Mayor, S. Perez-Gaviro. Rejuvenation and Memory in model Spin Glasses in 3 and 4 dimensions. Phys. Rev. B 72, 054417 (2005).

    Article  Google Scholar 

  33. I. Campos, M. Cotallo-Aban, V. Martin-Mayor, S. Perez-Gaviro, A. Tarancon. Phys. Rev. Lett. 97, 217204 (2006).

    Article  Google Scholar 

  34. M.S. Müller, M. van Waveren, R. Liebermann, B. Whitney, H. Saito, K. Kalyan, J. Baron, B. Brantley, Ch. Parrott, T. Elken, H. Feng and C. Ponder SPEC MPI2007 - An Application Benchmark for Clusters and HPC systems In Proceedings of ISC2007, Dresden, 2007.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bettina Krammer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Krammer, B., Hilbrich, T., Himmler, V., Czink, B., Dichev, K., Müller, M.S. (2008). MPI Correctness Checking with Marmot. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds) Tools for High Performance Computing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68564-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68564-7_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68561-6

  • Online ISBN: 978-3-540-68564-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics