Skip to main content

Open Issues in MPI Implementation

  • Conference paper
Advances in Computer Systems Architecture (ACSAC 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4697))

Included in the following conference series:

Abstract

MPI (the Message Passing Interface) continues to be the dominant programming model for parallel machines of all sizes, from small Linux clusters to the largest parallel supercomputers such as IBM Blue Gene/L and Cray XT3. Although the MPI standard was released more than 10 years ago and a number of implementations of MPI are available from both vendors and research groups, MPI implementations still need improvement in many areas. In this paper, we discuss several such areas, including performance, scalability, fault tolerance, support for debugging and verification, topology awareness, collective communication, derived datatypes, and parallel I/O. We also present results from experiments with several MPI implementations (MPICH2, Open MPI, Sun, IBM) on a number of platforms (Linux clusters, Sun and IBM SMPs) that demonstrate the need for performance improvement in one-sided communication and support for multithreaded programs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barnett, M., Gupta, S., Payne, D., Shuler, L., van de Geijn, R., Watts, J.: Interprocessor collective communication library (InterCom). In: Proceedings of Supercomputing 1994 (November 1994)

    Google Scholar 

  2. Bosilca, G., Bouteiller, A., Cappello, F., Djilali, S., Fedak, G., Germain, C., Herault, T., Lemarinier, P., Lodygensky, O., Magniette, F., Neri, V., Selikhov, A.: MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes. In: Proceedings of SC 2002, IEEE, Los Alamitos (2002)

    Google Scholar 

  3. Bruck, J., Ho, C.-T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Transactions on Parallel and Distributed Systems 8(11), 1143–1156 (1997)

    Article  Google Scholar 

  4. Buntinas, D., Mercier, G., Gropp, W.: Implementation and shared-memory evaluation of MPICH2 over the Nemesis communication subsystem. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, pp. 86–95. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Byna, S., Gropp, W., Sun, X.-H., Thakur, R.: Improving the performance of MPI derived datatypes by optimizing memory-access cost. In: Proceedings of the IEEE International Conference on Cluster Computing (Cluster 2003) (December 2003), pp. 412–419 (2003)

    Google Scholar 

  6. Cownie, J., Gropp, W.: A standard interface for debugger access to message queue information in MPI. In: Margalef, T., Dongarra, J.J., Luque, E. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 1697, pp. 51–58. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  7. Fagg, G.E., Dongarra, J.J.: FT-MPI: Fault tolerant MPI, supporting dynamic applications in a dynamic world. In: Dongarra, J.J., Kacsuk, P., Podhorszki, N. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 1908, pp. 346–353. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Falzone, C., Chan, A., Lusk, E., Gropp, W.: Collective error detection for MPI collective operations. In: Di Martino, B., Kranzlmüller, D., Dongarra, J.J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 3666, pp. 138–147. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Gottbrath, C., Barrett, B., Gropp, W.D., Lusk, E., Squyres, J.: An interface to support the identification of dynamic MPI-2 processes for scalable parallel debugging. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, pp. 115–122. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Gropp, W., Thakur, R.: Issues in developing a thread-safe MPI implementation. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, pp. 12–21. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Gropp, W., Thakur, R.: Revealing the performance of MPI RMA implementations. Technical Report ANL/MCS-P1419-0507, Mathematics and Computer Science Division, Argonne National Laboratory (May 2007)

    Google Scholar 

  12. Intel Trace Analyzer and Collector 7.0 for Linux, http://www.intel.com

  13. Jitsumoto, H., Endo, T., Matsuoka, S.: ABARIS: An adaptable fault detection/recovery component framework for MPIs. In: DPDNS 2007. Proceedings of 12th IEEE Workshop on Dependable Parallel, Distributed and Network-Centric Systems in conjunction with IPDPS 2007 (March 2007)

    Google Scholar 

  14. Karonis, N., de Supinski, B., Foster, I., Gropp, W., Lusk, E., Bresnahan, J.: Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In: IPDPS 2000. Proceedings of the Fourteenth International Parallel and Distributed Processing Symposium, pp. 377–384 (2000)

    Google Scholar 

  15. Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: MagPIe: MPI’s collective communication operations for clustered wide area systems. In: PPoPP 1999. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 131–140. ACM Press, New York (1999)

    Chapter  Google Scholar 

  16. Kramme, B., Müller, M.S., Resch, M.M.: MPI application development using the analysis tool MARMOT. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J.J. (eds.) ICCS 2004. LNCS, vol. 3038, pp. 464–471. Springer, Heidelberg (2004)

    Google Scholar 

  17. Lee, E.A.: The problem with threads. Computer 39(5), 33–42 (2006)

    Article  Google Scholar 

  18. Pervez, S., Gopalakrishnan, G., Kirby, R.M., Thakur, R., Gropp, W.: Formal verification of programs that use MPI one-sided communication. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, pp. 30–39. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  19. Ritzdorf, H., Träff, J.L.: Collective operations in NEC’s high-performance MPI libraries. In: IPDPS 2006. Proceedings of the 20th International Parallel and Distributed Processing Symposium (April 2006)

    Google Scholar 

  20. ROMIO: A high-performance, portable MPI-IO implementation. http://www.mcs.anl.gov/romio

  21. Ross, R., Miller, N., Gropp, W.D.: Implementing fast and reusable datatype processing. In: Dongarra, J.J., Laforenza, D., Orlando, S. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 2840, pp. 404–413. Springer, Heidelberg (2003)

    Google Scholar 

  22. Sistare, S., vandeVaart, R., Loh, E.: Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings of SC 1999: High Performance Networking and Computing (November 1999)

    Google Scholar 

  23. Thakur. R., Gropp, W.: Test suite for evaluating performance of MPI implementations that support MPI_THREAD_MULTIPLE. Technical Report ANL/MCS-P1418-0507, Mathematics and Computer Science Division, Argonne National Laboratory (May 2007)

    Google Scholar 

  24. Thakur, R., Gropp, W., Lusk, E.: On implementing MPI-IO portably and with high performance. In: Proceedings of the 6th Workshop on I/O in Parallel and Distributed Systems, pp. 23–32. ACM Press, New York (1999)

    Chapter  Google Scholar 

  25. Thakur, R., Gropp, W., Lusk, E.: Optimizing noncontiguous accesses in MPI-IO. Parallel Computing 28(1), 83–105 (2002)

    Article  MATH  Google Scholar 

  26. Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. International Journal of High-Performance Computing Applications 19(1), 49–66 (2005)

    Article  Google Scholar 

  27. Thakur, R., Ross, R., Latham, R.: Implementing byte-range locks using MPI one-sided communication. In: Di Martino, B., Kranzlmüller, D., Dongarra, J.J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 3666, pp. 120–129. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  28. Träff, J.L.: A simple work-optimal broadcast algorithm for message-passing parallel systems. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J.J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 3241, pp. 173–180. Springer, Heidelberg (2004)

    Google Scholar 

  29. Träff, J.L., Gropp, W., Thakur, R.: Self-consistent MPI performance requirements. In: Technical report. Euro PVM/MPI 2007 (submitted, 2007)

    Google Scholar 

  30. Träff, J.L., Hempel, R., Ritzdoff, H., Zimmermann, F.: Flattening on the fly: Efficient handling of MPI derived datatypes. In: Margalef, T., Dongarra, J.J., Luque, E. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 1697, pp. 109–116. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  31. Träff, J.L., Worringen, J.: Verifying collective MPI calls. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J.J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 3241, pp. 18–27. Springer, Heidelberg (2004)

    Google Scholar 

  32. Vetter, J.S., de Supinski, B.R.: Dynamic software testing of MPI applications with Umpire. In: Proceedings of SC2000: High Performance Networking and Computing, pp. 70–79 (November 2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Lynn Choi Yunheung Paek Sangyeun Cho

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thakur, R., Gropp, W. (2007). Open Issues in MPI Implementation. In: Choi, L., Paek, Y., Cho, S. (eds) Advances in Computer Systems Architecture. ACSAC 2007. Lecture Notes in Computer Science, vol 4697. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74309-5_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74309-5_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74308-8

  • Online ISBN: 978-3-540-74309-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics