Advertisement

The EPiGRAM Project: Preparing Parallel Programming Models for Exascale

  • Stefano MarkidisEmail author
  • Ivy Bo Peng
  • Jesper Larsson Träff
  • Antoine Rougier
  • Valeria Bartsch
  • Rui Machado
  • Mirko Rahn
  • Alistair Hart
  • Daniel Holmes
  • Mark Bull
  • Erwin Laure
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9945)

Abstract

EPiGRAM is a European Commission funded project to improve existing parallel programming models to run efficiently large scale applications on exascale supercomputers. The EPiGRAM project focuses on the two current dominant petascale programming models, message-passing and PGAS, and on the improvement of two of their associated programming systems, MPI and GASPI. In EPiGRAM, we work on two major aspects of programming systems. First, we improve the performance of communication operations by decreasing the memory consumption, improving collective operations and introducing emerging computing models. Second, we enhance the interoperability of message-passing and PGAS by integrating them in one PGAS-based MPI implementation, called EMPI4Re, implementing MPI endpoints and improving GASPI interoperability with MPI. The new EPiGRAM concepts are tested in two large-scale applications, iPIC3D, a Particle-in-Cell code for space physics simulations, and Nek5000, a Computational Fluid Dynamics code.

Notes

Acknowledgments

This work was funded by the European Commission through the EPiGRAM (grant agreement no. 610598, www.epigram-project.eu) project.

References

  1. 1.
    Balaji, P.: Programming Models for Parallel Computing. MIT Press, Cambridge (2015)Google Scholar
  2. 2.
    Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Kumar, S., Lusk, E., Thakur, R., Träff, J.L.: MPI on a million processors. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) EuroPVM/MPI 2009. LNCS, vol. 5759, pp. 20–30. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-03770-2_9 CrossRefGoogle Scholar
  3. 3.
    Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 66. IEEE Computer Society Press (2012)Google Scholar
  4. 4.
    Bruck, J., Ho, C.T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distrib. Syst. 8(11), 1143–1156 (1997)CrossRefGoogle Scholar
  5. 5.
    Chamberlain, B.L., Callahan, D., Zima, H.P.: Parallel programmability and the Chapel language. Int. J. High Perform. Comput. Appl. 21(3), 291–312 (2007)CrossRefGoogle Scholar
  6. 6.
    Dinan, J., Balaji, P., Goodell, D., Miller, D., Snir, M., Thakur, R.: Enabling MPI interoperability through flexible communication endpoints. In: Proceedings of the 20th European MPI Users’ Group Meeting, pp. 13–18. ACM (2013)Google Scholar
  7. 7.
    Fischer, P.F., Lottes, J.W., Kerkemeier, S.G.: Nek5000 web page. mcs.anl.gov(2008). http://nek5000.mcs.anl.gov
  8. 8.
    Gong, J., Markidis, S., Laure, E., Otten, M., Fischer, P., Min, M.: Nekbone performance on GPUs with OpenACC and CUDA Fortran implementations. J. Supercomput. 1–21 (2016). doi: 10.1007/s11227-016-1744-5D
  9. 9.
    Gong, J., Markidis, S., Schliephake, M., Laure, E., Henningson, D., Schlatter, P., Peplinski, A., Hart, A., Doleschal, J., Henty, D., Fischer, P.: Nek5000 with OpenACC. In: Markidis, S., Laure, E. (eds.) EASC 2014. LNCS, vol. 8759, pp. 57–68. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-15976-8_4 Google Scholar
  10. 10.
    Gropp, W., Hoefler, T., Thakur, R., Lusk, E.: Using Advanced MPI: Modern Features of the Message-Passing Interface. MIT Press, Cambridge (2014)Google Scholar
  11. 11.
    Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface, vol. 1. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  12. 12.
    Grünewald, D., Simmendinger, C.: The GASPI API specification and its implementation GPI 2.0. In: 7th International Conference on PGAS Programming Models, vol. 243 (2013)Google Scholar
  13. 13.
    Hart, A.: First experiences porting a parallel application to a hybrid supercomputer with OpenMP4.0 device constructs. In: Terboven, C., Supinski, B.R., Reble, P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2015. LNCS, vol. 9342, pp. 73–85. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-24595-9_6 CrossRefGoogle Scholar
  14. 14.
    Hoefler, T., Lumsdaine, A., Rehm, W.: Implementation and performance analysis of non-blocking collective operations for MPI. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, 2007, SC 2007, pp. 1–10. IEEE (2007)Google Scholar
  15. 15.
    Ibrahim, K.Z., Yelick, K.: On the conditions for efficient interoperability with threads: an experience with PGAS languages using cray communication domains. In: Proceedings of the 28th ACM International Conference on Supercomputing, pp. 23–32. ACM (2014)Google Scholar
  16. 16.
    Ivanov, I., Gong, J., Akhmetova, D., Peng, I.B., Markidis, S., Laure, E., Machado, R., Rahn, M., Bartsch, V., Hart, A., et al.: Evaluation of parallel communication models in Nekbone, a Nek5000 mini-application. In: 2015 IEEE International Conference on Cluster Computing,. pp. 760–767. IEEE (2015)Google Scholar
  17. 17.
    Luo, M., Lu, X., Hamidouche, K., Kandalla, K., Panda, D.K.: Initial study of multi-endpoint runtime for MPI+ OpenMP hybrid programming model on multi-core systems. In: ACM SIGPLAN Notices, vol. 49, pp. 395–396. ACM (2014)Google Scholar
  18. 18.
    Markidis, S., Gong, J., Schliephake, M., Laure, E., Hart, A., Henty, D., Heisey, K., Fischer, P.: OpenACC acceleration of the Nek5000 spectral element code. Int. J. High Perform. Comput. Appl. 29(3), 311–319 (2015)CrossRefGoogle Scholar
  19. 19.
    Markidis, S., Lapenta, G.: Development and performance analysis of a UPC particle-in-cell code. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, p. 10. ACM (2010)Google Scholar
  20. 20.
    Markidis, S., Lapenta, G.: Rizwan-uddin: multi-scale simulations of plasma with iPIC3D. Math. Comput. Simul. 80(7), 1509–1519 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Markidis, S., Peng, I.B., Iakymchuk, R., Laure, E., Kestor, G., Gioiosa, R.: A performance characterization of streaming computing on supercomputers. Procedia Comput. Sci. 80, 98–107 (2016)CrossRefGoogle Scholar
  22. 22.
    Mozdzynski, G., Hamrud, M., Wedi, N., Doleschal, J., Richardson, H.: A PGAS implementation by co-design of the ECMWF integrated forecasting system (IFS). In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 652–661. IEEE (2012)Google Scholar
  23. 23.
    Olshevsky, V., Deca, J., Divin, A., Peng, I.B., Markidis, S., Innocenti, M.E., Cazzola, E., Lapenta, G.: Magnetic null points in kinetic simulations of space plasmas. Astrophys. J. 819(1), 52 (2016)CrossRefGoogle Scholar
  24. 24.
    Peng, I.B., Markidis, S., Laure, E.: The cost of synchronizing imbalanced processes in message passing systems. In: 2015 IEEE International Conference on Cluster Computing, pp. 408–417. IEEE (2015)Google Scholar
  25. 25.
    Peng, I.B., Markidis, S., Laure, E., Holmes, D., Bull, M.: A data streaming model in MPI. In: Proceedings of the 3rd Workshop on Exascale MPI, p. 2. ACM (2015)Google Scholar
  26. 26.
    Peng, I.B., Markidis, S., Laure, E., Johlander, A., Vaivads, A., Khotyaintsev, Y., Henri, P., Lapenta, G.: Kinetic structures of quasi-perpendicular shocks in global particle-in-cell simulations. Phys. Plasmas (1994-Present) 22(9), 092109 (2015)CrossRefGoogle Scholar
  27. 27.
    Peng, I.B., Markidis, S., Vaivads, A., Vencels, J., Amaya, J., Divin, A., Laure, E., Lapenta, G.: The formation of a magnetosphere with implicit particle-in-cell simulations. Procedia Comput. Sci. 51, 1178–1187 (2015)CrossRefGoogle Scholar
  28. 28.
    Peng, I.B., Markidis, S., Vaivads, A., Vencels, J., Deca, J., Lapenta, G., Hart, A., Laure, E.: Acceleration of a particle-in-cell code for space plasma simulations with OpenACC. In: EGU General Assembly Conference Abstracts, vol. 17, p. 1276 (2015)Google Scholar
  29. 29.
    Peng, I.B., Vencels, J., Lapenta, G., Divin, A., Vaivads, A., Laure, E., Markidis, S.: Energetic particles in magnetotail reconnection. J. Plasma Phys. 81(02), 325810202 (2015)CrossRefGoogle Scholar
  30. 30.
    Sridharan, S., Dinan, J., Kalamkar, D.D.: Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 487–498. IEEE Press (2014)Google Scholar
  31. 31.
    Ten Bruggencate, M., Roweth, D.: DMAPP - an API for one-sided program models on Baker systems. In: Cray User Group Conference (2010)Google Scholar
  32. 32.
    Tóth, G., Jia, X., Markidis, S., Peng, I.B., Chen, Y., Daldorff, L.K., Tenishev, V.M., Borovikov, D., Haiducek, J.D., Gombosi, T.I., et al.: Extended magnetohydrodynamics with embedded particle-in-cell simulation of Ganymede’s magnetosphere. J. Geophys. Res. Space Phys. 121, 1273–1293 (2016)CrossRefGoogle Scholar
  33. 33.
    Träff, J.L., Carpen-Amarie, A., Hunold, S., Rougier, A.: Message-combining algorithms for isomorphic, sparse collective communication. arXiv preprint arXiv:1606.07676 (2016)
  34. 34.
    Träff, J.L., Lübbe, F.D., Rougier, A., Hunold, S.: Isomorphic, sparse MPI-like collective communication operations for parallel stencil computations. In: Proceedings of the 22nd European MPI Users’ Group Meeting, p. 10. ACM (2015)Google Scholar
  35. 35.
    Träff, J.L., Rougier, A.: MPI collectives and datatypes for hierarchical all-to-all communication. In: Proceedings of the 21st European MPI Users’ Group Meeting, p. 27. ACM (2014)Google Scholar
  36. 36.
    Träff, J.L., Rougier, A.: Zero-copy, hierarchical gather is not possible with MPI datatypes and collectives. In: Proceedings of the 21st European MPI Users’ Group Meeting, p. 39. ACM (2014)Google Scholar
  37. 37.
    Träff, J.L., Rougier, A., Hunold, S.: Implementing a classic: zero-copy all-to-all communication with MPI datatypes. In: Proceedings of the 28th ACM International Conference on Supercomputing, pp. 135–144. ACM (2014)Google Scholar
  38. 38.
    Vencels, J., Delzanno, G.L., Johnson, A., Peng, I.B., Laure, E., Markidis, S.: Spectral solver for multi-scale plasma physics simulations with dynamically adaptive number of moments. Procedia Comput. Sci. 51, 1148–1157 (2015)CrossRefGoogle Scholar
  39. 39.
    Vencels, J., Delzanno, G.L., Manzini, G., Markidis, S., Peng, I.B., Roytershteyn, V.: SpectralPlasmaSolver: a spectral code for multiscale simulations of collisionless, magnetized plasmas. J. Phys. Conf. Ser. 719, 012022 (2016). IOP PublishingCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Stefano Markidis
    • 1
    Email author
  • Ivy Bo Peng
    • 1
  • Jesper Larsson Träff
    • 2
  • Antoine Rougier
    • 2
  • Valeria Bartsch
    • 3
  • Rui Machado
    • 3
  • Mirko Rahn
    • 3
  • Alistair Hart
    • 4
  • Daniel Holmes
    • 5
  • Mark Bull
    • 5
  • Erwin Laure
    • 1
  1. 1.KTH Royal Institute of TechnologyStockholmSweden
  2. 2.Vienna University of Technology (TU Wien)ViennaAustria
  3. 3.Fraunhofer ITWMKaiserslauternGermany
  4. 4.Cray UKEdinburghUK
  5. 5.Edinburgh Parallel Computing CenterEdinburghUK

Personalised recommendations