Advertisement

Porting DMRG++ Scientific Application to OpenPOWER

  • Arghya ChatterjeeEmail author
  • Gonzalo Alvarez
  • Eduardo D’Azevedo
  • Wael Elwasif
  • Oscar Hernandez
  • Vivek Sarkar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11203)

Abstract

With the rapidly changing microprocessor designs and architectural diversity (multi-cores, many-cores, accelerators) for the next generation HPC systems, scientific applications must adapt to the hardware, to exploit the different types of parallelism and resources available in the architecture. To get the benefit of all the in-node hardware threads, it is important to use a single programming model to map and coordinate the available work to the different heterogeneous execution units in the node (e.g., multi-core hardware threads (latency optimized), accelerators (bandwidth optimized), etc.).

Our goal is to show that we can manage the node complexity of these systems by using OpenMP for in-node parallelization by exploiting different “programming styles” supported by OpenMP 4.5 to program CPU cores and accelerators. Finding out the suitable programming-style (e.g., SPMD style, multi-level tasks, accelerator programming, nested parallelism, or a combination of these) using the latest features of OpenMP to maximize performance and achieve performance portability across heterogeneous and homogeneous systems is still an open research problem.

We developed a mini-application, Kronecker Product (KP), from the original DMRG++ application (sparse matrix algebra) computational motif to experiment with different OpenMP programming styles on an OpenPOWER architecture and present their results in this paper.

Keywords

Power8 OpenMP OpenMP 4.5 Nested parallelism Task parallelism Data parallelism 

Notes

Acknowledgment

This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy.

References

  1. 1.
    Alvarez, G.: The density matrix renormalization group for strongly correlated electron systems: a generic implementation. Comput. Phys. Commun. 180, 1572–1578 (2009)CrossRefGoogle Scholar
  2. 2.
    Ayguade, E., Martorell, X., Labarta, J., Gonzalez, M., Navarro, N.: Exploiting multiple levels of parallelism in OpenMP: a case study. In: Proceedings of the 1999 International Conference on Parallel Processing, pp. 172–180 (1999)Google Scholar
  3. 3.
    Barker, B.: Message passing interface (MPI). In: Workshop: High Performance Computing on Stampede, vol. 262 (2015)Google Scholar
  4. 4.
    Broquedis, F., Diakhaté, F., Thibault, S., Aumage, O., Namyst, R., Wacrenier, P.-A.: Scheduling dynamic OpenMP applications over multicore architectures. In: Eigenmann, R., de Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 170–180. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-79561-2_15CrossRefGoogle Scholar
  5. 5.
    Department of Energy, Office of Science. ECP: Exascale Computing Project, addressing challenges, March 2017Google Scholar
  6. 6.
    Duran, A., Gonzàlez, M., Corbalán, J.: Automatic thread distribution for nested parallelism in OpenMP. In: Proceedings of the 19th Annual International Conference on Supercomputing, ICS 2005, pp. 121–130. ACM, New York (2005)Google Scholar
  7. 7.
    NERSC, Lawrence Berkley National Laboratory. CORI: Cray XC40, November 2017Google Scholar
  8. 8.
    NNSA, US Department of Energy: Office of Science. ECP: Exascale Computing Project, addressing challenges (2017)Google Scholar
  9. 9.
    Oak Ridge National Lab. Stepping up software for Exascale, May 2017Google Scholar
  10. 10.
    OLCF, Oak Ridge National Laboratory. Summit: Scale new heights. Discover new solutions, November 2017Google Scholar
  11. 11.
    OpenMPI developers. OpenMPI: Open Source High Performance Computing, May 2017Google Scholar
  12. 12.
    Van Loan, C.F.: The ubiquitous Kronecker product. J. Comput. Appl. Math. 123, 85–100 (2000)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Arghya Chatterjee
    • 1
    • 3
    Email author
  • Gonzalo Alvarez
    • 2
  • Eduardo D’Azevedo
    • 1
  • Wael Elwasif
    • 1
  • Oscar Hernandez
    • 1
  • Vivek Sarkar
    • 3
  1. 1.Computer Science and Mathematics DivisionOak Ridge National LaboratoryOak RidgeUSA
  2. 2.Computational Chemical and Material SciencesOak Ridge National LaboratoryOak RidgeUSA
  3. 3.School of Computer ScienceGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations