GPU Optimization of Large-Scale Eigenvalue Solver
We present a GPU implementation of a large-scale eigenvalue solver as a part of the ELPA library. We describe the methodology of utilizing the GPU accelerators within an already well optimized MPI-based code. We present numerical results using two different HPC systems equipped with modern GPU accelerators and show the performance benefits of the GPU version.
Part of this work is co-funded by BMBF grant 01IH15001 of the German Government.
- 3.ELPA Library, http://elpa.mpcdf.mpg.de
- 5.ScaLAPACK - Scalable Linear Algebra PACKage, http://netlib.org/scalapack
- 6.Matrix Algebra on GPU and Multicore Architectures, http://icl.utk.edu/magma
- 7.CuBLAS Library, https://developer.nvidia.com/cublas
- 8.Multi-Process Service, https://docs.nvidia.com/deploy/pdf/CUDA_Multi_Process_Service_Overview.pdf