The Journal of Supercomputing

, Volume 75, Issue 3, pp 1455–1469 | Cite as

An efficient GPU version of the preconditioned GMRES method

  • José I. Aliaga
  • Ernesto Dufrechou
  • Pablo EzzattiEmail author
  • Enrique S. Quintana-Ortí


In a large number of scientific applications, the solution of sparse linear systems is the stage that concentrates most of the computational effort. This situation has motivated the study and development of several iterative solvers, among which preconditioned Krylov subspace methods occupy a place of privilege. In a previous effort, we developed a GPU-aware version of the GMRES method included in ILUPACK, a package of solvers distinguished by its inverse-based multilevel ILU preconditioner. In this work, we study the performance of our previous proposal and integrate several enhancements in order to mitigate its principal bottlenecks. The numerical evaluation shows that our novel proposal can reach important run-time reductions.


GPUs GMRES Sparse triangular solver MGSO 


  1. 1.
    Aliaga JI, Badia RM, Barreda M, Bollhöfer M, Dufrechou E, Ezzatti P, Quintana-Ortí ES (2016) Exploiting task and data parallelism in ILUPACK’s preconditioned CG solver on NUMA architectures and many-core accelerators. Parallel Comput 54:97–107MathSciNetCrossRefGoogle Scholar
  2. 2.
    Aliaga JI, Bollhöfer M, Dufrechou E, Ezzatti P, Quintana-Ortí ES (2016) A data-parallel ILUPACK for sparse general and symmetric indefinite linear systems. In: Lecture Notes in Computer Science, 14th Int. Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms—HeteroPar’16. SpringerGoogle Scholar
  3. 3.
    Aliaga JI, Bollhöfer M, Martín AF, Quintana-Ortí ES (2011) Exploiting thread-level parallelism in the iterative solution of sparse linear systems. Parallel Comput 37(3):183–202MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Aliaga JI, Bollhöfer M, Martín AF, Quintana-Ortí ES (2012) Parallelization of multilevel ILU preconditioners on distributed-memory multiprocessors. Appl Parallel Sci Comput LNCS 7133:162–172CrossRefGoogle Scholar
  5. 5.
    Aliaga JI, Dufrechou E, Ezzatti P, Quintana-Ortí ES (2018) Accelerating a preconditioned GMRES method in massively parallel processors. In: CMMSE 2018: Proceedings of the 18th International Conference on Mathematical Methods in Science and Engineering (2018)Google Scholar
  6. 6.
    Bollhöfer M, Grote MJ, Schenk O (2009) Algebraic multilevel preconditioner for the Helmholtz equation in heterogeneous media. SIAM J Sci Comput 31(5):3781–3805MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Bollhöfer M, Saad Y (2006) Multilevel preconditioners constructed from inverse-based ILUs. SIAM J Sci Comput 27(5):1627–1650MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Dufrechou E, Ezzatti P (2018) A new GPU algorithm to compute a level set-based analysis for the parallel solution of sparse triangular systems. In: 2018 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Canada, 2018. IEEE Computer SocietyGoogle Scholar
  9. 9.
    Dufrechou E, Ezzatti P (2018) Solving sparse triangular linear systems in modern GPUs: a synchronization-free algorithm. In: 2018 26th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp 196–203.
  10. 10.
    Eijkhout V (1992) LAPACK working note 50: distributed sparse data structures for linear algebra operations. Tech. rep., Knoxville, TN, USAGoogle Scholar
  11. 11.
    Golub GH, Van Loan CF (2013) Matrix computationsGoogle Scholar
  12. 12.
    He K, Tan SXD, Zhao H, Liu XX, Wang H, Shi G (2016) Parallel GMRES solver for fast analysis of large linear dynamic systems on GPU platforms. Integration 52:10–22
  13. 13.
    Liu W, Li A, Hogg JD, Duff IS, Vinter B (2017) Fast synchronization-free algorithms for parallel sparse triangular solves with multiple right-hand sides. Concurr Comput 29(21)Google Scholar
  14. 14.
    Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. SIAM, PhiladelphiaCrossRefzbMATHGoogle Scholar
  15. 15.
    Schenk O, Wächter A, Weiser M (2008) Inertia revealing preconditioning for large-scale nonconvex constrained optimization. SIAM J Sci Comput 31(2):939–960MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • José I. Aliaga
    • 1
  • Ernesto Dufrechou
    • 2
  • Pablo Ezzatti
    • 2
    Email author
  • Enrique S. Quintana-Ortí
    • 1
  1. 1.Dep. de Ingeniería y Ciencia de la ComputaciónUniversidad Jaime ICastellónSpain
  2. 2.Instituto de Computación, Facultad de IngenieríaUniversidad de la RepúblicaMontevideoUruguay

Personalised recommendations