A Stable and Efficient Parallel Block Gram-Schmidt Algorithm

  • Denis Vanderstraeten
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1685)


The Modified Gram-Schmidt (MGS) orthogonalization process — used for example in the Arnoldi algorithm — constitutes often the bottleneck that limits parallel efficiencies. Indeed, a number of communications, proportional to the square of the problem size, is required to compute the dot-products. A block formulation is attractive but it suffers from potential numerical instability.

In this paper, we address this issue and propose a simple procedure that allows the use of a Block Gram-Schmidt algorithm while guaranteeing a numerical accuracy similar to MGS. The main idea is to dynamically determine the size of the blocks. The main advantage of this dynamic procedure are two-folds: first, high performance matrix-vector multiplications can be used to decrease the execution time. Next, in a parallel environment, the number of communications is reduced. Performance comparisons with the alternative Iterated CGS also show an improvement for moderate number of processors.


  1. [1]
    S. Balay, W.D. Gropp, L. Curfman McInnes, and B.F. Smith. PETSc home page., 1998.
  2. [2]
    C.H. Bischof. Incremental Condition Estimation. SIAM J. Mat. Anal. Appl., No. 2, pp. 312–322, 1990.Google Scholar
  3. [3]
    A. Björck. Numerics of Gram-Schmidt Orthogonalization. Linear Alg. Appl., pp. 297–316, 1994.Google Scholar
  4. [4]
    A. Björck and C.C. Paige. Loss and Recapture of Orthogonality in the Modified Gram-Schmidt Algorithm. SIAM J. Mat. Anal. Appl., No. 1, pp. 176–190, 1992.Google Scholar
  5. [5]
    J.W. Daniel, W.B. Gragg, L. Kaufman, and G.W. Steward. Reorthogonalization and Stable Algorithms for Updating the Gram-Schmidt QR Factorization. Mathematics of Computations, Vol. 30, No. 136, pp. 772–795, 1976.Google Scholar
  6. [6]
    J. Drkošovà, M. Rozložník, Z. Strakoš, and A. Greenbaum. Numerical Stability of GMRES. BIT, Vol. 35, pp. 309–330, 1995.Google Scholar
  7. [7]
    A. Greenbaum, M. Rozložník, and Z. Strakošs. Numerical Behaviour of the Modified Gram-Schmidt GMRES Implementation. BIT, Vol. 37, No. 3, pp. 706–719, 1997.Google Scholar
  8. [8]
    W. Hoffmann. Iterative Algorithms for Gram-Schmidt Orthogonalization. Computing, Vol. 41, pp. 335–348, 1989.Google Scholar
  9. [9]
    W. Jalby and B. Philippe. Stability Analysis and Improvement of the Block Gram-Schmidt Algorithm. SIAM J. Sci. Stat. Comput., No. 5, pp. 1058–1073, 1991.Google Scholar
  10. [10]
    R.B. Sidje. Alternatives for Parallel Krylov Subspace Basis Computations. Num. Lin. Alg. with Appl., Vol. 4, No. 4, pp. 305–331, 1997.Google Scholar
  11. [11]
    E. De Sturler and H.A. van der Vorst. Reducing the Effect of Global Communication in GMRES(m) and CG on Parallel Distributed Memory Computers. Applied Numerical Mathematics, Vol. 18, pp. 441–459, 1995.Google Scholar
  12. [12]
    H.F. Walker. Implementation of the GMRES Method Using Householder Transformation. SIAM J. Sci. Stat. Comput., Vol. 9, No. 1, pp. 152–163, 1988.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Denis Vanderstraeten
    • 1
  1. 1.Computer Science Dept.Katholieke Universiteit LeuvenHeverleeBelgium

Personalised recommendations