A Stable and Efficient Parallel Block Gram-Schmidt Algorithm
- 225 Downloads
The Modified Gram-Schmidt (MGS) orthogonalization process — used for example in the Arnoldi algorithm — constitutes often the bottleneck that limits parallel efficiencies. Indeed, a number of communications, proportional to the square of the problem size, is required to compute the dot-products. A block formulation is attractive but it suffers from potential numerical instability.
In this paper, we address this issue and propose a simple procedure that allows the use of a Block Gram-Schmidt algorithm while guaranteeing a numerical accuracy similar to MGS. The main idea is to dynamically determine the size of the blocks. The main advantage of this dynamic procedure are two-folds: first, high performance matrix-vector multiplications can be used to decrease the execution time. Next, in a parallel environment, the number of communications is reduced. Performance comparisons with the alternative Iterated CGS also show an improvement for moderate number of processors.
- S. Balay, W.D. Gropp, L. Curfman McInnes, and B.F. Smith. PETSc home page. http://www.mcs.anl.gov/petsc, 1998.
- C.H. Bischof. Incremental Condition Estimation. SIAM J. Mat. Anal. Appl., No. 2, pp. 312–322, 1990.Google Scholar
- A. Björck. Numerics of Gram-Schmidt Orthogonalization. Linear Alg. Appl., pp. 297–316, 1994.Google Scholar
- A. Björck and C.C. Paige. Loss and Recapture of Orthogonality in the Modified Gram-Schmidt Algorithm. SIAM J. Mat. Anal. Appl., No. 1, pp. 176–190, 1992.Google Scholar
- J.W. Daniel, W.B. Gragg, L. Kaufman, and G.W. Steward. Reorthogonalization and Stable Algorithms for Updating the Gram-Schmidt QR Factorization. Mathematics of Computations, Vol. 30, No. 136, pp. 772–795, 1976.Google Scholar
- J. Drkošovà, M. Rozložník, Z. Strakoš, and A. Greenbaum. Numerical Stability of GMRES. BIT, Vol. 35, pp. 309–330, 1995.Google Scholar
- A. Greenbaum, M. Rozložník, and Z. Strakošs. Numerical Behaviour of the Modified Gram-Schmidt GMRES Implementation. BIT, Vol. 37, No. 3, pp. 706–719, 1997.Google Scholar
- W. Hoffmann. Iterative Algorithms for Gram-Schmidt Orthogonalization. Computing, Vol. 41, pp. 335–348, 1989.Google Scholar
- W. Jalby and B. Philippe. Stability Analysis and Improvement of the Block Gram-Schmidt Algorithm. SIAM J. Sci. Stat. Comput., No. 5, pp. 1058–1073, 1991.Google Scholar
- R.B. Sidje. Alternatives for Parallel Krylov Subspace Basis Computations. Num. Lin. Alg. with Appl., Vol. 4, No. 4, pp. 305–331, 1997.Google Scholar
- E. De Sturler and H.A. van der Vorst. Reducing the Effect of Global Communication in GMRES(m) and CG on Parallel Distributed Memory Computers. Applied Numerical Mathematics, Vol. 18, pp. 441–459, 1995.Google Scholar
- H.F. Walker. Implementation of the GMRES Method Using Householder Transformation. SIAM J. Sci. Stat. Comput., Vol. 9, No. 1, pp. 152–163, 1988.Google Scholar