Abstract
Recent work suggests that Anderson acceleration can be used as an accelerator to the fixed-point iterative method. To improve the viability of the algorithm, we seek to improve its computational efficiency on parallel machines. The primary kernel of the method is a least-squares minimization within the main loop. We consider two approaches to reduce its cost. The first is to use a communication-avoiding QR factorization, and the second is to employ a GMRES-like restarting procedure. On problems using 1,000 processors or less, we find the amount of communication too low to justify communication avoidance. The restarting procedure also proves not to be better than current approaches unless the cost of the function evaluation is very small. In order to begin taking advantage of current trends in machine architecture, we also studied a first-attempt single-node GPU implementation of Anderson acceleration. Performance results show that for sufficiently large problems a GPU implementation can provide a significant performance increase over CPU versions due to the GPU’s higher memory bandwidth.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
D.G. Anderson, Iterative procedures for nonlinear integral equations. J. Assoc. Comput. Mach. 12, 547–560 (1965)
L.S. Blackford, J. Choi, A. Cleary, E. D’Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet et al., ScaLAPACK Users’ Guide, vol. 4 (SIAM, Philadelphia, 1997)
P.N. Brown, Y. Saad, Hybrid Krylov methods for nonlinear systems of equations. SIAM J. Sci. Statist. Comput. 11, 450–481 (1990)
J. Choi, J. Demmel, I. Dhillon, J. Dongarra, S. Ostrouchov, A. Petitet, K. Stanley, D. Walker, R.C. Whaley, ScaLAPACK: a portable linear algebra library for distributed memory computer design issues and performance, Applied Parallel Computing Computations in Physics, Chemistry and Engineering Science (Springer, Heidelberg, 1996), pp. 95–106
J. Demmel, L. Grigori, M. Hoemmen, J. Langou, Communication-optimal parallel and sequential QR and LU factorizations. SIAM J. Sci. Comput. 34(1), 206–239 (2012). http://dx.doi.org/10.1137/080731992
H. Fang, Y. Saad, Two classes of multisecant methods for nonlinear acceleration. Numer. Linear Algebra Appl. 16, 197–221 (2009)
Hammarling, S., Lucas, C.: Updating the QR factorization and the least squares problem. Tech. rep., The University of Manchester (2008). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.2571
A.C. Hindmarsh, P.N. Brown, K.E. Grant, S.L. Lee, R. Serban, D.E. Shumaker, C.S. Woodward, SUNDIALS: suite of nonlinear and differential/algebraic equation solvers. ACM Trans. Math. Softw. 31(3), 363–396 (2005). http://doi.acm.org/10.1145/1089014.1089020
J.E. Jones, C.S. Woodward, Preconditioning Newton–Krylov methods for variably saturated flow, in Computational Methods in Water Resources, vol. 1, ed. by L.R. Bentley, J.F. Sykes, C. Brebbia, W. Gray, G.F. Pinder (Balkema, Rotterdam, 2000), pp. 101–106
C. Kelley, Iterative Methods for Linear and Nonlinear Equations, Frontiers in Applied Mathematics, vol. 16 (SIAM, Philadelphia, 1995)
D.A. Knoll, D.E. Keyes, Jacobian-free Newton–Krylov methods: a survey of approaches and applications. J. Comp. Phys. 193, 357–397 (2004)
Y. Saad, M.H. Schultz, GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)
E. Solomonik, G. Ballard, N. Knight, M. Jacquelin, P. Koanantakool, E. Georganas, D. Matthews, NuLAB. https://github.com/solomonik/NuLAB/
SUNDIALS (SUite of Nonlinear and DIfferential/ALgebraic Solvers), http://www.llnl.gov/casc/sundials
H. Walker, Anderson acceleration: Algorithms and implementations. Tech. Rep. MS-9-21-45, Worcester Polytechnic Institute (2011)
H.F. Walker, P. Ni, Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49(4), 1715–1735 (2011). http://dx.doi.org/10.1137/10078356X
Acknowledgments
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-PROC-675918.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Loffeld, J., Woodward, C.S. (2016). Considerations on the Implementation and Use of Anderson Acceleration on Distributed Memory and GPU-based Parallel Computers. In: Letzter, G., et al. Advances in the Mathematical Sciences. Association for Women in Mathematics Series, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-319-34139-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-34139-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-34137-8
Online ISBN: 978-3-319-34139-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)