Skip to main content

Mixed-Precision Preconditioners in Parallel Domain Decomposition Solvers

  • Conference paper
Domain Decomposition Methods in Science and Engineering XVII

Part of the book series: Lecture Notes in Computational Science and Engineering ((LNCSE,volume 60))

Motivated by accuracy reasons, many large-scale scientific applications and industrial numerical simulation codes are fully implemented in 64-bit floating-point arithmetic. On the other hand, many recent processor architectures exhibit 32-bit computational power that is significantly higher than for 64-bit. One recent and significant example is the IBM CELL multiprocessor that is projected to have a peak performance near 256 Gflops in 32-bit and “only” 26 GFlops in 64-bit computation. We might legitimately ask whether all the calculation should be performed in 64-bit or if some pieces could be carried out in 32-bit. This leads to the design of mixed-precision algorithms. However, the switch from 64-bit operations into 32-bit operations increases rounding error. Thus we have to be careful when choosing 32-bit arithmetic so that the introduced rounding error or the accumulation of these rounding errors does not produce a meaningless solution. For the solution of linear systems, mixed-precision algorithms (single/double, double/quadruple) have been studied in dense and sparse linear algebra mainly in the framework of direct methods (see [5, 4, 8, 9]). For such approaches, the factorization is performed in low precision, and, for not too ill-conditioned matrices, a few steps of iterative refinement in high precision arithmetic is enough to recover a solution to full 64-bit accuracy (see [4]). For nonlinear systems, though, mixed-precision arithmetic is the essence of algorithms such as inexact Newton.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P.R. Amestoy, I.S. Duff, J. Koster, and J.-Y. L’Excellent. A fully asynchronous multifrontal solver using distributed dynamic scheduling. SIAM J. Matrix Anal. Appl., 23(1):15–41, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  2. P.R. Amestoy, A. Guermouche, J.-Y. L’Excellent, and S. Pralet. Hybrid scheduling for the parallel solution of linear systems. Parallel Comput., 32(2):136–156, 2006.

    Article  MathSciNet  Google Scholar 

  3. L.M. Carvalho, L. Giraud, and G. Meurant. Local preconditioners for two-level non-overlapping domain decomposition methods. Numer. Linear Algebra Appl., 8(4):207–227, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  4. J. Demmel, Y. Hida, W. Kahan, S.X. Li, S. Mukherjee, and E.J. Riedy. Error bounds from extra precise iterative refinement. Technical Report UCB/CSD-04-1344, LBNL-56965, University of California in Berkeley, 2006. Short version appeared in ACM Trans. Math. Software, vol. 32, no. 2, pp 325-351, June 2006.

    Google Scholar 

  5. J. W. Demmel, S. C. Eisenstat, J. R. Gilbert, S. X. Li, and J. W. H. Liu. A supernodal approach to sparse partial pivoting. SIAM J. Matrix Anal. Appl., 20(3):720–755, 1999.

    Article  MATH  MathSciNet  Google Scholar 

  6. L. Giraud, A. Haidar, and L.T. Watson. Parallel scalability study of three dimensional additive Schwarz preconditioners in non-overlapping domain decomposition. Technical Report TR/PA/07/05, CERFACS, Toulouse, France, 2007.

    Google Scholar 

  7. L. Giraud, A. Marrocco, and J.-C. Rioual. Iterative versus direct parallel substructuring methods in semiconductor device modelling. Numer. Linear Algebra Appl., 12(1):33–53, 2005.

    Article  MATH  MathSciNet  Google Scholar 

  8. J. Kurzak and J. Dongarra. Implementation of the mixed-precision high performance LINPACK benchmark on the CELL processor. Technical Report LAPACK Working Note #177 UT-CS-06-580, University of Tennessee Computer Science, September 2006.

    Google Scholar 

  9. J. Langou, J. Langou, P. Luszczek, J. Kurzak, A. Buttari, and J. Dongarra. Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy. Technical Report LAPACK Working Note #175 UT-CS-06-574, University of Tennessee Computer Science, April 2006.

    Google Scholar 

  10. S. Lanteri. Private communication, 2006.

    Google Scholar 

  11. G. Meurant. The Lanczos and Conjugate Gradient Algorithms: From Theory to Finite Precision Computations. Software, Environments, and Tools 19. SIAM, 2006.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Giraud, L., Haidar, A., Watson, L.T. (2008). Mixed-Precision Preconditioners in Parallel Domain Decomposition Solvers. In: Langer, U., Discacciati, M., Keyes, D.E., Widlund, O.B., Zulehner, W. (eds) Domain Decomposition Methods in Science and Engineering XVII. Lecture Notes in Computational Science and Engineering, vol 60. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75199-1_44

Download citation

Publish with us

Policies and ethics