Skip to main content

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 120))

  • 441 Accesses

Abstract

With teraflops-scale computational modeling expected to be routine by 2003-04, under the roadmap of the Accelerated Strategic Computing Initiative (ASCI) of the U.S. Department of Energy, and with teraflops-capable platforms already available to a small group of users, attention naturally focuses on the next symbolically important milestone, computing at rates of 1015 floating point operations per second, or “petaflop/s.” For architectural designs that are in any sense extrapolations of today’s, petaflops-scale computing will require approximately one-million-fold instruction-level concurrency. Given that cost-effective one-thousand-fold concurrency is challenging in practical computational fluid dynamics simulations today, algorithms are among the many possible bottlenecks to CFD on petaflops systems. After a general outline of the problems and prospects of petaflops computing, we examine the issue of algorithms for PDE computations in particular. A back-of-the-envelope parallel complexity analysis focuses on the latency of global synchronization steps in the implicit algorithm. We argue that the latency of synchronization steps is a fundamental, but addressable, challenge for PDE computations with static data structures, which are primarily determined by grids. We provide recent results with encouraging scalability for parallel implicit Euler simulations using the Newton-Krylov-Schwarz solver in the PETSc software library. The prospects for PDE simulations with dynamically evolving data structures are far less clear.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W.K. Anderson (1997), FUN2D/3D (homepage). http://fmad-www.larc.nasa.gov/wanderso/Fun/fun.html.

    Google Scholar 

  2. W. K. Anderson, W.D.Gropp, D.K. Kaushik, D.E. Keyes, and B.F. Smith (1999), Achieving High Sustained Performance in an Unstructured Mesh CFD Application, in Proceedings of Supercomputing’99, IEEE Computer Society, New York. http://www.icase.edu/~keyes/papers/finalbell.ps/~keyes/papers/finalbell.ps

    Google Scholar 

  3. D.H. Bailey et al. (1997), The 1997 Petaflops Algorithms Workshop Summary Report. http://www.ccic.gov/cicrd/pca-wg/pa197.html/cicrd/pca-wg/pa197.html

    Google Scholar 

  4. S. Balay, W.D. Gropp, L.C. Mcinnes, and B.F. Smith (1997), The Portable, Extensible Toolkit for Scientific Computing, version 2.0.17 (code and documentation).http://www.mcs.anl.gov/petsc/petsc

    Google Scholar 

  5. S. Balay, W.D. Gropp, L.C. Mcinnes, and B.F. Smith (1997), Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries, Modern Software Tools in Scientific Computing, E. Arge, A.M. Bruaset, and H.P. Langtangen, eds., Birkhauser Press, pp. 163–201. ftp://info.mcs.anl.gov/pub/petsc/scitools96.ps.gz/pub/petsc/scitools96.ps.gz

    Chapter  Google Scholar 

  6. S.T. Barnard and Robert L. Clay (1997), A portable MPI Implementation of the SPAI Preconditioner in ISSIS++, Technical Report NAS-97–002, NASA Ames Research Center. ftp://science.nas.nasa.gov/Pubs/TechReports/NASreprots/NAS-97-002/NAS-97-002.html./Pubs/TechReports/NASreprots/NAS-97-002/NAS-97-002.html.

    Google Scholar 

  7. X.-C. Cai (1989), Some Domain Decomposition Algorithms for Nonselfadjoint Elliptic and Parabolic Partial Differential Equations, Technical Report 461, Courant Institute, NYU.

    Google Scholar 

  8. X.-C. Cai, D.E. Keyes, and V. Venkatakrishnan (1995), Newton-KrylovSchwarz: An Implicit Solver for CFD, in Proceedings of the Eighth International Conference on Domain Decomposition Methods (R. Glowinski et al., eds.), Wiley, New York, pp. 387–400; also ICASE TR 95–87. ftp://ftp.icase.edu/pub/techreports/95/95-87.ps/pub/techreports/95/95-87.ps

    Google Scholar 

  9. X.-C. Cai and M. Sarkis (1997), A Restricted Additive Schwarz Preconditioner for Nonsymmetric Linear Systems, Technical Report CU-CS 843–97, computer Sci. Dept., Univ. of Colorado, Boulder. http://www.cs.colorado/edu/cai/public_html/papers/ras_v0.ps.

    Google Scholar 

  10. M. Chandy et al. (1996), The Caltech Archetypes/eText Project,http://www.etext.caltech.edu/.

    Google Scholar 

  11. D.E. Culler, J.P. Singh, and A. Gupta (1998), Parallel Computer Architecture,Morgan-Kaufman Press.

    Google Scholar 

  12. J.J. Dongarra (1997), Performance of Various Computers Using Standard Linear Equations Software, Technical Report CS-89–85 Computer Science Dept., Univ. of Tennessee, Knoxville. http://www.netlib.org/benchmark/performance.ps/benchmark/performance.ps

    Google Scholar 

  13. J. Erhel (1995), A parallel GMRES version for general sparse matrices, ETNA, 3:160–176.

    MathSciNet  MATH  Google Scholar 

  14. W.D. Gropp, L.C. Mcinnes, M.D. Tidriri, and D.E. Keyes (1997), Parallel Implicit PDE Computations: Algorithms and Software, in Proceedings of Parallel CFD’97, A. Ecer et al., eds., Elsevier. http://www.icase.edu/~keyes/papers/pcfd97.ps/~keyes/papers/pcfd97.ps

    Google Scholar 

  15. W.D. Gropp, D.K. Kaushik, D.E. Keyes, and B.F. Smith (1999), Towards Realistic Performance Bounds for Implicit CFD Codes, in Proceedings of Parallel CFD’99, D. E. Keyes et al., eds., Elsevier (to appear). http://www.icase.edu/~keyes/papers/gkks.ps/~keyes/papers/gkks.ps

    Google Scholar 

  16. M.M. Grote and T. Huckle (1997), Parallel Preconditioning with Sparse Approximate Inverses, SIAM J. Sci. comput., 18:838–853. http://www-sccm.stanford.edu/Students/grote/grote/spai.ps.gz/Students/grote/grote/spai.ps.gz.

    Article  MathSciNet  MATH  Google Scholar 

  17. M.E. Hayder, D.E. Keyes, and P. Mehrotra (1997), A Comparison of PETSc Library and HIPF Implementations of an Archetypal PDE Computation, Advances in Engineering Software, 29:415–424. http://www.icase.edu/~keyes/papers/nasa97.ps/~keyes/papers/nasa97.ps

    Article  Google Scholar 

  18. G. Horton (1994), Time-parallelism for the massively parallel solution of parabolic PDEs, in “Applications of High Performance Computers in Science and Engineering,” D. Power, ed., Computational Mechanics Publications, Southampton (UK).

    Google Scholar 

  19. G. Horton, S. Vandewalle, and P.H. Worley (1995), An algorithm with polylog parallel complexity for solving parabolic partial differential equations, SIAM J. Sci. Stat. Comput., 16:531–541.

    Article  MathSciNet  MATH  Google Scholar 

  20. D.K. Kaushik, D.E. Keyes, and B.F. Smith (1998), On the Interaction of Architecture and Algorithm in the Domain-Based Parallelization of an Unstructured Grid Incompressible Flow Code, in Proceedings of the Tenth International Conference on Domain Decomposition Methods (J. Mandel et al., eds.), Wiley, New York. http://www.icase.edu/~keyes/papers/kks_ddl0.ps/~keyes/papers/kks_ddl0.ps

    Google Scholar 

  21. D.K. Kaushik, D.E. Keyes, and B.F. Smith (1999), NKS Methods for Compressible and Incompressible Flows on Unstructured Grids, in Proceedings of the Eleventh International Conference on Domain Decomposition Methods (C.-H. Lai et al., eds.), Domain Decomposition Press, Bergen. http://www.icase.edu/~keyes/papers/kks_ddli.ps/~keyes/papers/kks_ddli.ps

    Google Scholar 

  22. C.T. Kelley and D.E. Keyes (1998), Convergence Analysis of Pseudo-Transient Continuation, SIAM J. Num. Anal., 35:508–523; also ICASE TR 96–46. ftp://ftp.icase.edu/pub/techreports/96/96-46.ps/pub/techreports/96/96-46.ps

    Article  MathSciNet  MATH  Google Scholar 

  23. D.E. Keyes (1999), Trends in Algorithms for Nonuniform Applications on Hierarchical Distributed Architectures,in Proceedings of the Workshop on Computational Aerosciences for the 21st Century, M. D. Salas and W. K. Anderson, eds., Elsevier. http://vvu.icase.edu/~keyes/papers/cas.ps/~keyes/papers/cas.ps

    Google Scholar 

  24. D.E. Keyes and B.F. Smith (1997), Final Report on “A Workshop on Parallel Unstructured Grid Computations”, NASA Contract Report NAS1–19858–92. http://uuv.icase.edu/~keyes/unstr.ps/~keyes/unstr.ps

    Google Scholar 

  25. P.M. Kogge, J.B. Brockman, V. Freeh, and S.C. Bass (1997), Petaflops, Algorithms, and PIMs, in Proceedings of the 1997 Petaflops Algorithms Workshop.

    Google Scholar 

  26. W. Mulder and B. Van leer (1985), Experiments with Implicit Upwind Methods for the Euler Equations, J. Comp. Phys., 59:232–246.

    Article  MATH  Google Scholar 

  27. Y. Saad and M.H. Schultz (1986), GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems, SIAM J. Sci. Stat. Comput., 7:865–869.

    MathSciNet  Google Scholar 

  28. B.F. Smith, P.E. Bjorstad, and W.D. Gropp (1996), Domain Decomposition: Parallel Multilevel Algorithms for Elliptic Partial Differential Equations, Cambridge Univ. Press.

    MATH  Google Scholar 

  29. R.W. Stevens (1997), Hardware Projects for COTS-based Designs,in Proceedings of the 1997 Petalops Algorithms Workshop.

    Google Scholar 

  30. T. Sterling (1997), Hebrid Technology Multithreaded Architecture: Issues for Algorithms and Programming, in Proceedings of the 1997 Petalops Algorithms Workshop.

    Google Scholar 

  31. T. Sterling, P. Messina, and P.H. Smith (1995), Enabling Technologies for Petaflops Computing, MIT Press.

    Google Scholar 

  32. J.C. Yan (1997), By Hand or Not By Hand — A Case Study of Computer Aided Parallelization Tools for CFD Applications, in Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications“ (H.R. Arabnia, ed.), CSREA, pp. 364–372.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media New York

About this paper

Cite this paper

Keyes, D.E., Kaushik, D.K., Smith, B.F. (2000). Prospects for CFD on Petaflops Systems. In: Bjørstad, P., Luskin, M. (eds) Parallel Solution of Partial Differential Equations. The IMA Volumes in Mathematics and its Applications, vol 120. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1176-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-1176-1_11

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-7034-8

  • Online ISBN: 978-1-4612-1176-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics