Have the Vectors the Continuing Ability to Parry the Attack of the Killer Micros?

  • Peter Lammers
  • Gerhard Wellein
  • Thomas Zeiser
  • Georg Hager
  • Michael Breuer
Conference paper


Classical vector systems still combine excellent performance with a well established optimization approach. On the other hand clusters based on commodity microprocessors offer comparable peak performance at very low costs. In the context of the introduction of the NEC SX-8 vector computer series we compare single and parallel performance of two CFD (computational fluid dynamics) applications on the SX-8 and on the SGI Altix architecture demonstrating the potential of the SX-8 for teraflop computing in the area of turbulence research for incompressible fluids. The two codes use either a finite-volume discretization or implement a lattice Boltzmann approach, respectively.


Parallel Performance High Performance Computing Lattice Boltzmann Method Memory Bandwidth Subgrid Scale 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brooks, E.: The attack of the killer micros. Teraflop Computing Panel, Supercomputing’ 89 (1989) Reno, Nevada, 1989Google Scholar
  2. 2.
    Oliker, L., A. Canning, J. C., Shalf, J., Skinner, D., Ethier, S., Biswas, R., Djomehri, J., d. Wijngaart, R.V.: Evaluation of cache-based superscalar and cacheless vector architectures for scientific computations. In: Proceedings of SC2003. CD-ROM (2003)Google Scholar
  3. 3.
    Oliker, L., A. Canning, J. C., Shalf, J., Ethier, S.: Scientific computations on modern parallel vector systems. In: Proceedings of SC2004. CD-ROM (2004)Google Scholar
  4. 4.
    Pohl, T., Deserno, F., Thürey, N., Rüde, U., Lammers, P., Wellein, G., Zeiser, T.: Performance evaluation of parallel large-scale lattice Boltzmann applications on three supercomputing architectures. In: Proceedings of SC2004. CD-ROM (2004)Google Scholar
  5. 5.
    HLRS/NEC: Teraflop workbench. (2005)Google Scholar
  6. 6.
    Breuer, M., Rodi, W.: Large-eddy simulation of complex turbulent flows of practical interest. In Hirschel, E.H., ed.: Flow Simulation with High-Performance Computers II. Volume 52., Vieweg Verlag, Braunschweig (1996) 258–274Google Scholar
  7. 7.
    Breuer, M.: Large-eddy simulation of the sub-critical flow past a circular cylinder: Numerical and modeling aspects. Int. J. for Numer. Methods in Fluids 28 (1998) 1281–1302zbMATHCrossRefGoogle Scholar
  8. 8.
    Breuer, M.: A challenging test case for large-eddy simulation: High Reynolds number circular cylinder flow. Int. J. of Heat and Fluid Flow 21 (2000) 648–654CrossRefGoogle Scholar
  9. 9.
    Breuer, M.: Direkte Numerische Simulation und Large-Eddy Simulation turbulenter Strömungen auf Hochleistungsrechnern. Berichte aus der Strömungstechnik, Habilitationsschrift, Universität Erlangen-Nürnberg, Shaker Verlag, Aachen (2002) ISBN: 3-8265-9958-6.Google Scholar
  10. 10.
    Lammers, P.: Direkte numerische Simulationen wandgebundener Strömungen kleiner Reynoldszahlen mit dem lattice Boltzmann Verfahren. Dissertation, Universit ät Erlangen-Nürnberg (2005)Google Scholar
  11. 11.
    Deserno, F., Hager, G., Brechtefeld, F., Wellein, G.: Performance of scientific applications on modern supercomputers. In Wagner, S., Hanke, W., Bode, A., Durst, F., eds.: High Performance Computing in Science and Engineering, Munich 2004. Transactions of the Second Joint HLRB and KONWIHR Result and Reviewing Workshop, March 2nd and 3rd, 2004, Technical University of Munich. Springer Verlag (2004) 3–25Google Scholar
  12. 12.
    Wellein, G., Zeiser, T., Donath, S., Hager, G.: On the single processor performance of simple lattice Boltzmann kernels. Computers & Fluids (in press, available online December 2005)Google Scholar
  13. 13.
    Smagorinsky, J.: General circulation experiments with the primitive equations, I, the basic experiment. Mon. Weather Rev. 91 (1963) 99–165CrossRefGoogle Scholar
  14. 14.
    Germano, M., Piomelli, U., Moin, P., Cabot, W.H.: A dynamic subgrid scale eddy viscosity model. Phys. of Fluids A 3 (1991) 1760–1765zbMATHCrossRefGoogle Scholar
  15. 15.
    Lilly, D.K.: A proposed modification of the Germano subgrid scale closure method. Phys. of Fluids A 4 (1992) 633–635CrossRefGoogle Scholar
  16. 16.
    Stone, H.L.: Iterative solution of implicit approximations of multidimensional partial differential equations. SIAM J. Num. Anal. 91 (1968) 530–558CrossRefGoogle Scholar
  17. 17.
    Deserno, F., Hager, G., Brechtefeld, F., Wellein, G.: Basic Optimization Strategies for CFD-Codes. Technical report, Regionales Rechenzentrum Erlangen (2002)Google Scholar
  18. 18.
    Qian, Y.H., d’Humières, D., Lallemand, P.: Lattice BGK models for Navier-Stokes equation. Europhys. Lett. 17 (1992) 479–484Google Scholar
  19. 19.
    Wolf-Gladrow, D.A.: Lattice-Gas Cellular Automata and Lattice Boltzmann Models. Volume 1725 of Lecture Notes in Mathematics. Springer, Berlin (2000)zbMATHGoogle Scholar
  20. 20.
    Succi, S.: The Lattice Boltzmann Equation — For Fluid Dynamics and Beyond. Clarendon Press (2001)Google Scholar
  21. 21.
    Wellein, G., Lammers, P., Hager, G., Donath, S., Zeiser, T.: Towards optimal performance for lattice boltzmann applications on terascale computers. In: Parallel Computational Fluid Dynamics 2005, Trends and Applications. Proceedings of the Parallel CFD 2005 Conference, May 24–27, Washington D. C., USA. (2005) submitted.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Peter Lammers
    • 1
  • Gerhard Wellein
    • 2
  • Thomas Zeiser
    • 2
  • Georg Hager
    • 2
  • Michael Breuer
    • 3
  1. 1.High Performance Computing Center Stuttgart (HLRS)StuttgartGermany
  2. 2.Regionales Rechenzentrum Erlangen (RRZE)ErlangenGermany
  3. 3.Institute of Fluid Mechanics (LSTM)ErlangenGermany

Personalised recommendations