Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units

Obrecht, Christian; Kuznik, Frédéric; Tourancheau, Bernard; Roux, Jean-Jacques

doi:10.1007/978-3-642-19328-6_16

Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units

Christian Obrecht²⁰,
Frédéric Kuznik²⁰,
Bernard Tourancheau²¹ &
…
Jean-Jacques Roux²⁰

Conference paper

1538 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6449))

Abstract

In this work, we investigate the global memory access mechanism on recent GPUs. For the purpose of this study, we created specific benchmark programs, which allowed us to explore the scheduling of global memory transactions. Thus, we formulate a model capable of estimating the execution time for a large class of applications. Our main goal is to facilitate optimisation of regular data-parallel applications on GPUs. As an example, we finally describe our CUDA implementations of LBM flow solvers on which our model was able to estimate performance with less than 5% relative error.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dongarra, J., Moore, S., Peterson, G., Tomov, S., Allred, J., Natoli, V., Richie, D.: Exploring new architectures in accelerating CFD for Air Force applications. In: Proceedings of HPCMP Users Group Conference, Citeseer, pp. 14–17 (2008)
Google Scholar
nVidia: Compute Unified Device Architecture Programming Guide version 2.3.1 (August 2009)
Google Scholar
McNamara, G.R., Zanetti, G.: Use of the Boltzmann Equation to Simulate Lattice-Gas Automata. Phys. Rev. Lett. 61, 2332–2335 (1988)
Article Google Scholar
Qian, Y.H., d’Humières, D., Lallemand, P.: Lattice BGK models for Navier-Stokes equation. Europhys. Lett. 17(6), 479–484 (1992)
Article MATH Google Scholar
d’Humières, D., Ginzburg, I., Krafczyk, M., Lallemand, P., Luo, L.: Multiple-relaxation-time lattice Boltzmann models in three dimensions. Philosophical Transactions: Mathematical, Physical and Engineering Sciences, 437–451 (2002)
Google Scholar
Ryoo, S., Rodrigues, C.I., Baghsorkhi, S.S., Stone, S.S., Kirk, D.B., Hwu, W.W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 73–82. ACM, New York (2008)
Google Scholar
Pohl, T., Kowarschik, M., Wilke, J., Iglberger, K., Rüde, U.: Optimization and Profiling of the Cache Performance of Parallel Lattice Boltzmann Codes. Parallel Processing Letters 13(4), 549–560 (2003)
Article MathSciNet Google Scholar
Kuznik, F., Obrecht, C., Rusaouën, G., Roux, J.J.: LBM Based Flow Simulation Using GPU Computing Processor. Computers and Mathematics with Applications (27) (June 2009)
Google Scholar
Tölke, J., Krafczyk, M.: TeraFLOP computing on a desktop PC with GPUs for 3D CFD. International Journal of Computational Fluid Dynamics 22(7), 443–456 (2008)
Article MATH Google Scholar
Obrecht, C., Kuznik, F., Tourancheau, B., Roux, J.J.: A new approach to the lattice Boltzmann method for graphics processing units. Computers and Mathematics with Applications (in press, 2010)
Google Scholar
Papadopoulou, M., Sadooghi-Alvandi, M., Wong, H.: Micro-benchmarking the GT200 GPU
Google Scholar
van der Laan, W.J.: Decuda G80 dissassembler version 0.4 (2007)
Google Scholar
Collange, S., Defour, D., Tisserand, A.: Power Consumption of GPUs from a Software Perspective. In: Proceedings of the 9th International Conference on Computational Science: Part I, p. 923. Springer, Heidelberg (2009)
Google Scholar
Volkov, V., Demmel, J.: Benchmarking GPUs to tune dense linear algebra. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing. IEEE Press, Piscataway (2008)
Google Scholar
Peng, Y., Shu, C., Chew, Y.T.: A 3D incompressible thermal lattice Boltzmann model and its application to simulate natural convection in a cubic cavity. Journal of Computational Physics 193(1), 260–274 (2004)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Centre de Thermique de Lyon, (UMR 5008 CNRS, INSA-Lyon, Université de Lyon), Bât. Sadi Carnot, 9 rue de la Physique, 69621, Villeurbanne Cedex, France
Christian Obrecht, Frédéric Kuznik & Jean-Jacques Roux
Laboratoire de l’Informatique du Parallélisme, (UMR 5668 CNRS, ENS de Lyon, INRIA, UCB Lyon 1), École Normale Supérieure de Lyon, 46 allée d’Italie, Cedex 07, 69364, Lyon, France
Bernard Tourancheau

Authors

Christian Obrecht
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Kuznik
View author publications
You can also search for this author in PubMed Google Scholar
Bernard Tourancheau
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Jacques Roux
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculdade de Engenharia da, Universidade do Porto, Rua Dr. Roberto Frias s/n, 4200-465, Porto, Portugal
José M. Laginha M. Palma
INP (ENSEEIHT) IRIT, University of Toulouse, rue Charles-Camichel, CEDEX 7, 31071, Toulouse, France
Michel Daydé
Lawrence Berkeley National Laboratory, Berkeley, USA
Osni Marques
Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal
João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Obrecht, C., Kuznik, F., Tourancheau, B., Roux, JJ. (2011). Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds) High Performance Computing for Computational Science – VECPAR 2010. VECPAR 2010. Lecture Notes in Computer Science, vol 6449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19328-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-19328-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19327-9
Online ISBN: 978-3-642-19328-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics