An Introduction to GPU Computing for Numerical Simulation

Part of the SEMA SIMAI Springer Series book series (SEMA SIMAI, volume 9)


Graphics Processing Units (GPUs) have proven to be a powerful accelerator for intensive numerical computations. The massive parallelism of these platforms makes it possible to achieve dramatic runtime reductions over a standard CPU in many numerical applications at a very affordable price. Moreover, several programming environments, such as NVIDIA’s Compute Unified Device Architecture (CUDA) have shown a high effectiveness in the mapping of numerical algorithms to GPUs. These notes provide an introduction to the development of CUDA programs for numerical simulation using CUDA C/C++, the most popular GPU programming toolkit. An overview of CUDA programming will be illustrated through the CUDA implementation of simple numerical examples for PDEs. These CUDA implementations will be studied and run on modern GPU-based platforms.


Graphics processing units CUDA Numerical solution of PDEs CUDA C programming 



This work has been partially supported by FEDER and the Spanish and Andalusian research projects MTM2014-52056-P, MTM2012-38383-C02-01, P11-FQM8179 and P11-RNM7069.


  1. 1.
    de la Asunción, M., Castro, M.J., Fernández-Nieto, E.D., Mantas, J.M., Ortega, S., González, J.M.: Efficient GPU implementation of a two waves TVD-WAF method for the two-dimensional one layer shallow water system on structured meshes. Comput. Fluids 80, 441–452 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    de la Asunción, M., Mantas, J.M., Castro, M.J.: Programming CUDA-based GPUs to simulate two-layer shallow water flows. In: D’ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Ischia. Lecture Notes in Computer Science, vol. 6272, pp. 353–364. Springer (2010)Google Scholar
  3. 3.
    de la Asunción, M., Mantas, J.M., Castro, M.J.: Simulation of one-layer shallow water systems on multicore and CUDA architectures. J. Supercomput. 58 (2), 206–214 (2011)CrossRefGoogle Scholar
  4. 4.
    Bell, N., Hoberock, J.: Thrust: a productivity-oriented library for CUDA. In: GPU Computing Gems Jade Edition, pp. 359–371. Morgan Kaufmann, Waltham (2011)Google Scholar
  5. 5.
    Brodtkorb, A.R., Hagen, T.R., SæTra, M.L.: Graphics processing unit (GPU) programming strategies and trends in GPU computing. J. Parallel Distrib. Comput. 73 (1), 4–13 (2013)CrossRefGoogle Scholar
  6. 6.
    Castro, M.J., Ortega, S., de la Asunción, M., Mantas, J.M., Gallardo, J.M.: GPU computing for shallow water flow simulation based on finite volume schemes. Comptes Rendus Mécanique 339 (2–3), 165–184 (2011)CrossRefzbMATHGoogle Scholar
  7. 7.
    Fang, J., Varbanescu, A., Sips, H.: A comprehensive performance comparison of cuda and opencl. In: 2011 International Conference on Parallel Processing (ICPP 2011), Taipei, pp. 216–225 (2011)Google Scholar
  8. 8.
    Fernando, R.: GPU Gems: Programming Techniques, Tips, and Tricks for Real-Time Graphics. Addison-Wesley, Boston (2004)Google Scholar
  9. 9.
    Fernando, R., Kilgard, M.J.: The Cg Tutorial: The Definitive Guide to Programmable Real-Time Graphics. Addison-Wesley, Boston (2003)Google Scholar
  10. 10.
    Hubert, N.: GPU Gems 3: Programming Techniques for High-Performance Graphics and General-Purpose Computation. Addison-Wesley Professional, Boston (2007)Google Scholar
  11. 11.
    Jackson, M., Budruk, R., Winkles, J., Anderson, D.: PCI Express Technology 3.0. MindShare Press, Monument (2012)Google Scholar
  12. 12.
    Khronos OpenCL Working Group: The OpenCL Specification. (2015)
  13. 13.
    Kirk, D., Wen-mei, H.: Programming Massively Parallel Processors. A Hands-on Approach, 2nd edn. Morgan Kaufmann, Waltham (2012)Google Scholar
  14. 14.
    Lastra, M., Mantas, J.M., Ureña, C., Castro, M.J., García, J.A.: Simulation of shallow-water systems using graphics processing units. Math. Comput. Simul. 80 (3), 598–618 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Leveque, R.: Finite Difference Methods for Ordinary and Partial Differential Equations. SIAM, Philadelphia (2007)CrossRefzbMATHGoogle Scholar
  16. 16.
    Matt, P., Randima, F.: GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation. Addison-Wesley Professional, Upper Saddle River (2005)Google Scholar
  17. 17.
    NVIDIA: CUDA C Best Practices Guide. (2014)
  18. 18.
  19. 19.
  20. 20.
    NVIDIA: CUDA Samples. (2014)
  21. 21.
  22. 22.
    NVIDIA: NVIDIA CUDA Compiler Driver NVCC. (2014)
  23. 23.
    NVIDIA: Parallel tread execution isa 3.2. (2014)
  24. 24. The OpenACC Application Programming Interface, Version 1.0. (2011)
  25. 25.
    Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proc. IEEE 96 (5), 879–899 (2008)CrossRefGoogle Scholar
  26. 26.
    Rumpf, M., Strzodka, R.: Graphics processor units: new prospects for parallel computing. In: Bruaset, A.M., Tveito, A. (eds.) Numerical Solution of Partial Differential Equations on Parallel Computers. Lecture Notes in Computational Science and Engineering, vol. 51, pp. 89–134. Springer, Berlin (2006)CrossRefGoogle Scholar
  27. 27.
    Shreiner, D., Woo, M., Neider, J., Davis, T.: OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 2.1. Addison-Wesley Professional, Upper Saddle River (2007)Google Scholar
  28. 28.
    Ujaldon, M.: High performance computing and simulations on the GPU using CUDA. In: 2012 International Conference on High Performance Computing & Simulation (HPCS 2012), Madrid, pp. 1–7. Curran Associates (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Departamento de Lenguajes y Sistemas informáticosUniversidad de GranadaGranadaSpain
  2. 2.Dpto. Análisis MatemáticoUniversidad de MálagaMálagaSpain

Personalised recommendations