Abstract
Recently, a paradigm shift in computer architecture has offered computational science the prospect of a vast increase in capability at relatively little cost. The tremendous computational power of graphics processors (GPU) provides a great opportunity for those willing to rethink algorithms and rewrite existing simulation codes. In this introduction, we give a brief survey of GPU computing, and its potential capabilities, intended for the general scientific and engineering audience. We will also review some challenges facing the users in adapting the large toolbox of scientific computing to these changes in computer architecture, and what the community can expect in the near future.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Accelereyes (2011) Jacket. http://www.accelereyes.com/products/jacket
Alghamdi A, Ahmadia A, Ketcheson D, Knepley M, Mandli K, Dalcin L (2011) PetClaw: a scalable parallel nonlinear wave propagation solver for python. ACM (2011). http://web.kaust.edu.sa/faculty/davidketcheson/petclaw.pdf
AMD (2011) AMD developer central: mangy cours zone. http://developer.amd.com/zones/magny-cours
Balay S, Gropp WD, McInnes LC, Smith BF (1997) Efficient management of parallelism in object oriented numerical software libraries. In: Arge E, Bruaset AM, Langtangen HP (eds) Modern software tools in scientific computing, Birkhäuser Press, Basel, pp 163–202
Balay S, Brown J, Buschelman K, Eijkhout V, Gropp WD, Kaushik D, Knepley MG, McInnes LC, Smith BF, Zhang H (2011) PETSc. http://www.mcs.anl.gov/petsc
Bell N, Garland M (2011) The cusp library. http://code.google.com/p/cusp-library/
Bell N, Hoberock J (2011) The thrust library. http://code.google.com/p/thrust/
Berger MJ, George DL, LeVeque RJ, Mandli K (2011) The geoclaw software for depth-averaged flows with adaptive refinement. Adv Water Resour (2011). doi:10.1016/j.advwatres.2011.02.016. (in press)
Cohen J (2011) The OpenCurrent library. http://code.google.com/p/opencurrent/
DePrince AE, Hammond JR (2011) Coupled cluster theory on graphics processing units i. the coupled cluster doubles method. J Chem Theory Comput 7(5):1287–1295. doi:10.1021/ct100584w. http://pubs.acs.org/doi/abs/10.1021/ct100584w
Dongarra J et al (2011) Magma. http://icl.cs.utk.edu/magma/
Emmett M (2011) PyWENO documentation. http://memmett.github.com/PyWENO/
George D (2011) GeoClaw documentation. http://depts.washington.edu/clawpack/users/geoclaw.html
Hairer E, Wanner G (1996) Solving ordinary differential equations II: stiff and differential-algebraic problems. Springer series in computational mathematics, vol 14. Springer, Berlin
Intel: Sandy Bridge (2011) The 2nd gen intel core processors. http://www.intel.com/SandyBridge
Kennedy CA, Carpenter MH (2003) Additive runge-kutta schemes for convection-diffusion-reaction equations. Appl Num Math 44:139–181. doi:10.1016/S0168-9274(02)00138-1. http://dl.acm.org/citation.cfm?id=639155.639164
Klöckner A (2011a) PyCUDA. http://mathema.tician.de/software/pycuda
Klöckner A (2011b) PyOpenCL. http://mathema.tician.de/software/pyopencl
Little JDC (1961) A proof for the queuing formula: \(L = \lambda W\). Oper Res 9(3): 383–387. http://www.jstor.org/stable/167570
Mandli K, Ketcheson D et al (2011) PyClaw documentation. http://numerics.kaust.edu.sa/pyclaw/
Meuer H, Strohmaier E, Dongarra J, Simon H (2011) The top 500 Supercomputer Sites. http://www.top500.org/
NVIDIA: CUDA: CUFFT library (2007) Technical report, PG-00000-003 V1.1, NVIDIA
NVIDIA: NVIDIA’s next generation CUDA compute architecture: fermi (2009) Technical report, NVIDIA
NVIDIA: CUDA: CUBLAS library (2010) Technical report, PG-00000-002 V3.1, NVIDIA
NVIDIA: CUDA: CUSPARSE library (2010) Technical report, PG-05329-032 V01, NVIDIA
Owens J et al (2011) CUDPP. http://code.google.com/p/cudpp/
Portland Group Inc (2011) PGI accelerator compilers. http://www.pgroup.com/resources/accel.htm
Quintana-Ortí G, Quintana-Ortí ES, van de Geijn RA, Van Zee FG, Chan E (2003) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36(3):14:1–14:26. http://doi.acm.org/10.1145/1527286.1527288
Volkov V, Demmel JW (2008) Benchmarking gpus to tune dense linear algebra. In: Proceedings of the 2008 ACM/IEEE conference on supercomputing (SC08) (2008). http://mc.stanford.edu/cgi-bin/images/6/65/SC08_Volkov_GPU.pdf
Zee FGV (2009) libflame: the complete reference. www.lulu.com
Acknowledgments
We thank Gordon Erlebacher, Paul R. Woodward, and Jonathan Cohen for useful discussions, and Kayvon Fatahalian, Vasily Volkv, and David Sanchez for illustrative graphics. We are grateful for the support given by the Minnesota Supercomputing Institute and the CMG program and CIG collaboration of the National Science Foundation. Dave Yuen expresses thanks to the Chinese Academy of Sciences for a senior Visiting Professorship during this period.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Knepley, M.G., Yuen, D.A. (2013). Why Do Scientists and Engineers Need GPU’s Today?. In: Yuen, D., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (eds) GPU Solutions to Multi-scale Problems in Science and Engineering. Lecture Notes in Earth System Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16405-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-16405-7_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16404-0
Online ISBN: 978-3-642-16405-7
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)