A Comparison of Performance Tuning Process for Different Generations of NVIDIA GPUs and an Example Scientific Computing Algorithm

Banaś, Krzysztof; Krużel, Filip; Bielański, Jan; Chłoń, Kazimierz

doi:10.1007/978-3-319-78024-5_21

Krzysztof Banaś ORCID: orcid.org/0000-0002-4045-1530¹⁷,
Filip Krużel¹⁸,
Jan Bielański¹⁷ &
…
Kazimierz Chłoń¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10777))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

1502 Accesses
1 Citations

Abstract

We consider the performance of a selected computational kernel from a scientific code on different generations of NVIDIA GPUs. The code that we use for tests is an OpenCL implementation of finite element numerical integration algorithm. In the current contribution we describe the performance tuning for the code, done by searching a parameter space associated with the code. The results of tuning for different generations of NVIDIA GPUs serve as a basis for analyses and conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Banaś, K., Płaszewski, P., Macioł, P.: Numerical integration on GPUs for higher order finite elements. Comput. Math. Appl. 67(6), 1319–1344 (2014)
Article MathSciNet MATH Google Scholar
Banaś, K., Krużel, F., Bielański, J.: Finite element numerical integration for first order approximations on multi- and many-core architectures. Comput. Methods Appl. Mech. Eng. 305, 827–848 (2016)
Article MathSciNet Google Scholar
Cecka, C., Lew, A.J., Darve, E.: Assembly of finite element methods on graphics processors. Int. J. Numer. Methods Eng. 85(5), 640–669 (2011)
Article MATH Google Scholar
Davidson, A., Owens, J.: Toward techniques for auto-tuning GPU algorithms. In: Jónasson, K. (ed.) PARA 2010. LNCS, vol. 7134, pp. 110–119. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28145-7_11
Chapter Google Scholar
Dziekonski, A., Sypek, P., Lamecki, A., Mrozowski, M.: Generation of large finite-element matrices on multiple graphics processors. Int. J. Numer. Methods Eng. 94(2), 204–220 (2013)
Article MathSciNet MATH Google Scholar
Group, K.O.W.: The OpenCL Specification, version 1.1 (2010). http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf
Hennessy, J.L., Patterson, D.A.: Computer Architecture, Fifth Edition: A Quantitative Approach, 5th edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)
MATH Google Scholar
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: a unified graphics and computing architecture. IEEE Micro 28, 39–55 (2008)
Article Google Scholar
Markall, G.R., Slemmer, A., Ham, D.A., Kelly, P.H.J., Cantwell, C.D., Sherwin, S.J.: Finite element assembly strategies on multi-core and many-core architectures. Int. J. Numer. Methods Fluids 71(1), 80–97 (2013)
Article MathSciNet Google Scholar
NVIDIA: NVIDIA CUDA C Programming Guide Version 5.0 (2012)
Google Scholar
Whaley, R.C., Dongarra, J.J.: Automatically tuned linear algebra software. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, SC 1998, pp. 1–27. IEEE Computer Society, Washington (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

AGH University of Science and Technology, Mickiewicza 30, 30-059, Kraków, Poland
Krzysztof Banaś, Jan Bielański & Kazimierz Chłoń
Cracow University of Technology, Warszawska 24, 31-155, Kraków, Poland
Filip Krużel

Authors

Krzysztof Banaś
View author publications
You can also search for this author in PubMed Google Scholar
Filip Krużel
View author publications
You can also search for this author in PubMed Google Scholar
Jan Bielański
View author publications
You can also search for this author in PubMed Google Scholar
Kazimierz Chłoń
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krzysztof Banaś .

Editor information

Editors and Affiliations

Czestochowa University of Technology, Czestochowa, Poland
Roman Wyrzykowski
University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra
University of Southern California, Marina Del Rey, California, USA
Ewa Deelman
Czestochowa University of Technology, Czestochowa, Poland
Konrad Karczewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Banaś, K., Krużel, F., Bielański, J., Chłoń, K. (2018). A Comparison of Performance Tuning Process for Different Generations of NVIDIA GPUs and an Example Scientific Computing Algorithm. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2017. Lecture Notes in Computer Science(), vol 10777. Springer, Cham. https://doi.org/10.1007/978-3-319-78024-5_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-78024-5_21
Published: 23 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78023-8
Online ISBN: 978-3-319-78024-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Comparison of Performance Tuning Process for Different Generations of NVIDIA GPUs and an Example Scientific Computing Algorithm