Improving Performance and Energy Efficiency of Geophysics Applications on GPU Architectures

Pavan, Pablo J.; Serpa, Matheus S.; Carreño, Emmanuell Diaz; Martínez, Víctor; Padoin, Edson Luiz; Navaux, Philippe O. A.; Panetta, Jairo; Mehaut, Jean-François

doi:10.1007/978-3-030-16205-4_9

Pablo J. Pavan¹²,
Matheus S. Serpa¹²,
Emmanuell Diaz Carreño¹³,
Víctor Martínez¹²,
Edson Luiz Padoin^12,14,
Philippe O. A. Navaux¹²,
Jairo Panetta¹⁵ &
…
Jean-François Mehaut¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 979))

Included in the following conference series:

Latin American High Performance Computing Conference

405 Accesses
3 Citations

Abstract

Energy and performance of parallel systems are an increasing concern for new large-scale systems. Research has been developed in response to this challenge aiming the manufacture of more energy efficient systems. In this context, this paper proposes optimization methods to accelerate performance and increase energy efficiency of geophysics applications used in conjunction to algorithm and GPU memory characteristics. The optimizations we developed applied to Graphics Processing Units (GPU) algorithms for stencil applications achieve a performance improvement of up to 44.65% compared with the read-only version. The computational results have shown that the combination of use read-only memory, the Z-axis internalization and reuse of specific architecture registers allow increase the energy efficiency of up to 54.11% when shared memory was used and increase of up to 44.53% when read-only was used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bauer, M., Cook, H., Khailany, B.: Cudadma: optimizing GPU memory bandwidth via warp specialization. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2011, pp. 12:1–12:11. ACM, New York (2011). https://doi.org/10.1145/2063384.2063400. http://doi.acm.org/10.1145/2063384.2063400
de la Cruz, R., Araya-Polo, M.: Towards a multi-level cache performance model for 3D stencil computation. Procedia Comput. Sci. 4, 2146–2155 (2011)
Article Google Scholar
Datta, K., et al.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 4. IEEE Press (2008)
Google Scholar
Dong, Y., Chen, J., Tang, T.: Power measurements and analyses of massive object storage system. In: Proceedings of the International Conference on Computer and Information Technology (CIT), pp. 1317–1322. IEEE Computer Society (2010). https://doi.org/10.1109/CIT.2010.237
Falch, T.L., Elster, A.C.: Register caching for stencil computations on GPUs. In: 2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, pp. 479–486. IEEE, September 2014. https://doi.org/10.1109/SYNASC.2014.70
Feng, X., Ge, R., Cameron, K.W.: Power and energy profiling of scientific applications on distributed systems. In: International Parallel and Distributed Processing Symposium (IPDPS), International Conference on Performance Engineering, p. 34. IEEE (2005). https://doi.org/10.1109/IPDPS.2005.346
Hamilton, B., Webb, C.J., Gray, A., Bilbao, S.: Large stencil operations for GPU-based 3-d acoustics simulations. In: Proceedings of the Digital Audio Effects (DAFx), Trondheim, Norway (2015)
Google Scholar
Laros, J., et al.: Topics on measuring real power usage on high performance computing platforms. In: Proceedings of the International Conference on Cluster Computing and Workshops (ICCC), pp. 1–8 (2009). https://doi.org/10.1109/CLUSTR.2009.5289179
Maruyama, N., Aoki, T.: Optimizing stencil computations for NVIDIA Kepler GPUs. In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations, Vienna, pp. 89–95 (2014)
Google Scholar
Micikevicius, P.: 3D finite difference computation on GPUs using CUDA. In: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2, pp. 79–84. ACM, New York (2009). https://doi.org/10.1145/1513895.1513905. http://doi.acm.org/10.1145/1513895.1513905
Nasciutti, T.C., Panetta, J.: Impacto da arquitetura de memória de GPGPUs na velocidade de computaçãpoundso de estênceis. In: XVII Simpósio de Sistemas Computacionais (WSCAD-SSC), Aracaju, SE, pp. 1–8 (2016)
Google Scholar
Nikitin, V.V., Duchkov, A.A., Andersson, F.: Parallel algorithm of 3D wave-packet decomposition of seismic data: implementation and optimization for GPU. J. Comput. Sci. 3(6), 469–473 (2012)
Article Google Scholar
Padoin, E.L., de Oliveira, D.A.G., Velho, P., Navaux, P.O.A., Mehaut, J.F.: ARM-based cluster: performance, scalability and energy efficiency. In: 4th Workshop on Applications for Multi-Core Architectures (WAMCA SBAC-PAD), Porto de Galinhas, PB, Brasil, pp. 1–6 (2013)
Google Scholar
Padoin, E.L., Pilla, L.L., Boito, F.Z., Kassick, R.V., Velho, P., Navaux, P.O.: Evaluating application performance and energy consumption on hybrid CPU+GPU architecture. Cluster Comput. 16(3), 511–525 (2013)
Article Google Scholar
Schafer, A., Fey, D.: High performance stencil code algorithms for GPGPUs. Procedia Comput. Sci. 4, 2027–2036 (2011). https://doi.org/10.1016/j.procs.2011.04.221. http://www.sciencedirect.com/science/article/pii/S1877050911002791. proceedings of the International Conference on Computational Science, ICCS 2011
Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009). https://doi.org/10.1145/1498765.1498785. http://doi.acm.org/10.1145/1498765.1498785
Xue, Q., Wang, Y., Zhan, Y., Chang, X.: An efficient GPU implementation for locating micro-seismic sources using 3D elastic wave time-reversal imaging. Comput. Geosci. 82, 89–97 (2015)
Article Google Scholar
Zhou, G., et al.: A novel GPU-accelerated strategy for contingency screening of static security analysis. Int. J. Electr. Power Energy Syst. 83, 33–39 (2016)
Article Google Scholar
Zhou, J., Unat, D., Choi, D.J., Guest, C.C., Cui, Y.: Hands-on performance tuning of 3D finite difference earthquake simulation on GPU fermi chipset. Procedia Comput. Sci. 9, 976–985 (2012)
Article Google Scholar

Download references

Acknowledgments

This research has received funding from the EU H2020 Programme and from MCTI/RNP-Brazil under the HPC4E Project, grant agreement n.^o 689772. It was also supported by Intel under the Modern Code project, and the PETROBRAS oil company under Ref. 2016/00133-9. We also thank to RICAP, partially funded by the Ibero-American Program of Science and Technology for Development (CYTED), Ref. 517RT0529.

Author information

Authors and Affiliations

Informatics Institute, Federal University of Rio Grande do Sul – UFRGS, Porto Alegre, Brazil
Pablo J. Pavan, Matheus S. Serpa, Víctor Martínez, Edson Luiz Padoin & Philippe O. A. Navaux
Department of Informatics, Federal University of Paraná – UFPR, Curitiba, Paraná, Brazil
Emmanuell Diaz Carreño
Department of Exact Sciences and Engineering, Regional University of the Northwest of the State of Rio Grande do Sul – UNIJUI, Ijuí, Brazil
Edson Luiz Padoin
Computer Science Division, Technological Institute of Aeronautics – ITA, São José dos Campos, Brazil
Jairo Panetta
Laboratoire d’Informatique de Grenoble, University of Grenoble – UGA, Grenoble, France
Jean-François Mehaut

Authors

Pablo J. Pavan
View author publications
You can also search for this author in PubMed Google Scholar
Matheus S. Serpa
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuell Diaz Carreño
View author publications
You can also search for this author in PubMed Google Scholar
Víctor Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Edson Luiz Padoin
View author publications
You can also search for this author in PubMed Google Scholar
Philippe O. A. Navaux
View author publications
You can also search for this author in PubMed Google Scholar
Jairo Panetta
View author publications
You can also search for this author in PubMed Google Scholar
Jean-François Mehaut
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pablo J. Pavan .

Editor information

Editors and Affiliations

Instituto Tecnológico de Costa Rica, Centro Nacional de Alta Tecnología , Pavas, Costa Rica
Esteban Meneses
Universidad de los Andes, Bogotá, Colombia
Harold Castro
Universidad Industrial de Santander, Bucaramanga, Colombia
Carlos Jaime Barrios Hernández
Universidad de Antioquia, Medellín, Colombia
Raul Ramos-Pollan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pavan, P.J. et al. (2019). Improving Performance and Energy Efficiency of Geophysics Applications on GPU Architectures. In: Meneses, E., Castro, H., Barrios Hernández, C., Ramos-Pollan, R. (eds) High Performance Computing. CARLA 2018. Communications in Computer and Information Science, vol 979. Springer, Cham. https://doi.org/10.1007/978-3-030-16205-4_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-16205-4_9
Published: 31 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16204-7
Online ISBN: 978-3-030-16205-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics