OpenACC Routine Directive Propagation Using Interprocedural Analysis

Shivam, Aniket; Wolfe, Michael

doi:10.1007/978-3-030-12274-4_5

Aniket Shivam¹⁵ &
Michael Wolfe¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11381))

Included in the following conference series:

International Workshop on Accelerator Programming Using Directives

346 Accesses

Abstract

Accelerator programming today requires the programmer to specify what data to place in device memory, and what code to run on the accelerator device. When programming with OpenACC, directives and clauses are used to tell the compiler what data to copy to and from the device, and what code to compile for and run on the device. In particular, the programmer inserts directives around code regions, typically loops, to identify compute constructs to be compiled for and run on the device. If the compute construct calls a procedure, that procedure also needs to be marked for device compilation, as does any routine called in that procedure, and so on transitively. In addition, the marking needs to include the kind of parallelism that is exploited within the procedure, or within routines called by the procedure. When using separate compilation, the marking where the procedure is defined must be replicated in any file where it is called. This causes much frustration when first porting existing programs to GPU programming using OpenACC.

This paper presents an approach to partially automate this process. The approach relies on interprocedural analysis (IPA) to analyze OpenACC regions and procedure definitions, and to propagate the necessary information forward and backward across procedure calls spanning all the linked files, generating the required accelerator code through recompilation at link time. This approach can also perform correctness checks to prevent compilation or runtime errors. This method is implemented in the PGI OpenACC compiler.

A. Shivam—Work done at NVIDIA/PGI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.olcf.ornl.gov/training-event/2017-gpu-hackathons/.

References

Callahan, D., Cooper, K., Kennedy, K., Torczon, L.M.: Interprocedural constant propagation. In: Proceedings of SIGPLAN 1986 Symposium on Compiler Construction, Palo Alto, CA, pp. 152–161, June 1986
Google Scholar
Chandrasekaran, S., Juckeland, G. (eds.): OpenACC for Programmers. Addison-Wesley, Boston (2018)
Google Scholar
Cooper, K.: Analyzing aliases of reference formal parameters. In: Conference on Record 12th Annual ACM Symposium Principles of Programming Languages, pp. 281–290, January 1985
Google Scholar
Cooper, K., Kennedy, K.: Efficient computation of flow insensitive interprocedural summary information. In: Proceedings of SIGPLAN 1984 Symposium on Compiler Construction, Montreal, Canada, pp. 247–258, June 1984
Google Scholar
Cooper, K., Kennedy, K.: Fast interprocedural alias analysis. In: Proceedings of ACM SIGPLAN 1989 Conference on Principles of Programming Languages, pp. 29–41, February 1986
Google Scholar
Cooper, K., Kennedy, K.: Efficient computation of flow-insensitive interprocedural summary information (a correction). Technical report TR87-60, Rice University (1987)
Google Scholar
Cooper, K.D., Kennedy, K.: Interprocedural side-effect analysis in linear time. In: Proceedings of ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation, Atlanta, GA, pp. 57–66, June 1988
Google Scholar
CUDA toolkit documentation. http://docs.nvidia.com/cuda/
Farber, R. (ed.): Parallel Programming with OpenACC. Morgan Kaufmann, Boston (2017)
Google Scholar
Hall, M., Kennedy, K.: Efficient call graph analysis. Lett. Program. Lang. Syst. 1(3), 227–242 (1992)
Article Google Scholar
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. ACM Queue 6(2), 40–53 (2008)
Article Google Scholar
The OpenACC application programming interface, version 2.6, November 2017. https://www.openacc.org/
OpenCL. https://www.khronos.org/opencl/
The OpenMP application programming interface, version 4.5, November 2015. https://www.openmp.org/
Ruetsch, G., Fatica, M.: CUDA Fortran for Scientists and Engineers. Morgan Kaufmann, San Francisco (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Irvine, CA, USA
Aniket Shivam
NVIDIA/PGI, Hillsboro, OR, USA
Michael Wolfe

Authors

Aniket Shivam
View author publications
You can also search for this author in PubMed Google Scholar
Michael Wolfe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aniket Shivam .

Editor information

Editors and Affiliations

Department of Computer and Information Science, University of Delaware, Newark, DE, USA
Sunita Chandrasekaran
Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Sachsen, Germany
Guido Juckeland
RWTH Aachen University, Aachen, Nordrhein-Westfalen, Germany
Sandra Wienke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shivam, A., Wolfe, M. (2019). OpenACC Routine Directive Propagation Using Interprocedural Analysis. In: Chandrasekaran, S., Juckeland, G., Wienke, S. (eds) Accelerator Programming Using Directives. WACCPD 2018. Lecture Notes in Computer Science(), vol 11381. Springer, Cham. https://doi.org/10.1007/978-3-030-12274-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-12274-4_5
Published: 24 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12273-7
Online ISBN: 978-3-030-12274-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics