Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels

Collingbourne, Peter; Donaldson, Alastair F.; Ketema, Jeroen; Qadeer, Shaz

doi:10.1007/978-3-642-37036-6_16

Peter Collingbourne¹⁸,
Alastair F. Donaldson¹⁸,
Jeroen Ketema¹⁸ &
…
Shaz Qadeer¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7792))

Included in the following conference series:

European Symposium on Programming

2094 Accesses
23 Citations

Abstract

We study semantics of GPU kernels — the parallel programs that run on Graphics Processing Units (GPUs). We provide a novel lock-step execution semantics for GPU kernels represented by arbitrary reducible control flow graphs and compare this semantics with a traditional interleaving semantics. We show for terminating kernels that either both semantics compute identical results or both behave erroneously.

The result induces a method that allows GPU kernels with arbitrary reducible control flow graphs to be verified via transformation to a sequential program that employs predicated execution. We implemented this method in the GPUVerify tool and experimentally evaluated it by comparing the tool with the previous version of the tool based on a similar method for structured programs, i.e., where control is organised using if and while statements. The evaluation was based on a set of 163 open source and commercial GPU kernels. Among these kernels, 42 exhibit unstructured control flow which our novel method can handle fully automatically, but the previous method could not. Overall the generality of the new method comes at a modest price: Verification across our benchmark set was 2.25 times slower overall; however, the median slow down across all kernels was 0.77, indicating that our novel technique yielded faster analysis in many cases.

This work was supported by the EU FP7 STREP project CARP (project number 287767).

Download to read the full chapter text

Chapter PDF

Engineering a Static Verification Tool for GPU Kernels

Specification and Verification of Atomic Operations in GPGPU Programs

Checking Data-Race Freedom of GPU Kernels, Compositionally

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools. Pearson Education, 2nd edn. (2007)
Google Scholar
Allen, J., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: POPL 1983, pp. 177–189 (1983)
Google Scholar
Alshawabkeh, M., Jang, B., Kaeli, D.: Accelerating the local outlier factor algorithm on a GPU for intrusion detection systems. In: GPGPU-3, pp. 104–110 (2010)
Google Scholar
AMD: AMD Accelerated Parallel Processing (APP) SDK, http://developer.amd.com/sdks/amdappsdk/pages/default.aspx
Barnett, M., Leino, K.R.M.: Weakest-precondition of unstructured programs. In: PASTE 2005, pp. 82–87 (2005)
Google Scholar
Barnett, M., Chang, B.-Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: A Modular Reusable Verifier for Object-Oriented Programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387. Springer, Heidelberg (2006)
Chapter Google Scholar
Betts, A., Chong, N., Donaldson, A.F., Qadeer, S., Thomson, P.: GPUVerify: a verifier for GPU kernels. In: OOPSLA 2012, pp. 113–132 (2012)
Google Scholar
Collingbourne, P., Cadar, C., Kelly, P.H.J.: Symbolic Testing of OpenCL Code. In: Eder, K., Lourenço, J., Shehory, O. (eds.) HVC 2011. LNCS, vol. 7261, pp. 203–218. Springer, Heidelberg (2012)
Chapter Google Scholar
DeMillo, R.A., Eisenstat, S.C., Lipton, R.J.: Space-time trade-offs in structured programming: An improved combinatorial embedding theorem. J. ACM 27(1), 123–127 (1980)
Article MathSciNet MATH Google Scholar
Fung, W.W., Sham, I., Yuan, G., Aamodt, T.M.: Dynamic warp formation and scheduling for efficient GPU control flow. In: MICRO 2007, pp. 407–418 (2007)
Google Scholar
Habermaier, A.: The model of computation of CUDA and its formal semantics. Tech. Rep. 2011-14, University of Augsburg (2011)
Google Scholar
Habermaier, A., Knapp, A.: On the Correctness of the SIMT Execution Model of GPUs. In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 316–335. Springer, Heidelberg (2012)
Chapter Google Scholar
Khronos Group: The OpenCL specification, version 1.2 (2011)
Google Scholar
Lamport, L.: What good is temporal logic? In: Information Processing 1983, pp. 657–668 (1983)
Google Scholar
Leung, A., Gupta, M., Agarwal, Y., Gupta, R., Jhala, R., Lerner, S.: Verifying GPU kernels by test amplification. In: PLDI 2012, pp. 383–394 (2012)
Google Scholar
Li, G., Gopalakrishnan, G.: Scalable SMT-based verification of GPU kernel functions. In: FSE 2010, pp. 187–196 (2010)
Google Scholar
Li, G., Li, P., Sawaya, G., Gopalakrishnan, G., Ghosh, I., Rajan, S.P.: GKLEE: concolic verification and test generation for GPUs. In: PPoPP 2012, pp. 215–224 (2012)
Google Scholar
Microsoft Corporation: C++ AMP sample projects for download, http://blogs.msdn.com/b/nativeconcurrency/archive/2012/01/30/c-amp-sample-projects-for-download.aspx
de Moura, L., Bjørner, N.: Z3: An Efficient SMT Solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)
Chapter Google Scholar
NVIDIA: CUDA Toolkit Release Archive, http://developer.nvidia.com/cuda/cuda-toolkit-archive
NVIDIA: NVIDIA CUDA C Programming Guide, Version 4.2 (2012)
Google Scholar
Rightware Oy: Basemark CL, http://www.rightware.com/en/Benchmarking+Software/Basemark%99+CL
Zhu, F., Chen, P., Yang, D., Zhang, W., Chen, H., Zang, B.: A GPU-based high-throughput image retrieval algorithm. In: GPGPU-5, pp. 30–37 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Imperial College London, UK
Peter Collingbourne, Alastair F. Donaldson & Jeroen Ketema
Microsoft Research, USA
Shaz Qadeer

Authors

Peter Collingbourne
View author publications
You can also search for this author in PubMed Google Scholar
Alastair F. Donaldson
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen Ketema
View author publications
You can also search for this author in PubMed Google Scholar
Shaz Qadeer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computer Science, Northeastern University, 02115, Boston, MA, USA
Matthias Felleisen
Department of Computing, Imperial College, SW7 2AZ, London, UK
Philippa Gardner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Collingbourne, P., Donaldson, A.F., Ketema, J., Qadeer, S. (2013). Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels. In: Felleisen, M., Gardner, P. (eds) Programming Languages and Systems. ESOP 2013. Lecture Notes in Computer Science, vol 7792. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37036-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-37036-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37035-9
Online ISBN: 978-3-642-37036-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels

Abstract

Chapter PDF

Similar content being viewed by others

Engineering a Static Verification Tool for GPU Kernels

Specification and Verification of Atomic Operations in GPGPU Programs

Checking Data-Race Freedom of GPU Kernels, Compositionally

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels

Abstract

Chapter PDF

Similar content being viewed by others

Engineering a Static Verification Tool for GPU Kernels

Specification and Verification of Atomic Operations in GPGPU Programs

Checking Data-Race Freedom of GPU Kernels, Compositionally

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation