Skip to main content

MIMD Interpretation on a GPU

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5898))

Abstract

Programming heterogeneous parallel computer systems is notoriously difficult, but MIMD models have proven to be portable across multi-core processors, clusters, and massively parallel systems. It would be highly desirable for GPUs (Graphics Processing Units) also to be able to leverage algorithms and programming tools designed for MIMD targets. Unfortunately, most GPU hardware implements a very restrictive multi-threaded SIMD-based execution model.

This paper presents a compiler, assembler, and interpreter system that allows a GPU to implement a richly featured MIMD execution model that supports shared-memory communication, recursion, etc. Through a variety of careful design choices and optimizations, reasonable efficiency is obtained on NVIDIA CUDA GPUs. The discussion covers both the methods used and the motivation in terms of the relevant aspects of GPU architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. NVIDIA, NVIDIA CUDA compute unified device architecture programming guide version 1.0 (June 2007)

    Google Scholar 

  2. ATI, ATI stream SDK user guide v1.3-beta (December 2008)

    Google Scholar 

  3. ClearSpeed. ClearSpeed whitepaper: CSX processor architecture, ClearSpeed Technology plc, vol. PN-1110-0702 (2007)

    Google Scholar 

  4. Blank, T.: The maspar mp-1 architecture. In: 35th IEEE Computer Society International Conference (COMPCON) (February 1990)

    Google Scholar 

  5. Wilsey, P., Hensgen, D., Slusher, C., Abu-Ghazaleh, N., Hollinden, D.: Exploiting simd computers for mutant program execution, Technical Report No. TR 133-11- 91, Department of Electrical and Computer Engineering, University of Cincinnati, Cincinnati, Ohio (November 1991)

    Google Scholar 

  6. Dietz, H.G., Cohen, W.E.: A massively parallel mimd implemented by SIMD hardware, Purdue University School of Electrical Engineering Technical Report TR-EE 92-4, 28 pages (January 1992)

    Google Scholar 

  7. Thinking Machines Corporation, Connection machine model cm-2 technical sum- mary, version 5.1 (May 1989)

    Google Scholar 

  8. Siegel, H., Nation, W., Allemang, M.: The organization of the PASM: Reconfigurable parallel processing system. In: Ohio State Parallel Computing Workshop, March 1990, pp. 1–12 (1990)

    Google Scholar 

  9. Nilsson, M., Tanaka, H.: MIMD Execution by SIMD Computers. Journal of Information Processing. Information Processing Society of Japan 13(1), 58–61 (1990)

    Google Scholar 

  10. Langdon, W.B., Banzhaf, W.: A SIMD interpreter for genetic programming on GPU graphics cards. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcazar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 73–85. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Dietz, H.G., Cohen, W.E.: A control-parallel programming model implemented on simd hardware. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds.) LCPC 1993. LNCS, vol. 768, pp. 96–114. Springer, Heidelberg (1994)

    Google Scholar 

  12. Abu-ghazaleh, N.B., Wilsey, P.A., Fan, X., Hensgen, D.A.: Synthesizing variable instruction issue interpreters for implementing functional parallelism on SIMD computers. IEEE Transactions on Parallel and Distributed Systems (1997)

    Google Scholar 

  13. Khronos OpenCL Working Group, The OpenCL specification version 1.0 (December 2008)

    Google Scholar 

  14. Lipchak, B., et al.: Arb fragment program, OpenGL Extension Registry (August 2002), http://oss.sgi.com/projects/ogl-sample/registry/ARB/fragment_program.txt

  15. Pixel Shader Reference, http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/dx81_c/directx_cpp/graphics/reference/shader/pixel/pixel.asp

  16. Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40(9), 1098–1101 (1952)

    Article  Google Scholar 

  17. Dietz, H.G.: Common subexpression induction. In: 1992 International Conference on Parallel Processing, Saint Charles, Illinois, August 1992, vol. II (1992)

    Google Scholar 

  18. Hou, Q., Zhou, K., Guo, A.: Debugging gpu stream programs through automatic data ow recording and visualization (May 2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dietz, H.G., Young, B.D. (2010). MIMD Interpretation on a GPU. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds) Languages and Compilers for Parallel Computing. LCPC 2009. Lecture Notes in Computer Science, vol 5898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13374-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13374-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13373-2

  • Online ISBN: 978-3-642-13374-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics