Offload – Automating Code Migration to Heterogeneous Multicore Systems

Cooper, Pete; Dolinsky, Uwe; Donaldson, Alastair F.; Richards, Andrew; Riley, Colin; Russell, George

doi:10.1007/978-3-642-11515-8_25

Offload – Automating Code Migration to Heterogeneous Multicore Systems

Pete Cooper²¹,
Uwe Dolinsky²¹,
Alastair F. Donaldson²²,
Andrew Richards²¹,
Colin Riley²¹ &
…
George Russell²¹

Conference paper

1342 Accesses
31 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5952))

Abstract

We present Offload, a programming model for offloading parts of a C++ application to run on accelerator cores in a heterogeneous multicore system. Code to be offloaded is enclosed in an offload scope; all functions called indirectly from an offload scope are compiled for the accelerator cores. Data defined inside/outside an offload scope resides in accelerator/host memory respectively, and code to move data between memory spaces is generated automatically by the compiler. This is achieved by distinguishing between host and accelerator pointers at the type level, and compiling multiple versions of functions based on pointer parameter configurations using automatic call-graph duplication. We discuss solutions to several challenging issues related to call-graph duplication, and present an implementation of Offload for the Cell BE processor, evaluated using a number of benchmarks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hofstee, H.P.: Power efficient processor architecture and the Cell processor. In: HPCA, pp. 258–262. IEEE, Los Alamitos (2005)
Google Scholar
Hoines, E.: A proposal for standard graphics environments. IEEE Comput. Graph. Appl. 7, 3–5 (1987)
Article Google Scholar
Fatahalian, K., Horn, D.R., Knight, T.J., Leem, L., Houston, M., Park, J.Y., Erez, M., Ren, M., Aiken, A., Dally, W.J., Hanrahan, P.: Sequoia: programming the memory hierarchy. In: Supercomputing, p. 83. ACM, New York (2006)
Google Scholar
Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: CellSs: a programming model for the Cell BE architecture. In: Supercomputing, p. 86. ACM, New York (2006)
Google Scholar
Thies, W., Karczmarek, M., Amarasinghe, S.P.: Streamit: A language for streaming applications. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 179–196. Springer, Heidelberg (2002)
Chapter Google Scholar
Buck, I.: Brook specification v0.2., http://merrimac.stanford.edu/brook/
CAPS Enterprise: HMPP, http://www.caps-entreprise.com/hmpp.html
Khronos Group: The OpenCL specification, http://www.khronos.org/opencl
Cooper, K.D., Hall, M.W., Kennedy, K.: A methodology for procedure cloning. Comput. Lang. 19, 105–117 (1993)
Article MATH Google Scholar
Metzger, R., Stroud, S.: Interprocedural constant propagation: An empirical study. LOPLAS 2, 213–232 (1993)
Article Google Scholar
Bik, A.J.C., Kreitzer, D.L., Tian, X.: A case study on compiler optimizations for the Intel Core 2 Duo processor. International Journal of Parallel Programming 36, 571–591 (2008)
Article Google Scholar
Das, D.: Optimizing subroutines with optional parameters in F90 via function cloning. SIGPLAN Notices 41, 21–28 (2006)
Article Google Scholar
Lokhmotov, A., Gaster, B.R., Mycroft, A., Hickey, N., Stuttard, D.: Revisiting SIMD programming. In: LCPC, Revised Selected Papers, pp. 32–46. Springer, Heidelberg (2008)
Google Scholar
Yelick, K.A., Semenzato, L., Pike, G., Miyamoto, C., Liblit, B., Krishnamurthy, A., Hilfinger, P.N., Graham, S.L., Gay, D., Colella, P., Aiken, A.: Titanium: A high-performance Java dialect. Concurrency - Practice and Experience 10, 825–836 (1998)
Article Google Scholar
Coarfa, C., Dotsenko, Y., Mellor-Crummey, J.M., Cantonnet, F., El-Ghazawi, T.A., Mohanti, A., Yao, Y., Chavarría-Miranda, D.G.: An evaluation of global address space languages: Co-array Fortran and Unified Parallel C. In: PPOPP, pp. 36–47. ACM, New York (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Codeplay Software Ltd., Edinburgh, UK
Pete Cooper, Uwe Dolinsky, Andrew Richards, Colin Riley & George Russell
Oxford University Computing Laboratory, Oxford, UK
Alastair F. Donaldson

Authors

Pete Cooper
View author publications
You can also search for this author in PubMed Google Scholar
Uwe Dolinsky
View author publications
You can also search for this author in PubMed Google Scholar
Alastair F. Donaldson
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Richards
View author publications
You can also search for this author in PubMed Google Scholar
Colin Riley
View author publications
You can also search for this author in PubMed Google Scholar
George Russell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, The University of Texas at Austin, 1 University Station C0803, TX 78712-0240, Austin, USA
Yale N. Patt
Dipartimento di Ingegneria della Informazione, Università di Pisa, Via Diotisalvi 2, 56100, Pisa, Italy
Pierfrancesco Foglia
IBM T.J.Watson Research Center, 19 Skyline Drive, NY 10532, Hawthorne, USA
Evelyn Duesterwald
Hewlett-Packard, Cami de Can Graells 1-21, Sant Cugat del Vallés, 08174, Barcelona, Spain
Paolo Faraboschi
Computer Architecture Department, Technical University of Catalunya (UPC), c/Jordi Girona 1-3, 08034, Barcelona, Spain
Xavier Martorell

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cooper, P., Dolinsky, U., Donaldson, A.F., Richards, A., Riley, C., Russell, G. (2010). Offload – Automating Code Migration to Heterogeneous Multicore Systems. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2010. Lecture Notes in Computer Science, vol 5952. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11515-8_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-11515-8_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11514-1
Online ISBN: 978-3-642-11515-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics