Abstract
Intel’s Threading Building Blocks (TBB) provide a high-level abstraction for expressing parallelism in applications without writing explicitly multi-threaded code. However, TBB is only available for shared-memory, homogeneous multicore processors. Codeplay’s Offload C++ provides a single-source, POSIX threads-like approach to programming heterogeneous multicore devices where cores are equipped with private, local memories—code to move data between memory spaces is generated automatically. In this paper, we show that the strengths of TBB and Offload C++ can be combined, by implementing part of the TBB headers in Offload C++. This allows applications parallelised using TBB to run, without source-level modifications, across all the cores of the Cell BE processor. We present experimental results applying our method to a set of TBB programs. To our knowledge, this work marks the first demonstration of programs parallelised using TBB executing on a heterogeneous multicore architecture.
This work was supported in part by the EU FP7 STREP project PEPPHER, and by EPSRC grant EP/G051100/1.
Chapter PDF
References
Hofstee, H.P.: Power efficient processor architecture and the Cell processor. In: HPCA, pp. 258–262. IEEE Computer Society, Los Alamitos (2005)
Intel, Threading Building Blocks 3.0 for Open Source, http://www.opentbb.org
Cooper, P., Dolinsky, U., Donaldson, A., Richards, A., Riley, C., Russell, G.: Offload – automating code migration to heterogeneous multicore systems. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds.) HiPEAC 2010. LNCS, vol. 5952, pp. 337–352. Springer, Heidelberg (2010)
Codeplay Software Ltd, Offload: Community Edition, http://offload.codeplay.com
Stroustrup, B.: The Design and Evolution of C++. Addison-Wesley, Reading (1994)
Bucciarelli, D.: SmallPT-GPU, http://davibu.interfree.it/opencl/smallptgpu/smallptGPU.html
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: Characterization and architectural implications. In: PACT 2008, pp. 72–81. ACM, New York (2008)
Hoeflinger, J.P.: Extending OpenMP to Clusters (2006), http://www.intel.com
O’Brien, K., O’Brien, K.M., Sura, Z., Chen, T., Zhang, T.: Supporting OpenMP on Cell. International Journal of Parallel Programming 36(3), 289–311 (2008)
Khronos Group, The OpenCL specification, http://www.khronos.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Russell, G., Keir, P., Donaldson, A.F., Dolinsky, U., Richards, A., Riley, C. (2011). Programming Heterogeneous Multicore Systems Using Threading Building Blocks. In: Guarracino, M.R., et al. Euro-Par 2010 Parallel Processing Workshops. Euro-Par 2010. Lecture Notes in Computer Science, vol 6586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21878-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-21878-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21877-4
Online ISBN: 978-3-642-21878-1
eBook Packages: Computer ScienceComputer Science (R0)