Offloading C++17 Parallel STL on System Shared Virtual Memory Platforms

  • Pekka JääskeläinenEmail author
  • John Glossner
  • Martin Jambor
  • Aleksi Tervo
  • Matti Rintala
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11203)


Shared virtual memory simplifies heterogeneous platform programming by enabling sharing of memory address pointers between heterogeneous devices in the platform. The most advanced implementations present a coherent view of memory to the programmer over the whole virtual address space of the process. From the point of view of data accesses, this System SVM (SSVM) enables the same programming paradigm in heterogeneous platforms as found in homogeneous platforms. C++ revision 17 adds its first features for explicit parallelism through its “Parallel Standard Template Library” (PSTL). This paper discusses the technical issues in offloading PSTL on heterogeneous platforms supporting SSVM and presents a working GCC-based proof-of-concept implementation. Initial benchmarking of the implementation on an AMD Carrizo platform shows speedups from 1.28X to 12.78X in comparison to host-only sequential STL execution.


SVM Offloading C++17 Parallel STL HSA GCC Heterogeneous platforms 



The authors would like to thank Academy of Finland (decision 297548) and the HSA Foundation for financially supporting the writing of this publication.


  1. 1.
    AMD: HCC : an open source C++ compiler for heterogeneous devices, 17 April 2018.
  2. 2.
    Chung, E.S., Milder, P.A., Hoe, J.C., Mai, K.: Single-chip heterogeneous computing: does the future include custom logic, FPGAs, and GPGPUs? In: Proceedings of 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO ’43) (2010)Google Scholar
  3. 3.
    Foley, D., Danskin, J.: Ultra-performance pascal GPU and NVLink interconnect. IEEE Micro 37(2), 7–17 (2017)CrossRefGoogle Scholar
  4. 4.
    Free Software Foundation Inc.: GCC, the GNU compiler collection, 17 April 2018.
  5. 5.
    HSA Foundation: HSA Platform System Architecture Specification v1.0, January 2015Google Scholar
  6. 6.
    HSA Foundation: HSA Programmer Reference Manual Specification v1.01, July 2015Google Scholar
  7. 7.
    HSA Foundation: HSA Runtime Specification v1.0, January 2015Google Scholar
  8. 8.
    Intel: Parallel STL, implementation available in github, 17 April 2018.
  9. 9.
    ISO/IEC: 14882:2017 Programming languages – C++, December 2017Google Scholar
  10. 10.
    Khronos: SYCL™ Specification v1.2.1, December 2017Google Scholar
  11. 11.
    Khronos: SPIR-V Specification v1.3 Revision 1, March 2018Google Scholar
  12. 12.
    Khronos Group: OpenCL Specification v2.0, July 2015Google Scholar
  13. 13.
    Nathuji, R., Isci, C., Gorbatov, E.: Exploiting platform heterogeneity for power efficient data centers. In: Fourth International Conference on Autonomic Computing (ICAC 2007), June 2007Google Scholar
  14. 14.
    NVIDIA: CUDA C Programming Guide v9.1, March 2018Google Scholar
  15. 15.
    Various authors: SYCL Parallel STL, implementation available in github, 17 April 2018.

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Tampere University of TechnologyTampereFinland
  2. 2.ParmanceTampereFinland
  3. 3.SUSEPragueCzech Republic
  4. 4.University of Science and TechnologyBeijingChina
  5. 5.Optimum Semiconductor TechnologiesTarrytownUSA

Personalised recommendations