Toward a software transactional memory for heterogeneous CPU–GPU processors

  • Alejandro Villegas
  • Angeles Navarro
  • Rafael Asenjo
  • Oscar Plata
Article
  • 28 Downloads

Abstract

The heterogeneous accelerated processing units (APUs) integrate a multi-core CPU and a GPU within the same chip. Modern APUs implement CPU–GPU platform atomics for simple data types. However, ensuring atomicity for complex data types is a task delegated to programmers. Transactional memory (TM) is an optimistic approach to achieve this goal. With TM, shared data can be accessed by multiple computing threads speculatively, but changes are only visible if a transaction ends with no conflict with others in its memory accesses. In this paper we present APUTM, a software TM designed for APU processors which focuses on minimizing the access to shared metadata. The main goal of APUTM is to understand the trade-offs of implementing a software TM on such platform. In our experiments, APUTM is able to outperform sequential execution of the applications. Additionally, we compare its adaptability to execute in one of the devices or in both simultaneously.

Keywords

Transactional memory APU processors Parallel programming Data sharing 

References

  1. 1.
    Adir A, Goodman D et al (2014) Verification of transactional memory in power 8. In: 51st Annual Design Automation Conference (DAC’14), pp 1–6Google Scholar
  2. 2.
    Cederman D, Tsigas P, Chaudhry MT (2010) Towards a software transactional memory for graphics processors. In 10th Eurographics Conference on Parallel Graphics and Visualization (EG PGV’10), pp 121–129Google Scholar
  3. 3.
    Chen S, Peng L (2016) Efficient GPU hardware transactional memory through early conflict resolution. In: 22nd International Symposium on High Performance Computer Architecture (HPCA’16)Google Scholar
  4. 4.
    Dalessandro L, Scott ML (2012) Strong isolation is a weak idea. In: International Conference on Parallel Architectures and Compilation Techniques (PACT’12)Google Scholar
  5. 5.
    Dalessandro L, Spear MF, Scott ML (2010) NOrec: streamlining STM by abolishing ownership records. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’10, New York, NY, USA. ACM, pp 67–78Google Scholar
  6. 6.
    Dice D, Shalev O, Shavit N (2006) Transactional locking II. Springer, Berlin, pp 194–208Google Scholar
  7. 7.
    Dragojević A, Guerraoui R, Kapalka M (2009) Stretching transactional memory. In: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’09, New York, NY, USA. ACM, pp 155–165Google Scholar
  8. 8.
    Felber P, Fetzer C, Riegel T, Marlier P (2010) Time-based software transactional memory. IEEE Trans Parallel Distrib Syst 21:1793–1807CrossRefGoogle Scholar
  9. 9.
    Fung WWL, Aamodt TM (2013) Energy efficient GPU transactional memory via space-time optimizations. In: 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13), pp 408–420Google Scholar
  10. 10.
    Fung WWL, Singh I, Brownsword A, Aamodt TM (2011) Hardware transactional memory for GPU architectures. In: 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11), pp 296–307Google Scholar
  11. 11.
    Guerraoui R, Kapalka M (2008) On the correctness of transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’08, New York, NY, USA. ACM, pp 175–184Google Scholar
  12. 12.
    Harris T, Larus J, Rajwar R (2010) Transactional memory, 2nd edn. Morgan & Claypool Publishers, San RafaelGoogle Scholar
  13. 13.
    Herlihy M, Moss JEB (1993) Transactional memory: architectural support for lock-free data structures. In: 20th Annual International Symposium on Computer Architecture (ISCA’93), pp 289–300Google Scholar
  14. 14.
    Holey A, Zhai A (2014) Lightweight software transactions on GPUs. In: 43rd International Conference on Parallel Processing (ICPP’14), pp 461–470Google Scholar
  15. 15.
    Jacobi C, Siegel T, Greiner D (2012) Transactional memory architecture and implementation for IBM System z. In: 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12), pp 25–36Google Scholar
  16. 16.
    Ruan W, Liu Y, Spear M (2015) Transactional read-modify-write without aborts. ACM Trans Archit Code Optim 11(4):63:1–63:4CrossRefGoogle Scholar
  17. 17.
    Shen Q, Sharp C, Blewitt W, Ushaw G, Morgan G (2015) PR-STM: priority rule based software transactions for the GPU. Springer, Berlin, pp 361–372Google Scholar
  18. 18.
    Villegas A, Asenjo R, Navarro A, Plata O, Ubal R, Kaeli D (2017) Hardware support for scratchpad memory transactions on GPU architectures. Springer, Cham, pp 273–286Google Scholar
  19. 19.
    Wang A, Gaudet M, Wu P, Amaral J, Ohmacht M, Barton C, Silvera R, and Michael M (2012) Evaluation of BlueGene/Q hardware support for transactional memories. In: 21st International Conference on Parallel Architectures and Compilation Techniques (PACT’12), pp 127–136Google Scholar
  20. 20.
    Xu Y, Wang R, Goswami N, Li T, Gao L, Qian D (2014) Software transactional memory for GPU architectures. In Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14), pp 1:1–1:10Google Scholar
  21. 21.
    Yoo RM, Hughes CJ, Lai K, Rajwar R (2013) Performance evaluation of Intel transactional synchronization extensions for high-performance computing. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC’13), pp 19:1–19:11Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Architecture, Andalucía Tech.Universidad de MálagaMálagaSpain

Personalised recommendations