Bounding Volume Hierarchy Acceleration Through Tightly Coupled Heterogeneous Computing

  • Ernesto Rivera-AlvaradoEmail author
  • Francisco J. Torres-Rojas
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1087)


Bounding Volume Hierarchy (BVH) is the main acceleration mechanism used for improving ray tracing rendering time. Several research efforts have been made to optimize the BVH algorithm for GPU and CPU architectures. Nonetheless, as far as we know, no study has targeted the APU (Accelerated Processing Unit) that have a CPU and an integrated GPU in the same die. The APU has the advantage of being able to share workloads within its internal processors (CPU and GPU) through heterogeneous computing. We crafted a specific implementation of the ray tracing algorithm with BVH traversal implemented for the APU architecture and compared the performance of this SoC against CPU and GPU equivalent implementations. It was found that the performance of the APU surpassed the other architectures.


Bounding Volume Hierarchy Accelerated Processing Unit Ray tracing CPU GPU APU BVH Heterogeneous computing 


  1. 1.
    Advanced Micro Devices: Getting Started with CodeXL. AMD, September 2012Google Scholar
  2. 2.
    Advanced Micro Devices: AMD Accelerated Parallel Processing. OpenCL Programming Guide, AMD, November 2013Google Scholar
  3. 3.
    Advanced Micro Devices: AMD APP SDK. OpenCL User Guide, AMD, August 2015Google Scholar
  4. 4.
    Advanced Micro Devices: OpenCL Optimization Guide. AMD, August 2015Google Scholar
  5. 5.
    Advanced Micro Devices: Introducing the Radeon Rays SDK. AMD, August 2016Google Scholar
  6. 6.
    Áfra, A.T., Wald, I., Benthin, C., Woop, S.: Embree ray tracing kernels: overview and new features. In: ACM SIGGRAPH 2016 Talks, SIGGRAPH 2016, pp. 52:1–52:2. ACM, New York (2016)Google Scholar
  7. 7.
    Aila, T., Laine, S.: Understanding the efficiency of ray traversal on GPUs. In: Proceedings of the Conference on High Performance Graphics 2009, HPG 2009, pp. 145–149. ACM, New York (2009)Google Scholar
  8. 8.
    Akenine-Möller, T., Haines, E., Hoffman, N.: Real-Time Rendering, 4th edn. A K Peters/CRC Press, Natick (2018)CrossRefGoogle Scholar
  9. 9.
    Angel, E., Shreiner, D.: Interactive Computer Graphics: A Top-Down Approach with WebGL, 7th edn. Pearson, London (2014)Google Scholar
  10. 10.
    Bikker, J.: Ray Tracing in Real-Time Games. Ph.D. thesis, NHTV University of Applied Sciences, Reduitlaan 41, 4814DC, Breda, The Netherlands (2012)Google Scholar
  11. 11.
    Bikker, J., van Schijndel, J.: The brigade renderer: a path tracer for real-time games. Int. J. Comput. Games Technol. 2013, 1–14 (2013)CrossRefGoogle Scholar
  12. 12.
    Chitalu, F.M., Dubach, C., Komura, T.: Bulk-synchronous parallel simultaneous BVH traversal for collision detection on GPUs. In: Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D 2018, pp. 4:1–4:9. ACM, New York (2018)Google Scholar
  13. 13.
    Du, P., Liu, E.S., Suzumura, T.: Parallel continuous collision detection for high-performance GPU cluster. In: Proceedings of the 21st ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D 2017, pp. 4:1–4:7. ACM, New York (2017)Google Scholar
  14. 14.
    Fare, C.: Enabling profiling for SYCL applications. In: Proceedings of the International Workshop on OpenCL, IWOCL 2018, pp. 12:1–12:1. ACM, New York (2018)Google Scholar
  15. 15.
    Gaster, B., Howes, L., Kaeli, D.R., Mistry, P., Schaa, D.: Heterogeneous Computing with OpenCL: Revised OpenCL 1, 2nd edn. Morgan Kaufmann, San Francisco (2012)Google Scholar
  16. 16.
    Haines, E., Akenine-Möller, T.: Ray Tracing Gems: High-Quality and Real-Time Rendering with DXR and Other APIs. Apress, Berkeley (2019)CrossRefGoogle Scholar
  17. 17.
    Haines, E., Hanrahan, P., Cook, R.L., Arvo, J., Kirk, D., Heckbert, P.S.: An Introduction to Ray Tracing (The Morgan Kaufmann Series in Computer Graphics). Academic Press, London (1989)Google Scholar
  18. 18.
    Hennessy, J.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, an imprint of Elsevier, Cambridge (2018)zbMATHGoogle Scholar
  19. 19.
    Hughes, J.F., et al.: Computer Graphics: Principles and Practice, 3rd edn. Addison-Wesley Professional, Boston (2013)Google Scholar
  20. 20.
    Intel Corporation: OpenCL\(^{\rm TM}\) Developer Guide for Intel® Processor Graphics. Intel Corporation, February 2015Google Scholar
  21. 21.
    Kaeli, D.R., Mistry, P., Schaa, D., Zhang, D.P.: Heterogeneous Computing with OpenCL 2.0. Morgan Kaufmann, San Francisco (2015)Google Scholar
  22. 22.
    Kay, T.L., Kajiya, J.T.: Ray tracing complex scenes. In: Proceedings of the 13th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1986, pp. 269–278. ACM, New York (1986)Google Scholar
  23. 23.
    Laine, S.: Restart trail for stackless BVH traversal. In: Proceedings of the Conference on High Performance Graphics, HPG 2010, pp. 107–111. Eurographics Association, Aire-la-Ville, Switzerland (2010)Google Scholar
  24. 24.
    Lauterbach, C., Garland, M., Sengupta, S., Luebke, D., Manocha, D.: Fast BVH construction on GPUs. Comput. Graph. Forum 28, 375–384 (2009)CrossRefGoogle Scholar
  25. 25.
    Lauterbach, C., Mo, Q., Manocha, D.: gProximity: hierarchical GPU-based operations for collision and distance queries. Comput. Graph. Forum 29, 419–428 (2010)CrossRefGoogle Scholar
  26. 26.
    Montgomery, D.C.: Design and Analysis of Experiments. Wiley, New York (2012)Google Scholar
  27. 27.
    Parker, S.G., et al.: OptiX: a general purpose ray tracing engine. ACM Trans. Graph. 29(4), 66:1–66:13 (2010)CrossRefGoogle Scholar
  28. 28.
    Parker, S.G., et al.: OptiX: a general purpose ray tracing engine. In: ACM SIGGRAPH 2010 Papers, SIGGRAPH 2010, pp. 66:1–66:13. ACM, New York (2010)Google Scholar
  29. 29.
    Patterson, D.: Computer Organization and Design: The Hardware/Software Interface. Morgan Kaufmann, Waltham (2014)Google Scholar
  30. 30.
    Pharr, M., Jakob, W., Humphreys, G.: Physically Based Rendering: From Theory to Implementation, 3rd edn. Morgan Kaufmann, Burlington (2016)Google Scholar
  31. 31.
    Rivera-Alvarado, E., Torres-Rojas, F.: APU performance evaluation for accelerating computationally expensive workloads. In: Conferencia Latinoamericana de Informática, April 2019Google Scholar
  32. 32.
    Shirley, P.: Ray Tracing in One Weekend, 1st edn. Amazon Digital Services LLC, Seattle (2016)Google Scholar
  33. 33.
    Shirley, P., Morley, R.K.: Realistic Ray Tracing, 2nd edn. A. K. Peters, Ltd., Natick (2003)Google Scholar
  34. 34.
    Stallings, W.: Computer Organization and Architecture, 10th edn. Pearson, Hoboken (2015)zbMATHGoogle Scholar
  35. 35.
    Suffern, K.: Ray Tracing from the Ground Up. A K Peters/CRC Press, Natick (2007)Google Scholar
  36. 36.
    Tang, M., Manocha, D., Tong, R.: Multi-core collision detection between deformable models. In: SIAM/ACM Joint Conference on Geometric and Physical Modeling, SPM 2009, pp. 355–360. ACM, New York (2009)Google Scholar
  37. 37.
    Tang, M., Wang, H., Tang, L., Tong, R., Manocha, D.: CAMA: contact-aware matrix assembly with unified collision handling for GPU-based cloth simulation. Comput. Graph. Forum 35, 511–521 (2016)CrossRefGoogle Scholar
  38. 38.
    Vinkler, M., Havran, V., Bittner, J.: Bounding volume hierarchies versus Kd-trees on contemporary many-core architectures. In: Proceedings of the 30th Spring Conference on Computer Graphics, SCCG 2014, pp. 29–36. ACM, New York (2014)Google Scholar
  39. 39.
    Wald, I.: On fast construction of SAH-based bounding volume hierarchies. In: Proceedings of the 2007 IEEE Symposium on Interactive Ray Tracing, RT 2007, pp. 33–40. IEEE Computer Society, Washington, DC(2007)Google Scholar
  40. 40.
    Wang, Y., Liu, C., Deng, Y.: A feasibility study of ray tracing on mobile GPUs. In: SIGGRAPH Asia 2014 Mobile Graphics and Interactive Applications, SA 2014, pp. 31–35. ACM, New York (2014)Google Scholar
  41. 41.
    Wickham, H., Grolemund, G.: R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Sebastopol (2017)Google Scholar
  42. 42.
    Ylitie, H., Karras, T., Laine, S.: Efficient incoherent ray traversal on GPUs through compressed wide BVHs. In: Proceedings of High Performance Graphics, HPG 2017, pp. 4:1–4:13. ACM, New York (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Ernesto Rivera-Alvarado
    • 1
    Email author
  • Francisco J. Torres-Rojas
    • 1
  1. 1.Computer ScienceCosta Rica Institute of TechnologyCartagoCosta Rica

Personalised recommendations