Multi-kernel Ray Traversal for Graphics Processing Units

  • Thomas Schiffer
  • Dieter W. Fellner
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 550)


Ray tracing is a very popular family of algorithms that are used to compute images with high visual quality. One of its core challenges is designing an efficient mapping of ray traversal computations to massively parallel hardware architectures like modern algorithms graphics processing units (GPUs).

In this paper we investigate the performance of state-of-the-art ray traversal algorithms on GPUs and discuss their potentials and limitations. Based on this analysis, a novel ray traversal scheme called batch tracing is proposed. It subdivides the task into multiple kernels, each of which is designed for efficient parallel execution. Our algorithm achieves comparable performance to current approaches and represents a promising direction for future research.


Ray tracing SIMT Parallelism Graphics processing units 



We thank Marko Dabrovic for providing the Sibenik cathedral model and the University of Utah for the Fairy Scene. Many thanks also go to Timo Aila for making his testing and benchmarking framework publicly available. We also want to thank the anonymous reviewers for their valuable comments.


  1. 1.
    Aila, T., Karras, T.: Architecture considerations for tracing incoherent rays. In: Proceedings of the Conference on High Performance Graphics, HPG 2010, pp. 113–122. Eurographics Association, Saarbrücken (2010).
  2. 2.
    Aila, T., Laine, S.: Understanding the efficiency of ray traversal on GPUs. In: Proceedings of the Conference on High Performance Graphics 2009, HPG 2009, pp. 145–149. ACM, New York (2009).
  3. 3.
    Aila, T., Laine, S., Karras, T.: Understanding the efficiency of ray traversal on GPUs - kepler and fermi addendum. NVIDIA Technical report NVR-2012-02, NVIDIA Corporation, June 2012Google Scholar
  4. 4.
    Cazals, F., Sbert, M.: Some integral geometry tools to estimate the complexity of 3D scenes. Technical report, iMAGIS/GRAVIR-IMAG, Grenoble, France, Departament dInformtica i Matemtica Aplicada, Universitat de Girona, Spain (1997)Google Scholar
  5. 5.
    Garanzha, K.: Fast ray sorting and breadth-first packet traversal for GPU ray tracing. Oral presentation at EG2010 (2010)Google Scholar
  6. 6.
    Garanzha, K., Loop, C.: Fast ray sorting and breadth-first packet traversal for GPU ray tracing. In: Proceedings of the Eurographics, EG 2010, pp. 289–298. Eurographics Association (2010).
  7. 7.
    Guthe, M.: Latency considerations of depth-first GPU ray tracing. In: Galin, E., Wand, M. (eds.) EG 2014 - Short Papers, pp. 53–56. Eurographics Association, Strasbourg (2014).
  8. 8.
    Hoberock, J., Lu, V., Jia, Y., Hart, J.C.: Stream compaction for deferred shading. In: Proceedings of the Conference on High Performance Graphics 2009, HPG 2009, pp. 173–180. ACM, New York (2009).
  9. 9.
    Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA tesla: a unified graphics and computing architecture. IEEE Micro 28(2), 39–55 (2008)CrossRefGoogle Scholar
  10. 10.
    NVIDIA: Nvidia’s Next Generation Cuda compute architecture: Fermi (2009)Google Scholar
  11. 11.
  12. 12.
    Pantaleoni, J., Fascione, L., Hill, M., Aila, T.: Pantaray: fast ray-traced occlusion caching of massive scenes. In: ACM SIGGRAPH 2010 papers, SIGGRAPH 2010, pp. 37:1–37:10. ACM, New York (2010).
  13. 13.
    Tzeng, S., Patney, A., Owens, J.D.: Task management for irregular-parallel workloads on the GPU. In: Doggett, M., Laine, S., Hunt, W. (eds.) High Performance Graphics, pp. 29–37. Eurographics Association (2010).
  14. 14.
    Wald, I.: Active thread compaction for GPU path tracing. In: Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics, HPG 2011, pp. 51–58. ACM, New York (2011).

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is distributed under the terms of the Creative Commons Attribution Noncommercial License, which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Authors and Affiliations

  1. 1.Institut für ComputerGraphik und WissensvisualisierungTU GrazGrazAustria
  2. 2.TU Darmstadt und Fraunhofer IGDDarmstadtGermany

Personalised recommendations