The Visual Computer

, Volume 34, Issue 5, pp 633–643 | Cite as

Exploring hidden coherency of Ray-Tracing for heterogeneous systems using online feedback methodology

  • Chih-Chen Kao
  • Wei-Chung Hsu
Original Article


Although naturally adopting an embarrassingly parallel paradigm, Ray-Tracing is also categorized as an irregular program that is troublesome to run on graphics processing units (GPUs). Conventional designs suffer from a performance penalty due to the irregularity of the control flow and memory access caused by incoherent rays. This work aims to explore the hidden coherency of rays by designing a feedback-guided mechanism that serves the following concept: extraction of the hidden regular portions out of the irregular execution flow. The method records the correlation of ray attributes and the traversed path and groups the newly generated rays to reduce potential irregularities for the ongoing execution. This mechanism captures the information from the entire ray space and can extract the hidden coherency from both primary and derived rays. The result leads to performance gains and an increase in resource utilization. The performance becomes 2 to 2.5 times higher than the original GPU and CPU versions.


Ray-Tracing Heterogeneous systems Irregular program HSA Shared virtual memory 


  1. 1.
    Adelson, E.H., Bergen, J.R.: The plenoptic function and the elements of early vision. Vision and Modeling Group, Media Laboratory, Massachusetts Institute of Technology (1991)Google Scholar
  2. 2.
    Áfra, A.T., Benthin, C., Wald, I., Munkberg, J.: Local shading coherence extraction for SIMD-efficient path tracing on CPUs. In: Proceedings of High Performance Graphics, pp. 119–128. Eurographics Association (2016)Google Scholar
  3. 3.
    Aila, T., Karras, T.: Architecture considerations for tracing incoherent rays. In: Proceedings of the Conference on High Performance Graphics, pp. 113–122. Eurographics Association (2010)Google Scholar
  4. 4.
    Aila, T., Laine, S.: Understanding the efficiency of ray traversal on GPUs. In: Proceedings of the Conference on High Performance Graphics 2009, pp. 145–149. ACM (2009)Google Scholar
  5. 5.
    AMD and GPUOpen: Radeon-rays.
  6. 6.
    Barringer, R., Akenine-Möller, T.: Dynamic ray stream traversal. ACM Trans. Graph. 33(4), 151 (2014)CrossRefGoogle Scholar
  7. 7.
    Benthin, C., Wald, I., Woop, S., Ernst, M., Mark, W.R.: Combining single and packet-ray tracing for arbitrary ray distributions on the intel mic architecture. IEEE Trans. Vis. Comput. Graph. 18(9), 1438–1448 (2012)CrossRefGoogle Scholar
  8. 8.
    Boulos, S., Wald, I., Benthin, C.: Adaptive ray packet reordering. In: IEEE Symposium on Interactive Ray Tracing, 2008. RT 2008, pp. 131–138 (2008)Google Scholar
  9. 9.
    Bouvier, D., Sander, B.: Applying amd’s kaveri apu for heterogeneous computing. In: Hot Chips: A Symposium on High Performance Chips (HC26) (2014)Google Scholar
  10. 10.
    Dammertz, H., Hanika, J., Keller, A.: Shallow bounding volume hierarchies for fast SIMD ray tracing of incoherent rays. In: Computer Graphics Forum, vol. 27, pp. 1225–1233. Wiley Online Library, New York (2008)Google Scholar
  11. 11.
    Davidovič, T., Křivánek, J., Hašan, M., Slusallek, P.: Progressive light transport simulation on the GPU: survey and improvements. ACM Trans. Graph. 33(3), 29 (2014)Google Scholar
  12. 12.
    Eisenacher, C., Nichols, G., Selle, A., Burley, B.: Sorted deferred shading for production path tracing. In: Computer Graphics Forum, vol. 32, pp. 125–132. Wiley Online Library, New York (2013)Google Scholar
  13. 13.
    Garanzha, K., Loop, C.: Fast ray sorting and breadth-first packet traversal for gpu ray tracing. In: Computer Graphics Forum, vol. 29, pp. 289–298. Wiley Online Library, New York (2010)Google Scholar
  14. 14.
    Gribble, C.P., Ramani, K.: Coherent ray tracing via stream filtering. In: IEEE Symposium on Interactive Ray Tracing, 2008. RT 2008, pp. 59–66 (2008)Google Scholar
  15. 15.
    Gunther, J., Popov, S., Seidel, H.P., Slusallek, P.: Realtime ray tracing on GPU with BVH-based packet traversal. In: IEEE Symposium on Interactive Ray Tracing, 2007. RT’07, pp. 113–118 (2007)Google Scholar
  16. 16.
    Jeffers, J., Reinders, J.: Intel Xeon Phi coprocessor high performance programming. 1st edn. Morgan Kaufmann Publishers, San Francisco, CA, USA (2013)Google Scholar
  17. 17.
    Kao, C.C., Hsu, W.C.: Runtime techniques for efficient ray-tracing on heterogeneous systems. In: 2015 IEEE International Conference on Digital Signal Processing (DSP), pp. 100–104 (2015)Google Scholar
  18. 18.
    Kao, C.C., Miao, Y.T., Hsu, W.C.: A pipeline-based runtime technique for improving ray-tracing on HSA-compliant systems. In: 2016 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2016)Google Scholar
  19. 19.
    Laine, S., Karras, T., Aila, T.: Megakernels considered harmful: wavefront path tracing on gpus. In: ACM Proceedings of the 5th High-Performance Graphics Conference, pp. 137–143 (2013)Google Scholar
  20. 20.
    Moon, B., Byun, Y., Kim, T.J., Claudio, P., Kim, H.S., Ban, Y.J., Nam, S.W., Yoon, S.E.: Cache-oblivious ray reordering. ACM Trans. Graph. 29(3), 28 (2010)CrossRefGoogle Scholar
  21. 21.
    Munshi, A., Gaster, B., Mattson, T.G., Ginsburg, D.: OpenCL Programming Guide. Pearson Education, New Jersey (2011)Google Scholar
  22. 22.
    Novák, J., Havran, V., Dachsbacher, C.: Path regeneration for interactive path tracing. In: Proceedings on EUROGRAPHICS Short Papers (2010)Google Scholar
  23. 23.
    Overbeck, R., Ramamoorthi, R., Mark, W.R.: Large ray packets for real-time whitted ray tracing. In: IEEE Symposium on Interactive Ray Tracing, 2008. RT 2008. pp. 41–48. (2008)Google Scholar
  24. 24.
    Pajot, A., Barthe, L., Paulin, M., Poulin, P.: Combinatorial bidirectional path-tracing for efficient hybrid CPU/GPU rendering. In: Computer Graphics Forum, vol. 30, pp. 315–324. Wiley Online Library, Hoboken (2011)Google Scholar
  25. 25.
    Parker, S.G., Bigler, J., Dietrich, A., Friedrich, H., Hoberock, J., Luebke, D., McAllister, D., McGuire, M., Morley, K., Robison, A., Stich, M.: Optix: a general purpose ray tracing engine. ACM Trans. Graph. 29, 66 (2010)CrossRefGoogle Scholar
  26. 26.
    Pharr, M., Humphreys, G.: Physically Based Rendering: From Theory to Implementation. Morgan Kaufmann, Burlington (2004)Google Scholar
  27. 27.
    Pharr, M., Kolb, C., Gershbein, R., Hanrahan, P.: Rendering complex scenes with memory-coherent ray tracing. In: Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, pp. 101–108. ACM Press/Addison-Wesley Publishing Co. (1997)Google Scholar
  28. 28.
    Ramani, K., Gribble, C.P., Davis, A.: Streamray: a stream filtering architecture for coherent ray tracing. ACM Sigplan Not. 44, 325–336 (2009)CrossRefGoogle Scholar
  29. 29.
    Rogers, P., Fellow, A.: Heterogeneous system architecture overview. In: 2013 IEEE Hot Chips 25 Symposium (HCS), pp. 1–41 (2013). doi: 10.1109/HOTCHIPS.2013.7478286
  30. 30.
    Sung, K., Craighead, J., Wang, C., Bakshi, S., Pearce, A., Woo, A.: Design and implementation of the maya renderer. In: IEEE Computer Graphics and Applications, 1998. Pacific Graphics’ 98. Sixth Pacific Conference on, pp. 150–159 (1998)Google Scholar
  31. 31.
    Tong, W., Deng, Y.: Mining effective parallelism from hidden coherence for GPU based path tracing. In: ACM SIGGRAPH Asia 2013 Technical Briefs, p. 31 (2013)Google Scholar
  32. 32.
    Tsakok, J.A.: Faster incoherent rays: multi-BVH ray stream tracing. In: ACM Proceedings of the Conference on High Performance Graphics 2009, pp. 151–158 (2009)Google Scholar
  33. 33.
    Tzeng, S., Patney, A., Owens, J.D.: Task management for irregular-parallel workloads on the GPU. In: Proceedings of the Conference on High Performance Graphics, pp. 29–37. Eurographics Association (2010)Google Scholar
  34. 34.
    Van Antwerpen, D.: Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU. In: ACM Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics, pp. 41–50 (2011)Google Scholar
  35. 35.
    Wald, I.: Active thread compaction for GPU path tracing. In: Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics, pp. 51–58 (2011)Google Scholar
  36. 36.
    Wald, I., Benthin, C., Boulos, S.: Getting rid of packets-efficient simd single-ray traversal using multi-branching bvhs. In: IEEE Symposium on Interactive Ray Tracing, 2008. RT 2008. pp. 49–57 (2008)Google Scholar
  37. 37.
    Wald, I., Slusallek, P., Benthin, C., Wagner, M.: Interactive rendering with coherent ray tracing. In: Computer Graphics Forum, vol. 20, pp. 153–165. Wiley Online Library, New York (2001)Google Scholar
  38. 38.
    Wald, I., Woop, S., Benthin, C., Johnson, G.S., Ernst, M.: Embree: a kernel framework for efficient cpu ray tracing. ACM Trans. Graph. 33(4), 143 (2014)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.National Taiwan UniversityTaipeiTaiwan

Personalised recommendations