Abstract
This chapter presents the usefulness of predictive and adaptive caching methods for the traversal of both uniform and recursive 3D grid structures. Recursive data structures are used in several image processing kernels and their efficient management is one challenge to save silicon area and reduce the power consumption due to the data transport. The described architectures greatly reduce the needs in term of bandwidth by exploiting the spatial and temporal locality of memory accesses during ray shooting in uniform and recursive grids. To maximize the cache efficiency, the original kernel is transformed to a “phase locked” ray-packet based propagation algorithm. Our results show that well-suited caching strategies can indeed yield significant performance gains during the traversal of both uniform and hierarchical grids. This emphasizes the relevance of semi-general purpose multi-dimensional predictive caches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Which may be emitted or re-emitted light (rendering), density (PET reconstruction), attenuation (X-ray based reconstruction), ….
- 2.
Indeed, if it is not the case along one or more axes, we can bring ourselves back to the case where it is by taking as absolute cell position the one’s complement of the actual cell position along those axes. Of course, the “correct” position must still be used for the memory accesses. This strategy is suggested in [20], where the reader may find extensive detail of such an approach.
- 3.
A ray phase is the ray coordinate along the phase axis.
- 4.
References
Akenine-Möller T, Haines E, Hoffman N (2008) Real-time rendering, 3rd edn. AK Peters, Natick
Amanatides J, Woo A (1987) A fast voxel traversal algorithm for ray tracing. In: Eurographics ’87. Elsevier, North-Holland, Amsterdam, pp. 3–10
Ang S-S, Constantinides GA, Luk W, Cheung PYK (2008) Custom parallel caching schemes for hardware-accelerated image compression. J Real-Time Image Process 3(4):289–302
Felzenszwalb PF, Huttenlocher DP (2006) Efficient belief propagation for early vision. International Journal of Computer Vision 70(1)
Glassner AS (October 1984) Space subdivision for fast ray tracing. IEEE Comput Graph Appl 4(10):15–22
Grimm S, Bruckner S, Kanitsar A, Meister EG (October 2004) A refined data addressing and processing scheme to accelerate volume raycasting. Comput Graph 28(5):719–729
Havran V (November 2000) Heuristic ray shooting algorithms. Ph.D. thesis. Department of Computer Science and Engineering, Faculty of Electrical Engineering, Czech Technical University in Prague
Kanus U, Wetekam G, Hirche J (July 2003) VoxelCache: a cache-based memory architecture for volume graphics. In: Eurographics/SIGGRAPH workshop on graphics hardware, pp. 76–83
Klimaszewski KS, Sederberg TW (January–February 1997) Faster ray tracing using adaptive grids. IEEE Comput Graph Appl 17(1):42–51
Krüger J, Westermann R (2003) Acceleration techniques for GPU-based volume rendering. In: Proceedings IEEE visualization 2003
Köse C, Chalmers A (July 1997) Profiling for efficient parallel volume visualisation. Parallel Comput 23(7)
Larabi Z, Mathieu Y, Mancini S (June 2009) Efficient data access management for FPGA-based image processing socs. In: Proceedings of the 2009 IEEE/IFIP international symposium on rapid system prototyping, pp. 159–165
Lorensen WE, Cline HE (1987) Marching cubes: a high resolution 3d surface construction algorithm. SIGGRAPH Comput Graph 21(4):163–169
Mancini S, Desvignes M (2006) Ray casting on a SoPC platform: algorithm and memory tradeoff. In: IEEE conference on computer information technology, Seoul, Korea. IEEE, Los Alamitos
Mancini S, Eveno N (November 2004) An IIR based 2D adaptive and predictive cache for image processing. In: DCIS 2004, p. 85
nVidia. Cuda sdk. http://developer.download.nvidia.com/compute/cuda/sdk/website/samples.html
Osborne R, Pfister H, Lauer H, McKenzie N, Gibson S, Hiatt W, Ohkami T (1997) EM-Cube: an architecture for low-cost real-time volume rendering. In: 1997 SIGGRAPH/eurographics workshop on graphics hardware. ACM, New York
Pfister H, Kaufman A, Chiueh T-c (1994) Cube-3: A real-time architecture for high-resolution volume visualization. In: Kaufman A, Krueger W (eds) 1994 symposium on volume visualization, pp. 75–82
Pfister H, Kaufman AE (1996) Cube-4 – a scalable architecture for real-time volume rendering. In: VVS, p. 47
Revelles J, Ureña C, Lastra M (2000) An efficient parametric algorithm for octree traversal
Strengert M et al. (2004) Large volume visualization of compressed time-dependent datasets on GPU clusters. Parallel Comput 31(2)
Wetekam G, Staneker D, Kanus U, Wand M (2005) A hardware architecture for multi-resolution volume rendering. In: HWWS ’05: proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on graphics hardware. ACM, New York, pp. 45–51
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Toczek, T., Mancini, S. (2011). Efficient Memory Management for Uniform and Recursive Grid Traversal. In: Gogniat, G., Milojevic, D., Morawiec, A., Erdogan, A. (eds) Algorithm-Architecture Matching for Signal and Image Processing. Lecture Notes in Electrical Engineering, vol 73. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9965-5_2
Download citation
DOI: https://doi.org/10.1007/978-90-481-9965-5_2
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9964-8
Online ISBN: 978-90-481-9965-5
eBook Packages: EngineeringEngineering (R0)