Parallelizing the ZSWEEP Algorithm for Distributed-Shared Memory Architectures
In this paper we describe a simple parallelization of the ZSWEEP algorithm for rendering unstructured volumetric grids on distributed-shared memory machines, and study its performance on three generations of SGI multiprocessors, including the new Origin 3000 series.
The main idea of the ZSWEEP algorithm is very simple; it is based on sweeping the data with a plane parallel to the viewing plane, in order of increasing z, projecting the faces of cells that are incident to vertices as they are encountered by the sweep plane. Our parallel extension of the basic algorithm makes use of an image-based task partitioning scheme. Essentially, the screen is divided in more tiles than the number of processors, then each processor performs the sweep independently on the next available tile, until no more tiles are available to render. Here, we detail the modifications necessary to efficiently extend the sequential algorithm to work on shared-memory machines. We report on the performance of our implementation, and show that the tile-based ZSWEEP is naturally cache friendly, achieves fast rendering times, and substantial speedups on all the machines we used for testing. On one processor of our Origin 3000, we measure the L2 data cache hit rate of the tile-based ZSWEEP to be over 99%; a parallel efficiency of 83% on 16 processors; and rendering rates of about 300 thousand tetrahedra per second for a 1024 × 1024 image.
KeywordsData Cache Load Imbalance Cache Coherence Parallel Efficiency Irregular Grid
Unable to display preview. Download preview PDF.
- 1.P. Bunyk, A. Kaufman, and C. Silva. Simple, fast, and robust ray casting of irregular grids. In Scientific Visualization, Proceedings of Dagstuhl ’87, pages 30–36, 2000.Google Scholar
- 2.J. Challinger. Scalable parallel volume raycasting for nonrectilinear computational grids. In ACM SIGGRAPH Symposium on Parallel Rendering, pages 81–88, November 1993.Google Scholar
- 3.D. Culler, J. Singh, and A. Gupta. Parallel Computer Architecture, A Hardware-Software Approach. Morgan-Kaufmann, 1999.Google Scholar
- 4.R. Fanas, J. Mitchell, and C. Silva. ZSWEEP: An efficient and exact projection algorithm for unstructured volume rendering. In 2000 Volume Visualization Symposium, pages 91–99. October 2000.Google Scholar
- 5.C. Hofsetz and K.-L. Ma. Multi-threaded rendering unstructured-grid volume data on the sgi origin 2000. In Third Eurographics Workshop on Parallel Graphics and Visualization, 2000.Google Scholar
- 6.L. Hong and A. Kaufman. Accelerated ray-casting for curvilinear volumes. IEEE Visualization ’98, pages 247–254, October 1998.Google Scholar
- 7.P. Lacroute. Real-time volume rendering on shared memory multiprocessors using the shear-warp factorization. IEEE Parallel Rendering Symposium, pages 15–22, October 1995.Google Scholar
- 8.P. Lacroute. Analysis of a parallel volume rendering system based on the shear-warp factorization. IEEE Transactions on Visualization and Computer Graphics, 2 (3), September 1996.Google Scholar
- 9.K.-L. Ma. Parallel volume ray-casting for unstructured-grid data on distributed-memory architectures. IEEE Parallel Rendering Symposium, pages 23–30, October 1995.Google Scholar
- 10.K.-L. Ma and T. Crockett. A scalable parallel cell-projection volume rendering algorithm for three-dimensional unstructured data. IEEE Parallel Rendering Symposium, pages 95–104, November 1997.Google Scholar
- 11.K.-L. Ma and T. Crockett. Parallel visualization of large-scale aerodynamics calculations: A case study on the Cray T3E. Symposium on Parallel Visualization and Graphics, pages 15–20, October 1999.Google Scholar
- 14.P. Shirley and A. Tuchman. A polygonal approximation to direct scalar volume rendering. Computer Graphics (San Diego Workshop on Volume Visualization, vol. 24, pages 63–70, November 1990.Google Scholar
- 15.C. Silva and J. Mitchell. The lazy sweep ray casting algorithm for rendering irregular grids. IEEE Transactions on Visualization and Computer Graphics, 3(2), April - June 1997.Google Scholar
- 16.S. Uselton. Volume rendering for computational fluid dynamics: Initial results. Tech Report RNR-91–026, Nasa Ames Research Center, 1991.Google Scholar
- 17.R. Westermann and T. Ertl. The vsbuffer: Visibility ordering of unstructured volume primitives by polygon drawing. IEEE Visualization ’97, pages 35–42, November 1997.Google Scholar
- 18.R. Westermann and T. Ertl. Efficiently using graphics hardware in volume rendering applications. Proceedings of SIGGRAPH 98, pages 169–178, July 1998.Google Scholar
- 19.J. Wilhelnis, A. Van Gelder, P. Tarantino, and J. Gibbs. Hierarchical and parallelizable direct volume rendering for irregular and multiple grids. IEEE Visualization ’86, pages 57–64, October 1996.Google Scholar
- 20.P. Williams. Parallel volume rendering finite element data. In Proceedings of Computer Graphics International, 1993.Google Scholar