Abstract
Modern Graphics Processing Units (GPUs) consist of several SIMD-processors and thus provide a high degree of parallelism at low cost. We introduce a new approach to systematically develop parallel image reconstruction algorithms for GPUs from their parallel equivalents for distributed-memory machines. We use High-Level Petri Nets (HLPN) to intuitively describe the parallel implementations for distributed- memory machines. By denoting the functions of the HLPN with memory requirements and information about data distribution, we are able to identify parallel functions that can be implemented efficiently on the GPU. For an important iterative medical image reconstruction algorithm —the list-mode OSEM algorithm—we demonstrate the limitations of its distributed-memory implementation and show how our HLPN-based approach leads to a fast implementation on GPUs, reusable across different medical imaging devices.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUs: Stream Computing in graphics hardware. ACM Trans. Graph. 23(3), 777–786 (2004)
Girault, C., Valk, R. (eds.): Petri Nets for System Engineers. Springer, Berlin (2003)
NVIDIA. NVIDIA CUDA Compute Unified Device Architecture, http://developer.nvidia.com/object/cuda.html
Natterer, F., Wuebbeling, F.: Mathematical Methods in Image Reconstruction. SIAM, Philadelphia (2001)
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E., Purcell, T.J.: A Survey of General-Purpose Computation on Graphics Hardware. Comp. Graph. Forum 26(1), 80–113 (2007)
Pratx, G., Chinn, G., Habte, F., Olcott, P., Levin, C.: Fully 3-D list-mode OSEM accelerated by graphics processing units. In: IEEE Nuclear Science Symposium Conference Record, vol. 4, pp. 2196–2202. IEEE, Los Alamitos (2006)
Reader, A.J., Erlandsson, K., Flower, M.A., Ott, R.J.: Fast accurate iterative reconstruction for low-statistics positron volume imaging. Phys. Med. Biol. 43(4), 823–834 (1998)
Ryoo, S., Rodrigues, C., Baghsorkhi, S., Stone, S., Kirk, D., Hwu, W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In: PPoPP 2008: Proc. of the 13th ACM SIGPLAN Symposium, pp. 73–82 (2008)
Scali MPI connect, http://www.scali.com/
Schäfers, K.P., Reader, A.J., Kriens, M., Knoess, C., Schober, O., Schäfers, M.: Performance evaluation of the 32-module quadHIDAC small-animal PET scanner. Journal Nucl. Med. 46(6), 996–1004 (2005)
Schellmann, M., Gorlatch, S.: Comparison of two decomposition strategies for parallelizing the 3D list-mode OSEM algorithm. In: Proceedings Fully 3D Meeting and HPIR Workshop, pp. 37–40 (2007)
Xu, F., Mueller, K.: Accelerating popular tomographic reconstruction algorithms on commodity PC graphics hardware. IEEE Trans. Nucl. Sci. 52(3), 654–663 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schellmann, M., Vörding, J., Gorlatch, S. (2008). Systematic Parallelization of Medical Image Reconstruction for Graphics Hardware. In: Luque, E., Margalef, T., Benítez, D. (eds) Euro-Par 2008 – Parallel Processing. Euro-Par 2008. Lecture Notes in Computer Science, vol 5168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85451-7_86
Download citation
DOI: https://doi.org/10.1007/978-3-540-85451-7_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85450-0
Online ISBN: 978-3-540-85451-7
eBook Packages: Computer ScienceComputer Science (R0)