Skip to main content

Memory Locality Exploitation Strategies for FFT on the CUDA Architecture

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5336))

Abstract

Modern graphics processing units (GPU) are becoming more and more suitable for general purpose computing due to its growing computational power. These commodity processors follow, in general, a parallel SIMD execution model whose efficiency is subject to a right exploitation of the explicit memory hierarchy, among other factors. In this paper we analyze the implementation of the Fast Fourier Transform using the programming model of the Compute Unified Device Architecture (CUDA) recently released by NVIDIA for its new graphics platforms. Within this model we propose an FFT implementation that takes into account memory reference locality issues that are crucial in order to achieve a high execution performance. This proposal has been experimentally tested and compared with other well known approaches such as the manufacturer’s FFT library.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fialka, O., Cadik, M.: FFT and Convolution Performance in Image Filtering on GPU. Information Visualization (2006)

    Google Scholar 

  2. Fastest Fourier Transform in the West (FFTW), http://www.fftw.org/

  3. Frigo, M., Johnson, S.G.: The Design and Implementation of FFTW3. Proceedings of the IEEE 93, 216–231 (2005)

    Article  Google Scholar 

  4. Govindaraju, N.K., Larsen, S., Gray, J., Manocha, D.: A Memory Model for Scientific Algorithms on Graphics Processors. In: Conference on Supercomputing (2006)

    Google Scholar 

  5. Jansen, T., von Rymon-Lipinski, B., Hanssen, N., Keeve, E.: Fourier volume rendering on the GPU using a split-stream FFT. In: Vision, Modeling, and Visualization Workshop (2004)

    Google Scholar 

  6. Moler, C.: HPC Benchmark. In: Conference on Supercomputing (2006), http://www.hpcchallenge.org/presentations/sc2006/moler-slides.pdf

  7. Moreland, K., Angel, E.: The FFT on a GPU. In: ACM Conference on Graphics Hardware (2003)

    Google Scholar 

  8. NVIDIA CUDA Homepage, http://developer.nvidia.com/object/cuda.html

  9. Spitzer, J.: Implementing a GPU-Efficient FFT. SIGGRAPH GPGPU Course (2003)

    Google Scholar 

  10. Sumanaweera, T., Liu, D.: Medical Image Reconstruction with the FFT. GPU Gems 2, 765–784 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gutierrez, E., Romero, S., Trenas, M.A., Zapata, E.L. (2008). Memory Locality Exploitation Strategies for FFT on the CUDA Architecture. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2008. VECPAR 2008. Lecture Notes in Computer Science, vol 5336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92859-1_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-92859-1_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-92858-4

  • Online ISBN: 978-3-540-92859-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics