Abstract
In this paper authors present the Fast Fourier Transform (FFT) partitioning scheme aimed at improvement of the effectiveness of the considered transform computation on graphics processing units (i.e. the GPUs). The FFT radix-2 decimation in time (DIT) algorithm is chosen as the base procedure for the FFT calculation which is then partitioned into subtransform blocks of arbitrary sizes enabling for different GPU resources distribution during its computational process and thus resulting in the potential improvement of the overall FFT execution time for chosen consumer segment GPU models. The conducted experiments show that for a chosen GPU architectures running in the single instruction multiple thread (SIMT) mode of operation partitioning of the FFT into 4-point and 8-point subtransforms calculated sequentially within an individual thread, instead of its calculation using standard 2-point butterfly operations, significantly reduces the FFT’s computation time. The presented scheme is general and can be used for the partitioning of the FFT into arbitrary size subtransform blocks aimed at the scheme’s time effectiveness fine-tuning to the chosen, particular GPU architectures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Kruger, J., Lefohn, A.E., Purcell, T.: A survey of general purpose computation on graphics hardware. Comput. Graph. Forum 26(1), 80–113 (2007)
Govindaraju, N.K., Lloyd, B., Dotsenko, Y., Smith, B., Manferdelli, J.: High performance discrete Fourier transforms on graphics processors. In: Proceedings of the ACM/IEEE Conference on Supercomputing. IEEE Press, Piscataway (2008)
Cheng, J., Grossman, M., McKercher, T.: Professional CUDA® C Programming. Wiley, Indianapolis (2014)
Hillis, W.D., Steele, G.L.: Data parallel algorithms. Commun. ACM 29(12), 1170–1183 (1986)
Ahmed, U.N., Rao, K.R.: Orthogonal Transforms for Digital Signal Processing. Springer-Verlag New York Inc., Secaucus (1975)
Puchala, D., Stokfiszewski, K.: Parametrized orthogonal transforms for data encryption. Comput. Probl. Electr. Eng. J. 3(1), 93–97 (2013)
Yatsymirskyy, M., Stokfiszewski, K., Szczepaniak, P.S.: Image compression using fast transforms realized through neural network learning. Model. Comput. Sci. Technol. Ukrainian Nat. Acad. Sci. 23, 95–99 (2003)
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19(90), 297–301 (1965)
Nussbaumer, H.J.: Fast Fourier Transform and Convolution Algorithms. Springer, Heidelberg (1982)
Moreland, K., Angel, E.: The FFT on a GPU. In: Proceedings of the ACM Siggraph/Eurographics Conference on Graphics Hardware, pp. 112–119. Eurographics Association Aire-la-Ville, San Diego (2003)
Spitzer, J.: Implementing a GPU-efficient FFT. In: Siggraph Course on Interactive Geometric and Scientific Computing with Graphics Hardware (2003)
Puchala, D., Stokfiszewski, K., Szczepaniak, B., Yatsymirskyy, M.: Effectiveness of Fast Fourier Transform implementations on GPU and CPU. Przegląd Elektrotechniczny 92(7), 69–71 (2016)
Ambuluri, S.: Implementations of the FFT algorithm on GPU. MSc. thesis, Linköping University, Department of Electrical Engineering (2012)
Dotsenko, Y., Baghsorkhi, S.S., Lloyd, B., Govindaraju, N.K.: Auto-tuning of Fast Fourier Transform on Graphics Processors. In: Proceedings of the 16-th ACM Symposium on Principles and Practice of Parallel Programming, San Antonio, TX, USA, pp. 257–266 (2011)
Yatsymirskyy, M.: Fast Algorithms for Orthogonal Trigonometric Transforms’ Computations (in Ukrainian). Lviv Acad. Express, Lviv (1997)
CUDA® Programming Guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html. Accessed 20 Aug 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Stokfiszewski, K., Wieloch, K., Yatsymirskyy, M. (2018). The Fast Fourier Transform Partitioning Scheme for GPU’s Computation Effectiveness Improvement. In: Shakhovska, N., Stepashko, V. (eds) Advances in Intelligent Systems and Computing II. CSIT 2017. Advances in Intelligent Systems and Computing, vol 689. Springer, Cham. https://doi.org/10.1007/978-3-319-70581-1_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-70581-1_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70580-4
Online ISBN: 978-3-319-70581-1
eBook Packages: EngineeringEngineering (R0)