Abstract
Small prime-sized discrete Fourier transforms appear in various applications from quantum mechanics, material sciences and machine learning. The typical implementation of the discrete Fourier transform for such problem sizes is done as a cyclic convolution using algorithms like Rader or Bluestein. However, these approaches exhibit extra computation and expensive data movement. In this work, we present an alternative method by casting the Fourier transform as a direct symmetric matrix-vector multiplication. Exploiting the symmetries of the Fourier matrix and using knowledge from dense linear algebra, we present an implementation that reduces the amount of computation and requires less memory usage. We show that this approach achieves up to 2x performance gains on Intel and AMD architectures, compared to implementations offered by Intel MKL and FFTW that use Rader and Bluestein.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bluestein, L.: A linear filtering approach to the computation of discrete Fourier transform. IEEE Trans. Audio Electroacoust. 18, 451–455 (1970)
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)
Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proc. IEEE 93(2), 216–231 (2005). Special issue on “Program Generation, Optimization, and Adaptation”
Intel: Math Kernel Library (2018). http://developer.intel.com/software/products/mkl/
Lebensohn, R.A., Kanjarla, A.K., Eisenlohr, P.: An elasto-viscoplastic formulation based on fast Fourier transforms for the prediction of micromechanical fields in polycrystalline materials. Int. J. Plast. 32, 59–69 (2012)
Popovici, D., Franchetti, F., Low, T.M.: Mixed data layout kernels for vectorized complex arithmetic. In: 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017 (2017)
Popovici, D.T., Russell, F.P., Wilkinson, K., Skylaris, C.K., Kelly, P.H., Franchetti, F.: Generating optimized Fourier interpolation routines for density functional theory using SPIRAL. In: 2015 IEEE International Parallel and Distributed Processing Symposium, pp. 743–752. IEEE (2015)
Rader, C.M.: Discrete Fourier transforms when the number of data samples is prime. Proc. IEEE 56, 1107–1108 (1968)
Skylaris, C.K., Haynes, P.D., Mostofi, A.A., Payne, M.C.: Introducing ONETEP: linear-scaling density functional simulations on parallel computers. J. Chem. Phys. 122, 084119 (2005)
Vasilache, N., Johnson, J., Mathieu, M., Chintala, S., Piantino, S., LeCun, Y.: Fast convolutional nets with fbfft: A GPU performance evaluation. arXiv preprint arXiv:1412.7580 (2014)
Veras, R., Popovici, D.T., Low, T.M., Franchetti, F.: Compilers, hands-off my hands-on optimizations. In: Proceedings of the 3rd Workshop on Programming Models for SIMD/Vector Processing, WPMVP 2016, pp. 4:1–4:8 (2016). https://doi.org/10.1145/2870650.2870654
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Popovici, D.T., Parikh, D.N., Spampinato, D.G., Low, T.M. (2020). Exploiting Symmetries of Small Prime-Sized DFTs. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2019. Lecture Notes in Computer Science(), vol 12043. Springer, Cham. https://doi.org/10.1007/978-3-030-43229-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-43229-4_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43228-7
Online ISBN: 978-3-030-43229-4
eBook Packages: Computer ScienceComputer Science (R0)