Spherical Harmonic Transform with GPUs

  • Ioan Ovidiu Hupca
  • Joel Falcou
  • Laura Grigori
  • Radek Stompor
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7155)


We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, s 2 hat. We focus our attention on two major sequential steps involved in the transforms computation retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We present performance comparisons of a single CPU plus GPU unit with the s 2 hat code running on either a single or 4 processors. In particular, we find that the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to s 2 hat executed on one core, and by as much as 5.5 with respect to s 2 hat on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.


Spherical Harmonic Transform NVIDIA CUDA GPU Cosmic Microwave Background 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Szydlarski, M., Esterie, P., Falcou, J., Grigori, L., Stompor, R.: Spherical harmonic transform on heterogeneous architectures using hybrid programming, INRIA, Rapport de recherche RR-7635 (April 2011),
  2. 2.
    Arfken, G.B., Weber, H.J.: Mathematical methods for physicists, 6th edn. Academic Press (2005)Google Scholar
  3. 3.
    Górski, K.M., et al.: HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere. Astrophysical Journal 622, 759–771 (2005)CrossRefGoogle Scholar
  4. 4.
    Reinecke, M.: Libpsht - algorithms for efficient spherical harmonic transforms. Astronomy and Astrophysics 526, A108+ (2011)Google Scholar
  5. 5.
    Driscoll, J.R., Healy, D.M.: Computing fourier transforms and convolutions on the 2-sphere. Advances in Applied Mathematics 15(2), 202–250 (1994)MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    Muciaccia, P.F., Natoli, P., Vittorio, N.: Fast Spherical Harmonic Analysis: A Quick Algorithm for Generating and/or Inverting Full-Sky, High-Resolution Cosmic Microwave Background Maps. Astrophysical Journal Letters 488, L63(1997)Google Scholar
  7. 7.
    Doroshkevich, A.G., et al.: First Release of Gauss-Legendre Sky Pixelization (GLESP) software package for CMB analysis. ArXiv Astrophysics e-prints (January 2005)Google Scholar
  8. 8.
    Tygert, M.: Fast algorithms for spherical harmonic expansions, ii. Journal of Computational Physics 227(8), 4260–4279 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Nukada, A., Matsuoka, S.: Auto-tuning 3-D FFT library for CUDA GPUs. In: SC 2009: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp. 1–10 (2009)Google Scholar
  10. 10.
    Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune dense linear algebra. In: ACM/IEEE Conference on Supercomputing, SC 2008 (2008)Google Scholar
  11. 11.
    Nvidia, NVIDIA CUDA Programming Guide (2010)Google Scholar
  12. 12.
    Nvidia, NVIDIA CUDA Best Practices Guide (2010)Google Scholar
  13. 13.
    Nvidia, Tuning CUDA Applications for Fermi (2010)Google Scholar
  14. 14.
    Frigo, M., Johnson, S.: The design and implementation of FFTW3. Proceedings of the IEEE 93(2), 216–231 (2005)CrossRefGoogle Scholar
  15. 15.
    Nvidia, CUDA CUFFT Library (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ioan Ovidiu Hupca
    • 1
    • 3
  • Joel Falcou
    • 1
    • 3
  • Laura Grigori
    • 1
    • 3
  • Radek Stompor
    • 2
  1. 1.LRI - INRIA Saclay-Ile deFrance
  2. 2.Astroparticule et Cosmologie, CNRSUniversité Paris DiderotParisFrance
  3. 3.Université Paris-Sud 11OrsayFrance

Personalised recommendations