A Volumetric FFT for BlueGene/L

  • Maria Eleftheriou
  • José E. Moreira
  • Blake G. Fitch
  • Robert S. Germain
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2913)


BlueGene/L is a massively parallel supercomputer organized as a three-dimensional torus of compute nodes. A fundamental challenge in harnessing the new computational capabilities of BlueGene/L is the design and implementation of numerical algorithms that scale effectively on thousands of nodes. A computational kernel of particular importance is the Fast Fourier Transform (FFT) of three-dimensional data. In this paper, we present the approach we are taking in BlueGene/L to produce a scalable FFT implementation. We rely on a volume decomposition of the data to take advantage of the toroidal communication topology. We present experimental results using an MPI-based implementation of our algorithm, in order to test the basic tenets behind our decomposition and to allow experimentation on existing platforms. Our preliminary results indicate that our algorithm scales well on as many as 512 nodes for three-dimensional FFTs of size 128 × 128 × 128.


Fast Fourier Transform Active Message Task Count Local Array Active Packet 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adiga, N.R., et al.: An overview of the BlueGene/L supercomputer. In: SC 2002 – High Performance Networking and Computing, Baltimore, MD (November 2002)Google Scholar
  2. 2.
    Almasi, G., Almasi, G.S., Beece, D., Bellofatto, R., Bhanot, G., Bickford, R., Blumrich, M., Bright, A.A., Brunheroto, J., Cascaval, C., Castaños, J., Ceze, L., Coteus, P., Chatterjee, S., Chen, D., Chiu, G., Cipolla, T.M., Crumley, P., Deutsch, A., Dombrowa, M.B., Donath, W., Eleftheriou, M., Fitch, B., Gagliano, J., Gara, A., Germain, R., Giampapa, M.E., Gupta, M., Gustavson, F., Hall, S., Haring, R.A., Heidel, D., Heidelberger, P., Herger, L.M., Hoenicke, D., Jackson, R.D., Jamal- Eddine, T., Kopcsay, G.V., Lanzetta, A.P., Lieber, D., Lu, M., Mendell, M., Mok, L., Moreira, J., Nathanson, B.J., Newton, M., Ohmacht, M., Rand, R., Regan, R., Sahoo, R., Sanomiya, A., Schenfeld, E., Singh, S., Song, P., Steinmacher-Burow, B.D., Strauss, K., Swetz, R., Takken, T., Vranas, P., Ward, T.J.C.: Cellular supercomputing with system-on-a-chip. In: Proceedings of International Solid-State Circuits Conference, ISSCC 2002 (2002)Google Scholar
  3. 3.
    Cramer, C.E., Board, J.A.: The development and integration of a distributed 3D FFT for a cluster of workstations. In: 4th Annual Linux Showcase and Conference, Atlanta, GA, October 2000, pp. 121–128 (2000)Google Scholar
  4. 4.
    Deserno, M., Holm, C.: How to mesh up Ewald sums. I. A theoretical and numerical comparison of various particle mesh routines. J. Chem. Phys. 109(18), 7678–7693 (1998)CrossRefGoogle Scholar
  5. 5.
    Ding, H.Q., Ferraro, R.D., Gennery, D.B.: A portable 3D FFT package for distributedmemory parallel architecture. In: SIAM Conference on Parallel Processing for Scientific Computing (1995)Google Scholar
  6. 6.
    Edelman, A., McCorquodale, P., Toledo, S.: The future fast Fourier transform? SIAM J. Sci. Comput. 20, 1094–1114 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Fitch, B.G., Germain, R.S., Mendell, M., Pitera, J., Rayshubskiy, A., Sham, Y., Suits, F., Swope, W., Zhestkov, Y., Zhou, R.: Blue Matter, an application framework for molecular simulation on Blue Gene. Journal of Parallel and Distributed Computing (2003) (to appear)Google Scholar
  8. 8.
    Frigo, M., Johnson, S.G.: The fastest Fourier transform in the west. Technical Report MIT-LCS-TR-728, Laboratory for Computing Sciences, MIT, Cambridge, MA (1997)Google Scholar
  9. 9.
    Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 3, pp. 1381–1384 (1998)Google Scholar
  10. 10.
    Haynes, P.D., Cote, M.: Parallel fast Fourier transforms for electronic structure calculations. Comp. Phys. Comm. 130, 121 (2000)CrossRefGoogle Scholar
  11. 11.
    Zapata, E.L., Rivera, F.F., Benavides, J., Garazo, J.M., Peskin, R.: Multidimensional fast Fourier transform into fixed size hypercubes. IEE Proceedings 137(4), 253–260 (1990)Google Scholar
  12. 12.
    Zhou, R., Harder, E., Xu, H., Berne, B.J.: Efficient multiple time step method for use with Ewald and particle mesh Ewald for large biomolecular systems. J. Chem. Phys. 115, 2348–2358 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Maria Eleftheriou
    • 1
  • José E. Moreira
    • 1
  • Blake G. Fitch
    • 1
  • Robert S. Germain
    • 1
  1. 1.IBM Thomas J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations