1-Dimensional parallel FFT benchmark on SUPRENUM
A distributed memory vectorised 1-dimensional FFT benchmark is first presented. The performance results of this benchmark on Suprenum are given and discussed. A performance analysis of the distributed memory FFT benchmark is performed and Hockney's performance parameters (r∞ and n1/2) are then employed to derive the performance formula. This formula is shown to fit the experimental results very well. A generalization of the analysis for uniformly distributed applications is also discussed, as well as some important characteristics such as the calculation/communication ratio, the fit of the application to the architecture, the average message length and the average vector length.
Keywordsdistributed memory FFT benchmark performance formulae and parameters Suprenum performance
Unable to display preview. Download preview PDF.
- [Amd67]G.M. Amdahl. The validity of the single processor approach to achieving large scale computing capabilities. AFIPS Conference Proceedings, 30:483–485, 1967. 1967 Spring Joint Computer Conference, Thompson Books and Academic Press.Google Scholar
- [CT65]J.W. Cooley and J.W. Tukey. An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19(90):297–301, 1965.Google Scholar
- [HJ88]R.W. Hockney and C.R. Jesshope. Parallel Computers 2: Architecture, Programming and Algorithms. Adam Hilger, Bristol, 1988.Google Scholar
- [Hoc91]R.W. Hockney. A framework for benchmark performance analysis. In Proceedings of the 2nd Euroben Workshop, September 1991. To appear in Supercomputer (ASFRA, The Netherlands).Google Scholar
- [Law75]D.H. Lawrie. Access and alignment of data in an array processor. IEEE Trans. on Computers, 24(12):1145–1155, December 1975.Google Scholar
- [Pea77]M.C. Pease. The indirect binary n-cube microprocessor array. IEEE Trans. on Computers, 26(5):458–473, May 1977.Google Scholar