BLAS-3 for the quadrics parallel computer

Lippert, Th.; Petkov, N.; Schilling, K.

doi:10.1007/BFb0031605

Th. Lippert¹,
N. Petkov² &
K. Schilling^1,3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1225))

Included in the following conference series:

International Conference on High-Performance Computing and Networking

106 Accesses
4 Citations

Abstract

A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and, in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. Choi, J. J. Dongarra and D. W. Walker: ‘The Design of Parallel Software Libraries for Distributed Memory Concurrent Computers', in J. J. Dongarra and B. Tourancheau (edts.): Environments and Tools for Parallel Scientific Computing (Elsevier, 1992).
Google Scholar
H. Gupta and P. Sadayappan: ‘Communication-Efficient Matrix Multiplication on Hypercubes', Parallel Computing 22 (1996) 25.
Google Scholar
J. J. Dongarra, J. Du Croz, I. Duff and S. Hammarling: ‘A Set of level 3 Basic Linear Algebra Subprograms', ACM Transaction on Math. Software, 16 (1990) 1.
Google Scholar
Alenia Spazio S.p.A.: Quadrics Primer (Rome: Alenia Spazio, 1995).
Google Scholar
R. Tripiccione, in: F. Karsch et al. (edts.), Proceedings of the International Conference ”Multi-scale Phenomena and their Simulation”, ZiF, Bielefeld, Sep. 30–Oct. 4, 1996, to appear.
Google Scholar
I. Arsenin et al. in: T. D Kieu et al. (edts.), Lattice 95, Proceedings of the International Symposium on Lattice Field Theory, Melbourne, Australia, 1995, Nucl. Phys. B (Proc. Suppl.) 47 (1996) 804.
Google Scholar
N. Christ, in: F. Karsch et al. (edts.), Proceedings of the International Conference ”Multi-scale Phenomena and their Simulation”, ZiF, Bielefeld, Sep. 30–Oct. 4, 1996, to appear.
Google Scholar
P. S. Paolucci: ‘N-Body Classical Systems and Neural Networks on a 3D SIMD Massive Parallel Processor: APE100/Quadrics', IJMPC 6 (1995) 169.
Google Scholar
A. Hoferichter, Th. Lippert, P. Palazzari, K. Schilling, and H. Simma: ‘Hyper-Systolic Routing for SIMD Systems', technical report, HLRZ 03/97, Jülich, Germany, submitted to PARCO '97.
Google Scholar
L. E. Cannon: ‘A Cellular Computer to Implement the Kalman Filter Algorithm', PhD Thesis, Montana State University, 1969.
Google Scholar
V. Kumar, A. Grama, A. Gupta, and G. Karypis: Introduction to Parallel Computing (Redwood City, Benjamin/Cummings, 1994).
Google Scholar
N. Petkov: Systolic Parallel Processing (North-Holland, 1992).
Google Scholar
G. S. Almasi, A. Gottlieb: Highly Parallel Computing (Redwood City, Benjamin/Cummings, 1994).
Google Scholar
Th. Lippert, A. Seyfried, A. Bode, K. Schilling. ‘Hyper-Systolic Parallel Computing', preprint HEP-LAT 9507021, WUB 95-13, HLRZ 32/95.
Google Scholar
Th. Lippert, U. Glaessner, H. Hoeber, G. Ritzenhöfer, K. Schilling and A. Seyfried: ‘Hyper-Systolic Processing on APE100/Quadrics, I. n ²-loop computations', Int. Jour. Mod. Phys. C 7 (1996) 485.
Google Scholar
A. Galli. Generalized Hyper-Systolic Parallel Computing. HEP-LAT 9509011 and MPI-PhT/95-87.
Google Scholar
P. Palazzari, Th. Lippert and K. Schilling: ‘Simulated Annealing Techniques for Communication-Efficient Hyper-Systolic Parallel Computing on APE100/Quadrics', in: L. Grandinetti et al. (edts.), Proceedings of NATO Advanced Research Workshop on High Performance Computing, Technology and Applications Cetraro, Italy-June 1996 (NATO ASI Series, Kluwer, 1996).
Google Scholar
Th. Lippert, N. Petkov, and K. Schilling: ‘Hyper-Systolic Matrix Multiplication on a SISAMD Computer', to appear.
Google Scholar
C. Battista et al.: ‘The APE-100 Computer: (I) the Architecture', Int. J. of High Speed Computing 5 (1993) 637.
Google Scholar

Download references

Author information

Authors and Affiliations

HLRZ, c/o KFA-Jülich, D-52425, Jülich, Germany
Th. Lippert & K. Schilling
Centre for High Performance Computing and Institute of Mathematics and Computing Science, University of Groningen, PO Box 800, 9700, AV Groningen, The Netherlands
N. Petkov
Department of Physics, University of Wuppertal, 42097, Wuppertal, Germany
K. Schilling

Authors

Th. Lippert
View author publications
You can also search for this author in PubMed Google Scholar
N. Petkov
View author publications
You can also search for this author in PubMed Google Scholar
K. Schilling
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Bob Hertzberger Peter Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lippert, T., Petkov, N., Schilling, K. (1997). BLAS-3 for the quadrics parallel computer. In: Hertzberger, B., Sloot, P. (eds) High-Performance Computing and Networking. HPCN-Europe 1997. Lecture Notes in Computer Science, vol 1225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0031605

Download citation

DOI: https://doi.org/10.1007/BFb0031605
Published: 25 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62898-9
Online ISBN: 978-3-540-69041-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics