Scientific software libraries for scalable architectures

Johnsson, S. Lennart; Mathur, Kapil K.

doi:10.1007/BFb0030160

S. Lennart Johnsson^1,2 &
Kapil K. Mathur¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 879))

Included in the following conference series:

International Workshop on Applied Parallel Computing

168 Accesses

Abstract

Massively parallel processors introduce new demands on software systems with respect to performance, scalability, robustness and portability. The increased complexity of the memory systems and the increased range of problem sizes for which a given piece of software is used, poses serious challenges to software developers. The Connection Machine Scientific Software Library, CMSSL, uses several novel techniques to meet these challenges. The CMSSL contains routines for managing the data distribution and provides data distribution independent functionality. High performance is achieved through careful scheduling of arithmetic operations and data motion, and through the automatic selection of algorithms at run-time. We discuss some of the techniques used, and provide evidence that CMSSL has reached the goals of performance and scalability for an important set of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. J. Beaudoin, P. R. Dawson, K. K. Mathur, U.F. Kocks, and D. A. Korzekwa. Application of polycrystal plasticity to sheet forming. Computer Methods in Applied Mechanics and Engineering, in press, 1993.
Google Scholar
Jack J. Dongarra, Jeremy Du Croz, Iain Duff, and Sven Hammarling. A Set of Level 3 Basic Linear Algebra Subprograms. Technical Report Reprint No. 1, Argonne National Laboratories, Mathematics and Computer Science Division, August 1988.
Google Scholar
M. Fiedler. A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslovak Mathematical Journal, 25:619–633, 1975.
Google Scholar
Zdenek Johan. Data Parallel Finite Element Techniques for Large-Scale Computational Fluid Dynamics. PhD thesis, Department of Mechanical Engineering, Stanford University, 1992.
Google Scholar
Zdenek Johan, Thomas J.R. Hughes, Kapil K. Mathur, and S. Lennart Johnsson. A data parallel finite element method for computational fluid dynamics on the Connection Machine system. Computer Methods in Applied Mechanics and Engineering, 99(1):113–134, August 1992.
Google Scholar
Zdenek Johan, Kapil K. Mathur, S. Lennart Johnsson, and Thomas J.R. Hughes. An efficient communication strategy for Finite Element Methods on the Connection Machine CM-5 system. Computer Methods in Applied Mechanics and Engineering, 113:363–387. 1994.
Google Scholar
S. Lennart Johnsson. Fast banded systems solvers for ensemble architectures. Technical Report YALEU/DCS/RR-379, Dept. of Computer Science, Yale University, March 1985.
Google Scholar
S. Lennart Johnsson. Communication efficient basic linear algebra computations on hypercube architectures. J. Parallel Distributed Computing, 4(2):133–172, April 1987.
Google Scholar
S. Lennart Johnsson. Minimizing the communication time for matrix multiplication on multiprocessors. Parallel Computing, 19(11):1235–1257, 1993.
Google Scholar
S. Lennart Johnsson. Parallel Architectures and their Efficient Use, chapter Massively Parallel Computing: Data distribution and communication, pages 68–92. Springer Verlag, 1993.
Google Scholar
S. Lennart Johnsson and Ching-Tien Ho. Spanning graphs for optimum broadcasting and personalized communication in hypercubes. IEEE Trans. Computers, 38(9):1249–1268, September 1989.
Google Scholar
S. Lennart Johnsson and Ching-Tien Ho. Optimizing tridiagonal solvers for alternating direction methods on Boolean cube multiprocessors. SIAM J. on Scientific and Statistical Computing, 11(3):563–592, 1990.
Google Scholar
S. Lennart Johnsson, Ching-Tien Ho, Michel Jacquemin, and Alan Ruttenberg. Computing fast Fourier transforms on Boolean cubes and related networks. In Advanced Algorithms and Architectures for Signal Processing II, volume 826, pages 223–231. Society of Photo-Optical Instrumentation Engineers, 1987.
Google Scholar
S. Lennart Johnsson, Michel Jacquemin, and Robert L. Krawitz. Communication efficient multi-processor FFT. Journal of Computational Physics, 102(2):381–397, October 1992.
Google Scholar
S. Lennart Johnsson and Kapil K. Mathur. Distributed level 1 and level 2 BLAS. Technical report, Thinking Machines Corp., 1992. In preparation.
Google Scholar
S. Lennart Johnsson and Luis F. Ortiz. Local Basic Linear Algebra Subroutines (LBLAS) for distributed memory architectures and languages with an array syntax. The International Journal of Supercomputer Applications, 6(4):322–350, 1992.
Google Scholar
S. Lennart Johnsson, Yousef Saad, and Martin H. Schultz. Alternating direction methods on multiprocessors. SIAM J. Sci. Statist. Comput, 8(5):686–700, 1987.
Google Scholar
C.L. Lawson, R.J. Hanson, D.R. Kincaid, and F.T. Krogh. Basic Linear Algebra Subprograms for Fortran Usage. ACM TOMS, 5(3):308–323, September 1979.
Google Scholar
Woody Lichtenstein and S. Lennart Johnsson. Block cyclic dense linear algebra. SIAM Journal of Scientific Computing, 14(6):1257–1286, 1993.
Google Scholar
Kapil K. Mathur and S. Lennart Johnsson. Multiplication of matrices of arbitrary shape on a Data Parallel Computer. Technical Report 216, Thinking Machines Corp., December 1991.
Google Scholar
Kapil K. Mathur and S. Lennart Johnsson. All-to-all communication. Technical Report 243, Thinking Machines Corp., December 1992.
Google Scholar
Kapil K. Mathur, Alan Needleman, and V. Tvergaard. Ductile failure analyses on massively parallel computers. Computer Methods in Applied Mechanics and Engineering, in press, 1993.
Google Scholar
N Metropolis, J Howlett, and Gian-Carlo Rota, editors. A History of Computing in the Twentieth Century. Academic Press, 1980.
Google Scholar
Gary L. Miller, Shang-Hua Teng, William Thurston, and Stephen A. Vavasis. Automatic mesh partitioning. In Sparse Matrix Computations: Graph Theory Issues and Algorithms. The Institute of Mathematics and its Applications, 1992.
Google Scholar
Alex Pothen, Horst D. Simon, and Kang-Pu Liou. Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl, 11(3):430–452, 1990.
Google Scholar
Horst D. Simon. Partitioning of unstructured problems for parallel processing. Computing Systems in Engineering, 2:135–148, 1991.
Google Scholar
Tayfun Tezduyar. Private communication, 1993.
Google Scholar
Thinking Machines Corp. CMSSL for CM Fortran, Version 3.0, 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Thinking Machines Corp., USA
S. Lennart Johnsson & Kapil K. Mathur
Harvard University, USA
S. Lennart Johnsson

Authors

S. Lennart Johnsson
View author publications
You can also search for this author in PubMed Google Scholar
Kapil K. Mathur
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jack Dongarra Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Johnsson, S.L., Mathur, K.K. (1994). Scientific software libraries for scalable architectures. In: Dongarra, J., Waśniewski, J. (eds) Parallel Scientific Computing. PARA 1994. Lecture Notes in Computer Science, vol 879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030160

Download citation

DOI: https://doi.org/10.1007/BFb0030160
Published: 17 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58712-5
Online ISBN: 978-3-540-49050-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics