Scalable parallel sparse factorization with left-right looking strategy on shared memory multiprocessors

  • Olaf Schenk
  • Klaus Gärtner
  • Wolfgang Fichtner
Track C2: Computational Science
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1593)


An efficient sparse LU factorization algorithm on popular shared memory multiprocessors is presented. Interprocess communication is critically important on these architectures—the algorithm introduces O(n) synchronization events only. No global barrier is used and a completely asynchronous scheduling scheme is one central point of the implementation. The algorithm aims at optimizing the single node performance and minimizing the communication overhead. It has been successfully tested on SUN Enterprise, DEC AlphaServer, SGI Origin 2000, Cray T90, J90, and NEC SX-4 parallel computers, delivering up to 2.3 GFlop/s on an eight processor DEC AlphaServer for medium-size semiconductor device simulations and structural engineering problems.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    C. Ashcraft, R. Grimes, J. Lewis, B. Peyton, and H. Simon, Progress in sparse matrix methods for large linear systems on vector supercomputers, The International Journal of Supercomputer Applications, 1 (1987), pp. 10–30.CrossRefGoogle Scholar
  2. 2.
    I. S. Duff, Multiprocessing a sparse matrix code on the alliant fx/8, J. Comput. Appl. Math., 27 (1989), pp. 229–239.CrossRefzbMATHGoogle Scholar
  3. 3.
    D. Fokkema, Subspace methods for linear, nonlinear, and eigen problems., PhD thesis, Utrecht University, 1996.Google Scholar
  4. 4.
    Integrated Systems Engineering AG, DESSIS-iseReference Manual, ISE Integrated Systems Engineering AG, 1998.Google Scholar
  5. 5.
    Integrated Systems Engineering AG, DESS-iseReference Manual, ISE Integrated Systems Engineering AG, 1998.Google Scholar
  6. 6.
    G. Karypis and V. Kumar, Analysis of multilevel graph algorithms, Tech. Rep. MN 95–037, University of Minnesota, Department of Computer Science, Minneapolis, MN 55455, 1995.Google Scholar
  7. 7.
    A. Liegmann, Efficient Solution of Large Sparse Linear Systems, PhD thesis, ETH Zürich, 1995.Google Scholar
  8. 8.
    J. Liu, The role of elimination trees in sparse factorization, SIAM Journal on Matrix Analysis & Applications, 11 (1990), pp. 134–172.CrossRefzbMATHGoogle Scholar
  9. 9.
    P. Matstoms, Parallel sparse QR factorization on shared memory architectures, Parallel Computing, 21 (1995), pp. 473–486.CrossRefzbMATHGoogle Scholar
  10. 10.
    E. Ng and B. Peyton, A supernodal Cholesky factorization algorithm for shared-memory multiprocessors, SIAM Journal on Scientific Computing, 14 (1993), pp. 761–769.CrossRefMathSciNetzbMATHGoogle Scholar
  11. 11.
    E. Rothberg, Exploiting the memory hierarchy in sequential and parallel sparse Cholesky factorization, PhD thesis, Stanford University, 1992. STAN-CS-92-1459.Google Scholar
  12. 12.
    Y. Saad, Iterative Methods for Sparse Linear Systems, PWS Publishing Company, 1996.Google Scholar
  13. 13.
    O. Schenk, K. Gärtner, and W. Fichtner, Efficient sparse LU factorization with left-right looking strategy on shared memory multiprocessors, Tech. Rep. 98/40, Integrated Systems Laboratory, ETH Zurich, Swiss Fed. Inst. of Technology (ETH), Zurich, Switzerland, Submitted to BIT Numerical Mathematics, 1998.Google Scholar
  14. 14.
    G. Sleijpen, H. Van Der Vorst, and D. Fokkema, BiCGSTAB(l) and other hybrid Bi-CG methods, Tech. Rep. TR Nr. 831, Department of Mathematics, University Utrecht, 1993.Google Scholar

Copyright information

© Springer-Verlag 1999

Authors and Affiliations

  • Olaf Schenk
  • Klaus Gärtner
    • 2
  • Wolfgang Fichtner
    • 1
  1. 1.Integrated Systems Laboratory Swiss Federal Institute of Technology ZurichETH ZurichZurichSwitzerland
  2. 2.Weierstrass Institute for Applied Analysis and StochasticsBerlinGermany

Personalised recommendations