Skip to main content

New Generalized Data Structures for Matrices Lead to a Variety of High-Performance Algorithms

  • Conference paper
Book cover Simulation and Visualization on the Grid

Part of the book series: Lecture Notes in Computational Science and Engineering ((LNCSE,volume 13))

Abstract

We describe new data structures for full and packed storage of dense symmetric/triangular arrays that generalize both. Using the new data structures, one is led to several new algorithms that save “half” the storage and outperform the current blocked-based level-3 algorithms in LAPACK. We concentrate on the simplest forms of the new algorithms and show for Cholesky factorization they are a direct generalization of LINPACK. This means that level-3 BLAS’s are not required to obtain level-3 performance. The replacement for level-3 BLAS are so-called kernel routines, and on IBM platforms they are producible from simple textbook type codes, by the XLF Fortran compiler. In the sequel I will label these “vanilla” codes. The results for Cholesky, on Power3 with a peak performance of 800 Mflop/s at n ≥ 200 is over 720 MFlop/s and reaches 735 MFlop/s. Using conventional full-format LAPACK DPOTRF with ESSL BLAS’s, one first gets 600 MFlop/s at n ≥ 600 and only reaches a peak of 620 MFlop/s. We have also produced simple square blocked full-matrix data formats where the blocks themselves are stored in column-major (Fortran) order or row-major (C) format. The simple algorithms of LU factorization with partial pivoting for this new data format is a direct generalization of LINPACK algorithm DGEFA. Again, no conventional level-3 BLAS’s are required; the replacements are again so-called kernel routines. Programming for squared blocked full-matrix format can be accomplished in standard Fortran through the use of three-and four-dimensional arrays. Thus, no new compiler support is necessary. Finally we mention that other more complicated algorithms are possible, for example, recursive ones. The recursive algorithms are also easily programmed via the use of tables that address where the blocks are stored in the two-dimensional recursive block array.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. C. Agarwal, F. G. Gustayson, and M. Zubair. Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM Journal of Research and Development, 38(5):563–576, September 1994.

    Article  Google Scholar 

  2. J. J. Dongarra, F. G. Gustayson, and A. Karp. Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Review, 26(1):91–112, January 1984.

    Article  MathSciNet  MATH  Google Scholar 

  3. . E. W. Elmroth and F. G. Gustayson. Applying recursion to serial and parallel QR factorization leads to better performance. Submitted to IBM Journal of Research and Development.

    Google Scholar 

  4. G. Golub and C. Van Loan. Matrix Computations. Johns Hopkins Press, Baltimore and London, second edition, 1989.

    MATH  Google Scholar 

  5. F. G. Gustayson. Notes on blocked packed format and square blocked format for symmetric/triangular arrays.

    Google Scholar 

  6. F. G. Gustayson. Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM Journal of Research and Development, 41(6):737–755, November 1997.

    Article  Google Scholar 

  7. F. G. Gustayson, A. Henriksson, I. Jonsson, B. Kågström, and P. Ling. Recursive blocked data formats and BLAS’s dense linear algebra algorithms. In B. Kågström et al., editors, Applied Parallel Computing: Large Scale Scientific and Industrial Problems, volume 1541 of Lecture Notes in Computer Science, pages 195–206, 1998.

    Google Scholar 

  8. F. G. Gustayson, A. Karaivanov, J. Wasniewski, and P. Yalamov. The efficiency of a new packed storage for symmetric indefinite matrices. Working draft, September 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gustavson, F.G. (2000). New Generalized Data Structures for Matrices Lead to a Variety of High-Performance Algorithms. In: Engquist, B., Johnsson, L., Hammill, M., Short, F. (eds) Simulation and Visualization on the Grid. Lecture Notes in Computational Science and Engineering, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57313-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-57313-2_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67264-7

  • Online ISBN: 978-3-642-57313-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics