Skip to main content

An hierarchical approach for performance analysis of ScaLAPACK-based routines using the distributed linear algebra machine

  • Conference paper
  • First Online:
Applied Parallel Computing Industrial Computation and Optimization (PARA 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1184))

Included in the following conference series:

Abstract

Performance models are important in the design and analysis of linear algebra software for scalable high performance computer systems. They can be used for estimation of the overhead in a parallel algorithm and measuring the impact of machine characteristics and block sizes on the execution time. We present an hierarchical approach for design of performance models for parallel algorithms in linear algebra based on a parallel machine model and the hierarchical structure of the ScaLAPACK library. This suggests three levels of performance models corresponding to existing ScaLAPACK routines. As a proof of the concept a performance model of the high level QR factorization routine pdgeqrf is presented. We also derive performance models of lower level ScaLAPACK building blocks such as pdgeqr2, pdlarft, pdlarfb, pdlarfg, pdlarf, pdnrm2, and pdscal, which are used in the high level model for pdgeqrf. Predicted performance results are compared to measurements on an Intel Paragon XP/S system. The accuracy of the top level model is over 90% for measured matrix and block sizes and different process grid configurations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Choi, J. Demmel, I. Dhillon, J. Dongarra, S. Ostrouchov, A. Petit, K. Stanley, D. Walker, and R.C. Whaley. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers — Design Issues and Performance. Technical Report UT CS-95-283, LAPACK Working Note 95, 1995.

    Google Scholar 

  2. J. Choi, J. Dongarra, S. Ostrouchov, A. Petit, D. Walker, and R.C. Whaley. The Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines. To appear in Scientific Programming, 1996.

    Google Scholar 

  3. K. Dackland and B. Kågström. Reduction of a Regular Matrix Pair (A, B) to Block Hessenberg-Triangular Form. In Dongarra et. al., editor, Applied Parallel Computing: Computations in Physics, Chemistry and Engineering Science, pages 125–133, Berlin, 1995. Springer-Verlag. Lecture Notes in Computer Science, Vol. 1041, Proceedings, Lyngby, Denmark.

    Google Scholar 

  4. J. Dongarra and R. van de Geijn. Two dimensional Basic Linear Algebra Communication Subprograms. Technical Report UT CS-91-138, LAPACK Working Note 37, University of Tennessee, 1991.

    Google Scholar 

  5. J. Dongarra and R. C. Whaley. A Users Guide to BLACS v1.0. Technical Report UT CS-95-281, LAPACK Working Note 94, University of Tennessee, 1995.

    Google Scholar 

  6. I. Duff, S. Hammarling, J. Dongarra, and J. Du Croz. A Set of Level 3 Basic Linear Algebra Subprograms. ACM Transactions on Mathematical Software, 16(1):1–17, 1990.

    Google Scholar 

  7. S. Hammarling, R. Hanson, J. Dongarra, and J. Du Croz. Algorithm 656: An extended Set of Basic Linear Algebra Subprograms: Model Implementation and Test Programs. A CM Transactions on Mathematical Software, 14(1):18–18, 1988.

    Google Scholar 

  8. Intel Corporation. Paragon System Basic Math Library Performance Report. Order Number 312936-003, 1995.

    Google Scholar 

  9. D. Kincaid, F. Krogh C. Lawson, and R. Hanson. Basic Linear Algebra Subprograms for Fortran Usage. ACM Transactions on Mathematical Software, 5(3):308–323, 1979.

    Google Scholar 

  10. R. Schreiber and C. Van Loan. A Storage Efficient WY Representation for Products of Householder Transformations. SIAM J. Sci. and Stat. Comp., 10:53–57 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jerzy Waśniewski Jack Dongarra Kaj Madsen Dorte Olesen

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dackland, K., Kågström, B. (1996). An hierarchical approach for performance analysis of ScaLAPACK-based routines using the distributed linear algebra machine. In: Waśniewski, J., Dongarra, J., Madsen, K., Olesen, D. (eds) Applied Parallel Computing Industrial Computation and Optimization. PARA 1996. Lecture Notes in Computer Science, vol 1184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62095-8_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-62095-8_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62095-2

  • Online ISBN: 978-3-540-49643-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics