Case Study 1: Parallel Fast Givens QR Factorization

Part of the Series in Computer Science book series (SCS)


Although parallel QR factorization has been the topic of much research, available parallel algorithms exhibit poor scalability characteristics on matrices with dimensions less than 3000. As a consequence, there is little flexibility to meet stringent latency constraints by manipulating the number of processors. This is particularly true of parallel algorithms based on block cyclic distribution schemes such as ScaLAPACK’s PDGEQRF (Choi et al., 1995; Blackford et al., 1997). Further compounding the problem of scalability is the fact that block cyclic distribution schemes are often not compatible with the data movement patterns of many applications. Note that some very recent work on efficient real time redistribution techniques promises to make these algorithms more attractive to high performance signal processing applications (Park et al., 1999; Petit and Dongarra, 1999).


Execution Time Shared Memory Message Passing Hybrid Version Minimum Execution Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Science+Business Media New York 2003

Authors and Affiliations

  1. 1.Mercury Computer System, Inc.ChelmsfordUSA
  2. 2.John Hopkins UniversityBaltimoreUSA

Personalised recommendations