Pipelining Wavefront Computations: Experiences and Performance
Wavefront computations are common in scientific applications. Although it is well understood how wavefronts are pipelined for parallel execution, the question remains: How are they best presented to the compiler for the effective generation of pipelined code? We address this question through a quantitative and qualitative study of three approaches to expressing pipelining: programmer implemented via message passing, compiler discovered via automatic parallelization, and programmer defined via explicit parallel language features for pipelining. This work is the first assessment of the efficacy of these approaches in solving wavefront computations, and in the process, we reveal surprising characteristics of commercial compilers. We also demonstrate that a parallel language-level solution simplifies development and consistently performs well.
KeywordsMessage Passing Interface Message Passing Array Reference Automatic Parallelization Stencil Computation
Unable to display preview. Download preview PDF.
- 1.Accelerated Strategic Computing Initiative. ASCI SWEEP3D homepage. http://www.llnl.gov/asci_benchmarks/asci/limited/sweep3d/sweep3d_readme.html.
- 2.Bradford L. Chamberlain, Sung-Eun Choi, E Christopher Lewis, Calvin Lin, Lawrence Snyder, and W. Derrick Weathersby. ZPL’s WYSIWYG performance model. In Third IEEE International Workshop on High-Level Parallel Programming Models and Supportive Environments, pages 50–61, March 1998.Google Scholar
- 3.Bradford L. Chamberlain, E Christopher Lewis, Calvin Lin, and Lawrence Snyder. Regions: An abstraction for expressing array computation. In ACM SIGAPL/SIGPLAN International Conference on Array Programming Languages, pages 41–49, August 1999.Google Scholar
- 4.Bradford L. Chamberlain, E Christopher Lewis, and Lawrence Snyder. Language support for pipelining wavefront computations. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing, 1999.Google Scholar
- 5.Ron Cytron. Doacross: Beyond vectorization for multiprocessors. In International Conference on Parallel Processing, pages 836–844, 1986.Google Scholar
- 6.Manish Gupta, Sam Midkiff, Edith Schonberg, Ven Seshadri, David Shields, Ko-Yang Wang, Wai-Mee Ching, and Ton Ngo. An HPF compiler for the IBM SP2. In Proceedings of the 1995 ACM/IEEE Supercomputing Conference (CD-ROM), 1995.Google Scholar
- 7.High Performance Fortran Forum. HPF Language Specification, Version 2.0. January 1997.Google Scholar
- 8.Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng. Compiler optimizations for Fortran D on MIMD distributed-memory machines. In Supercomputing’ 91, pages 96–100, Albuquerque, NM, November 1991.Google Scholar
- 9.K. R. Koch, R. S. Baker, and R. E. Alcouffe. Solution of the first-order form of three-dimensional discrete ordinates equations on a massively parallel machine. Transactions of the American Nuclear Society, 65:198–9, 1992.Google Scholar
- 10.David K. Lowenthal and Michael James. Run-time selection of block size in pipelined parallel programs. In Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, pages 82–7, 1999.Google Scholar
- 11.Anne Rogers and Keshav Pingali. Process decomposition through locality of reference. In ACM SIGPLAN PLDI’ 89, pages 69–80, June 1989.Google Scholar
- 12.Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker, and Jack Dongarra. MPI—The Complete Reference. The MIT Press, Cambridge, Massachusetts, 2nd edition, 1998.Google Scholar
- 13.Lawrence Snyder. The ZPL Programmer’s Guide. The MIT Press, 1999.Google Scholar
- 14.David Sundaram-Stukel and Mark K. Vernon. Predictive analysis of a wavefront application using LogGP. In Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1999.Google Scholar
- 16.ZPL Project. ZPL project homepage. http://www.cs.washington.edu/research/zpl.