Using Loop- Level Parallelism to Parallelize Vectorizable Programs

Pressel, D. M.; Sahu, Juris; Heavey, K. R.

doi:10.1007/3-540-45401-2_3

D. M. Pressel⁵,
Juris Sahu⁶ &
K. R. Heavey⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2026))

Included in the following conference series:

International Workshop on High-Level Parallel Programming Models and Supportive Environments

158 Accesses

Abstract

One of the major challenges facing high performance computing is the daunting task of producing programs that will achieve acceptable levels of performance when run on parallel architectures. Although many organizations have been actively working in this area for some time, many programs have yet to be parallelized. Furthermore, some programs that were parallelized were done so for obsolete systems. These programs may run poorly, if at all, on the current generation of parallel computers. Therefore, a straightforward approach to parallelizing vectorizable codes is needed without introducing any changes to the algorithm or the convergence properties of the codes. Using the combination of loop-level parallelism, and RISC-based shared memory SMPs has proven to be a successful approach to solving this problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wang, G. and Tafti, D. K.: Performance Enhancement on Microprocessors With Hierarchical Memory Systems for Solving Large Sparse Linear System. International Journal of Supercomputing Applications (1997).
Google Scholar
Steger, J. L., Ying, S. X., and Schiff, L. B.: A Partially Flux-Split Algorithm for Numerical Simulation of Compressible Inviscid and Viscous Flows. Proceedings of the Workshop on CFD. Davis, California (1986).
Google Scholar
Bailey, D. H.: Microprocessors and Scientific Computing. Proceedings of Supercomputing 93. Los Alamitos, California (1993).
Google Scholar
Schimmel, C.: UNIX Systems for Modern Architectures, Symmetric Multiprocessing, and Caching for Kernel Programmers. Addison-Wesley Publishing Company, Reading, Massachusetts (1994).
Google Scholar
Frumkin, M., Hribar, M., Jin, H., Waheed, A., and Yan, J.: A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000. Proceedings for the SC98 Conference. Supercomp Organization, IEEE Computer Society, and ACM (1998). http://www.supercomp.org/sc98/TechPapers.
Pressel, D. M.:Results From the Porting of the Computational Fluid Dynamics Code F3D to the Convex Exemplar (SPP-1000 and SPP-1600), ARL-TR-1923. U.S. Army Research Laboratory, Aberdeen Proving Ground, Maryland (1999).
Google Scholar
Edge, H. L., Sahu, J., Sturek, W. B., Pressel, D. M., Heavey, K. R., Weinacht, P., Zoltani, C. K., Nietubicz, C. J., Clarke, J., Behr, and M., Collins, P.: Common High Performance Computing Software Support Initiative (CHSSI) Computational Fluid Dynamics (CFD) Project Final Report: ARL Block-Structured Gridding Zonal Navier-Stokes Flow (ZNSFLOW) Solver Software, ARL-TR-2084. U.S. Army Research Laboratory, Aberdeen Proving Ground, Maryland (2000).
Google Scholar
Laudon, J. and Lenoski, D.: The SGI Origin: A ccNUMA Highly Scalable Server. Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA’ 97), Denver, Colorado. June 2-4, 1997. IEEE Computer Society, Los Alamitos, California.
Google Scholar
Theys, M. D., Braun, T. D., and Siegel, H. J.: Widespread Acceptance of General-Purpose, Large-Scale Parallel Machines: Fact, Future, or Fantasy? IEEE Concurrency Parallel, Distributed, and Mobile Computing, Vol. 6, No.1 (1998).
Google Scholar
Hisley, D. M., Agrawal, G., and Pollock, L.: Performance Studies of the Parallelization of a CFD Solver on the Origin 2000. Proceedings for the 21st Army Science Conference. Department of the Army (1998).
Google Scholar
Oberlin, S.: Keynote Address at the International Symposium on Computer Architecture (ISCA’ 99) (1999).At the time, S. Oberlin had been the Vice President for Software at SGI, having previously been the Vice President for Hardware.
Google Scholar
Behr, M., Pressel, D. M., and Sturek, W. B., Jr.: Comments on CFD Code Performance on Scalable Architectures. Computer Methods in Applied Mechanics and Engineering, Vol. 190 (2000) 263–277.
Article MATH Google Scholar
Taft, J.: Initial SGI Origin 2000 Tests Show Promise for CFD Codes. NAS News, Vol. 2, No. 25. NASA Ames Research Center (1997).
Google Scholar
Taft, J. R.: Shared Memory Multi-Level Parallelism for CFD, Overflow-MLP: A Case Study. Presented at the Cray User Group Origin 2000 Workshop, Denver, Colorado, October 11-13, 1998.
Google Scholar
Taft, J. R.: Achieving 60 GFLOP/S on the Production CFD CODE OVERFLOW-MLP. Presented at WOMPAT 2000, Workshop on OpenMP Applications and Tools, San Diego, California, July 6-7, 2000.
Google Scholar
Keleher, P., Cox, A. L., Dwarkadas, S., and Zwaenepoel, W.: readMarks: Distributed Shared Memory on Standard Workstations and Operating Systems. Proceedings of the Winter 94 Usenix Conference (1994).http://www.cs.rice.edu/~willy/TreadMarks/papers.html.
Hagersten, E. and Koster, M.: WildFire: A Scalable Path for SMPs. Proceedings of the 5th International Symposium on High-Performance Computer Architecture (HPCA), Orlando, Florida, January 9-13, 1999.IEEE Computer Society, Los Alamitos, California.
Google Scholar

Download references

Author information

Authors and Affiliations

Computational and Information Sciences Directorate, U.S. Army Research Laboratory, Aberdeen Proving Ground, 21005-5067, Maryland, USA
D. M. Pressel
Weapons and Materials Research Directorate, U.S. Army Research Laboratory, Aberdeen Proving Ground, 21005-5069, Maryland, USA
Juris Sahu
Weapons and Materials Research Directorate, U.S. Army Research Laboratory, Aberdeen Proving Ground, 21005-5069, Maryland, USA
K. R. Heavey

Authors

D. M. Pressel
View author publications
You can also search for this author in PubMed Google Scholar
Juris Sahu
View author publications
You can also search for this author in PubMed Google Scholar
K. R. Heavey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, P.O. Box 808, 94551, Livermore, CA, USA
Frank Mueller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pressel, D.M., Sahu, J., Heavey, K.R. (2001). Using Loop- Level Parallelism to Parallelize Vectorizable Programs. In: Mueller, F. (eds) High-Level Parallel Programming Models and Supportive Environments. HIPS 2001. Lecture Notes in Computer Science, vol 2026. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45401-2_3

Download citation

DOI: https://doi.org/10.1007/3-540-45401-2_3
Published: 02 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41944-0
Online ISBN: 978-3-540-45401-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics