Implementation Issues of Loop-Level Speculative Run-Time Parallelization

Patel, Devang; Rauchwerger, Lawrence

doi:10.1007/978-3-540-49051-7_13

Devang Patel⁵ &
Lawrence Rauchwerger⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1575))

Included in the following conference series:

International Conference on Compiler Construction

766 Accesses
2 Citations

Abstract

Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. We advocate a novel framework for the identification of parallel loops. It speculatively executes a loop as a doall and applies a fully parallel data dependence test to check for any unsatisfied data dependencies; if the test fails, then the loop is re-executed serially. We will present the principles of the design and implementation of a compiler that employs both run-time and static techniques to parallelize dynamic applications. Run-time optimizations always represent a tradeoff between a speculated potential benefit and a certain (sure) overhead that must be paid. We will introduce techniques that take advantage of classic compiler methods to reduce the cost of run-time optimization thus tilting the outcome of speculation in favor of significant performance gains. Experimental results from the PERFECT, SPEC and NCSA Benchmark suites show that these techniques yield speedups not obtainable by any other known method.

Research supported in part by NSF CAREER Award CCR-9734471 and utilized the SGI systems at the NCSA, University of Illinois under grant #ASC980006N.

Download to read the full chapter text

Chapter PDF

A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism

Article 17 November 2014

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Article Open access 11 March 2024

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Charmm: A program for macromolecular energy, minimization, and dynamics calculations. J. of Computational Chemistry 4(6) (1983)
Google Scholar
Abraham, S.: Private Communication. In: Hewlett Packard Laboratories (1994)
Google Scholar
Banerjee, U.: Loop Parallelization. Kluwer Publishers, Norwell (1994)
MATH Google Scholar
Berryman, H., Saltz, J.: A manual for PARTI runtime primitives. Interim Report 90-13, ICASE (1990)
Google Scholar
Blume, W., et al.: Advanced Program Restructuring for High-Performance Computerswith Polaris. IEEE Computer 29(12), 78–82 (1996)
Google Scholar
Blume, W., Eigenmann, R.: Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks^TM Programs. IEEE Trans. on Parallel and Distributed Systems 3(6), 643–656 (1992)
Article Google Scholar
Blume, W., et al.: Effective automatic parallelization with Polaris. In: IJPP (May 1995)
Google Scholar
Blume, W., et al.: Polaris: The next generation in parallelizing compilers. In: Proc. of the 7-th Workshop on Languages and Compilers for Parallel Computing (1994)
Google Scholar
Cooper, K., et al.: The parascope parallel programming environment. Proc. of IEEE, 84–89 (February 1993)
Google Scholar
Hall, M., et al.: Maximizing multiprocessor performance with the Suif compiler. IEEE Computer 29(12), pp. 84–89 (1996)
Google Scholar
Lawrence, T.: Implementation of run time techniques in the polaris fortran restructurer. TR 1501, CSRD, Univ. of Illinois at Urbana-Champaign (July 1995)
Google Scholar
Leung, S., Zahorjan, J.: Improving the performance of runtime parallelization. In: 4th PPOPP, May 1993, pp. 83–91 (1993)
Google Scholar
Li, Z.: Array privatization for parallel execution of loops. In: Proceedings of the 19th International Symposium on Computer Architecture, pp. 313–322 (1992)
Google Scholar
Frisch, M.J., et al.: Gaussian 1994. Gaussian, Inc., Pittsburgh (1995)
Google Scholar
Maydan, D.E., Amarasinghe, S.P., Lam, M.S.: Data dependence and dataflow analysis of arrays. In: Proc. 5th Workshop on Programming Languages and Compilers for Parallel Computing (August 1992)
Google Scholar
Nagel, L.: SPICE2: A Computer Program to Simulate Semiconductor Circuits. PhD thesis, University of California (May 1975)
Google Scholar
Paek, Y., Hoeflinger, J., Padua, D.: Simplification of Array Access Patterns for Compiler Optimizat ions. In: Proc. of the SIGPLAN 1998 Conf. on Programming Language Design and Implementation, Montreal, Canada (June 1998)
Google Scholar
Patel, D., Rauchwerger, L.: Principles of speculative run–time parallelization. In: Proceedings 11th Annual Workshop on Programming Languages and Compilers for Parallel Computing, August 1998, pp. 330–351 (1998)
Google Scholar
Polychronopoulos, C., et al.: Parafrase-2: A New Generation Parallelizing Compiler. In: Proc. of 1989 Int. Conf. on Parallel Processing, St. Charles, IL, vol. II, pp. 39–48 (1989)
Google Scholar
Pugh, W.: A practical algorithm for exact array dependence analysis. Comm. of the ACM 35(8), 102–114 (1992)
Article Google Scholar
Rauchwerger, L., Amato, N., Padua, D.: A scalable method for run-time loop parallelization. In: IJPP, July 1995, vol. 26(6), pp. 537–576 (1995)
Google Scholar
Rauchwerger, L.: Run–time parallelization: A framework for parallel computation. In: UIUCDCS-R-95-1926, Univ. of Illinois, Urbana, IL (September 1995)
Google Scholar
Rauchwerger, L., Padua, D.: Parallelizing WHILE Loops for Multiprocessor Systems. In: Proc. of 9th International Parallel Processing Symposium (April 1995)
Google Scholar
Rauchwerger, L., Padua, D.: The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. In: Proc. of the SIGPLAN 1995 Conf. on Programming Language Design and Implementation, La Jolla, CA, June 1995, pp. 218–232 (1995)
Google Scholar
Saltz, J., Mirchandaney, R., Crowley, K.: Run-time parallelization and scheduling of loops. IEEE Trans. Comput. 40(5) (May 1991)
Google Scholar
Tu, P., Padua, D.: Array privatization for shared and distributed memory machines. In: Proc. 2nd Workshop on Languages, Compilers, and Run-Time Environments for Distributed Memory Machines (September 1992)
Google Scholar
Whirley. R., Engelmann. B.: DYNA3D: A Nonlinear, Explicit, Three- Dimensional Finite Element Code For Solid and Structural Mechanics. Lawrence Livermore National Laboratory (November 1993)
Google Scholar
Zhu, C., Yew, P.C.: A scheme to enforce data dependence on large multiprocessor systems. IEEE Trans. Softw. Eng. 13(6), 726–739 (1987)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Texas A&M University College Station, TX-77843-3112
Devang Patel & Lawrence Rauchwerger

Authors

Devang Patel
View author publications
You can also search for this author in PubMed Google Scholar
Lawrence Rauchwerger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fraunhofer FIRST, Kekuléstr. 7, D-12489, Berlin, Germany
Stefan Jähnichen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patel, D., Rauchwerger, L. (1999). Implementation Issues of Loop-Level Speculative Run-Time Parallelization. In: Jähnichen, S. (eds) Compiler Construction. CC 1999. Lecture Notes in Computer Science, vol 1575. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49051-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-49051-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65717-0
Online ISBN: 978-3-540-49051-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Implementation Issues of Loop-Level Speculative Run-Time Parallelization

Abstract

Chapter PDF

Similar content being viewed by others

A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Implementation Issues of Loop-Level Speculative Run-Time Parallelization

Abstract

Chapter PDF

Similar content being viewed by others

A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation