Abstract
This paper contributes to the theory and practice of automatic extraction of synchronization-free parallelism in nested loops. It extends the iteration-space slicing framework to extract slices described by not only affine (linear) but also non-affine forms. A slice is represented by a set of dependent loop statement instances (iterations) forming an arbitrary graph topology. The algorithm generates an outer loop to spawn synchronization-free slices to be executed in parallel, enclosing sequential loops iterating over those slices. Experimental results demonstrate that the generated code is competitive with that generated by state-of-the-art techniques scanning polyhedra.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ancourt, C., Irigoin, F.: Scanning polyhedra with DO loops. In: Proc. of the Third ACM/SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp. 39–50. ACM Press, New York (1991)
Bastoul, C.: Code Generation in the Polyhedral Model Is Easier Than You Think. In: Proc. of the PACT’13 IEEE Intl. Conf. on Parallel Architecture and Compilation Techniques, Juan-les-Pins, pp. 7–16 (2004)
Beletska, A., Barthou, D., Bielecki, W., Cohen, A.: Computing the Transitive Closure of a Union of Affine Tuple Relations. In: Du, D.-Z., Hu, X., Pardalos, P.M. (eds.) COCOA 2009. LNCS, vol. 5573, pp. 98–109. Springer, Heidelberg (2009)
Beletska, A., Bielecki, W., Siedlecki, K., San Pietro, P.: Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops. In: Gervasi, O., Murgante, B., Laganà , A., Taniar, D., Mun, Y., Gavrilova, M.L. (eds.) ICCSA 2008, Part II. LNCS, vol. 5073, pp. 871–886. Springer, Heidelberg (2008)
Beletska, A., Bielecki, W., San Pietro, P.: Extracting Coarse-Grained Parallelism in Program Loops with the Slicing Framework. In: IEEE Proc. of ISPDC, p. 29 (2007)
Beletska, A., San Pietro, P.: Extracting Coarse-Grained Parallelism with the Affine Transformation Framework and its Limitations. Electronic Modelling 5 (2006)
Bielecki, W., Beletska, A., San Pietro, P.: Finding Synchronization-Free Parallelism Represented with Trees of Dependent operations. In: Bourgeois, A.G., Zheng, S.Q. (eds.) ICA3PP 2008. LNCS, vol. 5022, pp. 185–195. Springer, Heidelberg (2008)
Bielecki, W., Klimek, T., Trifunovic, K.: Calculating Exact Transitive Closure for a Normalized Affine Integer Tuple Relation. Journal of Electronic Notes in Discrete Mathematics 33, 7–14 (2009)
Boigelot, B., Wolper, P.: Symbolic Verification With Periodic Sets. In: Dill, D.L. (ed.) CAV 1994. LNCS, vol. 818, pp. 55–76. Springer, Heidelberg (1994)
Boulet, P., Darte, A., Silber, G.A., Vivien, F.: Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Computing 24, 421–444 (1998)
Comon, H., Jurski, Y.: Multiple counters automata, safety analysis and Presburger arithmetic. In: Y. Vardi, M. (ed.) CAV 1998. LNCS, vol. 1427, pp. 268–279. Springer, Heidelberg (1998)
van Engelen, R.A.: Efficient symbolic analysis for optimizing compilers. In: Wilhelm, R. (ed.) CC 2001. LNCS, vol. 2027, pp. 118–132. Springer, Heidelberg (2001)
Feautrier, P.: Some efficient solutions to the affine scheduling problem, Part I, one dimensional time. Intl. Journal of Parallel Programming 21, 313–348 (1992)
Feautrier, P.: Some efficient solutions to the affine scheduling problem, Part II, multidimensional time. Intl. Journal of Parallel Programming 21, 389–420 (1992)
Feautrier, P., Boulet, P.: Scanning polyhedra without DO-loops. In: Parallel Architectures and Compilation Techniques, PACT 1998 (1998)
Henzinger, T.A.: A theory of hybrid automata. In: Symp. on Logic in Computer Science, LICS 1996 (1996)
Kelly, W., Maslov, V., Pugh, W., Rosser, E., Shpeisman, T., Wonnacott, D.: The Omega library interface guide, Technical Report CS-TR-3445, University of Maryland (1995)
Kelly, W.: Optimization within a Unified Transformation Framework, Technical Report CS-TR-3725, University of Maryland (1996)
Kelly, W., Pugh, W., Rosser, E., Shpeisman, T.: Transitive closure of infinite graphs and its applications. Intl. J. of Parallel Programming 24(6), 579–598 (1996)
Kelly, W., Pugh, W., Rosser, E.: Code generation for multiple mappings. In: Frontiers 1995 Symposium on the frontiers of massively parallel computation (1995)
Lee, C.G., Stoodley, M.: UTDSP Benchmark Suite. Univ. of Toronto, Canada (1992), http://www.eecg.toronto.edu/corinna/DSP/infrastructureUTDSP.html
Lim, A.W., Lam, M.: Maximizing Parallelism and Minimizing Synchronization with Affine Transforms. In: Proc. of the Symp. on the Principles of Programming Languages, pp. 201–214 (1997)
NAS benchmarks suite, http://www.nas.nasa.gov
The Omega Project, http://www.cs.umd.edu/projects/omega
Piplib - A parametric integer linear programming solver, http://www.prism.uvsq.fr/~cedb/bastools/piplib.html
Pean, D.-L., Chua, H.-T., Chen, C.: A Release Combined Scheduling Scheme for Non-Uniform Dependence Loops. J. Inf. Sci. Eng. (JISE) 18(2), 223–255 (2002)
Prakash, S.R., Srikant, Y.N.: Hyperplane Partitioning: An Approach to Global Data Partitioning for Distributed Memory Machines. In: IPPS/SPDP 1999, p. 744 (1999)
Pugh, W., Wonnacott, D.: An exact method for analysis of value-based array data dependences. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.A. (eds.) LCPC 1993. LNCS, vol. 768, pp. 546–566. Springer, Heidelberg (1994)
Pugh, W., Rosser, E.: Iteration Space Slicing and Its Application to Communication Optimization. In: Proc. of the Intl. Conf. on Supercomputing, pp. 221–228 (1997)
Quillere, F., Rajopadhye, S., Wilde, D.: Generation of efficient nested loops from polyhedra. Intl. Journal of Parallel Programming 28 (2000)
Shen, Z., Li, Z., Yew, P.-C.: An Empirical Study of Fortran Programs for Parallelizing Compilers. IEEE Trans. Parallel Distributed Syst. 1, 356–364 (1990)
Vasilache, N., Bastoul, C., Cohen, A.: Polyhedral code generation in the real world. In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 185–201. Springer, Heidelberg (2006)
Wolfe, M.J.: Beyond induction variables. In: Symp. on Programming Languages and Implementation (PLDI 1992), pp. 162–174 (1992)
Yu, Y., D’Hollander, E.H.: Non-Uniform Dependences Partitioned by Recurrence Chains. In: Proc. the 2004 International Conference on Parallel Processing (ICPP 2004), pp. 100–107 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Beletska, A., Bielecki, W., Cohen, A., Palkowski, M. (2010). Synchronization-Free Automatic Parallelization: Beyond Affine Iteration-Space Slicing. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds) Languages and Compilers for Parallel Computing. LCPC 2009. Lecture Notes in Computer Science, vol 5898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13374-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-13374-9_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13373-2
Online ISBN: 978-3-642-13374-9
eBook Packages: Computer ScienceComputer Science (R0)