Skip to main content

Synchronization-Free Automatic Parallelization: Beyond Affine Iteration-Space Slicing

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5898))

Abstract

This paper contributes to the theory and practice of automatic extraction of synchronization-free parallelism in nested loops. It extends the iteration-space slicing framework to extract slices described by not only affine (linear) but also non-affine forms. A slice is represented by a set of dependent loop statement instances (iterations) forming an arbitrary graph topology. The algorithm generates an outer loop to spawn synchronization-free slices to be executed in parallel, enclosing sequential loops iterating over those slices. Experimental results demonstrate that the generated code is competitive with that generated by state-of-the-art techniques scanning polyhedra.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ancourt, C., Irigoin, F.: Scanning polyhedra with DO loops. In: Proc. of the Third ACM/SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp. 39–50. ACM Press, New York (1991)

    Chapter  Google Scholar 

  2. Bastoul, C.: Code Generation in the Polyhedral Model Is Easier Than You Think. In: Proc. of the PACT’13 IEEE Intl. Conf. on Parallel Architecture and Compilation Techniques, Juan-les-Pins, pp. 7–16 (2004)

    Google Scholar 

  3. Beletska, A., Barthou, D., Bielecki, W., Cohen, A.: Computing the Transitive Closure of a Union of Affine Tuple Relations. In: Du, D.-Z., Hu, X., Pardalos, P.M. (eds.) COCOA 2009. LNCS, vol. 5573, pp. 98–109. Springer, Heidelberg (2009)

    Google Scholar 

  4. Beletska, A., Bielecki, W., Siedlecki, K., San Pietro, P.: Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops. In: Gervasi, O., Murgante, B., Laganà, A., Taniar, D., Mun, Y., Gavrilova, M.L. (eds.) ICCSA 2008, Part II. LNCS, vol. 5073, pp. 871–886. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Beletska, A., Bielecki, W., San Pietro, P.: Extracting Coarse-Grained Parallelism in Program Loops with the Slicing Framework. In: IEEE Proc. of ISPDC, p. 29 (2007)

    Google Scholar 

  6. Beletska, A., San Pietro, P.: Extracting Coarse-Grained Parallelism with the Affine Transformation Framework and its Limitations. Electronic Modelling 5 (2006)

    Google Scholar 

  7. Bielecki, W., Beletska, A., San Pietro, P.: Finding Synchronization-Free Parallelism Represented with Trees of Dependent operations. In: Bourgeois, A.G., Zheng, S.Q. (eds.) ICA3PP 2008. LNCS, vol. 5022, pp. 185–195. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Bielecki, W., Klimek, T., Trifunovic, K.: Calculating Exact Transitive Closure for a Normalized Affine Integer Tuple Relation. Journal of Electronic Notes in Discrete Mathematics 33, 7–14 (2009)

    Article  Google Scholar 

  9. Boigelot, B., Wolper, P.: Symbolic Verification With Periodic Sets. In: Dill, D.L. (ed.) CAV 1994. LNCS, vol. 818, pp. 55–76. Springer, Heidelberg (1994)

    Google Scholar 

  10. Boulet, P., Darte, A., Silber, G.A., Vivien, F.: Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Computing 24, 421–444 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  11. Comon, H., Jurski, Y.: Multiple counters automata, safety analysis and Presburger arithmetic. In: Y. Vardi, M. (ed.) CAV 1998. LNCS, vol. 1427, pp. 268–279. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  12. van Engelen, R.A.: Efficient symbolic analysis for optimizing compilers. In: Wilhelm, R. (ed.) CC 2001. LNCS, vol. 2027, pp. 118–132. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Feautrier, P.: Some efficient solutions to the affine scheduling problem, Part I, one dimensional time. Intl. Journal of Parallel Programming 21, 313–348 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  14. Feautrier, P.: Some efficient solutions to the affine scheduling problem, Part II, multidimensional time. Intl. Journal of Parallel Programming 21, 389–420 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  15. Feautrier, P., Boulet, P.: Scanning polyhedra without DO-loops. In: Parallel Architectures and Compilation Techniques, PACT 1998 (1998)

    Google Scholar 

  16. Henzinger, T.A.: A theory of hybrid automata. In: Symp. on Logic in Computer Science, LICS 1996 (1996)

    Google Scholar 

  17. Kelly, W., Maslov, V., Pugh, W., Rosser, E., Shpeisman, T., Wonnacott, D.: The Omega library interface guide, Technical Report CS-TR-3445, University of Maryland (1995)

    Google Scholar 

  18. Kelly, W.: Optimization within a Unified Transformation Framework, Technical Report CS-TR-3725, University of Maryland (1996)

    Google Scholar 

  19. Kelly, W., Pugh, W., Rosser, E., Shpeisman, T.: Transitive closure of infinite graphs and its applications. Intl. J. of Parallel Programming 24(6), 579–598 (1996)

    Google Scholar 

  20. Kelly, W., Pugh, W., Rosser, E.: Code generation for multiple mappings. In: Frontiers 1995 Symposium on the frontiers of massively parallel computation (1995)

    Google Scholar 

  21. Lee, C.G., Stoodley, M.: UTDSP Benchmark Suite. Univ. of Toronto, Canada (1992), http://www.eecg.toronto.edu/corinna/DSP/infrastructureUTDSP.html

    Google Scholar 

  22. Lim, A.W., Lam, M.: Maximizing Parallelism and Minimizing Synchronization with Affine Transforms. In: Proc. of the Symp. on the Principles of Programming Languages, pp. 201–214 (1997)

    Google Scholar 

  23. NAS benchmarks suite, http://www.nas.nasa.gov

  24. The Omega Project, http://www.cs.umd.edu/projects/omega

  25. Piplib - A parametric integer linear programming solver, http://www.prism.uvsq.fr/~cedb/bastools/piplib.html

  26. Pean, D.-L., Chua, H.-T., Chen, C.: A Release Combined Scheduling Scheme for Non-Uniform Dependence Loops. J. Inf. Sci. Eng. (JISE) 18(2), 223–255 (2002)

    MathSciNet  Google Scholar 

  27. Prakash, S.R., Srikant, Y.N.: Hyperplane Partitioning: An Approach to Global Data Partitioning for Distributed Memory Machines. In: IPPS/SPDP 1999, p. 744 (1999)

    Google Scholar 

  28. Pugh, W., Wonnacott, D.: An exact method for analysis of value-based array data dependences. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.A. (eds.) LCPC 1993. LNCS, vol. 768, pp. 546–566. Springer, Heidelberg (1994)

    Google Scholar 

  29. Pugh, W., Rosser, E.: Iteration Space Slicing and Its Application to Communication Optimization. In: Proc. of the Intl. Conf. on Supercomputing, pp. 221–228 (1997)

    Google Scholar 

  30. Quillere, F., Rajopadhye, S., Wilde, D.: Generation of efficient nested loops from polyhedra. Intl. Journal of Parallel Programming 28 (2000)

    Google Scholar 

  31. Shen, Z., Li, Z., Yew, P.-C.: An Empirical Study of Fortran Programs for Parallelizing Compilers. IEEE Trans. Parallel Distributed Syst. 1, 356–364 (1990)

    Article  Google Scholar 

  32. Vasilache, N., Bastoul, C., Cohen, A.: Polyhedral code generation in the real world. In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 185–201. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  33. Wolfe, M.J.: Beyond induction variables. In: Symp. on Programming Languages and Implementation (PLDI 1992), pp. 162–174 (1992)

    Google Scholar 

  34. Yu, Y., D’Hollander, E.H.: Non-Uniform Dependences Partitioned by Recurrence Chains. In: Proc. the 2004 International Conference on Parallel Processing (ICPP 2004), pp. 100–107 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Beletska, A., Bielecki, W., Cohen, A., Palkowski, M. (2010). Synchronization-Free Automatic Parallelization: Beyond Affine Iteration-Space Slicing. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds) Languages and Compilers for Parallel Computing. LCPC 2009. Lecture Notes in Computer Science, vol 5898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13374-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13374-9_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13373-2

  • Online ISBN: 978-3-642-13374-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics