Polyhedral Optimizations for a Data-Flow Graph Language

Sbîrlea, Alina; Shirako, Jun; Pouchet, Louis-Noël; Sarkar, Vivek

doi:10.1007/978-3-319-29778-1_4

Polyhedral Optimizations for a Data-Flow Graph Language

Alina Sbîrlea¹⁶,
Jun Shirako¹⁶,
Louis-Noël Pouchet¹⁷ &
…
Vivek Sarkar¹⁶

Conference paper
First Online: 20 February 2016

753 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9519))

Abstract

This paper proposes a novel optimization framework for the Data-Flow Graph Language (DFGL), a dependence-based notation for macro-dataflow model which can be used as an embedded domain-specific language. Our optimization framework follows a “dependence-first” approach in capturing the semantics of DFGL programs in polyhedral representations, as opposed to the standard polyhedral approach of deriving dependences from access functions and schedules. As a first step, our proposed framework performs two important legality checks on an input DFGL program — checking for potential violations of the single-assignment rule, and checking for potential deadlocks. After these legality checks are performed, the DFGL dependence information is used in lieu of standard polyhedral dependences to enable polyhedral transformations and code generation, which include automatic loop transformations, tiling, and code generation of parallel loops with coarse-grain (fork-join) and fine-grain (doacross) synchronizations. Our performance experiments with nine benchmarks on Intel Xeon and IBM Power7 multicore processors show that the DFGL versions optimized by our proposed framework can deliver up to 6.9\(\times \) performance improvement relative to standard OpenMP versions of these benchmarks. To the best of our knowledge, this is the first system to encode explicit macro-dataflow parallelism in polyhedral representations so as to provide programmers with an easy-to-use DSL notation with legality checks, while taking full advantage of the optimization functionality in state-of-the-art polyhedral frameworks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Step I/O may comprise a list of items, and item keys may include range expressions.
2.
A typical case is env step to create set of step instances where tag is a range.
3.
In future work, we may consider the possibility of not treating this case as an error condition by assuming that each data item that is not performed in the DFGL region has a initializing write that is instead performed by the environment.
4.
MKL is the best tuned library for Intel platforms. We compare against Sequential and Parallel MKL.
5.
On POWER7 we use ATLAS — the sequential library — as MKL cannot run on POWER7, and a parallel library was not available.

References

Hydrodynamics Challenge Problem, Lawrence Livermore National Laboratory. Technical report LLNL-TR-490254
Google Scholar
The PACE compiler project. http://pace.rice.edu
The Swarm Framework. http://swarmframework.org/
Building an open community runtime (OCR) framework for exascale systems, supercomputing 2012 Birds-of-a-feather session, November 2012
Google Scholar
Ackerman, W., Dennis, J.: VAL - A Value Oriented Algorithmic Language. Technical report TR-218, MIT Laboratory for Computer Science, June 1979
Google Scholar
Agrawal, K., et al.: Executing task graphs using work-stealing. In: IPDPS (2010)
Google Scholar
Arvind., Dertouzos, M., Nikhil, R., Papadopoulos, G.: Project Dataflow: A parallel computing system based on the Monsoon architecture and the Id programming language. Technical report, MIT Lab for Computer Science, computation Structures Group Memo 285, March 1988
Google Scholar
Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: PACT, pp. 7–16 (2004)
Google Scholar
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: SC (2012)
Google Scholar
Bhaskaracharya, S.G., Bondhugula, U.: PolyGLoT: a polyhedral loop transformation framework for a graphical dataflow language. In: Jhala, R., De Bosschere, K. (eds.) Compiler Construction. LNCS, vol. 7791, pp. 123–143. Springer, Heidelberg (2013)
Chapter Google Scholar
Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: PLDI (2008)
Google Scholar
Budimlić, Z., Burke, M., Cavé, V., Knobe, K., Lowney, G., Newton, R., Palsberg, J., Peixotto, D., Sarkar, V., Schlimbach, F., Taşirlar, S.: Concurrent collections. Sci. Program. 18, 203–217 (2010)
Google Scholar
Chandramowlishwaran, A., Knobe, K., Vuduc, R.: Performance evaluation of concurrent collections on high-performance multicore computing systems. In: 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS), pp. 1–12, April 2010
Google Scholar
Chatarasi, P., Shirako, J., Sarkar, V.: Polyhedral optimizations of explicitly parallel programs. In: Proceedings of PACT 2015 (2015)
Google Scholar
Chatterjee, S., Tasrlar, S., Budimlic, Z., Cave, V., Chabbi, M., Grossman, M., Sarkar, V., Yan, Y.: Integrating asynchronous task parallelism with MPI. In: IPDPS (2013)
Google Scholar
Collard, J.-F., Griebl, M.: Array dataflow analysis for explicitly parallel programs. In: Bougé, L., Fraigniaud, P., Mignotte, A., Robert, Y. (eds.) Euro-Par 1996. LNCS, vol. 1123, pp. 406–416. Springer, Heidelberg (1996)
Chapter Google Scholar
Cytron, R.: Doacross: beyond vectorization for multiprocessors. In: ICPP 1986, pp. 836–844 (1986)
Google Scholar
Feautrier, P.: Some efficient solutions to the affine scheduling problem, part II: multidimensional time. Int. J. Parallel Program. 21(6), 389–420 (1992)
Article MathSciNet MATH Google Scholar
Feautrier, P., Lengauer, C.: The polyhedron model. In: Encyclopedia of Parallel Programming (2011)
Google Scholar
Hong, S., Salihoglu, S., Widom, J., Olukotun, K.: Simplifying scalable graph processing with a domain-specific language. In: CGO (2014)
Google Scholar
IntelCorporation: Intel (R) Concurrent Collections for C/C++. http://softwarecommunity.intel.com/articles/eng/3862.htm
Karlin, I., et al.: Lulesh programming model and performance ports overview. Techical report. LLNL-TR-608824, December 2012
Google Scholar
Kong, M., Pop, A., Pouchet, L.N., Govindarajan, R., Cohen, A., Sadayappan, P.: Compiler/runtime framework for dynamic dataflow parallelization of tiled programs. ACM Trans. Archit. Code Optim. (TACO) 11(4), 61 (2015)
Google Scholar
Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978). http://doi.acm.org/10.1145/359545.359563
Article MATH Google Scholar
Pouchet, L.-N.: The Polyhedral Benchmark Suite. http://polybench.sourceforge.net
Lu, Q., Bondhugula, U., Henretty, T., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P., Chen, Y., Lin, H., Fook Ngai, T.: Data layout transformation for enhancing data locality on NUCA chip multiprocessors. In: PACT (2009)
Google Scholar
McGraw, J.: SISAL - Streams and Iteration in a Single-Assignment Language - Version 1.0. Lawrence Livermore National Laboratory, July 1983
Google Scholar
OpenMP Technical Report 3 on OpenMP 4.0 enhancements. http://openmp.org/TR3.pdf
Sarkar, V., Harrod, W., Snavely, A.E.: Software Challenges in Extreme Scale Systems, special Issue on Advanced Computing: The Roadmap to Exascale, January 2010
Google Scholar
Sarkar, V., Hennessy, J.: Partitioning parallel programs for macro-dataflow. In: ACM Conference on LISP and Functional Programming, pp. 202–211, August 1986
Google Scholar
Sbirlea, A., Pouchet, L.N., Sarkar, V.: DFGR: an intermediate graph representation for macro-dataflow programs. In: Fourth International Workshop on Data-Flow Modelsfor Extreme Scale Computing (DFM 2014), August 2014
Google Scholar
Sbîrlea, A., Zou, Y., Budimlić, Z., Cong, J., Sarkar, V.: Mapping a data-flow programming model onto heterogeneous platforms. In: LCTES (2012)
Google Scholar
Shirako, J., Pouchet, L.N., Sarkar, V.: Oil and water can mix: an integration of polyhedral and AST-based transformations. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2014 (2014)
Google Scholar
Shirako, J., Unnikrishnan, P., Chatterjee, S., Li, K., Sarkar, V.: Expressing DOACROSS loop dependencies in OpenMP. In: 9th International Workshop on OpenMP (IWOMP) (2011)
Google Scholar
Stavrou, K., Nikolaides, M., Pavlou, D., Arandi, S., Evripidou, P., Trancoso, P.: TFlux: a portable platform for data-driven multithreading on commodity multicore systems. In: ICPP (2008)
Google Scholar
The STE—AR Group: HPX, a C++ runtime system for parallel and distributed applications of any scale. http://stellar.cct.lsu.edu/tag/hpx
UCLA, Rice, OSU, UCSB: Center for Domain-Specific Computing (CDSC). http://cdsc.ucla.edu
Unnikrishnan, P., Shirako, J., Barton, K., Chatterjee, S., Silvera, R., Sarkar, V.: A practical approach to DOACROSS parallelization. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 219–231. Springer, Heidelberg (2012)
Chapter Google Scholar
Vrvilo, N.: Asynchronous Checkpoint/Restart for the Concurrent Collections Model. MS thesis, Rice University (2014). https://habanero.rice.edu/vrvilo-ms
Wonnacott, D.G.: Constraint-based Array Dependence Analysis. Ph.D. thesis, College Park, MD, USA, uMI Order No. GAX96-22167 (1995)
Google Scholar
Yuki, T., Feautrier, P., Rajopadhye, S., Saraswat, V.: Array dataflow analysis for polyhedral X10 programs. In: Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2007 (2013)
Google Scholar
Yuki, T., Gupta, G., Kim, D.G., Pathan, T., Rajopadhye, S.: AlphaZ: a system for design space exploration in the polyhedral model. In: Kasahara, H., Kimura, K. (eds.) LCPC 2012. LNCS, vol. 7760, pp. 17–31. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Science Foundation through awards 0926127 and 1321147.

Author information

Authors and Affiliations

Rice University, Houston, USA
Alina Sbîrlea, Jun Shirako & Vivek Sarkar
Ohio State University, Columbus, USA
Louis-Noël Pouchet

Authors

Alina Sbîrlea
View author publications
You can also search for this author in PubMed Google Scholar
Jun Shirako
View author publications
You can also search for this author in PubMed Google Scholar
Louis-Noël Pouchet
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alina Sbîrlea .

Editor information

Editors and Affiliations

North Carolina State University, Raleigh, North Carolina, USA
Xipeng Shen
North Carolina State University, Raleigh, North Carolina, USA
Frank Mueller
North Carolina State University, Raleigh, North Carolina, USA
James Tuck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sbîrlea, A., Shirako, J., Pouchet, LN., Sarkar, V. (2016). Polyhedral Optimizations for a Data-Flow Graph Language. In: Shen, X., Mueller, F., Tuck, J. (eds) Languages and Compilers for Parallel Computing. LCPC 2015. Lecture Notes in Computer Science(), vol 9519. Springer, Cham. https://doi.org/10.1007/978-3-319-29778-1_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-29778-1_4
Published: 20 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29777-4
Online ISBN: 978-3-319-29778-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics