Locality Optimization of Stencil Applications Using Data Dependency Graphs

Orozco, Daniel; Garcia, Elkin; Gao, Guang

doi:10.1007/978-3-642-19595-2_6

Daniel Orozco¹⁷,
Elkin Garcia¹⁷ &
Guang Gao¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6548))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

850 Accesses
15 Citations

Abstract

This paper proposes tiling techniques based on data dependencies and not in code structure.

The work presented here leverages and expands previous work by the authors in the domain of non traditional tiling for parallel applications.

The main contributions of this paper are: (1) A formal description of tiling from the point of view of the data produced and not from the source code. (2) A mathematical proof for an optimum tiling in terms of maximum reuse for stencil applications, addressing the disparity between computation power and memory bandwidth for many-core architectures. (3) A description and implementation of our tiling technique for well known stencil applications. (4) Experimental evidence that confirms the effectiveness of the tiling proposed to alleviate the disparity between computation power and memory bandwidth for many-core architectures. Our experiments, performed using one of the first Cyclops-64 many-core chips produced, confirm the effectiveness of our approach to reduce the total number of memory operations of stencil applications as well as the running time of the application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

del Cuvillo, J., Zhu, W., Hu, Z., Gao, G.R.: Toward a software infrastructure for the cyclops-64 cellular architecture. In: 20th International Symposium on High-Performance Computing in an Advanced Collaborative Environment, HPCS 2006, p. 9 (May 2006)
Google Scholar
Garcia, E., Venetis, I.E., Khan, R., Gao, G.: Optimized dense matrix multiplication on a many-core architecture. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6272, pp. 316–327. Springer, Heidelberg (2010)
Chapter Google Scholar
Irigoin, F., Triolet, R.: Supernode partitioning. In: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1988, pp. 319–329. ACM, New York (1988), http://doi.acm.org/10.1145/73560.73588
Google Scholar
Krishnamoorthy, S., Baskaran, M., Bondhugula, U., Ramanujam, J., Rountev, A., Sadayappan, P.: Effective automatic parallelization of stencil computations. SIGPLAN Not. 42(6), 235–244 (2007)
Article Google Scholar
Lam, M.S., Wolf, M.E.: A data locality optimizing algorithm. SIGPLAN Not. 39(4), 442–459 (2004)
Article Google Scholar
Lim, A.W., Cheong, G.I., Lam, M.S.: An affine partitioning algorithm to maximize parallelism and minimize communication. In: ICS 1999: Proceedings of the 13th International Conference on Supercomputing, pp. 228–237. ACM, New York (1999)
Chapter Google Scholar
Orozco, D., Gao, G.: Diamond Tiling: A Tiling Framework for Time-iterated Scientific Applications. In: CAPSL Technical Memo 91. University of Delaware (2009)
Google Scholar
Orozco, D., Gao, G.: Mapping the fdtd application for many core processor. In: International Conference on Parallel Processing ICPP (2009)
Google Scholar
Rajopadhye, S.: Dependence analysis and parallelizing transformations. In: Srikant, Y.N.S., Shankar, P. (eds.) Handbook on Compiler Design, 1st edn. CRC Press, Boca Raton (2002) (in press)
Google Scholar
Ramanujam, J., Sadayappan, P.: Tiling multidimensional iteration spaces for multicomputers. Journal of Parallel and Distributed Computing 16(2), 108–120 (1992)
Article Google Scholar
Schreiber, R., Dongarra, J.: Automatic Blocking of Nested Loops (1990)
Google Scholar
Shirako, J., Peixotto, D.M., Sarkar, V., Scherer, W.N.: Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. In: ICS 2008, pp. 277–288. ACM, New York (2008)
Google Scholar
Venetis, I.E., Gao, G.R.: Mapping the LU Decomposition on a Many-Core Architecture: Challenges and Solutions. In: Proceedings of the 6th ACM Conference on Computing Frontiers (CF 2009), Ischia, Italy, pp. 71–80 (May 2009)
Google Scholar
Wolf, M.E., Lam, M.S.: A data locality optimizing algorithm. SIGPLAN Not. 26(6), 30–44 (1991)
Article Google Scholar
Wolfe, M.: More iteration space tiling. In: Supercomputing 1989: Proceedings of the 1989 ACM/IEEE Conference on Supercomputing, pp. 655–664. ACM, New York (1989)
Chapter Google Scholar
Yee, K.: Numerical solution of inital boundary value problems involving maxwell’s equations in isotropic media. IEEE Transactions on Antennas and Propagation 14(3), 302–307 (1966)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering Department, University of Delaware, US
Daniel Orozco, Elkin Garcia & Guang Gao

Authors

Daniel Orozco
View author publications
You can also search for this author in PubMed Google Scholar
Elkin Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Guang Gao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Rice University, 6100 Main Street, 77005-1892, Houston, TX, USA
Keith Cooper , John Mellor-Crummey & Vivek Sarkar , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Orozco, D., Garcia, E., Gao, G. (2011). Locality Optimization of Stencil Applications Using Data Dependency Graphs. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds) Languages and Compilers for Parallel Computing. LCPC 2010. Lecture Notes in Computer Science, vol 6548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19595-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-19595-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19594-5
Online ISBN: 978-3-642-19595-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics