Domain-Specific Language and Compiler for Stencil Computation on FPGA-Based Systolic Computational-Memory Array

Luzhou, Wang; Sano, Kentaro; Yamamoto, Satoru

doi:10.1007/978-3-642-28365-9_3

Domain-Specific Language and Compiler for Stencil Computation on FPGA-Based Systolic Computational-Memory Array

Wang Luzhou²⁰,
Kentaro Sano²⁰ &
Satoru Yamamoto²⁰

Conference paper

1471 Accesses
10 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7199))

Abstract

This paper presents a domain-specific language for stencil computation (DSLSC) and its compiler for our FPGA-based systolic computational-memory array (SCMA). In DSLSC, we can program stencil computations by describing their mathematical form instead of writing explicit procedure optimally. The compiler automatically parallelizes stencil computations for processing elements (PEs) of SCMA, and schedules multiply-and-add operations for PEs considering data-reference delay via a local memory or communication FIFOs between PEs. For arbitrary grid-sizes of 2D Jacobi compilation with 3x3 and 5x5 stencils, the compiler achieves high utilization of PEs, 85.6 % and 92.18 %, which are close to 87.5 % and 93.75 % for ideal cases, respectively.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boost C++ Library, http://www.boost.org
Chamberlain, B.L., Snyder, L.: Array language support for parallel sparse computation. In: Proceedings of the 15th International Conference on Supercomputing, pp. 133–145 (June 2001)
Google Scholar
Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12 (November 2008)
Google Scholar
Elliott, D.G., Stumm, M., Snelgrove, W., Cojocaru, C., Mckenzie, R.: Computational ram: Implementing processors in memory. Design & Test of Computers 16(1), 32–41 (1999)
Article Google Scholar
Ferziger, J.H., Perić, M.: Computational Methods for Fluid Dynamics. Springer, Heidelberg (1996)
Book MATH Google Scholar
Hageman, L.A., Young, D.M.: Applied Iterative Methods. Academic Press (1981)
Google Scholar
Kung, H.T.: Why systolic architecture? Computer 15(1), 37–46 (1982)
Article Google Scholar
Luzhou, W., Sano, K., Yamamoto, S.: Local-and-global stall mechanism for systolic computational-memory array on extensible multi-fpga system. In: Proceedings of the International Conference on Field-Programmable Technology (FPT 2010), pp. 102–109 (December 2010)
Google Scholar
Mycroft, D.O.A.: Efficient and correct stencil computation via pattern matching and static typing. In: Proceedings of IFIP Working Conference on Domain-Specific Languages (September 2011) (to appear)
Google Scholar
Sano, K., Iizuka, T., Yamamoto, S.: Systolic architecture for computational fluid dynamics on FPGAs. In: Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 107–116 (April 2007)
Google Scholar
Sano, K., Luzhou, W., Hatsuda, Y., Iizuka, T., Yamamoto, S.: FPGA-array with bandwidth-reduction mechanism for scalable and power-efficient numerical simulations based on finite difference methods. ACM Transactions on Reconfigurable Technology and Systems 3(4) (November 2010), doi:10.1145/1862648.1862651
Google Scholar
Tang, Y., Chowdhury, R., Kuszmaul, B.C., Luk, C.K., Leiserson, C.E.: The pochoir stencil compiler. In: Proceedings of the 23th ACM Symposium on Parallelism in Algorithms and Architectures (June 2011)
Google Scholar
Teich, J., Thiele, L.: Partitioning processor arrays under resource constrains. Journal of VLSI Signal Processing 17, 5–20 (1997)
Article MATH Google Scholar
Underwood, K.D., Hemmert, K.S.: Closing the gap: CPU and FPGA trends in sustainable floating-point blas performance. In: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 219–228 (2004)
Google Scholar
Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Communications of the ACM 52(4), 65–76 (2009)
Article Google Scholar
Wolf, M.E., Lam, M.S.: A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems 2(4), 452–471 (1991)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Sciences, Tohoku University, 6-6-01 Aramaki Aza Aoba, Aoba-ku, Sendai, 980-8579, Japan
Wang Luzhou, Kentaro Sano & Satoru Yamamoto

Authors

Wang Luzhou
View author publications
You can also search for this author in PubMed Google Scholar
Kentaro Sano
View author publications
You can also search for this author in PubMed Google Scholar
Satoru Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, China
Oliver C. S. Choy
Department of Electronic Engineering, City University of Hong Kong, Kowloon Tong, Hong Kong, China
Ray C. C. Cheung
Department of ECE, Virginia Tech., 302 Whittemore Hall, 24061, Blacksburg, VA, USA
Peter Athanas
Tohoku University, 6-6-01 Aramaki Aza Aoba, Aobaku, 981-8579, Sendai, Miyagi, Japan
Kentaro Sano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luzhou, W., Sano, K., Yamamoto, S. (2012). Domain-Specific Language and Compiler for Stencil Computation on FPGA-Based Systolic Computational-Memory Array. In: Choy, O.C.S., Cheung, R.C.C., Athanas, P., Sano, K. (eds) Reconfigurable Computing: Architectures, Tools and Applications. ARC 2012. Lecture Notes in Computer Science, vol 7199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28365-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-28365-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28364-2
Online ISBN: 978-3-642-28365-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics