Abstract
This paper presents a domain-specific language for stencil computation (DSLSC) and its compiler for our FPGA-based systolic computational-memory array (SCMA). In DSLSC, we can program stencil computations by describing their mathematical form instead of writing explicit procedure optimally. The compiler automatically parallelizes stencil computations for processing elements (PEs) of SCMA, and schedules multiply-and-add operations for PEs considering data-reference delay via a local memory or communication FIFOs between PEs. For arbitrary grid-sizes of 2D Jacobi compilation with 3x3 and 5x5 stencils, the compiler achieves high utilization of PEs, 85.6 % and 92.18 %, which are close to 87.5 % and 93.75 % for ideal cases, respectively.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Boost C++ Library, http://www.boost.org
Chamberlain, B.L., Snyder, L.: Array language support for parallel sparse computation. In: Proceedings of the 15th International Conference on Supercomputing, pp. 133–145 (June 2001)
Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12 (November 2008)
Elliott, D.G., Stumm, M., Snelgrove, W., Cojocaru, C., Mckenzie, R.: Computational ram: Implementing processors in memory. Design & Test of Computers 16(1), 32–41 (1999)
Ferziger, J.H., Perić, M.: Computational Methods for Fluid Dynamics. Springer, Heidelberg (1996)
Hageman, L.A., Young, D.M.: Applied Iterative Methods. Academic Press (1981)
Kung, H.T.: Why systolic architecture? Computer 15(1), 37–46 (1982)
Luzhou, W., Sano, K., Yamamoto, S.: Local-and-global stall mechanism for systolic computational-memory array on extensible multi-fpga system. In: Proceedings of the International Conference on Field-Programmable Technology (FPT 2010), pp. 102–109 (December 2010)
Mycroft, D.O.A.: Efficient and correct stencil computation via pattern matching and static typing. In: Proceedings of IFIP Working Conference on Domain-Specific Languages (September 2011) (to appear)
Sano, K., Iizuka, T., Yamamoto, S.: Systolic architecture for computational fluid dynamics on FPGAs. In: Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 107–116 (April 2007)
Sano, K., Luzhou, W., Hatsuda, Y., Iizuka, T., Yamamoto, S.: FPGA-array with bandwidth-reduction mechanism for scalable and power-efficient numerical simulations based on finite difference methods. ACM Transactions on Reconfigurable Technology and Systems 3(4) (November 2010), doi:10.1145/1862648.1862651
Tang, Y., Chowdhury, R., Kuszmaul, B.C., Luk, C.K., Leiserson, C.E.: The pochoir stencil compiler. In: Proceedings of the 23th ACM Symposium on Parallelism in Algorithms and Architectures (June 2011)
Teich, J., Thiele, L.: Partitioning processor arrays under resource constrains. Journal of VLSI Signal Processing 17, 5–20 (1997)
Underwood, K.D., Hemmert, K.S.: Closing the gap: CPU and FPGA trends in sustainable floating-point blas performance. In: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 219–228 (2004)
Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Communications of the ACM 52(4), 65–76 (2009)
Wolf, M.E., Lam, M.S.: A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems 2(4), 452–471 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luzhou, W., Sano, K., Yamamoto, S. (2012). Domain-Specific Language and Compiler for Stencil Computation on FPGA-Based Systolic Computational-Memory Array. In: Choy, O.C.S., Cheung, R.C.C., Athanas, P., Sano, K. (eds) Reconfigurable Computing: Architectures, Tools and Applications. ARC 2012. Lecture Notes in Computer Science, vol 7199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28365-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-28365-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28364-2
Online ISBN: 978-3-642-28365-9
eBook Packages: Computer ScienceComputer Science (R0)