Lookahead Memory Prefetching for CGRAs Using Partial Loop Unrolling

Jung, Lukas Johannes; Hochberger, Christian

doi:10.1007/978-3-319-78890-6_8

Lukas Johannes Jung¹⁹ &
Christian Hochberger¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10824))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

2434 Accesses
2 Citations

Abstract

Coarse Grained Reconfigurable Arrays have become an established approach to provide high computational performance in various environments. Several researchers have found that the achievable performance highly depends on the interface between memory and CGRA. In this contribution we show that a smart prefetching mechanism can increase the performance of the CGRA. At the same time it consumes less hardware resources and energy as state of the art prefetching mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Values are based on our FPGA implementation of the System (work in progress).
2.
Also called synthesis in previous publications.
3.
Simply setting \(f=p=0\) and increasing u will result in a worse performance because high u decrease performance as shown in [9].
4.
Note that the number of contexts does not directly correlate to the runtime, because some contexts are executed more often as they are part of inner loops or even different kernels.

References

Archibald, J., Baer, J.L.: Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst. 4(4), 273–298 (1986)
Article Google Scholar
Cong, J., Huang, H., Ma, C., Xiao, B., Zhou, P.: A fully pipelined and dynamically composable architecture of CGRA. In: 2014 FCCM, pp. 9–16, May 2014
Google Scholar
Dahlgren, F., Stenstrom, P.: Evaluation of hardware-based stride and sequential prefetching in shared-memory multiprocessors. TPDS 7(4), 385–398 (1996)
Google Scholar
Fuchs, A., Mannor, S., Weiser, U., Etsion, Y.: Loop-aware memory prefetching using code block working sets. In: 2014 MICRO, pp. 533–544, December 2014
Google Scholar
Gatzka, S., Hochberger, C.: The AMIDAR class of reconfigurable processors. J. Supercomput. 32(2), 163–181 (2005)
Article Google Scholar
Gatzka, S., Hochberger, C.: Hardware based online profiling in AMIDAR processors. In: IPDPS, p. 144b (2005)
Google Scholar
Hashemi, M., Mutlu, O., Patt, Y.N.: Continuous runahead: transparent hardware acceleration for memory intensive workloads. In: 2016 MICRO, pp. 1–12, October 2016
Google Scholar
Hoy, C.H., Govindarajuz, V., Nowatzki, T., Nagaraju, R., Marzec, Z., Agarwal, P., Frericks, C., Cofell, R., Sankaralingam, K.: Performance evaluation of a DySER FPGA prototype system spanning the compiler, microarchitecture, and hardware implementation. In: 2015 ISPASS, pp. 203–214, March 2015
Google Scholar
Jung, L.J., Hochberger, C.: Feasibility of high level compiler optimizations in online synthesis. In: 2015 ReConFig, pp. 1–7, December 2015
Google Scholar
Jung, L.J., Hochberger, C.: Optimal processor interface for CGRA-based accelerators implemented on FPGAs. In: 2016 ReConFig, pp. 1–7, November 2016
Google Scholar
Lee, H., Nguyen, D., Lee, J.: Optimizing stream program performance on CGRA-based systems. In: Proceedings of the 52nd DAC, DAC 2015, pp. 110:1–110:6. ACM, New York (2015)
Google Scholar
Prabhakar, R., Zhang, Y., Koeplinger, D., Feldman, M., Zhao, T., Hadjis, S., Pedram, A., Kozyrakis, C., Olukotun, K.: Plasticine: a reconfigurable architecture for parallel paterns. In: Proceedings of the 44th ISCA, ISCA 2017, pp. 389–402. ACM, New York (2017)
Google Scholar
Ruschke, T., Jung, L.J., Wolf, D., Hochberger, C.: Scheduler for inhomogeneous and irregular CGRAs with support for complex control flow. In: 2016 IPDPSW, pp. 198–207, May 2016
Google Scholar
Vahid, F., Stitt, G., Lysecky, R.: Warp processing: dynamic translation of binaries to FPGA circuits. Computer 41(7), 40–46 (2008)
Article Google Scholar
Veredas, F.J., Scheppler, M., Moffat, W., Mei, B.: Custom implementation of the coarse-grained reconfigurable ADRES architecture for multimedia purposes. In: FPL 2005, pp. 106–111, August 2005
Google Scholar
Yang, C., Liu, L., Yin, S., Wei, S.: Data cache prefetching via context directed pattern matching for coarse-grained reconfigurable arrays. In: 2016 53nd DAC, pp. 1–6, June 2016
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Information Technology, Computer Systems Group, TU Darmstadt, Darmstadt, Germany
Lukas Johannes Jung & Christian Hochberger

Authors

Lukas Johannes Jung
View author publications
You can also search for this author in PubMed Google Scholar
Christian Hochberger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lukas Johannes Jung .

Editor information

Editors and Affiliations

Technological Educational Institute of Western Greece, Antirrio, Greece
Nikolaos Voros
Ruhr-Universität Bochum, Bochum, Germany
Michael Huebner
Technological Educational Institute of Western Greece, Antirrio, Greece
Georgios Keramidas
Technische Universität Dresden, Dresden, Germany
Diana Goehringer
Technological Educational Institute of Western Greece, Antirio, Greece
Christos Antonopoulos
INESC-ID, Lisbon, Portugal
Pedro C. Diniz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jung, L.J., Hochberger, C. (2018). Lookahead Memory Prefetching for CGRAs Using Partial Loop Unrolling. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2018. Lecture Notes in Computer Science(), vol 10824. Springer, Cham. https://doi.org/10.1007/978-3-319-78890-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-78890-6_8
Published: 08 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78889-0
Online ISBN: 978-3-319-78890-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics