Reconfigurable Buffer Structures for Coarse-Grained Reconfigurable Arrays

  • Éricles Sousa
  • Frank Hannig
  • Jürgen Teich
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 523)


Coarse-Grained Reconfigurable Arrays (CGRAs) have emerged as a powerful solution to speedup computationally intensive applications. Heterogeneous MPSoC architectures containing such reconfigurable accelerators have the advantage of providing high flexibility, power-efficiency, and high performance. However, CGRAs may suffer from a data access bottleneck. To mitigate this problem, we present a reconfigurable buffer architecture for CGRAs. Here, the buffers can be configured at runtime to select between different schemes for memory access, i.e., addressable RAMs or pixel buffers. We showcase the benefits of our approach by prototyping a heterogeneous MPSoC architecture containing a RISC processor and a class of CGRA called Tightly Coupled Processor Arrays (TCPAs). The architecture is prototyped in FPGA technology. For basic image processing algorithms, we demonstrate that our proposed buffer structures for system integration allow to increase the memory bandwidth utilization and allow for a performance improvement of up to 7% in comparison to state-of-the-art solutions for image processing.



This work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing” (SFB/TR 89) and Research Training Group 1773 “Heterogeneous Image Systems”. The first author is also grateful to the Brazilian National Council for Scientific and Technological Development (CNPq) for supporting his research.


  1. 1.
    Goulding-Hotta, N., Sampson, J., Venkatesh, G., Garcia, S., Auricchio, J., Huang, P., Arora, M., Nath, S., Bhatt, V., Babb, J., Swanson, S., Taylor, M.: The GreenDroid mobile application processor: an architecture for silicon’s dark future. IEEE Micro 31(2), 86–95 (2011)CrossRefGoogle Scholar
  2. 2.
    Yongjun, P., Park, J., Mahlke, S.: Efficient performance scaling of future CGRAs for mobile applications. In: International Conference on Field-Programmable Technology (FPT), pp. 335–342 December 2012Google Scholar
  3. 3.
    Hannig, F., Schmid, M., Lari, V., Boppu, S., Teich, J.: System integration of tightly-coupled processor arrays using reconfigurable buffer structures. In: Proceedings of the ACM International Conference on Computing Frontiers (CF). ACM (2013)Google Scholar
  4. 4.
    Baumgarte, V., Ehlers, G., May, F., Nückel, A., Vorbach, M., Weinhardt, M.: PACT XPP - a self-reconfigurable data processing architecture. J. Supercomput. 26(2), 167–184 (2003)CrossRefGoogle Scholar
  5. 5.
    Bouwens, F., Berekovic, M., De Sutter, B., Gaydadjiev, G.: Architecture enhancements for the ADRES coarse-grained reconfigurable array. In: Proceedings of the International Conference on High Performance Embedded Architectures and Compilers, pp. 66–81 (2008)Google Scholar
  6. 6.
    Schmidt, M., Reichenbach, M., Fey, D.: A generic VHDL template for 2D stencil code applications on FPGAs. In: Proceedings of the International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops, pp. 180–187 (2012)Google Scholar
  7. 7.
    Liang, X., Jean, J., Tomko, K.: Data buffering and allocation in mapping generalized template matching on reconfigurable systems. J. Supercomput. 19(1), 77–91 (2001)CrossRefGoogle Scholar
  8. 8.
    Guo, Z., Buyukkurt, B., Najjar, W.: Input data reuse in compiling window operations onto reconfigurable hardware. In: Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES). ACM, pp. 249–256 (2004)Google Scholar
  9. 9.
    Hannig, F., Ruckdeschel, H., Dutta, H., Teich, J.: PARO: synthesis of hardware accelerators for multi-dimensional dataflow-intensive applications. In: Woods, R., Compton, K., Bouganis, C., Diniz, P.C. (eds.) ARC 2008. LNCS, vol. 4943, pp. 287–293. Springer, Heidelberg (2008). Scholar
  10. 10.
    Hannig, F., Lari, V., Boppu, S., Tanase, A., Reiche, O.: Invasive tightly-coupled processor arrays: a domain-specific architecture/compiler co-design approach. ACM Trans. Embed. Comput. Syst. (TECS) 13(4s), 133:1–133:29 (2014)Google Scholar
  11. 11.
    Kissler, D., Strawetz, A., Hannig, F., Teich, J.: Power-efficient reconfiguration control in coarse-grained dynamically reconfigurable architectures. J. Low Power Electron. 5(1), 96–105 (2009)CrossRefGoogle Scholar
  12. 12.
    Kissler, D., Hannig, F., Kupriyanov, A., Teich, J.: A highly parameterizable parallel processor array architecture. In: Proceedings of the IEEE International Conference on Field Programmable Technology (FPT). IEEE, pp. 105–112 (2006)Google Scholar
  13. 13.
    Teich, J.: A compiler for application specific processor arrays, ser. Reihe Elektrotechnik. Shaker (1993).
  14. 14.
    Teich, J., Thiele, L., Zhang, L.: Scheduling of partitioned regular algorithms on processor arrays with constrained resources. In: International Conference on Application-Specific Systems, Architectures, and Processors (ASAP), pp. 131–144, August 1996.
  15. 15.
    Hannig, F., Dutta, H., Teich, J.: Mapping a class of dependence algorithms to coarse-grained reconfigurable arrays: architectural parameters and methodology. Int. J. Emb. Syst. 2(1/2), 114–127 (2006)CrossRefGoogle Scholar
  16. 16.
    Teich, J.: Invasive Algorithms and Architectures. IT-Inf. Technol. 50(5), 300–310 (2008)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2017

Authors and Affiliations

  1. 1.Hardware/Software Co-design, Department of Computer ScienceFriedrich-Alexander-Universität Erlangen-Nürnberg (FAU)ErlangenGermany

Personalised recommendations