A Flexible Multi-port Caching Scheme for Reconfigurable Platforms

Ang, Su-Shin; Constantinides, George; Cheung, Peter; Luk, Wayne

doi:10.1007/11802839_29

Su-Shin Ang¹⁹,
George Constantinides¹⁹,
Peter Cheung¹⁹ &
…
Wayne Luk²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3985))

Included in the following conference series:

International Workshop on Applied Reconfigurable Computing

984 Accesses
4 Citations

Abstract

Memory accesses contribute sunstantially to aggregate system delays. It is critical for designers to ensure that the memory subsystem is designed efficiently, and much work has been done on the exploitation of data re-use for algorithms that exhibit static memory access patterns in FPGAs. The proposed scheme enables the exploitation of data re-use for both static and non-static parallel memory access patterns through the use of a multi-port cache, where parameters can be determined at compile time and matched to the statistical properties of the application, and where sub-cache contentions are arbitrated with a semaphore-based system. A complete hardware implementation demonstrates that, for a motion vector estimation benchmark, the proposed caching scheme results in a cycle count reduction of 51% and execution time reduction of up to 24%, using a Xilinx XC2V6000 FPGA on a Celoxica RC300 board. Hardware resource usage and clock frequency penalties are analyzed while varying the number of ports and cache size. Consequently, it is demonstrated how the optimum cache size and number of ports may be established for a given datapath.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Issenin, I., Dutt, N.: Automatic generation of affine functions for memory optimizations. In: Proceedings of the conference on Design, Automation and Test in Europe, pp. 808–813 (2005)
Google Scholar
Kandemir, M., Choudhary, A.: Compiler-directed scratch-pad memory hierarchy design and management. In: Proceedings of the Design Automation Conference, pp. 628–633 (2002)
Google Scholar
Udayakumaran, A., Barua, R.: Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In: Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp. 276–279 (2003)
Google Scholar
Chalidabhongse, J., Kuo, C.: Fast motion vector estimation using multiresolution-spatio-temporal correlations. IEEE transactions on circuits and systems for video technology 7(3), 477–488 (1997)
Article Google Scholar
Patterson, D.A., Hennessy, J.L.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco (1996)
MATH Google Scholar
Kulkarni, C., Catthoor, F., Man, H.D.: Data and memory optimization techniques for embedded systems. In: Proceedings of the IPDPS Workshops on Parallel and Distributed Processing, pp. 186–193 (2000)
Google Scholar
Panda, P., Catthoor, F., Danckaert, K., Brockmeyer, E., Kulkarni, C., Vandercappelle, A., Kjeldsberg, P.: Data and memory optimization techniques for embedded systems. IEEE Transactions on Very Large Scale Integr. Syst. 6(2), 149–206 (2001)
Google Scholar
Ishihara, T., Fallah, F.: A way memoization technique for reducing power consumption in caches in Application Specific Integrated Procesors. In: Proceedings of the conference on Design, Automation and Test in Europe, pp. 358–363 (2005)
Google Scholar
Nastaran, B., Park, J., Diniz, P.: A compiler analysis and algorithm for exploiting data reuse in configurable architectures with RAM blocks. In: Proceedings of the Field-Programmable Logic and Applications, pp. 1113–1115 (2004)
Google Scholar
Guo, Z., Buyukkurt, B., Najjar, W., Vissers, K.: Optimized generation of data-paths from C codes for FPGAs. In: Proceedings of the conference on Design, Automation and Test in Europe, pp. 112–118 (2005)
Google Scholar
Sohi, G.S., Franklin, M.: High-bandwidth data memory systems for superscalar processors. In: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 53–62 (1991)
Google Scholar
Edmondson, J., Rubinfield, P., Bannon, P., Benschneider, B., Berstein, D., Castelino, R., Cooper, E., Dever, D., Donchin, D., Fischer, T., Jain, A., Mehta, S., Meyer, J., Preston, R., Rajagopalan, V., Somanathan, C., Taylor, S., Wolrich, G.: Internal organization of the Alpha 21164 a 300 MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal 7(1), 119–135 (1995)
Google Scholar
Page, I., Luk, W.: Compiling Occam into FPGAs. In: Proceedings of the Field-Programmable Logic and Applications, pp. 271–283 (1991)
Google Scholar
Intel: Understanding memory access characteristics of motion estimation algorithms (accessed October 1, 2005), http://www.intel.com/cd/ids/developer/asmo-na/eng/182345.htm?page=2
Celoxica: DK compiler (accessed October 1, 2005), http://www.celoxica.com
Celoxica: RC300 board (accessed October 1, 2005), http://www.celoxica.com/rc300/default.asp
Xilinx: Virtex 2 datasheet (accessed October 1, 2005), http://www.xilinx.com/bvdocs/publications/ds031.pdf
Celoxica: RC300 manual (accessed October 1, 2005), http://www.celoxica.com/techlib/CEL-WO4110816VG-316.pdf
Bouganis, C.S., Constantinides, G., Cheung, P.Y.K.: A novel 2-D design methodology for heterogeneous devices. In: Proceedings of the IEEE International Symposium on Field Programmable Custom Computing Machines, pp. 1–10 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electrical and Electronics Engineering, Imperial College, London, UK
Su-Shin Ang, George Constantinides & Peter Cheung
Dept. of Computing, Imperial College, London, UK
Wayne Luk

Authors

Su-Shin Ang
View author publications
You can also search for this author in PubMed Google Scholar
George Constantinides
View author publications
You can also search for this author in PubMed Google Scholar
Peter Cheung
View author publications
You can also search for this author in PubMed Google Scholar
Wayne Luk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TU Delft, Computer Engineering Laboratory, EEMCS, The Netherlands
Koen Bertels
IST/INESC-ID, Lisboa, Portugal
João M. P. Cardoso
TUDelft., Computer Engineering Lab, Postbus 5031, 2600 GA, Delft, The Netherlands
Stamatis Vassiliadis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ang, SS., Constantinides, G., Cheung, P., Luk, W. (2006). A Flexible Multi-port Caching Scheme for Reconfigurable Platforms. In: Bertels, K., Cardoso, J.M.P., Vassiliadis, S. (eds) Reconfigurable Computing: Architectures and Applications. ARC 2006. Lecture Notes in Computer Science, vol 3985. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11802839_29

Download citation

DOI: https://doi.org/10.1007/11802839_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36708-6
Online ISBN: 978-3-540-36863-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics