Skip to main content

A Flexible Multi-port Caching Scheme for Reconfigurable Platforms

  • Conference paper
Reconfigurable Computing: Architectures and Applications (ARC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3985))

Included in the following conference series:

Abstract

Memory accesses contribute sunstantially to aggregate system delays. It is critical for designers to ensure that the memory subsystem is designed efficiently, and much work has been done on the exploitation of data re-use for algorithms that exhibit static memory access patterns in FPGAs. The proposed scheme enables the exploitation of data re-use for both static and non-static parallel memory access patterns through the use of a multi-port cache, where parameters can be determined at compile time and matched to the statistical properties of the application, and where sub-cache contentions are arbitrated with a semaphore-based system. A complete hardware implementation demonstrates that, for a motion vector estimation benchmark, the proposed caching scheme results in a cycle count reduction of 51% and execution time reduction of up to 24%, using a Xilinx XC2V6000 FPGA on a Celoxica RC300 board. Hardware resource usage and clock frequency penalties are analyzed while varying the number of ports and cache size. Consequently, it is demonstrated how the optimum cache size and number of ports may be established for a given datapath.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Issenin, I., Dutt, N.: Automatic generation of affine functions for memory optimizations. In: Proceedings of the conference on Design, Automation and Test in Europe, pp. 808–813 (2005)

    Google Scholar 

  2. Kandemir, M., Choudhary, A.: Compiler-directed scratch-pad memory hierarchy design and management. In: Proceedings of the Design Automation Conference, pp. 628–633 (2002)

    Google Scholar 

  3. Udayakumaran, A., Barua, R.: Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In: Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp. 276–279 (2003)

    Google Scholar 

  4. Chalidabhongse, J., Kuo, C.: Fast motion vector estimation using multiresolution-spatio-temporal correlations. IEEE transactions on circuits and systems for video technology 7(3), 477–488 (1997)

    Article  Google Scholar 

  5. Patterson, D.A., Hennessy, J.L.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco (1996)

    MATH  Google Scholar 

  6. Kulkarni, C., Catthoor, F., Man, H.D.: Data and memory optimization techniques for embedded systems. In: Proceedings of the IPDPS Workshops on Parallel and Distributed Processing, pp. 186–193 (2000)

    Google Scholar 

  7. Panda, P., Catthoor, F., Danckaert, K., Brockmeyer, E., Kulkarni, C., Vandercappelle, A., Kjeldsberg, P.: Data and memory optimization techniques for embedded systems. IEEE Transactions on Very Large Scale Integr. Syst. 6(2), 149–206 (2001)

    Google Scholar 

  8. Ishihara, T., Fallah, F.: A way memoization technique for reducing power consumption in caches in Application Specific Integrated Procesors. In: Proceedings of the conference on Design, Automation and Test in Europe, pp. 358–363 (2005)

    Google Scholar 

  9. Nastaran, B., Park, J., Diniz, P.: A compiler analysis and algorithm for exploiting data reuse in configurable architectures with RAM blocks. In: Proceedings of the Field-Programmable Logic and Applications, pp. 1113–1115 (2004)

    Google Scholar 

  10. Guo, Z., Buyukkurt, B., Najjar, W., Vissers, K.: Optimized generation of data-paths from C codes for FPGAs. In: Proceedings of the conference on Design, Automation and Test in Europe, pp. 112–118 (2005)

    Google Scholar 

  11. Sohi, G.S., Franklin, M.: High-bandwidth data memory systems for superscalar processors. In: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 53–62 (1991)

    Google Scholar 

  12. Edmondson, J., Rubinfield, P., Bannon, P., Benschneider, B., Berstein, D., Castelino, R., Cooper, E., Dever, D., Donchin, D., Fischer, T., Jain, A., Mehta, S., Meyer, J., Preston, R., Rajagopalan, V., Somanathan, C., Taylor, S., Wolrich, G.: Internal organization of the Alpha 21164 a 300 MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal 7(1), 119–135 (1995)

    Google Scholar 

  13. Page, I., Luk, W.: Compiling Occam into FPGAs. In: Proceedings of the Field-Programmable Logic and Applications, pp. 271–283 (1991)

    Google Scholar 

  14. Intel: Understanding memory access characteristics of motion estimation algorithms (accessed October 1, 2005), http://www.intel.com/cd/ids/developer/asmo-na/eng/182345.htm?page=2

  15. Celoxica: DK compiler (accessed October 1, 2005), http://www.celoxica.com

  16. Celoxica: RC300 board (accessed October 1, 2005), http://www.celoxica.com/rc300/default.asp

  17. Xilinx: Virtex 2 datasheet (accessed October 1, 2005), http://www.xilinx.com/bvdocs/publications/ds031.pdf

  18. Celoxica: RC300 manual (accessed October 1, 2005), http://www.celoxica.com/techlib/CEL-WO4110816VG-316.pdf

  19. Bouganis, C.S., Constantinides, G., Cheung, P.Y.K.: A novel 2-D design methodology for heterogeneous devices. In: Proceedings of the IEEE International Symposium on Field Programmable Custom Computing Machines, pp. 1–10 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ang, SS., Constantinides, G., Cheung, P., Luk, W. (2006). A Flexible Multi-port Caching Scheme for Reconfigurable Platforms. In: Bertels, K., Cardoso, J.M.P., Vassiliadis, S. (eds) Reconfigurable Computing: Architectures and Applications. ARC 2006. Lecture Notes in Computer Science, vol 3985. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11802839_29

Download citation

  • DOI: https://doi.org/10.1007/11802839_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-36708-6

  • Online ISBN: 978-3-540-36863-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics