Skip to main content

Scalable and Efficient Linear Algebra Kernel Mapping for Low Energy Consumption on the Layers CGRA

  • Conference paper
  • First Online:
Applied Reconfigurable Computing (ARC 2015)

Abstract

A scalable mapping is proposed for 3 important kernels from the Numerical Linear Algebra domain, to exploit architectural features to reach asymptotically optimal efficiency and a low energy consumption. Performance and power evaluations were done with input data set matrix sizes ranging from 64\(\times \)64 to 16384\(\times \)16384. 12 architectural variants with up to 10\(\times \)10 processing elements were used to explore scalability of the mapping and the architecture, achieving \(<10\,\%\) energy increase for architectures up to 8\(\times \)8 PEs coupled with performance speed-ups of more than an order of magnitude. This enables a clean area-performance trade-off on the Layers architecture while keeping energy constant over the variants.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali, M., Stotzer, E., Igual, F.D., van de Geijn, R.A.: Level-3 BLAS on the TI C6678 multi-core DSP. In: Proc. of the 2012 IEEE 24th Intl. Simp. on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 179–186. IEEE (2012)

    Google Scholar 

  2. Chattopadhyay, A.: Ingredients of adaptability: a survey of reconfigurable processors. VLSI Design 2013, 10 (2013)

    Article  MathSciNet  Google Scholar 

  3. DeHon, A.: The density advantage of configurable computing. Computer 33(4), 41–49 (2000)

    Article  Google Scholar 

  4. Fell, A., Rákossy, Z.E., Chattopadhyay, A.: Force-directed scheduling for data-flow graph mapping on coarse-grained reconfigurable architectures. In: Reconfigurable Computing and FPGAs (ReConFig), IEEE (2014)

    Google Scholar 

  5. Gonzalez, J., Núñez, R.C.: LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators. In: Journal of Physics: Conference Series 180, p. 012042. IOP Publishing (2009)

    Google Scholar 

  6. Lei, Y., Dou, Y., Dong, Y., Zhou, J., Xia, F.: FPGA implementation of an exact dot product and its application in variable-precision floating-point arithmetic. The Journal of Supercomputing 64(2), 580–605 (2013). http://dx.doi.org/10.1007/s11227-012-0860-0

    Article  Google Scholar 

  7. Pedram, A., van de Geijn, R.A., Gerstlauer, A.: Codesign tradeoffs for high-performance, low-power linear algebra architectures. IEEE Trans. Comput. 61(12), 1724–1736 (2012)

    Article  MathSciNet  Google Scholar 

  8. Rákossy, Z.E., Acosta Aponte, A., Chattopadhyay, A.: Exploiting architecture description language for diverse IP synthesis in heterogeneous MPSoC. In: Reconfigurable Computing and FPGAs (ReConFig). IEEE (2013)

    Google Scholar 

  9. Rákossy, Z.E., Merchant, F., Acosta Aponte, A., Nandy, S., Chattopadhyay, A.: Scalable and energy-efficient reconfigurable accelerator for column-wise givens rotation. In: 22nd International Conference on Very Large Scale Integration (VLSI-SoC). IEEE (2014)

    Google Scholar 

  10. Rákossy, Z.E., Naphade, T., Chattopadhyay, A.: Design and analysis of layered coarse-grained reconfigurable architecture. In: Reconfigurable Computing and FPGAs (ReConFig), pp. 1–6 (2012)

    Google Scholar 

  11. Volkov, V., Demmel, J.W.: Benchmarking gpus to tune dense linear algebra. In: Proc. of the 2008 ACM/IEEE Conf. on Supercomputing, p. 31. IEEE Press (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zoltán Endre Rákossy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Rákossy, Z.E., Stengele, D., Acosta-Aponte, A., Chafekar, S., Bientinesi, P., Chattopadhyay, A. (2015). Scalable and Efficient Linear Algebra Kernel Mapping for Low Energy Consumption on the Layers CGRA. In: Sano, K., Soudris, D., Hübner, M., Diniz, P. (eds) Applied Reconfigurable Computing. ARC 2015. Lecture Notes in Computer Science(), vol 9040. Springer, Cham. https://doi.org/10.1007/978-3-319-16214-0_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16214-0_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16213-3

  • Online ISBN: 978-3-319-16214-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics