Skip to main content

Exploiting Outer Loops Vectorization in High Level Synthesis

  • Conference paper
  • First Online:
Architecture of Computing Systems – ARCS 2015 (ARCS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9017))

Included in the following conference series:

Abstract

Synthesis of DoAll loops is a key aspect of High Level Synthesis since they allow to easily exploit the potential parallelism provided by programmable devices. This type of parallelism can be implemented in several ways: by duplicating the implementation of body loop, by exploiting loop pipelining or by applying vectorization.

In this paper a methodology for the synthesis of complex DoAll loops based on outer vectorization is proposed. Vectorization is not limited to the innermost loops: complex constructs such as nested loops, conditional constructs and function calls are supported. Experimental results on parallel benchmarks show up to 7.35x speed-up and up to 40 % reduction of area-delay product.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altera: Quartus II (2013). http://www.altera.com

  2. Choi, J., Brown, S., Anderson, J.: From software threads to parallel hardware in high-level synthesis for fpgas. In: FPT ’13, pp. 270–277, December 2013

    Google Scholar 

  3. Cilardo, A., Gallo, L., Mazzocca, N.: Design space exploration for high-level synthesis of multi-threaded applications. Journal of Systems Architecture 59(10, Part D), 1171–1183 (2013)

    Article  Google Scholar 

  4. Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for fpgas: From prototyping to deployment. IEEE TCAD 30(4), 473–491 (2011)

    Google Scholar 

  5. Cong, J., Jiang, W.: Pattern-based behavior synthesis for fpga resource reduction. In: FPGA 2008, pp. 107–116. ACM, New York (2008)

    Google Scholar 

  6. Fingeroff, M.: High-Level Synthesis Blue Book. Xlibris Corporation (2010)

    Google Scholar 

  7. Gupta, S., Savoiu, N., Kim, S., Dutt, N., Gupta, R., Nicolau, A.: Speculation techniques for high level synthesis of control intensive designs. In: DAC 2001, pp. 269–272. ACM, New York (2001)

    Google Scholar 

  8. Hadjis, S., Canis, A., Anderson, J.H., Choi, J., Nam, K., Brown, S., Czajkowski, T.: Impact of fpga architecture on resource sharing in high-level synthesis. In: FPGA 2012, pp. 111–114. ACM, New York (2012)

    Google Scholar 

  9. Kurra, S., Singh, N.K., Panda, P.R.: The impact of loop unrolling on controller delay in high level synthesis. In: DATE ’07, pp. 391–396 (2007)

    Google Scholar 

  10. Mahlke, S.A., Lin, D.C., Chen, W.Y., Hank, R.E., Bringmann, R.A.: Effective compiler support for predicated execution using the hyperblock. SIGMICRO Newsl. 23(1–2), 45–54 (1992)

    Article  Google Scholar 

  11. Morvan, A., Derrien, S., Quinton, P.: Polyhedral bubble insertion: A method to improve nested loop pipelining for high-level synthesis. IEEE TCAD 32(3), 339–352 (2013)

    Google Scholar 

  12. Naishlos, D.: Autovectorization in GCC. In: GCC Developers Summit, pp. 105–118 (2004)

    Google Scholar 

  13. Nuzman, D., Zaks, A.: Outer-loop vectorization: revisited for short simd architectures. In: PACT 2008, pp. 2–11. ACM, New York (2008). http://doi.acm.org/10.1145/1454115.1454119

  14. OpenMP: Application Program Interface, version 4.0, July 2013

    Google Scholar 

  15. Papakonstantinou, A., Gururaj, K., Stratton, J.A., Chen, D., Cong, J., Hwu, W.M.W.: Efficient compilation of cuda kernels for high-performance computing on fpgas. ACM TECS 13(2), 1–26 (2013)

    Article  Google Scholar 

  16. Pilato, C., Ferrandi, F.: Bambu: A modular framework for the high level synthesis of memory-intensive applications. In: FPL 2013, pp. 1–4, September 2013

    Google Scholar 

  17. Raghunathan, V., Raghunathan, A., Srivastava, M., Ercegovac, M.: High-level synthesis with simd units. In: ASP-DAC 2002, pp. 407–413 (2002)

    Google Scholar 

  18. Xilinx: Vivado Design Suite (2013). http://www.xilinx.com

  19. Zuo, W., Liang, Y., Li, P., Rupnow, K., Chen, D., Cong, J.: Improving high level synthesis optimization opportunity through polyhedral transformations. In: FPGA 2013, pp. 9–18. ACM, New York (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Lattuada .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Lattuada, M., Ferrandi, F. (2015). Exploiting Outer Loops Vectorization in High Level Synthesis. In: Pinho, L., Karl, W., Cohen, A., Brinkschulte, U. (eds) Architecture of Computing Systems – ARCS 2015. ARCS 2015. Lecture Notes in Computer Science(), vol 9017. Springer, Cham. https://doi.org/10.1007/978-3-319-16086-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16086-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16085-6

  • Online ISBN: 978-3-319-16086-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics