Exploiting Outer Loops Vectorization in High Level Synthesis

  • Marco LattuadaEmail author
  • Fabrizio Ferrandi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9017)


Synthesis of DoAll loops is a key aspect of High Level Synthesis since they allow to easily exploit the potential parallelism provided by programmable devices. This type of parallelism can be implemented in several ways: by duplicating the implementation of body loop, by exploiting loop pipelining or by applying vectorization.

In this paper a methodology for the synthesis of complex DoAll loops based on outer vectorization is proposed. Vectorization is not limited to the innermost loops: complex constructs such as nested loops, conditional constructs and function calls are supported. Experimental results on parallel benchmarks show up to 7.35x speed-up and up to 40 % reduction of area-delay product.


Clock Cycle Outer Loop Nest Loop Body Loop High Level Synthesis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altera: Quartus II (2013).
  2. 2.
    Choi, J., Brown, S., Anderson, J.: From software threads to parallel hardware in high-level synthesis for fpgas. In: FPT ’13, pp. 270–277, December 2013Google Scholar
  3. 3.
    Cilardo, A., Gallo, L., Mazzocca, N.: Design space exploration for high-level synthesis of multi-threaded applications. Journal of Systems Architecture 59(10, Part D), 1171–1183 (2013)CrossRefGoogle Scholar
  4. 4.
    Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for fpgas: From prototyping to deployment. IEEE TCAD 30(4), 473–491 (2011)Google Scholar
  5. 5.
    Cong, J., Jiang, W.: Pattern-based behavior synthesis for fpga resource reduction. In: FPGA 2008, pp. 107–116. ACM, New York (2008)Google Scholar
  6. 6.
    Fingeroff, M.: High-Level Synthesis Blue Book. Xlibris Corporation (2010)Google Scholar
  7. 7.
    Gupta, S., Savoiu, N., Kim, S., Dutt, N., Gupta, R., Nicolau, A.: Speculation techniques for high level synthesis of control intensive designs. In: DAC 2001, pp. 269–272. ACM, New York (2001)Google Scholar
  8. 8.
    Hadjis, S., Canis, A., Anderson, J.H., Choi, J., Nam, K., Brown, S., Czajkowski, T.: Impact of fpga architecture on resource sharing in high-level synthesis. In: FPGA 2012, pp. 111–114. ACM, New York (2012)Google Scholar
  9. 9.
    Kurra, S., Singh, N.K., Panda, P.R.: The impact of loop unrolling on controller delay in high level synthesis. In: DATE ’07, pp. 391–396 (2007)Google Scholar
  10. 10.
    Mahlke, S.A., Lin, D.C., Chen, W.Y., Hank, R.E., Bringmann, R.A.: Effective compiler support for predicated execution using the hyperblock. SIGMICRO Newsl. 23(1–2), 45–54 (1992)CrossRefGoogle Scholar
  11. 11.
    Morvan, A., Derrien, S., Quinton, P.: Polyhedral bubble insertion: A method to improve nested loop pipelining for high-level synthesis. IEEE TCAD 32(3), 339–352 (2013)Google Scholar
  12. 12.
    Naishlos, D.: Autovectorization in GCC. In: GCC Developers Summit, pp. 105–118 (2004)Google Scholar
  13. 13.
    Nuzman, D., Zaks, A.: Outer-loop vectorization: revisited for short simd architectures. In: PACT 2008, pp. 2–11. ACM, New York (2008).
  14. 14.
    OpenMP: Application Program Interface, version 4.0, July 2013Google Scholar
  15. 15.
    Papakonstantinou, A., Gururaj, K., Stratton, J.A., Chen, D., Cong, J., Hwu, W.M.W.: Efficient compilation of cuda kernels for high-performance computing on fpgas. ACM TECS 13(2), 1–26 (2013)CrossRefGoogle Scholar
  16. 16.
    Pilato, C., Ferrandi, F.: Bambu: A modular framework for the high level synthesis of memory-intensive applications. In: FPL 2013, pp. 1–4, September 2013Google Scholar
  17. 17.
    Raghunathan, V., Raghunathan, A., Srivastava, M., Ercegovac, M.: High-level synthesis with simd units. In: ASP-DAC 2002, pp. 407–413 (2002)Google Scholar
  18. 18.
    Xilinx: Vivado Design Suite (2013).
  19. 19.
    Zuo, W., Liang, Y., Li, P., Rupnow, K., Chen, D., Cong, J.: Improving high level synthesis optimization opportunity through polyhedral transformations. In: FPGA 2013, pp. 9–18. ACM, New York (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Dipartimento di Elettronica, Informazione e BioingegneriaPolitecnico di MilanoMilanItaly

Personalised recommendations