Journal of Signal Processing Systems

, Volume 91, Issue 1, pp 75–91 | Cite as

Modeling and Analysis of FPGA Accelerators for Real-Time Streaming Video Processing in the Healthcare Domain

  • Steven van der VlugtEmail author
  • Hadi Alizadeh Ara
  • Rob de Jong
  • Martijn Hendriks
  • Ruben Guerra Marin
  • Marc Geilen
  • Dip Goswami


Complex real-time video processing applications with strict throughput constraints are commonly found in a typical healthcare application. The video processing chain is implemented as Field-Programmable Gate Array (FPGA) accelerators (processing blocks) communicating through a number of First-In First-Out (FIFO) buffers. The FIFO buffers are made out of Block RAM (BRAM) and limited in availability. Therefore, a key design question is the optimal sizes of the FIFO buffers with respect to the throughput constraint. In this paper, we use model-driven analysis and detailed hardware level simulation to address the question of buffer dimensioning in an efficient way. Using a Cyclo-Static Dataflow (CSDF) model and an optimization method, we identify and optimize the FIFO buffers. The results are confirmed using a detailed hardware level simulation and validated by comparison with VHDL simulations. The technique is illustrated on a use case from Philips Healthcare Image Guided Therapy (IGT) on the imaging pipeline of an Interventional X-Ray (i XR) system.


Throughput Latency Buffer sizing Optimized hardware synthesis 



This work has been supported by the ALMARVI European Artemis project nr. 621439.


  1. 1.
    ALMARVI. (2017). ALMARVI project website.
  2. 2.
  3. 3.
    Altera. (2017). Intel SoC device overview.
  4. 4.
    Andrade, H., Correll, J., Ekbal, A., Ghosal, A., Kim, D., Kornerup, J., Limaye, R., Prasad, A., Ravindran, K., Tran, T., et al. (2012). From streaming models to fpga implementations. In Proceedings of the Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA).Google Scholar
  5. 5.
    Ara, H.A., Behrouzian, A., Geilen, M., Hendriks, M., Goswami, D., Basten, T. (2016). Analysis and visualization of execution traces of dataflow applications. In 2016 2nd international workshop on Integrating Dataflow, Embedded computing and Architecture (IDEA), ESR-2017-01, IEEE (pp. 1–8).Google Scholar
  6. 6.
    Bhattacharyya, S.S., Murthy, P.K., Lee, E.A. (1999). Synthesis of embedded software from synchronous dataflow specifications. Journal of VLSI signal processing systems for signal, image and video technology, 21(2), 151–166.CrossRefGoogle Scholar
  7. 7.
    (2001). Describing Synthesizable R, Systemc. Synopsys, May.Google Scholar
  8. 8.
    Falk, J, Haubelt, C., Zebelein, C., Teich, J. (2013). Integrated modeling using finite state machines and dataflow graphs. In Handbook of signal processing systems, Springer (pp. 975–1013).Google Scholar
  9. 9.
    Geilen, M., & Stuijk, S. (2010). Worst-case performance analysis of synchronous dataflow scenarios. In Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, ACM (pp. 125–134).Google Scholar
  10. 10.
    Hendriks, M., Alizadeh Ara, H., Geilen, M., Basten, T., Guerra Marin, R., De Jong, R., Van der Vlugt, S. (2017). Monotonic optimization of dataflow buffer sizes. Submitted to Journal of Signal Processing Systems.Google Scholar
  11. 11.
    Janneck, J.W., Miller, I.D., Parlour, D.B., Roquier, G., Wipliez, M., Raulet, M. (2011). Synthesizing hardware from dataflow programs. Journal of Signal Processing Systems, 63(2), 241–249.CrossRefGoogle Scholar
  12. 12.
    Keinert, J., Dutta, H., Hannig, F., Haubelt, C., Teich, J. (2009). Model-based synthesis and optimization of static multi-rate image processing algorithms. In Proceedings of the conference on design, automation and test in Europe, European design and automation association (pp. 135–140).Google Scholar
  13. 13.
    MathWorks. (2017). Mathworks Embedded Coder.
  14. 14.
    MathWorks. (2017). Mathworks HDL Coder.
  15. 15.
    MathWorks. (2017). Mathworks Simulink.
  16. 16.
    Philips. (2017). Philips iXR Azurion 7 with 12 inch detector.
  17. 17.
    Salunkhe, H., Moreira, O., van Berkel, K. (2014). Buffer allocation for real-time streaming on a multi-processor without back- pressure. In 2014 IEEE 12th symposium on Embedded Systems for Real-time Multimedia (ESTIMedia), IEEE (pp. 20–29).Google Scholar
  18. 18.
    Salunkhe, H., Moreira, O., van Berkel, K. (2014). Mode-controlled dataflow based modeling & analysis of a 4g-lte receiver. In Proceedings of the conference on design, automation & test in Europe, European design and automation association (p. 212).Google Scholar
  19. 19.
    Stuijk, S. (2007). Predictable mapping of streaming applications on multiprocessors. PhD Thesis, Eindhoven University of Technology.Google Scholar
  20. 20.
    Stuijk, S., Basten, T., Geilen, M., Corporaal, H. (2007). Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs. In Proceedings of the 44th annual design automation conference, ACM (pp. 777–782).Google Scholar
  21. 21.
    Stuijk, S., Geilen, M., Basten, T. (2008). Throughput-buffering trade-off exploration for cyclo-static and synchronous dataflow graphs. IEEE Transactions on Computers, 57(10), 1331–1345.MathSciNetCrossRefGoogle Scholar
  22. 22.
    TOPIC. (2017). DYnamic Process LOader.
  23. 23.
    TRACE. (1999). Embedded Systems Innovation by TNO.
  24. 24.
    Xilinx, Inc. (2017). Xilinx Partial Reconfiguration design tool.
  25. 25.
    Xilinx, Inc. (2017). Xilinx SDAccel design tool.
  26. 26.
    Xilinx, Inc. (2017). Xilinx SDSoC design tool.
  27. 27.
    Xilinx, Inc. (2017). Xilinx SoC device overview.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Philips HealthcareBestThe Netherlands
  2. 2.TOPIC Embedded SystemsBestThe Netherlands
  3. 3.Eindhoven University of TechnologyEindhovenThe Netherlands
  4. 4.ESI (TNO)EindhovenThe Netherlands

Personalised recommendations