Journal of Real-Time Image Processing

, Volume 16, Issue 2, pp 305–319 | Cite as

An FPGA 2D-convolution unit based on the CAPH language

  • Abiel Aguilar-GonzálezEmail author
  • Miguel Arias-Estrada
  • Madaín Pérez-Patricio
  • J. L. Camas-Anzueto
Original Research Paper


Convolution is an important operation in image processing applications, such as edge detection, sharpening and adding blurring. Convolving video streams in real time is a challenging task for PC systems, however, FPGA devices can successfully be used in these tasks. In this article, the design and implementation of a reconfigurable FPGA architecture for 2D-convolution filtering is described. The filtered frames are calculated at a rate of 103 frames per second for images up to \(1200\times 720\) pixel resolution. Using a shift-based arithmetic and circular buffers, the developed FPGA architecture allows to reduce the hardware resource consumption up to 98 % compared to the conventional convolution implementations, provides high speed processing and enables to manage large number of different convolution kernels. On the other hand, using the CAPH language, it is possible to reduce the design time up to 75 % compared to the plain VHDL design. Furthermore, to maintain high flexibility in concordance with the input video, the developed hardware allows to configure the resolution of the input images with values of \(3\times \textit{Y}\) up to \(1200\times \textit{Y}\), and allows scalability for different sizes of convolution kernels of simple and systematic form. Finally, the developed FPGA architecture for the proposed method was implemented and validated in an FPGA Cyclone II EP2C35F672C6 embedded in an Altera development board DE2.


FPGA CAPH 2D-convolution. 

Supplementary material

11554_2015_535_MOESM1_ESM.rar (4 kb)
Supplemaentary material 1. (rar 4 kb)
11554_2015_535_MOESM2_ESM.rar (92.8 mb)
Supplemaentary material 2. (95,071 kb)
11554_2015_535_MOESM3_ESM.rar (343 kb)
Supplemaentary material 3. (rar 344 kb)


  1. 1.
    Aniruddha Acharya, K., Venkatesh Babu, R., Vadhiyar, S.S.: A real-time implementation of SIFT using GPU. J Real-Time Image Proc (2014). doi: 10.1007/s11554-014-0446-6 Google Scholar
  2. 2.
    Asgher, U., Muhammad, H., Hamza, H., Ahmad, R., Butt, S., Jamil, M.: A temporal superresolution method applied to low-light cardiac fluorescence microscopy. In: Proceedings of The 2013 Asilomar Conference on Signals, pp. 1073–1077. Systems and Computers, IEEE, Pacific Grove, CA (2013)Google Scholar
  3. 3.
    Asgher, U., Muhammad, H., Hamza, H., Ahmad, R., Butt, S., Jamil, M.: Robust hybrid normalized convolution and forward error correction in image reconstruction. In: Proceedings of The 10th International Conference on Innovations in Information Technology, pp. 54–59. IEEE, Al Ain (2014)Google Scholar
  4. 4.
    Barina, D., Zemcik P.: Vectorization and parallelization of 2-D wavelet lifting. J Real-Time Image Proc (2015). doi: 10.1007/s11554-015-0486-6 Google Scholar
  5. 5.
    Braun, L., Gohringer, D., Perschke, T., Schatz, V., Hubner, M., Becker, J.: Adaptive real-time image processing exploiting two dimensional reconfigurable architecture. J Real-Time Image Proc 4, 109–125 (2009)CrossRefGoogle Scholar
  6. 6.
    Colodro-Conde, C., Toledo-Moreo, F., Toledo-Moreo, R., Martínez-Álvarez, J., Garrigós-Guerrero, J., Ferrández-Vicente, J.: A practical evaluation of the performance of the impulse codeveloper hls tool for implementing large-kernel 2-d filters. J Real-Time Image Proc 9, 263–279 (2014)CrossRefGoogle Scholar
  7. 7.
    Fiack, L., Cuperlier, N., Miramond, B.,: Embedded and real-time architecture for bio-inspired vision-based robot navigation. J Real-Time Image Proc (2013). doi: 10.1007/s11554-013-0391-9 Google Scholar
  8. 8.
    Fons, F., Fons, M., Cantó, E., López, M.: Real-time embedded systems powered by fpga dynamic partial self-reconfiguration: a case study oriented to biometric recognition applications. J Real-Time Image Proc 8, 229–251 (2009)CrossRefGoogle Scholar
  9. 9.
    Hofmann, M., Eggeling, C., Hell, S.J.S.: Breaking the diffraction barrier in fluorescence microscopy at low light intensities by using reversibly photoswitchable proteins. Proceedings of the National Academy of Sciences of the United States of America 42, 17565–17569 (2005)CrossRefGoogle Scholar
  10. 10.
    Jiang, B., Woodell, A., Jobson, D.J.: Novel multi-scale retinex with color restoration on graphics processing unit. J Real-Time Image Proc 10, 239–253 (2015)CrossRefGoogle Scholar
  11. 11.
    Krause, M., Alles, R.M., Burgeth, B., Weickert, J.,: Fast retinal vessel analysis. J Real-Time Image Proc (2013). doi: 10.1007/s11554-013-0342-5 Google Scholar
  12. 12.
    M Arias Estrada CTH (2000) Real-time fpga arquitectures for computer vision. In: Proceedings of The Electronic Imaging 2000-Photonics West, dedicated conference on Machine Vision Applications in Industrial Inspection VII, San Jose, pp 23–28Google Scholar
  13. 13.
    Mabrouk, A., Hassim, N., Elshafiey, I.: A computationally efficient technique for real-time detection of particular-slope edges. J Real-Time Image Proc (2013). doi: 10.1007/s11554-013-0346-1 Google Scholar
  14. 14.
    Park, H., Park, Y., Oh, S.K.: L/m-fold image resizing in block-dct domain using symmetric convolution. IEEE Transactions on Image Processing 12, 1016–1034 (2003)CrossRefGoogle Scholar
  15. 15.
    Rasnik, I., French, T., Jacobson, K., Berland, K.: Electronic cameras for low, light microscopy. ELSEVIER ACADEMIC PRESS INC, San Diego (2013)CrossRefGoogle Scholar
  16. 16.
    Reichenbach, S.E., Geng, F.: Improved cubic convolution for two dimensional image reconstruction. IEEE Nucl. Sci. Sympos. Med. Imaging Conf. 3, 1775–1778 (2001)Google Scholar
  17. 17.
    Romero-Troncoso, R.: Diseño de Sistemas Digitales con VHDL. S.A. Ediciones Paraninfo, Spain (2002)Google Scholar
  18. 18.
    Romero-Troncoso, R.: Electrnica Digital y Lógica Programable. Universidad De Guanajuato, México (2007)Google Scholar
  19. 19.
    Saldaa, G., Arias-Estrada, M.: Customizable fpga-based architecture for video applications in real time. In: Proceedings of The IEEE international conference on field programmable technology, pp. 381–384. IEEE, Bangkok (2006)Google Scholar
  20. 20.
    Saldaa, G., Arias-Estrada, M.: Compact fpga-based systolic array architecture suitable for vision systems. In: Proceedings of the 4th international conference on information technology: new generations, pp. 1008–1013. IEEE, Las Vegas (2007)Google Scholar
  21. 21.
    Sangwine, S.: Colour image edge detector based on quaternion convolution. Elect. Lett. 10, 969–971 (2002)Google Scholar
  22. 22.
    Sangwine, S., Ell, T.: Colour image filters based on hypercomplex convolution. IEE Proc. Vision Image Signal Process. 147, 89–93 (2002)CrossRefGoogle Scholar
  23. 23.
    Savarimuthu, T.R., Kjaer-Nielsen, A., Sorensen, A.S.: Real-time medical video processing, enabled by hardware accelerated correlations. J Real-Time Image Proc 6, 187–197 (2011)CrossRefGoogle Scholar
  24. 24.
    SEROT J (2012) Caph : a high-level actor-based language for programming fpgas. In: Workshop on Architecture of Smart Cameras—WASC 2012Google Scholar
  25. 25.
    SEROT J (2013) Caph: a domain specic language for implementing stream-processing applications on recongurable hard. In: First Workshop on Domain Specific Languages Design and Implementation.
  26. 26.
    Serot, J., Berry, F.: Caph, un langage dé dié á la synthése; applications flot de données sur circuits fpga. In: 24eme Congrés GRETSI (2013)Google Scholar
  27. 27.
    Serot J, Berry F, Ahmed S (2012) CAPH: a Language for implementing stream-processing applications on FPGAs, vol Embedded Systems Design with F, Springer, chap CAPH: A La, pp 201–224.
  28. 28.
    Shi, J., Reichenbach, S.: Image interpolation by two-dimensional parametric cubic convolution. IEEE Trans. Image Process. 54, 1857–1870 (2006)Google Scholar
  29. 29.
    Singh-Parihar, R.K., Reddy, S.: Efficient floating point 32-bit single precision multipliers design using VHDL. BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE PILANI 333031, Pilani (2005)Google Scholar
  30. 30.
    Stevenson, D.: A proposed standard for binary floating point arithmetic. IEEE Trans. Elect. Comput. 14, 51–62 (1981)Google Scholar
  31. 31.
    SWallace, C.: A suggestion for fast multipliers. IEEE Trans Electr Comput 13, 14–17 (1984)Google Scholar
  32. 32.
    Takagi, N., Yasuura, H., Yajima, S.: High-speed vlsi multiplication algorithm with a redundant binary addition tree. IEEE Trans. Elect. Comput. 34, 789–796 (2006)zbMATHGoogle Scholar
  33. 33.
    Zhou, F., Zhao, J., Ye, T., Chen, L.: Accelerating embedded image processing for real time: a case study. J Real-Time Image Proc (2014). doi: 10.1007/s11554-013-0353-2 Google Scholar
  34. 34.
    Zhou, F., Zhao, J., Ye, T., Chen, L.: Fast star centroid extraction algorithm with sub-pixel accuracy based on fpga. J Real-Time Image Proc (2014). doi: 10.1007/s11554-014-0408-z Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Computer Science Department, Reconfigurable Computing LaboratoryInstituto Nacional de Astrofísica Óptica y Electrónica (INAOE)TonanzintlaMexico
  2. 2.Division of Graduate Studies and ResearchInstituto Tecnológico de Tuxtla Gutiérrez (ITTG)Tuxtla GutiérrezMexico

Personalised recommendations