Convolutional neural network acceleration with hardware/software co-design

Chen, Andrew Tzer-Yeu; Biglari-Abhari, Morteza; Wang, Kevin I-Kai; Bouzerdoum, Abdesselam; Tivive, Fok Hing Chi

doi:10.1007/s10489-017-1007-z

Convolutional neural network acceleration with hardware/software co-design

Published: 02 August 2017

Volume 48, pages 1288–1301, (2018)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Andrew Tzer-Yeu Chen ORCID: orcid.org/0000-0002-6239-8443¹,
Morteza Biglari-Abhari¹,
Kevin I-Kai Wang¹,
Abdesselam Bouzerdoum^2,3 &
…
Fok Hing Chi Tivive²

1432 Accesses
9 Citations
3 Altmetric
Explore all metrics

Abstract

Convolutional Neural Networks (CNNs) have a broad range of applications, such as image processing and natural language processing. Inspired by the mammalian visual cortex, CNNs have been shown to achieve impressive results on a number of computer vision challenges, but often with large amounts of processing power and no timing restrictions. This paper presents a design methodology for accelerating CNNs using Hardware/Software Co-design techniques, in order to balance performance and flexibility, particularly for resource-constrained systems. The methodology is applied to a gender recognition case study, using an ARM processor and FPGA fabric to create an embedded system that can process facial images in real-time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hardware/Software Co-design for a Gender Recognition Embedded System

Design of Hardware Accelerator for Facial Recognition System Using Convolutional Neural Networks Based on FPGA

A survey of neural network accelerators

Article 17 May 2017

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) TensorFlow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI)
Google Scholar
Tivive FHC, Bouzerdoum A (2006) A gender recognition system using shunting inhibitory convolutional neural networks. In: International joint conference on neural networks (IJCNN), pp 5336–5341
Google Scholar
Chen ATY, Biglari-Abhari M, Wang KIK, Bouzerdoum A, Tivive FHC (2016) Hardware/software co-design for a gender recognition embedded system. In: Trends in applied knowledge-based systems and data science, vol 9799, pp 541–552
Google Scholar
de Michell G, Gupta RK (1997) Hardware/software co-design. Proc IEEE 85(3):349–365
Article Google Scholar
Teich J (2012) Hardware/software codesign: the past, the present, and predicting the future. Proc IEEE 100:1411–1430
Article Google Scholar
Alt N, Clause C, Stechele W (2008) Hardware/software architecture of an algorithm for vision-based real-time vehicle detection in dark environments. In: Design, automation, and test in europe (DATE), pp 176–181
van der Wal G, Zhang D, Kandaswamy I, Marakowitz J, Kaighn K, Zhang J, Chai S (2015) FPGA acceleration for feature based processing applications. In: Conference on computer vision and pattern recognition (CVPR), pp 42–47
Tasson D, Montagnini A, Marzotto R, Farenzena M (2015) FPGA-based pedestrian detection under strong distortions. In: Conference on computer vision and pattern recognition (CVPR), pp 65–70
Farabet C, Poulet C, Han JY, LeCun Y (2009) CNP: An FPGA-based processor for convolutional networks. In: International conference on field programmable logic (FPL), pp 32–37
Sankaradas M, Jakkula V, Cadambi S, Chakradhar S, Durdanovic I, Cosatto E, Graf HP (2009) A massively parallel coprocessor for convolutional neural networks. In: 20th international conference on application-specific systems, architectures, and processors (ASAP), pp 53–60
Google Scholar
Farabet C, Martini B, Corda B, Akselrod P, Culurciello E, LeCun Y (2011) NeuFlow: a runtime reconfigurable dataflow processor for vision. In: Conference on computer vision and pattern recognition workshops (CVPR), pp 109–116
Google Scholar
Cavigelli L, Gschwend D, Mayer C, Willi S, Muheim B, Benini L (2015) Origami: a convolutional network accelerator. In: 25th great lakes symposium on VLSI (GLSVLSI), pp 199–204
Pham PH, Jelaca D, Farabet C, Martini B, LeCun Y, Culurciello E (2012) NeuFlow: dataflow vision processing system-on-a-chip. In: 55th midwest symposium on circuits and systems (MWSCAS), pp 1044–1047
Li X, Areibi S (2004) A hardware/software co-design approach for face recognition. In: 16th international conference on microelectronics (ICM), pp 55–58
Che M, Chang Y (2010) A hardware/software co-design of a face detection algorithm based on FPGA. In: International conference on measuring technology and mechatronics automation (ICMTMA), pp 109–112
Qiu J, Wang J, Yao S, Guo K, Li B, Zhou E, Yu J, Tang T, Xu N, Song S, Wang Y, Yang H (2016) Going deeper with embedded FPGA platform for convolutional neural network. In: International symposium on field-programmable gate arrays (FPGA), pp 26–35
Maclean WJ (2005) An evaluation of the suitability of FPGAs for embedded vision systems. In: Conference on computer vision and pattern recognition workshops (CVPR), pp 131–138
Zhang C, Li P, Sun G, Guan Y, Xiao B, Cong J (2015) Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: International symposium on field-programmable gate arrays (FPGA), pp 161–170
Gupta S, Agrawal A, Gopalakrishnan K (2015) Deep learning with limited numerical precision. In: 32nd international conference on machine learning (ICML), pp 1737–1746
Ng CB, Tay YH, Goi BM (2012) Recognizing human gender in computer vision: a survey. In: Pacific rim international conference on artificial intelligence: trends in artificial intelligence (PRICAI), pp 335–346
Google Scholar
Zheng J, Lu B (2011) A support vector machine classifier with automatic confidence. Neurocomputing 74(11):1926–1935
Article Google Scholar
Shan C (2012) Learning local binary patterns for gender classification on real-world face images. Pattern Recogn Lett 4(33):431–437
Article Google Scholar
Azarmehr R, Laganiere R, Lee WS, Xu C, Laroche D (2015) Real-time embedded age and gender classification in unconstrained video. In: Conference on computer vision and pattern recognition workshops (CVPR), pp 56–64
Irick KM, DeBole M, Narayanan V, Gayasen A (2008) A hardware efficient support vector machine architecture for FPGA. In: 16th international symposium on field-programmable custom computing machines (FCCM), pp 304–305
Irick K, DeBole M, Narayanan V, Sharma R, Moon H, Mummareddy S (2007) A unified streaming architecture for real time face detection and gender classification. In: international conference on field programmable logic and applications (FPL), pp 267–272
Ratnakar A, More G (2015) Real time gender recognition on FPGA. Int J Sci Eng Res 6(2):19–22
Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Conference on computer vision and pattern recognition (CVPR), pp 779–788
Tivive FHC, Bouzerdoum A, Phung SL, Iftekharuddin KM (2010) Adaptive hierarchical architecture for visual recognition. Appl Opt 49(10):B1–B8
Article Google Scholar
Fogel I, Sagi D (1989) Gabor filters as texture discriminator. Biol Cybern 61(2):103–113
Article Google Scholar
Wu J, An G, Ruan Q (2009) Independent Gabor analysis of discriminant features fusion for face recognition. IEEE Signal Processing Lett 16(2):97–100
Article Google Scholar
Li W, Du Q (2014) Gabor-filtering-based nearest regularized subspace for hyperspectral image classification. IEEE J Select Topics Appl Earth Observ Rem Sens 7(4):1012–1022
Article Google Scholar
Jones JP, Palmer L (1987) An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophys 58(6):1233–1258
Article Google Scholar
Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J Optic Soc Amer A: Optic Image Sci Vis 2(7):1160–1169
Article Google Scholar
Naka KI, Rushton WAH (1966) S-potentials from colour units in the retina of fish (Cyprinidae). J Phys 185:536–555
Google Scholar
Hagan MT, Menhaj M (1994) Training feedforward networks with the marquardt algorithm. IEEE Trans Neural Networks 5(6):989–993
Article Google Scholar
Cesur E, Yildiz N, Tavsanoglu V (2012) On an improved FPGA implementation of CNN-based Gabor-type filters. IEEE Trans Circuits Systems 59(11):815–819
Google Scholar
Pauwels K, Tomasi M, Alonso JD, Ros E, van Hulle MM (2012) A comparison of FPGA and GPU for real-time phase-based optical flow, stereo, and local image features. IEEE Trans Comput 61(7):999–1012
Article MathSciNet MATH Google Scholar
Han S, Mao H, Dally WJ (2016) Deep compression: Compressing deep neural networks with pruning trained quantization and huffman coding. In: International conference on learning representations (ICLR)
Google Scholar
Chen Y, Xu W, Zhao R, Chen X (2014) Design and evaluation of a hardware/software FPGA-based system for fast image processing. Photonic Sensors 4(3):274–280
Article Google Scholar
Gudis E, Lu P, Berends D, Kaighn K, van der Wal G, Buchanan G, Chai S, Piacentino M (2013) An embedded vision services framework for heterogeneous accelerators. In: conference on computer vision and pattern recognition workshops (CVPR), pp 598–603
Albericio J, Judd P, Hetherington T, Aamodt T, Jerger NE, Moshovos A (2016) Cnvlutin: ineffectual-neuron-free deep neural network computing. In: 43rd international symposium on comparative archives (ISCA), pp 1–13
Jesorsky O, Kirchberg KJ, Frischholz RW (2001) Robust face detection using the Hausdorff distance. In: 3rd international conference on audio- and video-based biometric person authentication (AVBPA), pp 90–95
Pantic M, Valstar M, Rademaker R (2005) Web-based database for facial expression analysis. In: International conference on multimedia and expo (ICME), pp. 317–321
Phillips PJ, Moon H, Rauss PJ, Rizvi S (2000) The FERET evaluation methodology for face recognition algorithms. IEEE Trans Pattern Anal Machine Intelligence 22(10):1090–1104
Article Google Scholar
Thomaz CE, Giraldi GA (2010) A new ranking method for principal components analysis and its application to face image analysis. Image Vis Comput 28(6):902–913
Article Google Scholar
Lee PH, Hung JY, Hung YP (2010) Automatic gender recognition using fusion of facial strips. In: 20th international conference on pattern recognition, pp 1140–1143
Leng XM, Wang YD (2008) Improving generalization for gender classification. In: 15th international conference on image processing, pp 1656–1659
Moghaddam B, Yang MH (2002) Learning gender with support faces. IEEE Trans Pattern Anal Machine Intelligence 24(5):707–711
Article Google Scholar
Lu L, Shi P (2009) A novel fusion-based method for expression-invariant gender classification. In: International conference on acoustics, speech, and signal processing, pp 1065–1068
Google Scholar
Baluja S, Rowley HA (2007) Boosting sex identification performance. Int J Comp Vision 71(1):111–119
Article Google Scholar
Buchala S, Loomes MJ, Davey N, Frank RJ (2005) The role of global and feature based information in gender classification of faces: a comparison of human performance and computational models. Int J Neural Syst 15:121–128
Article Google Scholar
Sahin I, Saritekin NK (2016) A data path design tool for automatically mapping artificial neural networks on to FPGA-based systems. J Elec Eng Tech 11(5):1921–1929
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, The University of Auckland, Auckland, New Zealand
Andrew Tzer-Yeu Chen, Morteza Biglari-Abhari & Kevin I-Kai Wang
School of Electrical, Computer, and Telecommunications Engineering, University of Wollongong, Wollongong, Australia
Abdesselam Bouzerdoum & Fok Hing Chi Tivive
College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
Abdesselam Bouzerdoum

Authors

Andrew Tzer-Yeu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Morteza Biglari-Abhari
View author publications
You can also search for this author in PubMed Google Scholar
Kevin I-Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Abdesselam Bouzerdoum
View author publications
You can also search for this author in PubMed Google Scholar
Fok Hing Chi Tivive
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Tzer-Yeu Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, AY., Biglari-Abhari, M., Wang, KK. et al. Convolutional neural network acceleration with hardware/software co-design. Appl Intell 48, 1288–1301 (2018). https://doi.org/10.1007/s10489-017-1007-z

Download citation

Published: 02 August 2017
Issue Date: May 2018
DOI: https://doi.org/10.1007/s10489-017-1007-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional neural network acceleration with hardware/software co-design

Abstract

Access this article

Similar content being viewed by others

Hardware/Software Co-design for a Gender Recognition Embedded System

Design of Hardware Accelerator for Facial Recognition System Using Convolutional Neural Networks Based on FPGA

A survey of neural network accelerators

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Convolutional neural network acceleration with hardware/software co-design

Abstract

Access this article

Similar content being viewed by others

Hardware/Software Co-design for a Gender Recognition Embedded System

Design of Hardware Accelerator for Facial Recognition System Using Convolutional Neural Networks Based on FPGA

A survey of neural network accelerators

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation