Foveated Image Processing for Faster Object Detection and Recognition in Embedded Systems Using Deep Convolutional Neural Networks

Jaramillo-Avila, Uziel; Anderson, Sean R.

doi:10.1007/978-3-030-24741-6_17

Uziel Jaramillo-Avila²¹ &
Sean R. Anderson²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11556))

Included in the following conference series:

Conference on Biomimetic and Biohybrid Systems

1953 Accesses
5 Citations
4 Altmetric

Abstract

Object detection and recognition algorithms using deep convolutional neural networks (CNNs) tend to be computationally intensive to implement. This presents a particular challenge for embedded systems, such as mobile robots, where the computational resources tend to be far less than for workstations. As an alternative to standard, uniformly sampled images, we propose the use of foveated image sampling here to reduce the size of images, which are faster to process in a CNN due to the reduced number of convolution operations. We evaluate object detection and recognition on the Microsoft COCO database, using foveated image sampling at different image sizes, ranging from \(416\times 416\) to \(96\times 96\) pixels, on an embedded GPU – an NVIDIA Jetson TX2 with 256 CUDA cores. The results show that it is possible to achieve a \(4{\times }\) speed-up in frame rates, from 3.59 FPS to 15.24 FPS, using \(416\times 416\) and \(128\times 128\) pixel images respectively. For foveated sampling, this image size reduction led to just a small decrease in recall performance in the foveal region, to 92.0% of the baseline performance with full-sized images, compared to a significant decrease to 50.1% of baseline recall performance in uniformly sampled images, demonstrating the advantage of foveated sampling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Person, bicycle, car, motorbike, aeroplane, bus, train, truck, boat, traffic light, fire hydrant, stop sign, parking meter, bench, bird, cat, dog, horse, sheep, cow.
2.
96 \(\times \) 96, 128 \(\times \) 128, 160 \(\times \) 160, 192 \(\times \) 192, 224 \(\times \) 224, 256 \(\times \) 256, 288 \(\times \) 288, 320 \(\times \) 320, 352 \(\times \) 352, 384 \(\times \) 384 and 416 \(\times \) 416.

References

Akbas, E., Eckstein, M.P.: Object detection through search with a foveated visual system. PLoS Comput. Biol. 13(10), e1005743 (2017)
Article Google Scholar
Almeida, A.F., Figueiredo, R., Bernardino, A., Santos-Victor, J.: Deep networks for human visual attention: a hybrid model using foveal vision. In: Ollero, A., Sanfeliu, A., Montano, L., Lau, N., Cardeira, C. (eds.) ROBOT 2017. AISC, vol. 694, pp. 117–128. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-70836-2_10
Chapter Google Scholar
Frintrop, S., Werner, T., Martin Garcia, G.: Traditional saliency reloaded: a good old model in new shape. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 82–90 (2015)
Google Scholar
Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans. Image Process. 13(10), 1304–1318 (2004)
Article Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Martinez, J., Altamirano, L.: A new foveal cartesian geometry approach used for object tracking. In: Proceedings of the IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, SPPRA 2006, Innsbruck, Austria, pp. 133–139 (2006)
Google Scholar
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS-W (2017). Accessed 20 Oct 2018
Google Scholar
Recasens, A., Kellnhofer, P., Stent, S., Matusik, W., Torralba, A.: Learning to zoom: a saliency-based sampling layer for neural networks. arXiv preprint arXiv:1809.03355 (2018)
Redmon, J.: Darknet: open source neural networks in C (2016). http://pjreddie.com/darknet/. Accessed 25 Aug 2018
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Shafiee, M.J., Chywl, B., Li, F., Wong, A.: Fast YOLO: a fast you only look once system for real-time embedded object detection in video. arXiv preprint arXiv:1709.05943 (2017)
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)
Article Google Scholar
Strasburger, H., Rentschler, I., Jüttner, M.: Peripheral vision and pattern recognition: a review. J. Vision 11(5), 1–82 (2011)
Article Google Scholar
Tijtgat, N., Van Ranst, W., Volckaert, B., Goedemé, T., De Turck, F.: Embedded real-time object detection for a UAV warning system. In: The International Conference on Computer Vision, ICCV 2017, pp. 2110–2118 (2017)
Google Scholar
Tong, F., Li, Z.N.: Reciprocal-wedge transform for space-variant sensing. IEEE Trans. Pattern Anal. Mach. Intell. 17(5), 500–511 (1995)
Article Google Scholar
Traver, V.J., Bernardino, A.: A review of log-polar imaging for visual perception in robotics. Rob. Autonom. Syst. 58(4), 378–398 (2010)
Article Google Scholar
Wässle, H., Grünert, U., Röhrenbeck, J., Boycott, B.B.: Cortical magnification factor and the ganglion cell density of the primate retina. Nature 341(6243), 643–646 (1989)
Article Google Scholar
Wilson, S.W.: On the retino-cortical mapping. Int. J. Man Mach. Stud. 18(4), 361–389 (1983)
Article Google Scholar
Wu, B., Iandola, F.N., Jin, P.H., Keutzer, K.: SqueezeDet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: CVPR Workshops, pp. 446–454 (2017)
Google Scholar
Zhang, X., Gao, T., Gao, D.: A new deep spatial transformer convolutional neural network for image saliency detection. Des. Autom. Embed. Syst. 1–14 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, S1 3JD, UK
Uziel Jaramillo-Avila & Sean R. Anderson

Authors

Uziel Jaramillo-Avila
View author publications
You can also search for this author in PubMed Google Scholar
Sean R. Anderson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Uziel Jaramillo-Avila .

Editor information

Editors and Affiliations

University of Bath, Bath, UK
Uriel Martinez-Hernandez
SPECS, Institute for Bioengineering of Catalonia, Barcelona, Spain
Vasiliki Vouloutsi
SPECS, Institute for Bioengineering of Catalonia, Barcelona, Spain
Anna Mura
University of Sheffield, Sheffield, UK
Michael Mangan
Osaka University, Suita, Japan
Minoru Asada
University of Sheffield, Sheffield, UK
Tony J. Prescott
SPECS, ICREA, BIST, Barcelona, Spain
Paul F.M.J. Verschure

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaramillo-Avila, U., Anderson, S.R. (2019). Foveated Image Processing for Faster Object Detection and Recognition in Embedded Systems Using Deep Convolutional Neural Networks. In: Martinez-Hernandez, U., et al. Biomimetic and Biohybrid Systems. Living Machines 2019. Lecture Notes in Computer Science(), vol 11556. Springer, Cham. https://doi.org/10.1007/978-3-030-24741-6_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-24741-6_17
Published: 06 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24740-9
Online ISBN: 978-3-030-24741-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics