Skip to main content

Foveated Image Processing for Faster Object Detection and Recognition in Embedded Systems Using Deep Convolutional Neural Networks

  • Conference paper
  • First Online:
Biomimetic and Biohybrid Systems (Living Machines 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11556))

Included in the following conference series:

Abstract

Object detection and recognition algorithms using deep convolutional neural networks (CNNs) tend to be computationally intensive to implement. This presents a particular challenge for embedded systems, such as mobile robots, where the computational resources tend to be far less than for workstations. As an alternative to standard, uniformly sampled images, we propose the use of foveated image sampling here to reduce the size of images, which are faster to process in a CNN due to the reduced number of convolution operations. We evaluate object detection and recognition on the Microsoft COCO database, using foveated image sampling at different image sizes, ranging from \(416\times 416\) to \(96\times 96\) pixels, on an embedded GPU – an NVIDIA Jetson TX2 with 256 CUDA cores. The results show that it is possible to achieve a \(4{\times }\) speed-up in frame rates, from 3.59 FPS to 15.24 FPS, using \(416\times 416\) and \(128\times 128\) pixel images respectively. For foveated sampling, this image size reduction led to just a small decrease in recall performance in the foveal region, to 92.0% of the baseline performance with full-sized images, compared to a significant decrease to 50.1% of baseline recall performance in uniformly sampled images, demonstrating the advantage of foveated sampling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Person, bicycle, car, motorbike, aeroplane, bus, train, truck, boat, traffic light, fire hydrant, stop sign, parking meter, bench, bird, cat, dog, horse, sheep, cow.

  2. 2.

    96 \(\times \) 96, 128 \(\times \) 128, 160 \(\times \) 160, 192 \(\times \) 192, 224 \(\times \) 224, 256 \(\times \) 256, 288 \(\times \) 288, 320 \(\times \) 320, 352 \(\times \) 352, 384 \(\times \) 384 and 416 \(\times \) 416.

References

  1. Akbas, E., Eckstein, M.P.: Object detection through search with a foveated visual system. PLoS Comput. Biol. 13(10), e1005743 (2017)

    Article  Google Scholar 

  2. Almeida, A.F., Figueiredo, R., Bernardino, A., Santos-Victor, J.: Deep networks for human visual attention: a hybrid model using foveal vision. In: Ollero, A., Sanfeliu, A., Montano, L., Lau, N., Cardeira, C. (eds.) ROBOT 2017. AISC, vol. 694, pp. 117–128. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-70836-2_10

    Chapter  Google Scholar 

  3. Frintrop, S., Werner, T., Martin Garcia, G.: Traditional saliency reloaded: a good old model in new shape. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 82–90 (2015)

    Google Scholar 

  4. Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans. Image Process. 13(10), 1304–1318 (2004)

    Article  Google Scholar 

  5. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  6. Martinez, J., Altamirano, L.: A new foveal cartesian geometry approach used for object tracking. In: Proceedings of the IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, SPPRA 2006, Innsbruck, Austria, pp. 133–139 (2006)

    Google Scholar 

  7. Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS-W (2017). Accessed 20 Oct 2018

    Google Scholar 

  8. Recasens, A., Kellnhofer, P., Stent, S., Matusik, W., Torralba, A.: Learning to zoom: a saliency-based sampling layer for neural networks. arXiv preprint arXiv:1809.03355 (2018)

  9. Redmon, J.: Darknet: open source neural networks in C (2016). http://pjreddie.com/darknet/. Accessed 25 Aug 2018

  10. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  11. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  12. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  13. Shafiee, M.J., Chywl, B., Li, F., Wong, A.: Fast YOLO: a fast you only look once system for real-time embedded object detection in video. arXiv preprint arXiv:1709.05943 (2017)

  14. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)

    Article  Google Scholar 

  15. Strasburger, H., Rentschler, I., Jüttner, M.: Peripheral vision and pattern recognition: a review. J. Vision 11(5), 1–82 (2011)

    Article  Google Scholar 

  16. Tijtgat, N., Van Ranst, W., Volckaert, B., Goedemé, T., De Turck, F.: Embedded real-time object detection for a UAV warning system. In: The International Conference on Computer Vision, ICCV 2017, pp. 2110–2118 (2017)

    Google Scholar 

  17. Tong, F., Li, Z.N.: Reciprocal-wedge transform for space-variant sensing. IEEE Trans. Pattern Anal. Mach. Intell. 17(5), 500–511 (1995)

    Article  Google Scholar 

  18. Traver, V.J., Bernardino, A.: A review of log-polar imaging for visual perception in robotics. Rob. Autonom. Syst. 58(4), 378–398 (2010)

    Article  Google Scholar 

  19. Wässle, H., Grünert, U., Röhrenbeck, J., Boycott, B.B.: Cortical magnification factor and the ganglion cell density of the primate retina. Nature 341(6243), 643–646 (1989)

    Article  Google Scholar 

  20. Wilson, S.W.: On the retino-cortical mapping. Int. J. Man Mach. Stud. 18(4), 361–389 (1983)

    Article  Google Scholar 

  21. Wu, B., Iandola, F.N., Jin, P.H., Keutzer, K.: SqueezeDet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: CVPR Workshops, pp. 446–454 (2017)

    Google Scholar 

  22. Zhang, X., Gao, T., Gao, D.: A new deep spatial transformer convolutional neural network for image saliency detection. Des. Autom. Embed. Syst. 1–14 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Uziel Jaramillo-Avila .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jaramillo-Avila, U., Anderson, S.R. (2019). Foveated Image Processing for Faster Object Detection and Recognition in Embedded Systems Using Deep Convolutional Neural Networks. In: Martinez-Hernandez, U., et al. Biomimetic and Biohybrid Systems. Living Machines 2019. Lecture Notes in Computer Science(), vol 11556. Springer, Cham. https://doi.org/10.1007/978-3-030-24741-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-24741-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-24740-9

  • Online ISBN: 978-3-030-24741-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics