Speeding up inference on deep neural networks for object detection by performing partial convolution

  • Wattanapong KurdthongmeeEmail author
Original Research Paper


Real-time object detection is an expected application of deep neural networks (DNNs). It can be achieved by employing graphic processing units (GPUs) or dedicated hardware accelerators. Alternatively, in this work, we present a software scheme to accelerate the inference stage of DNNs designed for object detection. The scheme relies on partial processing within the consecutive convolution layers of a DNN. It makes use of different relationships between the locations of the components of an input feature, an intermediate feature representation, and an output feature to effectively identify the modified components. This downsizes the matrix multiplicand to cover only those modified components. Therefore, matrix multiplication is accelerated within a convolution layer. In addition, the aforementioned relationships can also be employed to signal the next consecutive convolution layer regarding the modified components. This further helps reduce the overhead of the comparison on a member-by-member basis to identify the modified components. The proposed scheme has been experimentally benchmarked against a similar concept approach, namely, CBinfer, and against the original Darknet on the Tiny-You Only Look Once network. The experiments were conducted on a personal computer with dual CPU running at 3.5 GHz without GPU acceleration upon video data sets from YouTube. The results show that improvement ratios of 1.56 and 13.10 in terms of detection frame rate over CBinfer and Darknet, respectively, are attainable on average. Our scheme was also extended to exploit GPU-assisted acceleration. The experimental results of NVIDIA Jetson TX2 reached a detection frame rate of 28.12 frames per second (1.25\(\times\) with respect to CBinfer). The accuracy of detection of all experiments was preserved at 90% of the original Darknet.


Deep neural networks DNNs object detection Convolution Inference acceleration 



This work was supported by Thailand Research Fund (TRF) and Walailak University, Thailand, under Grant number RSA6280097.


  1. 1.
    Zhao, Z-Q., Zheng, P., Xu, H. S., WU, X.: Object detection with deep learning: a review. J. LaTeX Class Files 14, 8. arXiv:1807.05511 (2017)
  2. 2.
    Pathak, A.R., Pandey, M., Rautaray, S.: Application of deep learning for object detection. Proc. Comput. Sci. 132, 1706–1717 (2018)CrossRefGoogle Scholar
  3. 3.
    Vondrick, C., Khosla, A., Pirsiavash, H., Malisiewicz, T., Torralba, A.: Visualizing object detection features. Int. J. Comput. Vis. 119(2), 145–158 (2016)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Matsumoto, M.: SVM-based object detection using self-quotient \(\epsilon\)-filter and histograms of oriented gradients. In: Proceedings of the Computational Intelligence. Springer, Berlin Heidelberg, pp. 277–286 (2012)Google Scholar
  5. 5.
    Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In; Proceedings of the 26th International Conference on Neural Information Processing Systems—Volume 2, Lake Tahoe, Nevada, pp. 2553–2561 (2013)Google Scholar
  6. 6.
    Liu, N., Han, J., Zhang, D., Wen, S., Liu, T.: Predicting eye fixations using convolutional neural networks. CVPR (2015)Google Scholar
  7. 7.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR (2014)Google Scholar
  8. 8.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS (2015)Google Scholar
  9. 9.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. CVPR (2016)Google Scholar
  10. 10.
    Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger, arXiv (2016). arXiv:1612.08242
  11. 11.
    Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement, arXiv (2018), arXiv:1804.02767
  12. 12.
    Huynh, L.N., Lee, Y., Balan, R.K.: Deepmon: Mobile GPU-based deep learning framework for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, New York, pp. 82–95 (2017)Google Scholar
  13. 13.
    Mobahi, H., Collobert, R., Weston, J.: Deep learning from temporal coherence in video. In: Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Quebec, Canada, pp. 737–744 (2009)Google Scholar
  14. 14.
    Lin, X., Zhao, C., Pan W.: Towards accurate binary convolutional neural network, NIPS 2017. Long Beach, CA, USA, pp. 344–352 (2017)Google Scholar
  15. 15.
    Bertasiu, G., Torresani, L., Shi, J.: Object detection in video with spatiotemporal sampling networks, ECCV2018. arXiv:1803.05549 (2018)
  16. 16.
    Cavigelli, L., Degen, P., Benini, L.: CBinfer: Change-based inference for convolutional neural networks on video data. arXiv:1704.04313 (2017)
  17. 17.
    Xu, M., Zhu, M., Liu, Y., Lin, F.X., Liu, X.: DeepCache: Principled cache for mobile deep vision. arXiv:1712.01670 (2018)
  18. 18.
    Anderson, A., Vasudevany, A., Keane, C., Gregg, D.: Low-memory GEMM-based convolution algorithms for deep neural networks, DeepMon: Mobile GPU-based deep learning framework for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, New York, pp. 82–95 (2017)Google Scholar
  19. 19.
    Abu-El-Haija, S., Kothari, N.: YouTube-8M: A large-scale video classification Benchmark (2016)Google Scholar
  20. 20.
    Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S. E., Fu, C. Y., Berg, A. C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M., (eds) Computer Vision—ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905. Springer, Cham (2016)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Engineering and TechnologyWalailak UniversityNakhon-si-thammaratThailand

Personalised recommendations