Object Detection on Base of Modified Convolutional Network

  • Alexey AlexeevEmail author
  • Yuriy Matveev
  • Georgy Kukharev
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11401)


The work involves a new object detector using a convolutional network with a kernel of type NiN (Network in Network). Detection refers to the simultaneous localization of objects on an image and their recognition. The operation of the detector is possible on images of arbitrary size. To learn the network images \(100\times 100\) pixels are used. The proposed method has a high computational efficiency, so processing time of HD frame on a single CPU core is about 300 ms. As will be seen from the paper, a high degree of uniformity of network operations creates conditions for streaming parallel processing of data on the GPU, with an estimated operating time of less than 10 ms. Our method is resistant to small overlaps, the average quality of images of detected objects and represents the end-to-end learner model, the output of which is delimited by the boundaries and classes of objects throughout the image. In work, an open dataset of images obtained from car recorders is used to evaluate the algorithm for detecting objects. A similar approach can be used to detect and count other types of objects, for example people’s faces. This method is not limited to the use of one type of objects, it is possible to simultaneously detect a mixture of objects. The algorithm of the detector was tested on our own a3net framework, without using third-party neural network programs.


Object detection Region proposal CNN NiN 

Supplementary material


  1. 1.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. ArXiv e-prints, June 2015Google Scholar
  2. 2.
    Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. CoRR, abs/1506.02640 (2015)Google Scholar
  3. 3.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. CoRR, abs/1411.4038 (2014)Google Scholar
  4. 4.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR, abs/1703.06870 (2017)Google Scholar
  5. 5.
    Lee, H.S., Kim, K.: Simultaneous traffic sign detection and boundary estimation using convolutional neural network. CoRR, abs/1802.10019 (2018)Google Scholar
  6. 6.
    Lin, M., Chen, Q., Yan, S.: Network in network. CoRR, abs/1312.4400 (2013)Google Scholar
  7. 7.
    Pang, Y., Sun, M., Jiang, X., Li, X.: Convolution in convolution for network in network. CoRR, abs/1603.06759 (2016)Google Scholar
  8. 8.
    Chang, J., Chen, Y.: Batch-normalized maxout network in network. CoRR, abs/1511.02583 (2015)Google Scholar
  9. 9.
    Girshick, R.B.: Fast R-CNN. CoRR, abs/1504.08083 (2015)Google Scholar
  10. 10.
    Laskar, M.N.U., Giraldo, L.G.S., Schwartz, O.: Correspondence of deep neural networks and the brain for visual textures. ArXiv e-prints, June 2018Google Scholar
  11. 11.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR, abs/1412.6980 (2014)Google Scholar
  12. 12.
    Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. CoRR, abs/1609.04836 (2016)Google Scholar
  13. 13.
    Perera, P., Patel, V.M.: Learning deep features for one-class classification. CoRR, abs/1801.05365 (2018)Google Scholar
  14. 14.
    Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. CoRR, abs/1710.09829 (2017)Google Scholar
  15. 15.
    Frosst, N., Hinton, G.E., Sabour, S.: Matrix capsules with EM routing. In: International Conference on Learning Representations (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.ITMO UniversitySaint-PetersburgRussia
  2. 2.West Pomeranian University of TechnologySzczecinPoland

Personalised recommendations