Small target detection based on bird’s visual information processing mechanism

Abstract

Detecting small targets in large fields of view is a challenging task. Nowadays, many targets detection models based on the convolutional neural network (CNN) achieve excellent performance. However, these CNN-based detectors are inefficient when applied to tasks of real-time detection of small targets. This paper proposes a small-target detection model in large fields of view based on the tectofugal–thalamofugal–accessory optic system of birds. Within this model, first, we design an unsupervised saliency algorithm to generate saliency regions to suppress background information according to the visual information processing mechanism of the tectofugal pathway of birds. Second, we design a super-resolution (SR) analysis method to enlarge small targets and improve image resolution by the information processing mechanism of the accessory optic system of birds. Then, according to the information processing mechanism of the thalamofugal pathway, we propose a CNN-based method to detect small targets. We further test our model on two public datasets (the VEDAI dataset and DLR 3 K dataset), and the experimental results demonstrate that the proposed detection model outperforms the state-of-the-art methods on small-target detection.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

References

  1. 1.

    Abadi M, Agarwal A, Barham P et al (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467

  2. 2.

    Alexey AB (2018) How to improve object detection. IOP Publishing PhysicsWeb. https://github.com/AlexeyAB/darknet. Accessed on 21 July 2018

  3. 3.

    Bessette B, Hodos W (1989) Intensity, color, and pattern discrimination deficits after lesions of the core and belt regions of the ectostriatum. Vis Neurosci 2(1):27–34

    Google Scholar 

  4. 4.

    Boehnke S, Munoz D (2008) On the importance of the transient visual response in the superior colliculus. Curr Opin Neurobiol 18(6):544–551

    Google Scholar 

  5. 5.

    Bruhn A, Weickert J, Schnörr C (2005) Lucas/Kanade meets horn/Schunck: combining local and global optic flow methods. Int J Comput Vis 61(3):211–231

    Google Scholar 

  6. 6.

    Butler AB, Hodos W (2005) Comparative vertebrate neuroanatomy: evolution and adaptation. Wiley

  7. 7.

    Cao Y, Wang G, Yan D, Zhao Z (2015) Two algorithms for the detection and tracking of moving vehicle targets in aerial infrared image sequences. Remote Sens 8(1):28

    Google Scholar 

  8. 8.

    Chen C, Liu MY, Tuzel O et al (2016) R-CNN for small object detection. Asian conference on computer vision. Springer, Cham, pp 214–230

    Google Scholar 

  9. 9.

    Ewert J, Buxbaum-Conradi H, Dreisvogt F, Glagow M, Merkel-Harff C, Röttgen A, Schürg-Pfeiffer E, Schwippert WW (2001) Neural modulation of visuomotor functions underlying prey-catching behaviour in anurans: perception, attention, motor performance, learning. Comp Biochem Physiol A Mol Integr Physiol 128(3):417–460

    Google Scholar 

  10. 10.

    Fecteau JH, Munoz DP (2006) Salience, relevance, and firing: a priority map for target selection. Trends Cogn Sci 10(8):382–390

    Google Scholar 

  11. 11.

    Fendrich R (1993) The merging of the senses. J Cogn Neurosci 5(3):373–374

    Google Scholar 

  12. 12.

    Fu C, Liu W, Ranga A, Tyagi A Berg A (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659

  13. 13.

    Gang L, Qianqian Z, Tao H et al (2012) Detecting for the aerial small target in infrared image based on the correlation coefficients of nonsubsampled contourlet transform. IEEE international conference on automation and logistics. IEEE, pp 363–367

  14. 14.

    Girshick R (2015) Fast r-cnn. Proceedings of the IEEE international conference on computer vision (ICCV), pp 1440–1448

  15. 15.

    Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 580–587

  16. 16.

    He K, Zhang X, Ren S et al (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  17. 17.

    He K, Gkioxari G, Dollár P et al (2017) Mask r-cnn. Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  18. 18.

    Hosang J, Benenson R, Dollár P, Schiele B (2015) What makes for effective detection proposals? IEEE Trans Pattern Anal Mach Intell 38(4):814–830

    Google Scholar 

  19. 19.

    Hui Z, Wang X, Gao X (2018) Fast and accurate single image super-resolution via information distillation network. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 723–731

  20. 20.

    Jia Y, Shelhamer E, Donahue J et al (2014) Caffe: convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678

  21. 21.

    Ju M, Luo J, Zhang P et al (2019) A simple and efficient network for small target detection. IEEE Access, pp 85771–85781

  22. 22.

    Khare M, Srivastava RK, Khare A (2014) Single change detection-based moving object segmentation by using Daubechies complex wavelet transform. IET Image Process 8(6):334–344

    Google Scholar 

  23. 23.

    Kong T, Yao A, Chen Y et al (2016) Hypernet: towards accurate region proposal generation and joint object detection. Proceedings of IEEE conference on computer vision and pattern recognition, pp 845–853

  24. 24.

    Kong L, Zhu X, Wang G (2018) Context semantics for small target detection in large-field images with two cascaded faster R-CNNs. J. Phys Conf Ser, IOP publishing 1069(1):012138. https://doi.org/10.1088/1742-6596/1069/1/012138

    Article  Google Scholar 

  25. 25.

    Ku J, Mozifian M, Lee J et al (2018) Joint 3d proposal generation and object detection from view aggregation. 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 1–8

  26. 26.

    Lai W S, Huang J B, Ahuja N et al (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 624–632

  27. 27.

    Li D, Li M, Zheng J et al (2017) Joint rotation invariant feature for vehicle detection in aerial images. Ninth international conference on digital image processing (ICDIP), International Society for Optics and Photonics, 10420: 104200W

  28. 28.

    Li J, Liang X, Wei Y et al (2017) Perceptual generative adversarial networks for small object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1222–1230

  29. 29.

    Lin TY, Dollár P, Girshick R et al (2017) Feature pyramid networks for object detection. IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125

  30. 30.

    Liu K, Mattyus G (2015) Fast multiclass vehicle detection on aerial images. Geoscience and Remote Sensing Letters 12(9):1938–1942

    Google Scholar 

  31. 31.

    Liu W, Anguelov D, Erhan D et al (2016) Ssd: single shot multibox detector. European conference on computer vision. Springer, Cham, pp 21–37

    Google Scholar 

  32. 32.

    Lou J, Zhu W, Wang H, Ren M (2016) Small target detection combining regional stability and saliency in a color image. Multimed Tools Appl 76(13):14781–14798

    Google Scholar 

  33. 33.

    Mandal M, Shah M, Meena P et al (2019) SSSDET: simple short and shallow network for resource efficient vehicle detection in aerial scenes. IEEE international conference on image processing (ICIP), pp 3098–3102

  34. 34.

    Mandal M, Shah M, Meena P et al (2019) AVDNet: A Small-Sized Vehicle Detection Network for Aerial Visual Data. IEEE Geoscience and Remote Sensing Letters, pp 1–5

  35. 35.

    Medina L, Reiner A (2000) Do birds possess homologues of mammalian primary visual, somatosensory and motor cortices? Trends Neurosci 23(1):1–12

    Google Scholar 

  36. 36.

    Mysore P, Asadollahi A, Knudsen E (2010) Global inhibition and stimulus competition in the owl optic Tectum. J Neurosci 30(5):1727–1738

    Google Scholar 

  37. 37.

    Northmore D (2011) Optic tectum. Encyclopedia of fish physiology: from genome to environment. Elsevier, pp 131–142

  38. 38.

    Razakarivony S, Jurie F (2016) Vehicle detection in aerial imagery: a small target detection benchmark. J Vis Commun Image Represent 34:187–203

    Google Scholar 

  39. 39.

    Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  40. 40.

    Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767

  41. 41.

    Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  42. 42.

    Reiner A, Yamamoto K, Karten H (2005) Organization and evolution of the avian forebrain. Anat Rec 287(1):1080–1102

    Google Scholar 

  43. 43.

    Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Google Scholar 

  44. 44.

    Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster R-CNN. Appl Sci 8(5):813

    Google Scholar 

  45. 45.

    Sermanet P, Eigen D, Zhang X et al (2013) OverFeat: integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229

  46. 46.

    Shi W, Caballero J, Huszár F et al (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1874–1883

  47. 47.

    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  48. 48.

    Sobral A, Vacavant A (2014) A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Comput Vis Image Underst 122:4–21

    Google Scholar 

  49. 49.

    Suganyadevi K, Malmurugan N (2014) OFGM-SMED: an efficient and robust foreground object detection in compressed video sequences. Eng Appl Artif Intell 28:210–217

    Google Scholar 

  50. 50.

    Uijlings J, van de Sande KE, Gevers T, Smeulders A (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Google Scholar 

  51. 51.

    Vedaldi A, Lenc K (2015) Matconvnet: convolutional neural networks for matlab. Proceedings of the 23rd ACM international conference on multimedia. ACM, pp 689–692

  52. 52.

    Wang P, Tian JW, Gao CQ (2009) Infrared small target detection using directional highpass filters based on LS-SVM. Electron Lett 45(3):156–158

    Google Scholar 

  53. 53.

    Yang Y, Cao P, Yang Y, Wang S (2008) Corollary discharge circuits for saccadic modulation of the pigeon visual system. Nat Neurosci 11(5):595–602

    Google Scholar 

  54. 54.

    Yang MY, Liao W, Li X et al (2019) Vehicle detection in aerial images. Photogramm Eng Remote Sens 85(4):297–304

    Google Scholar 

  55. 55.

    Zhong J, Lei T, Yao G (2017) Robust vehicle detection in aerial images based on cascaded convolutional neural networks. Sensors 17(12):2720

    Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (NSFC) General program (61673353), Young Scientist Fund of NSFC (61603344) and Key research projects of Henan colleges and Universities(15A120017).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Zhizhong Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Liu, D., Lei, Y. et al. Small target detection based on bird’s visual information processing mechanism. Multimed Tools Appl 79, 22083–22105 (2020). https://doi.org/10.1007/s11042-020-08807-8

Download citation

Keywords

  • Convolutional neural network
  • Small-target detection
  • Saliency algorithm
  • Super-resolution
  • Large fields of view