Abstract
Object detection on multi-source images from satellite platforms is difficult due to the characteristics of imaging sensors. Multi-model image fusion provides a possibility to improve the performance of object detection. This paper proposes a fusion object detection framework with arbitrary-oriented region convolutional neural network. First, nine kinds of pansharpening methods are utilized to fuse multi-source images. Second, a novel object detection framework based on Faster Region-based Convolutional Neural Network structure is used, which is suitable for large-scale satellite images. Region Proposal Network is adopted to generate axially aligned bounding boxes enclosing object sin different orientations, and then extract features by pooling layers with different sizes. These features are used to classify the proposals, adjust the bounding boxes, and predict the inclined boxes and the objectness/non-objectness score. Smaller anchors for small objects are considered. Finally, inclined non-maximum suppression method is utilized to get the detection results. Experimental results showed that the proposed method performs better than some state-of-the-art object detection techniques, such as YOLO-v2, YOLO-v3, etc. Some numerical tests validate the efficiency and effectiveness of the proposed method.
Similar content being viewed by others
References
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: NIPS. Curran Associates Inc
Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Everingham M, Winn J (2006) The PASCAL visual object classes challenge 2007 (VOC2007) development kit. Int J Comput Vis 111(1):98–136
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems. MIT Press, Lake Tahoe, NV, pp 1097–1105
Vishwakarma S, Agrawal A (2013) A survey on activity recognition and behavior understanding in video surveillance. Visual Comput 29(10):983–1009
Zhao ZQ, Zheng P, Xu ST et al (2018) Object detection with deep learning: a review. arXiv:1807.05511
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Ren S, He K, Girshick R et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. In: Leibe B., Matas J., Sebe N., Welling M. (eds) Computer vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905. Springer, Cham
Redmon J, Divvala S, Girshick R et al (2015) You only look once: unified, real-time object detection. arXiv:1506.02640
Redmon, J, Farhadi A (2017) [IEEE 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)—Honolulu, HI (2017.7.21-2017.7.26)] 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)—YOLO9000: better, faster, stronger, pp 6517–6525
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv:1804.02767
Van Etten A (2018) You only look twice: rapid multi-scale object detection in satellite imagery. arXiv:1805.09512
Jiang Y, Zhu X, Wang X et al (2017) R2CNN: rotational region CNN for orientation robust scene text detection. arXiv:1706.09579
Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision & pattern recognition
Girshick R (2015) Fast R-CNN. Computer Science
Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2017) Arbitrary-oriented scene text detection via rotation proposals. arXiv preprint arXiv:1703.01086
Vivone G, Alparone L, Chanussot J et al (2014) A critical comparison among pansharpening algorithms. IEEE Trans Geosci Remote Sens 53(5):2565–2586
Chaudhuri S, Kotwal K (2013) Hyperspectral image fusion. Springer, Berlin
Middleton EM, Ungar SG, Mandl DJ, Ong L, Frye SW, Campbell PE, Landis DR, Young JP, Pollack NH (2013) The earth observing one (eo-1) satellite mission: over a decade in space. IEEE J Sel Top Appl Earth Observ Remote Sens 6(2):243–256
Jing Z, Pan H, Xiao G (2015) Application to environmental surveillance: dynamic image estimation fusion and optimal remote sensing with fuzzy integral. Springer, Cham, pp 159–189. https://doi.org/10.1007/978-3-319-12892-4_7
Zhongliang J, Han P, Yuankai L, Peng D (2018) Non-cooperative target tracking, fusion and control: algorithms and advances. Springer, Berlin
Pan H, Jing Z, Qiao L, Li M (2018) Visible and infrared image fusion using l0-generalized total variation model. Sci China Inf Sci 61(4):049103
Shen HF, Meng XC, Zhang LP (2016) An integrated framework for the spatio-temporal-spectral fusion of remote sensing images. IEEE Trans Geosci Remote Sens 54(12):7135–7148
Zhang LP, Shen HF (2016) Progress and future of remote sensing data fusion. J Remote Sens 20(5):1050–1061
Aiazzi B, Alparone L, Baronti S et al (2012) Twenty-five years of pansharpening: a critical review and new developments. In: Chen CH (ed) Signal and image processing for remote sensing, 2nd edn. CRC Press, Boca Raton, FL, pp 533–548
Chavez PS Jr, Sides SC, Anderson JA (1991) Comparison of three different methods to merge multiresolution and multispectral data: landsat TM and SPOT panchromatic. Photogramm Eng Remote Sens 57(3):295–303
Laben CA, Brower BV (2000) Process for enhancing the spatial resolution of multispectral imagery using pansharpening: United States, 6011875[P]. 04 Jan 2000
Carper W, Lillesand T, Kiefer R (1990) The use of intensity-hue-saturation transformations for merging SPOT panchromatic and multispectral image data. Photogramm Eng Remote Sens 56(4):459–467
Meng XC, Li J, Shen HF et al (2016) Pansharpening with a guided filter based on three-layer decomposition. Sensors 16(7):1068
Ranchin T, Wald L (2000) Fusion of high spatial and spectral resolution images: the ARSIS concept and its implementation. Photogramm Eng Remote Sens 66(1):49–61
Aiazzi B, Alparone L, Baronti S, Garzelli A (2006) MTF-tailored multiscale fusion of high-resolution MS and Pan imagery. Photogramm Eng Remote Sens 72(5):591–596
Li WJ, Wen WP, Wang QH (2015) A study of remote sensing image fusion method based on Contourlet transform. Remote Sens Land Resources 27(2):44–50. https://doi.org/10.6046/gtzyyg.2015.02.07
Tu TM, Su SC, Shyu HC et al (2001) A new look at IHS-like image fusion methods. Inf Fusion 2(3):177–186
Joyce X. Deep learning for object detection: a comprehensive review [EB/OL]. https://towardsdatascience.com/deep-learning-for-object-detection-a-comprehensive-review-73930816d8d9. 12 Sept 2017/28 May 2019
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR 2014
Thomas C, Ranchin T, Wald L, Chanussot J (2008) Synthesis of multispectral images to high spatial resolution: a critical review of fusion methods based on remote sensing physics. IEEE Trans Geosci Remote Sens 46(5):1301–1312
Tu T-M, Huang PS, Hung C-L, Chang C-P (2004) A fast intensity-hue-saturation fusion technique with spectral adjustment for IKONOS imagery. IEEE Geosci Remote Sens Lett 1(4):309–312
Gillespie R, Kahle AB, Walker RE (1987) Color enhancement of highly correlated images—II. Channel ratio and “Chromaticity” transform techniques. Remote Sens Environ 22(3):343–365
Chavez PS Jr, Kwarteng AW (1989) Extracting spectral contrast in Landsat thematic mapper image data using selective principal component analysis. Photogramm Eng Remote Sens 55(3):339–348
Aiazzi B, Baronti S, Selva M (2007) Improving component substitution pansharpening through multivariate regression of MS + Pan data. IEEE Trans Geosci Remote Sens 45(10):3230–3239
Garzelli A, Nencini F, Capobianco L (2008) Optimal MMSE pan sharpening of very high resolution multispectral images. IEEE Trans Geosci Remote Sens 46(1):228–236
Choi J, Yu K, Kim Y (2011) A new adaptive component-substitution based satellite image fusion by using partial replacement. IEEE Trans Geosci Remote Sens 49(1):295–309
Dou W, Chen Y, Li X, Sui D (2007) A general framework for component substitution image fusion: an implementation using fast image fusion method. Comput Geosci 33(2):219–228
Schowengerdt RA (1997) Remote sensing: models and methods for imageprocessing, 2nd edn. Academic, Orlando, FL
Liu JG (2000) Smoothing filter based intensity modulation: a spectral preserve image fusion technique for improving spatial details. Int J Remote Sens 21(18):3461–3472
Wald L, Ranchin T (2002) Comment: Liu ‘Smoothing filter-based intensitymodulation: a spectral preserve image fusion technique for improving spatial details’. Int J Remote Sens 23(3):593–597
Khan MM, Chanussot J, Condat L, Montavert A (2008) Indusion: fusion of multispectral and panchromatic images using the induction scaling technique. IEEE Geosci Remote Sens Lett 5(1):98–102
Aiazzi B, Alparone L, Baronti S, Garzelli A, Selva M (2003) An MTF-based spectral distortion minimizing model for pan-sharpening of very high resolution multispectral images of urban areas. In: Proceedings on 2nd GRSS/ISPRS joint workshop on remote sensing and data fusion over urban areas, pp 90–94
Xia GS, Bai X, Ding J et al (2017) DOTA: a large-scale dataset for object detection in aerial images. arXiv:1711.10398
Alparone L et al (2007) Comparison of pansharpening algorithms: outcome of the 2006 GRS-S data fusion contest. IEEE Trans Geosci Remote Sens 45(10):3012–3021
Dstl Satellite Imagery Feature Detection[EB/OL]. https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection. 08 March 2017/28 May 2019
IEEE GRSS Data Fusion Contest [EB/OL]. http://www.grss-ieee.org/community/technical-committees/data-fusion/data-fusion-contest/. 28 May 2019
Acknowledgements
This work is jointly supported by National Natural Science Foundation of China (Grant Nos. 61673262, 61603249), and key project of Science and Technology Commission of Shanghai Municipality (Grant No. 16JC1401100).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ya, Y., Pan, H., Jing, Z. et al. Fusion object detection of satellite imagery with arbitrary-oriented region convolutional neural network. AS 2, 163–174 (2019). https://doi.org/10.1007/s42401-019-00033-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42401-019-00033-x