Semantic Segmentation of Aerial Image Using Fully Convolutional Network
Dense semantic segmentation is an important task for remote sensing image analyzing and understanding. Recently deep learning has been applied to pixel-level labeling tasks in computer vision and produces state-of-the-art results. In this work, a fully convolutional network (FCN), which is a variant of convolutional neural network (CNN), is employed to address the semantic segmentation of high resolution aerial images. We design a skip-layer architecture that combines different layers of features in aerial images. This structure integrates the semantic information from deep layer and appearance information from shallow layer to make better use of the aerial image features. Moreover, the FCN can be trained end-to-end and produce segmentation output correspondingly-sized as the input image. Our model is trained on the extended GE-4 aerial image dataset to adapt FCN to the aerial image segmentation task. A full-resolution semantic segmentation is produced for each testing aerial image. Experiments show that our method obtains improvement in accuracy compared with several other methods.
KeywordsSemantic segmentation Aerial images Deep learning Convolutional neural network Fully convolutional network
This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61501009, 61771031 and 61371134), the National Key Research and Development Program of China (2016YFB0501300, 2016YFB0501302) and the Aerospace Science and Technology Innovation Fund of CASC (China Aerospace Science and Technology Corporation).
- 2.Viola, P.: Robust real time object detection. In: International Workshop on Statistical and Computational Theories of Vision – Modeling, Learning, Computing, and Sampling 87 (2001)Google Scholar
- 3.Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. (2012)Google Scholar
- 4.Lin, G., Shen, C., Reid, I., et al.: Efficient piecewise training of deep structured models for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
- 5.Penatti, O.A.B., Nogueira, K., dos Santos, J.A.: Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? In: IEEE CVPR Workshops, pp. 44–55 (2015)Google Scholar
- 8.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)Google Scholar
- 9.Yang, J., Jiang, Z., Quan, Z., et al.: Remote sensing image semantic labeling based on conditional random field. Acta Aeronaut. Et Astronaut. Sin. (2015)Google Scholar
- 10.Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 437–478. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_26CrossRefGoogle Scholar
- 11.Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood 405 methods. Adv. Larg. Margin Classif. 10, 61–74 (1999)Google Scholar
- 12.Zhong, P., Wang, R.: Jointly learning the hybrid CRF and MLR model for simultaneous denoising and 384 classification of hyperspectral imagery. IEEE Trans. Neural Netw. Learn. Syst. 25(7), 385, 1319–1334 (2014)Google Scholar
- 13.Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (ICLR 2015) (2015)Google Scholar
- 14.Arora, H., Loeff, N., Forsyth, D., Ahuja, N., et al.: Unsupervised segmentation of objects using efficient 434 learning. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR 2007, pp. 1–7, 435. IEEE (2007)Google Scholar