Semantic Segmentation of Aerial Image Using Fully Convolutional Network

  • Junli Yang
  • Yiran Jiang
  • Han Fang
  • Zhiguo Jiang
  • Haopeng Zhang
  • Shuang Hao
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 875)


Dense semantic segmentation is an important task for remote sensing image analyzing and understanding. Recently deep learning has been applied to pixel-level labeling tasks in computer vision and produces state-of-the-art results. In this work, a fully convolutional network (FCN), which is a variant of convolutional neural network (CNN), is employed to address the semantic segmentation of high resolution aerial images. We design a skip-layer architecture that combines different layers of features in aerial images. This structure integrates the semantic information from deep layer and appearance information from shallow layer to make better use of the aerial image features. Moreover, the FCN can be trained end-to-end and produce segmentation output correspondingly-sized as the input image. Our model is trained on the extended GE-4 aerial image dataset to adapt FCN to the aerial image segmentation task. A full-resolution semantic segmentation is produced for each testing aerial image. Experiments show that our method obtains improvement in accuracy compared with several other methods.


Semantic segmentation Aerial images Deep learning Convolutional neural network Fully convolutional network 



This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61501009, 61771031 and 61371134), the National Key Research and Development Program of China (2016YFB0501300, 2016YFB0501302) and the Aerospace Science and Technology Innovation Fund of CASC (China Aerospace Science and Technology Corporation).


  1. 1.
    Zhou, H., Yua, Y., Shi, C.: Object tracking using SIFT features and mean shift. Comput. Vis. Image Underst. 113(3), 345–352 (2009)CrossRefGoogle Scholar
  2. 2.
    Viola, P.: Robust real time object detection. In: International Workshop on Statistical and Computational Theories of Vision – Modeling, Learning, Computing, and Sampling 87 (2001)Google Scholar
  3. 3.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. (2012)Google Scholar
  4. 4.
    Lin, G., Shen, C., Reid, I., et al.: Efficient piecewise training of deep structured models for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  5. 5.
    Penatti, O.A.B., Nogueira, K., dos Santos, J.A.: Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? In: IEEE CVPR Workshops, pp. 44–55 (2015)Google Scholar
  6. 6.
    Maggiori, E., Tarabalka, Y., Charpiat, G., et al.: Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans. Geosc. Remote Sens. 55(2), 645–657 (2016)CrossRefGoogle Scholar
  7. 7.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)CrossRefGoogle Scholar
  8. 8.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)Google Scholar
  9. 9.
    Yang, J., Jiang, Z., Quan, Z., et al.: Remote sensing image semantic labeling based on conditional random field. Acta Aeronaut. Et Astronaut. Sin. (2015)Google Scholar
  10. 10.
    Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 437–478. Springer, Heidelberg (2012). Scholar
  11. 11.
    Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood 405 methods. Adv. Larg. Margin Classif. 10, 61–74 (1999)Google Scholar
  12. 12.
    Zhong, P., Wang, R.: Jointly learning the hybrid CRF and MLR model for simultaneous denoising and 384 classification of hyperspectral imagery. IEEE Trans. Neural Netw. Learn. Syst. 25(7), 385, 1319–1334 (2014)Google Scholar
  13. 13.
    Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (ICLR 2015) (2015)Google Scholar
  14. 14.
    Arora, H., Loeff, N., Forsyth, D., Ahuja, N., et al.: Unsupervised segmentation of objects using efficient 434 learning. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR 2007, pp. 1–7, 435. IEEE (2007)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Junli Yang
    • 1
  • Yiran Jiang
    • 1
  • Han Fang
    • 1
  • Zhiguo Jiang
    • 2
  • Haopeng Zhang
    • 2
  • Shuang Hao
    • 3
  1. 1.International SchoolBeijing University of Posts and TelecommunicationsBeijingChina
  2. 2.Image Processing Center, School of AstronauticsBeihang UniversityBeijingChina
  3. 3.Beijing Control and Electronic Technology InstituteBeijingChina

Personalised recommendations