Fully CapsNet for Semantic Segmentation

  • Su Li
  • Xiangyu Ren
  • Lu YangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11257)


Fully convolutional networks (FCNs) are powerful models for semantic segmentation. But convolutional networks fail to perform well in recognizing and parsing images with spatial variation. In this paper, a novel Capsule network called Fully CapsNet is proposed. We introduce Capsule to FCN and improve Equivariance of the neural network in image segmentation. Compared with traditional FCN based networks, a trained Fully CapsNet shows robustness in recognizing image pixels with more or less spatial variation. Each capsule layer is connected by dynamic routing algorithm. The effectiveness of the proposed model is verified through PASCAL VOC. Results show that Fully CapsNet outperforms the FCN in understanding both original images and rotated images.


Fully convolutional network Semantic segmentation Capsule network PASCAL VOC 



This research was supported by NSFC (No. 61871074) and Fundamental Research Funds for the Central Universities (ZYGX2018J064).


  1. 1.
    Ballester, P., Araujo, R.M.: On the performance of GoogLeNet and AlexNet applied to sketches. In: Thirtieth AAAI Conference on Artificial Intelligence, pp. 1124–1128 (2016)Google Scholar
  2. 2.
    Branson, S., Horn, G.V., Belongie, S., Perona, P.: Bird species categorization using pose normalized deep convolutional nets. eprint Arxiv (2014)Google Scholar
  3. 3.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. Computer Science (2014)Google Scholar
  4. 4.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)CrossRefGoogle Scholar
  5. 5.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)Google Scholar
  6. 6.
    Fitzgerald, D.L.: Landing site selection for UAV forced landings using machine vision. Unmanned Arial Vehicle (2007)Google Scholar
  7. 7.
    Han, S.Q., Wang, L.: A survey of thresholding methods for image segmentation. Syst. Eng. Electron. 41, 233–260 (2002)Google Scholar
  8. 8.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  9. 9.
    Lee, C.M., Schroder, K.E., Seibel, E.J.: Efficient image segmentation of walking hazards using IR illumination in wearable low vision. In: International Symposium on Wearable Computers, pp. 127–128 (2002)Google Scholar
  10. 10.
    Li, H., Qian, X., Li, W.: Image semantic segmentation based on fully convolutional neural network and CRF. In: Yuan, H., Geng, J., Bian, F. (eds.) GRMSE 2016. CCIS, vol. 698, pp. 245–250. Springer, Singapore (2017). Scholar
  11. 11.
    Rother, C., Kolmogorov, V., Blake, A.: “GrabCut”: interactive foreground extraction using iterated graph cuts. In: ACM SIGGRAPH, pp. 309–314 (2004)Google Scholar
  12. 12.
    Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3859–3869 (2017)Google Scholar
  13. 13.
    Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2014)CrossRefGoogle Scholar
  14. 14.
    Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Computer Society (2000)Google Scholar
  15. 15.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)Google Scholar
  16. 16.
    Wang, S.L., Cao, A.J., Chen, C., Wang, R.Y.: A comparative study on fuzzy-clustering-based lip region segmentation methods. Commun. Comput. Inf. Sci. 234, 376–381 (2011)Google Scholar
  17. 17.
    Wong, Y.W., Tang, L., Bailey, D.: Vision system for a robot guide system. In: Fourth International Conference on Computational Intelligence, Robotics and Autonomous Systems, pp. 337–341 (2007)Google Scholar
  18. 18.
    Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: Computer Vision and Pattern Recognition, pp. 2528–2535 (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Automation EngineeringUniversity of Electronic Science and Technology of ChinaChengduChina

Personalised recommendations