Semantic Segmentation with Modified Deep Residual Networks

Chen, Xinze; Cheng, Guangliang; Cai, Yinghao; Wen, Dayong; Li, Heping

doi:10.1007/978-981-10-3005-5_4

Xinze Chen¹⁶,
Guangliang Cheng¹⁶,
Yinghao Cai¹⁶,
Dayong Wen¹⁶ &
…
Heping Li¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

Chinese Conference on Pattern Recognition

3055 Accesses
5 Citations

Abstract

A novel semantic segmentation method is proposed, which consists of the following three parts: (I) First, a simple yet effective data augmentation method is introduced without any extra GPU memory cost during training. (II) Second, a deeper residual network is constructed through three effective techniques: dilated convolution, LSTM network and multi-scale prediction. (III) Third, an online hard pixels mining is adopted to improve the segmentation performance. We combine these three parts to train an end-to-end network and achieve a new state-of-the-art segmentation accuracy of 79.3 % on PASCAL VOC 2012 test set at the time of submission.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://host.robots.ox.ac.uk:8080/anonymous/GHOLEA.html.
2.
We use the released ResNet-101 model, which is public available at https://github.com/KaimingHe/deep-residual-networks.

References

Bell, S., Zitnick, C.L., Bala, K., Girshick, R.B.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. arXiv preprint arXiv:1512.04143 (2015)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR (2015)
Google Scholar
Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: CVPR (2016)
Google Scholar
Dai, J., He, K., Sun, J.: BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: ICCV, pp. 1635–1643 (2015)
Google Scholar
Everingham, M., Eslami, S.M.A., Gool, L.J.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)
Article Google Scholar
Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: ICML, pp. 1764–1772 (2014)
Google Scholar
Hariharan, B., Arbelaez, P., Bourdev, L.D., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: ICCV, pp. 991–998 (2011)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: ICCV, pp. 1026–1034 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678 (2014)
Google Scholar
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: NIPS, pp. 109–117 (2011)
Google Scholar
Lin, G., Shen, C., van den Hengel, A., Reid, I.D.: Exploring context with deep structured models for semantic segmentation. In: CVPR (2016)
Google Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10602-1_48
Google Scholar
Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: ICCV, pp. 1377–1385 (2015)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Google Scholar
Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, San Diego (1999)
MATH Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015)
Google Scholar
Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: CVPR (2016)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Badrinarayanan, V., Alex Kendall, R.C.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561 (2015)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
Google Scholar
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: ICCV, pp. 1529–1537 (2015)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (NSFC) (Grant No.: 61305048, Grant No.: 61503381).

Author information

Authors and Affiliations

NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Xinze Chen, Guangliang Cheng, Yinghao Cai, Dayong Wen & Heping Li

Authors

Xinze Chen
View author publications
You can also search for this author in PubMed Google Scholar
Guangliang Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Yinghao Cai
View author publications
You can also search for this author in PubMed Google Scholar
Dayong Wen
View author publications
You can also search for this author in PubMed Google Scholar
Heping Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heping Li .

Editor information

Editors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China
Xuelong Li
Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
Xilin Chen
Tsinghua University , Beijing, China
Jie Zhou
Nanjing University of Science and Technology, Nanjing, China
Jian Yang
University of Electronic Science and Technology, Chengdu, Sichuan, China
Hong Cheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X., Cheng, G., Cai, Y., Wen, D., Li, H. (2016). Semantic Segmentation with Modified Deep Residual Networks. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_4

Download citation

DOI: https://doi.org/10.1007/978-981-10-3005-5_4
Published: 22 October 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics