Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Ji, Ge-Peng; Chou, Yu-Cheng; Fan, Deng-Ping; Chen, Geng; Fu, Huazhu; Jha, Debesh; Shao, Ling

doi:10.1007/978-3-030-87193-2_14

Ge-Peng Ji^15,16,
Yu-Cheng Chou¹⁶,
Deng-Ping Fan¹⁵,
Geng Chen¹⁵,
Huazhu Fu¹⁵,
Debesh Jha¹⁷ &
…
Ling Shao¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12901))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

13k Accesses
43 Citations

Abstract

Existing video polyp segmentation(VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs cannot fully exploit the global temporal and spatial information in successive video frames, resulting in false positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed (\(\sim \)140fps) on a single RTX 2080 GPU and no post-processing. Our PNS-Net is based solely on a basic normalized self-attention block, equipping with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.

G.-P. Ji and Y.-C. Chou—Contributed equally. Code: http://dpfan.net/pnsnet/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We set \(H^{l}=\frac{H'}{4}\), \(W^{l}=\frac{W'}{4}\), \(C^{l}=24\), \(H^{h}=\frac{H'}{8}\), \(W^{h}=\frac{W'}{8}\), and \(C^{h}=32\).

References

Akbari, M., et al.: Polyp segmentation in colonoscopy images using fully convolutional network. In: IEEE EMBC, pp. 69–72 (2018)
Google Scholar
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: Wm-dova maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. CMIG 43, 99–111 (2015)
Google Scholar
Bernal, J., Sánchez, J., Vilarino, F.: Towards automatic polyp detection with a polyp appearance model. PR 45(9), 3166–3182 (2012)
Google Scholar
Brandao, P., et al.: Fully convolutional neural networks for polyp segmentation in colonoscopy. In: MICAD, vol. 10134, p. 101340F (2017)
Google Scholar
Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: IEEE ICCV, pp. 4548–4557 (2017)
Google Scholar
Fan, D.P., Ji, G.P., Cheng, M.M., Shao, L.: Concealed object detection. IEEE TPAMI 66, 9909–9917 (2021)
Google Scholar
Fan, D.P., Ji, G.P., Qin, X., Cheng, M.M.: Cognitive vision inspired object segmentation metric and loss function. SSI (2020)
Google Scholar
Fan, D.P., et al.: Pranet: parallel reverse attention network for polyp segmentation. In: MICCAI, pp. 263–273 (2020)
Google Scholar
Fang, Y., Chen, C., Yuan, Y., Tong, K.: Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 302–310. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_34
Chapter Google Scholar
Gu, Y., Wang, L., Wang, Z., Liu, Y., Cheng, M.M., Lu, S.P.: Pyramid constrained self-attention network for fast video salient object detection. AAAI 34, 10869–10876 (2020)
Article Google Scholar
Guo, L., Liu, J., Zhu, X., Yao, P., Lu, S., Lu, H.: Normalized and geometry-aware self-attention network for image captioning. In: IEEE CVPR, pp. 10327–10336 (2020)
Google Scholar
Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 451–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_37
Chapter Google Scholar
Jha, D., et al.: Resunet++: an advanced architecture for medical image segmentation. In: IEEE ISM, pp. 225–2255 (2019)
Google Scholar
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. NIPS 24, 109–117 (2011)
Google Scholar
Liu, S., Huang, D., et al.: Receptive field block net for accurate and fast object detection. In: ECCV, pp. 385–400 (2018)
Google Scholar
Mamonov, A.V., Figueiredo, I.N., Figueiredo, P.N., Tsai, Y.H.R.: Automated polyp detection in colon capsule endoscopy. IEEE TMI 33(7), 1488–1502 (2014)
Google Scholar
Murugesan, B., Sarveswaran, K., Shankaranarayana, S.M., Ram, K., Joseph, J., Sivaprakasam, M.: Psi-Net: shape and boundary aware joint multi-task deep network for medical image segmentation. In: IEEE EMBC, pp. 7223–7226 (2019)
Google Scholar
Puyal, J.G.B., et al.: Endoscopic polyp segmentation using a hybrid 2D/3D CNN. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 295–305. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_29
Chapter Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Tajbakhsh, N., Gurudu, S.R., Liang, J.: Automated polyp detection in colonoscopy videos using shape and context information. IEEE TMI 35(2), 630–644 (2015)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: IEEE CVPR, pp. 7794–7803 (2018)
Google Scholar
Yang, F., Yang, H., Fu, J., Lu, H., Guo, B.: Learning texture transformer network for image super-resolution. In: IEEE CVPR, pp. 5791–5800 (2020)
Google Scholar
Yu, L., Chen, H., Dou, Q., Qin, J., Heng, P.A.: Integrating online and offline three-dimensional deep learning for automated polyp detection in colonoscopy videos. IEEE JBHI 21(1), 65–75 (2016)
Google Scholar
Zhang, R., Li, G., Li, Z., Cui, S., Qian, D., Yu, Y.: Adaptive context selection for polyp segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 253–262. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_25
Chapter Google Scholar
Zhang, R., Zheng, Y., Poon, C.C., Shen, D., Lau, J.Y.: Polyp detection during colonoscopy using a regression-based convolutional neural network with a tracker. PR 83, 209–219 (2018)
Google Scholar
Zhao, X., Zhang, L., Lu, H.: Automatic polyp segmentation via multi-scale subtraction network. In: MICCAI (2021)
Google Scholar
Zhong, J., Wang, W., Wu, H., Wen, Z., Qin, J.: PolypSeg: an efficient context-aware network for polyp segmentation from colonoscopy videos. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 285–294. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_28
Chapter Google Scholar
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: A nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Inception Institute of AI (IIAI), Abu Dhabi, UAE
Ge-Peng Ji, Deng-Ping Fan, Geng Chen, Huazhu Fu & Ling Shao
Wuhan University, Wuhan, China
Ge-Peng Ji & Yu-Cheng Chou
SimulaMet, Oslo, Norway
Debesh Jha

Authors

Ge-Peng Ji
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Cheng Chou
View author publications
You can also search for this author in PubMed Google Scholar
Deng-Ping Fan
View author publications
You can also search for this author in PubMed Google Scholar
Geng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Huazhu Fu
View author publications
You can also search for this author in PubMed Google Scholar
Debesh Jha
View author publications
You can also search for this author in PubMed Google Scholar
Ling Shao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Erasmus MC - University Medical Center Rotterdam, Rotterdam, The Netherlands
Marleen de Bruijne
University of Basel, Allschwil, Switzerland
Philippe C. Cattin
Inria Nancy Grand Est, Villers-lès-Nancy, France
Stéphane Cotin
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Nicolas Padoy
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Tencent Jarvis Lab, Shenzhen, China
Yefeng Zheng
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Caroline Essert

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ji, GP. et al. (2021). Progressively Normalized Self-Attention Network for Video Polyp Segmentation. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12901. Springer, Cham. https://doi.org/10.1007/978-3-030-87193-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-87193-2_14
Published: 21 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87192-5
Online ISBN: 978-3-030-87193-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)