Abstract
Quantifying uncertainty in medical image segmentation applications is essential, as it is often connected to vital decision-making. Compelling attempts have been made in quantifying the uncertainty in image segmentation architectures, e.g. to learn a density segmentation model conditioned on the input image. Typical work in this field restricts these learnt densities to be strictly Gaussian. In this paper, we propose to use a more flexible approach by introducing Normalizing Flows (NFs), which enables the learnt densities to be more complex and facilitate more accurate modeling for uncertainty. We prove this hypothesis by adopting the Probabilistic U-Net and augmenting the posterior density with an NF, allowing it to be more expressive. Our qualitative as well as quantitative (GED and IoU) evaluations on the multi-annotated and single-annotated LIDC-IDRI and Kvasir-SEG segmentation datasets, respectively, show a clear improvement. This is mostly apparent in the quantification of aleatoric uncertainty and the increased predictive performance of up to 14%. This result strongly indicates that a more flexible density model should be seriously considered in architectures that attempt to capture segmentation ambiguity through density modeling. The benefit of this improved modeling will increase human confidence in annotation and segmentation, and enable eager adoption of the technology in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Armato, S., III., et al.: The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med. Phys. 38, 915–931 (2011). https://doi.org/10.1118/1.3528204
Baumgartner, C.F., et al.: PHiSeg: capturing uncertainty in medical image segmentation (2019)
van den Berg, R., Hasenclever, L., Tomczak, J.M., Welling, M.: Sylvester normalizing flows for variational inference (2019)
Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 451–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_37
Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? CoRR (2017). http://arxiv.org/abs/1703.04977
Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1x1 convolutions (2018)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2014)
Kobyzev, I., Prince, S., Brubaker, M.: Normalizing flows: an introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell. p. 1 (2020). https://doi.org/10.1109/tpami.2020.2992934
Kohl, S.A.A., et al.: A hierarchical probabilistic U-Net for modeling multi-scale ambiguities (2019)
Kohl, S.A., et al.: A probabilistic u-net for segmentation of ambiguous images. arXiv preprint arXiv:1806.05034 (2018)
Pogorelov, K., et al.: KVASIR: a multi-class image dataset for computer aided gastrointestinal disease detection (2017). https://doi.org/10.1145/3083187.3083212
Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Selvan, R., Faye, F., Middleton, J., Pai, A.: Uncertainty quantification in medical image segmentation with normalizing flows. In: Liu, M., Yan, P., Lian, C., Cao, X. (eds.) MLMI 2020. LNCS, vol. 12436, pp. 80–90. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59861-7_9
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. Adv. Neural. Inf. Process. Syst. 28, 3483–3491 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendices
A Probabilistic U-Net Objective
The loss function of the PU-Net is based on the standard ELBO and is defined as
where the latent sample \(\mathbf {z}\) from the posterior distribution is conditioned on the input image \(\mathbf {x}\), and ground-truth segmentation \(\mathbf {s}\).
B Planar and Radial Flows
Normalizing Flows are trained by maximizing the likelihood objective
In the PU-Net, the objective becomes
where the i-th latent sample \(\mathbf {z}_i\) from the Normalizing Flow is conditioned on the input image \(\mathbf {x}\), and ground-truth segmentation \(\mathbf {s}\).
The planar flow expands and contracts distributions along a specific directions by applying the transformation
while the radial flow warps distributions around a specific point with the transformation
C Dataset Images
Here example images from the datasets used in this work can be seen. Figure 4 depicts four examples from the LIDC dataset. On the left in the figure the 2D CT image containing the lesion, followed by the four labels made by four independent annotators is shown. In Fig. 5, eight examples from the Kvasir-SEG dataset is depicted. An endoscopic image with its ground truth label can be seen.
D Sample Size Dependent GED
The GED evaluation is dependent on the number of reconstructions sampled from the prior distribution. Figure 6 depicts this relationship for the vanilla, 2-planar and 2-radial posterior models. The uncertainty in the values originate from the changing results when training with ten-fold cross validation. One can observe that with increasing sample size, the GED as well as the associated uncertainty decrease. This is also the case when the posterior is augmented with a 2-planar or 2-radial flow. Particularly, the uncertainty in the GED evaluation significantly decreases.
E Prior Distribution Variance
We investigated whether the prior distribution captures the degree of ambiguity in the input images. For every input image \(\mathbb {X}\), we obtain a latent L-dimensional mean and standard deviation vector of the prior distribution \(P(\boldsymbol{\mu }, \boldsymbol{\sigma }\vert \mathbb {X})\). The mean of the latent prior variance vector \(\mu _{LV}\), is obtained from the input images in an attempt to quantify this uncertainty. Figure 7 shows this for several different input images of the test set. As can be seen, the mean variance over the latent prior increases along with a subjective assessment of the annotation difficulty.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Valiuddin, M.M.A., Viviers, C.G.A., van Sloun, R.J.G., de With, P.H.N., van der Sommen, F. (2021). Improving Aleatoric Uncertainty Quantification in Multi-annotated Medical Image Segmentation with Normalizing Flows. In: Sudre, C.H., et al. Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, and Perinatal Imaging, Placental and Preterm Image Analysis. UNSURE PIPPI 2021 2021. Lecture Notes in Computer Science(), vol 12959. Springer, Cham. https://doi.org/10.1007/978-3-030-87735-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-87735-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87734-7
Online ISBN: 978-3-030-87735-4
eBook Packages: Computer ScienceComputer Science (R0)