Advertisement

Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment

  • Florin C. GhesuEmail author
  • Bogdan Georgescu
  • Eli Gibson
  • Sebastian Guendel
  • Mannudeep K. Kalra
  • Ramandeep Singh
  • Subba R. Digumarthy
  • Sasa Grbic
  • Dorin Comaniciu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11769)

Abstract

The interpretation of chest radiographs is an essential task for the detection of thoracic diseases and abnormalities. However, it is a challenging problem with high inter-rater variability and inherent ambiguity due to inconclusive evidence in the data, limited data quality or subjective definitions of disease appearance. Current deep learning solutions for chest radiograph abnormality classification are typically limited to providing probabilistic predictions, relying on the capacity of learning models to adapt to the high degree of label noise and become robust to the enumerated causal factors. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose an automatic system that learns not only the probabilistic estimate on the presence of an abnormality, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that explicitly learning the classification uncertainty as an additional measure to the predicted output, is essential to account for the inherent variability characteristic of this data. Experiments were conducted on two datasets of chest radiographs of over 85,000 patients. Sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC, e.g., by 8% to 0.91 with an expected rejection rate of under 25%. Eliminating training samples using uncertainty-driven bootstrapping, enables a significant increase in robustness and accuracy. In addition, we present a multi-reader study showing that the predictive uncertainty is indicative of reader errors.

References

  1. 1.
    Dempster, A.P.: A generalization of bayesian inference. J. Royal Stat. Soc.: Ser. B (Methodol.) 30(2), 205–232 (1968)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Gohagan, J.K., Prorok, P.C., Hayes, R.B., Kramer, B.S.: The prostate, lung, colorectal and ovarian (PLCO) cancer screening trial of the National Cancer Institute: history, organization, and status. Control. Clin. Trials 21(6), 251–272 (2000)CrossRefGoogle Scholar
  3. 3.
    Guan, Q., Huang, Y., Zhong, Z., Zheng, Z., Zheng, L., Yang, Y.: Diagnose like a radiologist: attention guided convolutional neural network for thorax disease classification. arXiv:1801.09927 (2018)
  4. 4.
    Guendel, S., et al.: Learning to recognize abnormalities in chest X-rays with location-aware dense networks. arXiv:1803.04565 (2018)
  5. 5.
    Huang, G., Liu, Z., v. d. Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR, pp. 2261–2269 (2017)Google Scholar
  6. 6.
    Jøsang, A.: Subjective Logic: A Formalism for Reasoning Under Uncertainty, 1st edn. Springer, Heidelberg (2016).  https://doi.org/10.1007/978-3-319-42337-1CrossRefzbMATHGoogle Scholar
  7. 7.
    Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: NIPS, pp. 6402–6413 (2017)Google Scholar
  8. 8.
    Molchanov, D., Ashukha, A., Vetrov, D.: Variational dropout sparsifies deep neural networks. In: ICML, pp. 2498–2507 (2017)Google Scholar
  9. 9.
    Rajpurkar, P., et al.: Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15(11), e1002686 (2018)CrossRefGoogle Scholar
  10. 10.
    Sensoy, M., Kaplan, L., Kandemir, M.: Evidential deep learning to quantify classification uncertainty. In: NIPS, pp. 3179–3189 (2018)Google Scholar
  11. 11.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.: ChestX-Ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: CVPR, pp. 3462–3471 (2017)Google Scholar
  13. 13.
    Yao, L., Prosky, J., Poblenz, E., Covington, B., Lyman, K.: Weakly supervised medical diagnosis and localization from multiple resolutions. arXiv:1803.07703 (2018)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Florin C. Ghesu
    • 1
    Email author
  • Bogdan Georgescu
    • 1
  • Eli Gibson
    • 1
  • Sebastian Guendel
    • 1
  • Mannudeep K. Kalra
    • 2
    • 3
  • Ramandeep Singh
    • 2
    • 3
  • Subba R. Digumarthy
    • 2
    • 3
  • Sasa Grbic
    • 1
  • Dorin Comaniciu
    • 1
  1. 1.Digital Technology and InnovationSiemens HealthineersPrincetonUSA
  2. 2.Department of RadiologyMassachusetts General HospitalBostonUSA
  3. 3.Harvard Medical SchoolBostonUSA

Personalised recommendations