A deep learning model integrating mammography and clinical factors facilitates the malignancy prediction of BI-RADS 4 microcalcifications in breast cancer screening

Abstract

Objectives

To investigate the value of full-field digital mammography-based deep learning (DL) in predicting malignancy of Breast Imaging Reporting and Data System (BI-RADS) 4 microcalcifications.

Methods

A total of 384 patients with 414 pathologically confirmed microcalcifications (221 malignant and 193 benign) were randomly allocated into the training, validation, and testing datasets (272/71/71 lesions) in this retrospective study. A combined DL model was developed incorporating mammography and clinical variables. Model performance was evaluated by using areas under the receiver operating characteristic curve (AUC) and compared with the clinical model, stand-alone DL image model, and BI-RADS approach. The predictive performance for malignancy was also compared between the combined model and human readers (2 juniors and 2 seniors).

Results

The combined DL model demonstrated favorable AUC, sensitivity, and specificity of 0.910, 85.3%, and 91.9% in predicting BI-RADS 4 malignant microcalcifications in the testing dataset, which outperformed the clinical model, DL image model, and BI-RADS with AUCs of 0.799, 0.841, and 0.804, respectively. The combined model achieved non-inferior performance as senior radiologists (p = 0.860, p = 0.800) and outperformed junior radiologists (p = 0.155, p = 0.029). The diagnostic performance of two junior radiologists was improved after artificial intelligence assistance with AUCs increased to 0.854 and 0.901 from 0.816 (p = 0.556) and 0.773 (p = 0.046), while the interobserver agreement was improved with a kappa value increased to 0.843 from 0.331.

Conclusions

The combined deep learning model can improve the malignancy prediction of BI-RADS 4 microcalcifications in screening mammography and assist junior radiologists to achieve better performance, which can facilitate clinical decision-making.

Key Points

The combined deep learning model demonstrated high diagnostic power, sensitivity, and specificity for predicting malignant BI-RADS 4 mammographic microcalcifications.

The combined model achieved similar performance with senior breast radiologists, while it outperformed junior breast radiologists.

Deep learning could improve the diagnostic performance of junior radiologists and facilitate clinical decision-making.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Abbreviations

AI:

Artificial intelligence

AUC:

Area under the receiver operating characteristic curve

BI-RADS:

Breast Imaging Reporting and Data System

CNN:

Convolution neural network

DL:

Deep learning

FFDM:

Full-field digital mammography

NPV:

Negative predictive value

PPV:

Positive predictive value

References

  1. 1.

    Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68:394–424

    Article  Google Scholar 

  2. 2.

    Tabár L, Yen AM, Wu WY et al (2015) Insights from the breast cancer screening trials: how screening affects the natural history of breast cancer and implications for evaluating service screening programs. Breast J 21:13–20

    Article  Google Scholar 

  3. 3.

    Independent UK Panel on Breast Cancer Screening (2012) The benefits and harms of breast cancer screening: an independent review. Lancet 380:1778–1786

    Article  Google Scholar 

  4. 4.

    Gülsün M, Demirkazik FB, Ariyürek M (2003) Evaluation of breast microcalcifications according to breast imaging reporting and data system criteria and Le Gal’s classification. Eur J Radiol 47:227–231

    Article  Google Scholar 

  5. 5.

    O'Grady S, Morgan MP (2018) Microcalcifications in breast cancer: from pathophysiology to diagnosis and prognosis. Biochim Biophys Acta Rev Cancer 1869:310–320

    CAS  Article  Google Scholar 

  6. 6.

    D’Orsi CJSE, Mendelson EB (2013) Breast imaging reporting and data system: ACR BI-RADS-breast imaging atlas, 5th edn. Reston, American College of Radiology

    Google Scholar 

  7. 7.

    Park GE, Kim SH, Lee JM, Kang BJ, Chae BJ (2019) Comparison of positive predictive values of categorization of suspicious calcifications using the 4th and 5th editions of BI-RADS. AJR Am J Roentgenol 213:710–715

    Article  Google Scholar 

  8. 8.

    Lehman CD, Arao RF, Sprague BL et al (2017) National performance benchmarks for modern screening digital mammography: update from the Breast Cancer Surveillance Consortium. Radiology 283:49–58

    Article  Google Scholar 

  9. 9.

    Yala A, Schuster T, Miles R, Barzilay R, Lehman C (2019) A deep learning model to triage screening mammograms: a simulation study. Radiology 293:38–46

    Article  Google Scholar 

  10. 10.

    Jiang Y, Nishikawa RM, Schmidt RA, Metz CE (2006) Comparison of independent double readings and computer-aided diagnosis (CAD) for the diagnosis of breast calcifications. Acad Radiol 13:84–94

    Article  Google Scholar 

  11. 11.

    McKinney SM, Sieniek M, Godbole V et al (2020) International evaluation of an AI system for breast cancer screening. Nature 577:89–94

    CAS  Article  Google Scholar 

  12. 12.

    van Velzen SGM, Lessmann N, Velthuis BK et al (2020) Deep learning for automatic calcium scoring in CT: validation using multiple cardiac CT and chest CT protocols. Radiology 295:66–79

    Article  Google Scholar 

  13. 13.

    Xu Y, Hosny A, Zeleznik R et al (2019) Deep learning predicts lung cancer treatment response from serial medical imaging. Clin Cancer Res 25:3266–3275

    Article  Google Scholar 

  14. 14.

    Cai H, Huang Q, Rong W et al (2019) Breast microcalcification diagnosis using deep convolutional neural network from digital mammograms. Comput Math Methods Med 2019:2717454

    Article  Google Scholar 

  15. 15.

    Wang J, Yang X, Cai H, Tan W, Jin C, Li L (2016) Discrimination of breast cancer with microcalcifications on mammography by deep learning. Sci Rep 6:27327

    CAS  Article  Google Scholar 

  16. 16.

    Lei C, Wei W, Liu Z et al (2019) Mammography-based radiomic analysis for predicting benign BI-RADS category 4 calcifications. Eur J Radiol 121:108711

    Article  Google Scholar 

  17. 17.

    Dong L, Wesseloo J, Potvin Y, Li X (2016) Discrimination of mine seismic events and blasts using the fisher classifier, naive Bayesian classifier and logistic regression. Rock Mech Rock Eng 49:183–211

    Article  Google Scholar 

  18. 18.

    Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  19. 19.

    Evans A, Clements K, Maxwell A et al (2010) Lesion size is a major determinant of the mammographic features of ductal carcinoma in situ: findings from the Sloane project. Clin Radiol 65:181–184

    CAS  Article  Google Scholar 

  20. 20.

    Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252

    Article  Google Scholar 

  21. 21.

    Sandler MHA, Zhu M, Zhmoginov A, Chen LC. (2018) MobileNetV2: inverted residuals and linear bottlenecks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510-4520

  22. 22.

    Duan C, Cao Y, Zhou L, Tan MT, Chen P (2018) A novel nonparametric confidence interval for differences of proportions for correlated binary data. Stat Methods Med Res 27:2249–2263

  23. 23.

    Esteva A, Kuprel B, Novoa RA et al (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118

    CAS  Article  Google Scholar 

  24. 24.

    Ibrahim A, Primakov S, Beuque M et al (2020) Radiomics for precision medicine: current challenges,future prospects, and the proposal of a new framework. Methods S1046-2023(20):30111-0

    Google Scholar 

  25. 25.

    Yasaka K, Akai H, Abe O, Kiryu S (2018) Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: a preliminary study. Radiology 286:887–896

    Article  Google Scholar 

  26. 26.

    Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25:44–56

    CAS  Article  Google Scholar 

Download references

Funding

This study was supported by Special Research Program of Shanghai Municipal Commission of Heath and Family Planning on medical intelligence (No. 2018ZHYL0108), National Key Research and Development Program of China (No. 2017YFC0109003), National Natural Science Foundation of China (No. 81901695), Shanghai Sailing Program (No. 19YF1433100), and Research Fund of Hospital Project (No. 18YJ16). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Dengbin Wang.

Ethics declarations

Guarantor

The scientific guarantor of this publication is Prof. Dengbin Wang.

Conflict of interest

Three of the authors (H.L.Z., W.X.T., and H.K.Y.) are employees of Beijing Infervision Technology Co., Ltd. No potential conflicts of interest are disclosed by the other authors.

Statistics and biometry

No complex statistical methods were necessary for this paper.

Informed consent

Written informed consent was waived by the Institutional Review Board.

Ethical approval

Institutional Review Board approval was obtained.

Methodology

• Retrospective

• Diagnostic or prognostic study

• Performed at one institution

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

ESM 1

(DOCX 310 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Chen, Y., Zhang, Y. et al. A deep learning model integrating mammography and clinical factors facilitates the malignancy prediction of BI-RADS 4 microcalcifications in breast cancer screening. Eur Radiol (2021). https://doi.org/10.1007/s00330-020-07659-y

Download citation

Keywords

  • Breast cancer
  • Mammography
  • Calcification
  • Deep learning
  • Screening