Abstract
Objectives
To investigate the value of full-field digital mammography-based deep learning (DL) in predicting malignancy of Breast Imaging Reporting and Data System (BI-RADS) 4 microcalcifications.
Methods
A total of 384 patients with 414 pathologically confirmed microcalcifications (221 malignant and 193 benign) were randomly allocated into the training, validation, and testing datasets (272/71/71 lesions) in this retrospective study. A combined DL model was developed incorporating mammography and clinical variables. Model performance was evaluated by using areas under the receiver operating characteristic curve (AUC) and compared with the clinical model, stand-alone DL image model, and BI-RADS approach. The predictive performance for malignancy was also compared between the combined model and human readers (2 juniors and 2 seniors).
Results
The combined DL model demonstrated favorable AUC, sensitivity, and specificity of 0.910, 85.3%, and 91.9% in predicting BI-RADS 4 malignant microcalcifications in the testing dataset, which outperformed the clinical model, DL image model, and BI-RADS with AUCs of 0.799, 0.841, and 0.804, respectively. The combined model achieved non-inferior performance as senior radiologists (p = 0.860, p = 0.800) and outperformed junior radiologists (p = 0.155, p = 0.029). The diagnostic performance of two junior radiologists was improved after artificial intelligence assistance with AUCs increased to 0.854 and 0.901 from 0.816 (p = 0.556) and 0.773 (p = 0.046), while the interobserver agreement was improved with a kappa value increased to 0.843 from 0.331.
Conclusions
The combined deep learning model can improve the malignancy prediction of BI-RADS 4 microcalcifications in screening mammography and assist junior radiologists to achieve better performance, which can facilitate clinical decision-making.
Key Points
• The combined deep learning model demonstrated high diagnostic power, sensitivity, and specificity for predicting malignant BI-RADS 4 mammographic microcalcifications.
• The combined model achieved similar performance with senior breast radiologists, while it outperformed junior breast radiologists.
• Deep learning could improve the diagnostic performance of junior radiologists and facilitate clinical decision-making.
This is a preview of subscription content, access via your institution.






Abbreviations
- AI:
-
Artificial intelligence
- AUC:
-
Area under the receiver operating characteristic curve
- BI-RADS:
-
Breast Imaging Reporting and Data System
- CNN:
-
Convolution neural network
- DL:
-
Deep learning
- FFDM:
-
Full-field digital mammography
- NPV:
-
Negative predictive value
- PPV:
-
Positive predictive value
References
- 1.
Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68:394–424
- 2.
Tabár L, Yen AM, Wu WY et al (2015) Insights from the breast cancer screening trials: how screening affects the natural history of breast cancer and implications for evaluating service screening programs. Breast J 21:13–20
- 3.
Independent UK Panel on Breast Cancer Screening (2012) The benefits and harms of breast cancer screening: an independent review. Lancet 380:1778–1786
- 4.
Gülsün M, Demirkazik FB, Ariyürek M (2003) Evaluation of breast microcalcifications according to breast imaging reporting and data system criteria and Le Gal’s classification. Eur J Radiol 47:227–231
- 5.
O'Grady S, Morgan MP (2018) Microcalcifications in breast cancer: from pathophysiology to diagnosis and prognosis. Biochim Biophys Acta Rev Cancer 1869:310–320
- 6.
D’Orsi CJSE, Mendelson EB (2013) Breast imaging reporting and data system: ACR BI-RADS-breast imaging atlas, 5th edn. Reston, American College of Radiology
- 7.
Park GE, Kim SH, Lee JM, Kang BJ, Chae BJ (2019) Comparison of positive predictive values of categorization of suspicious calcifications using the 4th and 5th editions of BI-RADS. AJR Am J Roentgenol 213:710–715
- 8.
Lehman CD, Arao RF, Sprague BL et al (2017) National performance benchmarks for modern screening digital mammography: update from the Breast Cancer Surveillance Consortium. Radiology 283:49–58
- 9.
Yala A, Schuster T, Miles R, Barzilay R, Lehman C (2019) A deep learning model to triage screening mammograms: a simulation study. Radiology 293:38–46
- 10.
Jiang Y, Nishikawa RM, Schmidt RA, Metz CE (2006) Comparison of independent double readings and computer-aided diagnosis (CAD) for the diagnosis of breast calcifications. Acad Radiol 13:84–94
- 11.
McKinney SM, Sieniek M, Godbole V et al (2020) International evaluation of an AI system for breast cancer screening. Nature 577:89–94
- 12.
van Velzen SGM, Lessmann N, Velthuis BK et al (2020) Deep learning for automatic calcium scoring in CT: validation using multiple cardiac CT and chest CT protocols. Radiology 295:66–79
- 13.
Xu Y, Hosny A, Zeleznik R et al (2019) Deep learning predicts lung cancer treatment response from serial medical imaging. Clin Cancer Res 25:3266–3275
- 14.
Cai H, Huang Q, Rong W et al (2019) Breast microcalcification diagnosis using deep convolutional neural network from digital mammograms. Comput Math Methods Med 2019:2717454
- 15.
Wang J, Yang X, Cai H, Tan W, Jin C, Li L (2016) Discrimination of breast cancer with microcalcifications on mammography by deep learning. Sci Rep 6:27327
- 16.
Lei C, Wei W, Liu Z et al (2019) Mammography-based radiomic analysis for predicting benign BI-RADS category 4 calcifications. Eur J Radiol 121:108711
- 17.
Dong L, Wesseloo J, Potvin Y, Li X (2016) Discrimination of mine seismic events and blasts using the fisher classifier, naive Bayesian classifier and logistic regression. Rock Mech Rock Eng 49:183–211
- 18.
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
- 19.
Evans A, Clements K, Maxwell A et al (2010) Lesion size is a major determinant of the mammographic features of ductal carcinoma in situ: findings from the Sloane project. Clin Radiol 65:181–184
- 20.
Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252
- 21.
Sandler MHA, Zhu M, Zhmoginov A, Chen LC. (2018) MobileNetV2: inverted residuals and linear bottlenecks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510-4520
- 22.
Duan C, Cao Y, Zhou L, Tan MT, Chen P (2018) A novel nonparametric confidence interval for differences of proportions for correlated binary data. Stat Methods Med Res 27:2249–2263
- 23.
Esteva A, Kuprel B, Novoa RA et al (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118
- 24.
Ibrahim A, Primakov S, Beuque M et al (2020) Radiomics for precision medicine: current challenges,future prospects, and the proposal of a new framework. Methods S1046-2023(20):30111-0
- 25.
Yasaka K, Akai H, Abe O, Kiryu S (2018) Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: a preliminary study. Radiology 286:887–896
- 26.
Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25:44–56
Funding
This study was supported by Special Research Program of Shanghai Municipal Commission of Heath and Family Planning on medical intelligence (No. 2018ZHYL0108), National Key Research and Development Program of China (No. 2017YFC0109003), National Natural Science Foundation of China (No. 81901695), Shanghai Sailing Program (No. 19YF1433100), and Research Fund of Hospital Project (No. 18YJ16). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Affiliations
Corresponding author
Ethics declarations
Guarantor
The scientific guarantor of this publication is Prof. Dengbin Wang.
Conflict of interest
Three of the authors (H.L.Z., W.X.T., and H.K.Y.) are employees of Beijing Infervision Technology Co., Ltd. No potential conflicts of interest are disclosed by the other authors.
Statistics and biometry
No complex statistical methods were necessary for this paper.
Informed consent
Written informed consent was waived by the Institutional Review Board.
Ethical approval
Institutional Review Board approval was obtained.
Methodology
• Retrospective
• Diagnostic or prognostic study
• Performed at one institution
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
ESM 1
(DOCX 310 kb)
Rights and permissions
About this article
Cite this article
Liu, H., Chen, Y., Zhang, Y. et al. A deep learning model integrating mammography and clinical factors facilitates the malignancy prediction of BI-RADS 4 microcalcifications in breast cancer screening. Eur Radiol (2021). https://doi.org/10.1007/s00330-020-07659-y
Received:
Revised:
Accepted:
Published:
Keywords
- Breast cancer
- Mammography
- Calcification
- Deep learning
- Screening