To investigate the value of full-field digital mammography-based deep learning (DL) in predicting malignancy of Breast Imaging Reporting and Data System (BI-RADS) 4 microcalcifications.
A total of 384 patients with 414 pathologically confirmed microcalcifications (221 malignant and 193 benign) were randomly allocated into the training, validation, and testing datasets (272/71/71 lesions) in this retrospective study. A combined DL model was developed incorporating mammography and clinical variables. Model performance was evaluated by using areas under the receiver operating characteristic curve (AUC) and compared with the clinical model, stand-alone DL image model, and BI-RADS approach. The predictive performance for malignancy was also compared between the combined model and human readers (2 juniors and 2 seniors).
The combined DL model demonstrated favorable AUC, sensitivity, and specificity of 0.910, 85.3%, and 91.9% in predicting BI-RADS 4 malignant microcalcifications in the testing dataset, which outperformed the clinical model, DL image model, and BI-RADS with AUCs of 0.799, 0.841, and 0.804, respectively. The combined model achieved non-inferior performance as senior radiologists (p = 0.860, p = 0.800) and outperformed junior radiologists (p = 0.155, p = 0.029). The diagnostic performance of two junior radiologists was improved after artificial intelligence assistance with AUCs increased to 0.854 and 0.901 from 0.816 (p = 0.556) and 0.773 (p = 0.046), while the interobserver agreement was improved with a kappa value increased to 0.843 from 0.331.
The combined deep learning model can improve the malignancy prediction of BI-RADS 4 microcalcifications in screening mammography and assist junior radiologists to achieve better performance, which can facilitate clinical decision-making.
• The combined deep learning model demonstrated high diagnostic power, sensitivity, and specificity for predicting malignant BI-RADS 4 mammographic microcalcifications.
• The combined model achieved similar performance with senior breast radiologists, while it outperformed junior breast radiologists.
• Deep learning could improve the diagnostic performance of junior radiologists and facilitate clinical decision-making.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Area under the receiver operating characteristic curve
Breast Imaging Reporting and Data System
Convolution neural network
Full-field digital mammography
Negative predictive value
Positive predictive value
Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68:394–424
Tabár L, Yen AM, Wu WY et al (2015) Insights from the breast cancer screening trials: how screening affects the natural history of breast cancer and implications for evaluating service screening programs. Breast J 21:13–20
Independent UK Panel on Breast Cancer Screening (2012) The benefits and harms of breast cancer screening: an independent review. Lancet 380:1778–1786
Gülsün M, Demirkazik FB, Ariyürek M (2003) Evaluation of breast microcalcifications according to breast imaging reporting and data system criteria and Le Gal’s classification. Eur J Radiol 47:227–231
O'Grady S, Morgan MP (2018) Microcalcifications in breast cancer: from pathophysiology to diagnosis and prognosis. Biochim Biophys Acta Rev Cancer 1869:310–320
D’Orsi CJSE, Mendelson EB (2013) Breast imaging reporting and data system: ACR BI-RADS-breast imaging atlas, 5th edn. Reston, American College of Radiology
Park GE, Kim SH, Lee JM, Kang BJ, Chae BJ (2019) Comparison of positive predictive values of categorization of suspicious calcifications using the 4th and 5th editions of BI-RADS. AJR Am J Roentgenol 213:710–715
Lehman CD, Arao RF, Sprague BL et al (2017) National performance benchmarks for modern screening digital mammography: update from the Breast Cancer Surveillance Consortium. Radiology 283:49–58
Yala A, Schuster T, Miles R, Barzilay R, Lehman C (2019) A deep learning model to triage screening mammograms: a simulation study. Radiology 293:38–46
Jiang Y, Nishikawa RM, Schmidt RA, Metz CE (2006) Comparison of independent double readings and computer-aided diagnosis (CAD) for the diagnosis of breast calcifications. Acad Radiol 13:84–94
McKinney SM, Sieniek M, Godbole V et al (2020) International evaluation of an AI system for breast cancer screening. Nature 577:89–94
van Velzen SGM, Lessmann N, Velthuis BK et al (2020) Deep learning for automatic calcium scoring in CT: validation using multiple cardiac CT and chest CT protocols. Radiology 295:66–79
Xu Y, Hosny A, Zeleznik R et al (2019) Deep learning predicts lung cancer treatment response from serial medical imaging. Clin Cancer Res 25:3266–3275
Cai H, Huang Q, Rong W et al (2019) Breast microcalcification diagnosis using deep convolutional neural network from digital mammograms. Comput Math Methods Med 2019:2717454
Wang J, Yang X, Cai H, Tan W, Jin C, Li L (2016) Discrimination of breast cancer with microcalcifications on mammography by deep learning. Sci Rep 6:27327
Lei C, Wei W, Liu Z et al (2019) Mammography-based radiomic analysis for predicting benign BI-RADS category 4 calcifications. Eur J Radiol 121:108711
Dong L, Wesseloo J, Potvin Y, Li X (2016) Discrimination of mine seismic events and blasts using the fisher classifier, naive Bayesian classifier and logistic regression. Rock Mech Rock Eng 49:183–211
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Evans A, Clements K, Maxwell A et al (2010) Lesion size is a major determinant of the mammographic features of ductal carcinoma in situ: findings from the Sloane project. Clin Radiol 65:181–184
Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252
Sandler MHA, Zhu M, Zhmoginov A, Chen LC. (2018) MobileNetV2: inverted residuals and linear bottlenecks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510-4520
Duan C, Cao Y, Zhou L, Tan MT, Chen P (2018) A novel nonparametric confidence interval for differences of proportions for correlated binary data. Stat Methods Med Res 27:2249–2263
Esteva A, Kuprel B, Novoa RA et al (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118
Ibrahim A, Primakov S, Beuque M et al (2020) Radiomics for precision medicine: current challenges,future prospects, and the proposal of a new framework. Methods S1046-2023(20):30111-0
Yasaka K, Akai H, Abe O, Kiryu S (2018) Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: a preliminary study. Radiology 286:887–896
Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25:44–56
This study was supported by Special Research Program of Shanghai Municipal Commission of Heath and Family Planning on medical intelligence (No. 2018ZHYL0108), National Key Research and Development Program of China (No. 2017YFC0109003), National Natural Science Foundation of China (No. 81901695), Shanghai Sailing Program (No. 19YF1433100), and Research Fund of Hospital Project (No. 18YJ16). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The scientific guarantor of this publication is Prof. Dengbin Wang.
Conflict of interest
Three of the authors (H.L.Z., W.X.T., and H.K.Y.) are employees of Beijing Infervision Technology Co., Ltd. No potential conflicts of interest are disclosed by the other authors.
Statistics and biometry
No complex statistical methods were necessary for this paper.
Written informed consent was waived by the Institutional Review Board.
Institutional Review Board approval was obtained.
• Diagnostic or prognostic study
• Performed at one institution
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Liu, H., Chen, Y., Zhang, Y. et al. A deep learning model integrating mammography and clinical factors facilitates the malignancy prediction of BI-RADS 4 microcalcifications in breast cancer screening. Eur Radiol (2021). https://doi.org/10.1007/s00330-020-07659-y
- Breast cancer
- Deep learning