Deep Learning Pre-training Strategy for Mammogram Image Classification: an Evaluation Study

Abstract

In this work, we assess how pre-training strategy affects deep learning performance for the task of distinguishing false-recall from malignancy and normal (benign) findings in digital mammography images. A cohort of 1303 breast cancer screening patients (4935 digital mammogram images in total) was retrospectively analyzed as the target dataset for this study. We assessed six different convolutional neural network model structures utilizing four different imaging datasets (total > 1.4 million images (including ImageNet); medical images different in terms of scale, modality, organ, and source) for pre-training on six classification tasks to assess how the performance of CNN models varies based on training strategy. Representative pre-training strategies included transfer learning with medical and non-medical datasets, layer freezing, varied network structure, and multi-view input for both binary and triple-class classification of mammogram images. The area under the receiver operating characteristic curve (AUC) was used as the model performance metric. The best performing model out of all experimental settings was an AlexNet model incrementally pre-trained on ImageNet and a large Breast Density dataset. The AUC for the six classification tasks using this model ranged from 0.68 to 0.77. In the case of distinguishing recalled-benign mammograms from others, four out of five pre-training strategies tested produced significant performance differences from the baseline model. This study suggests that pre-training strategy influences significant performance differences, especially in the case of distinguishing recalled- benign from malignant and benign screening patients.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. 1.

    Siu AL, on behalf of the U.S. Preventive Services Task Force: Screening for Breast Cancer: U.S. Preventive Services Task Force Recommendation Statement. Ann Intern Med. 164:279–296. https://doi.org/10.7326/M15-2886

  2. 2.

    Nelson HD, Tyne K, Naik A, Bougatsos C, Chan BK, Humphrey L: Screening for breast cancer: an update for the U.S. Preventive Services Task Force. Ann Intern Med. 151:727–737. https://doi.org/10.7326/0003-4819-151-10-200911170-00009

  3. 3.

    Hubbard RA, Kerlikowske K, Flowers CI, Yankaskas BC, Zhu W, Miglioretti DL: Cumulative probability of false-positive recall or biopsy recommendation after 10 years of screening mammography: a cohort study. Ann Intern Med. 155:481–492. https://doi.org/10.7326/0003-4819-155-8-201110180-00004

  4. 4.

    Brewer NT, Salz T, Lillie SE: Systematic review: the long-term effects of false-positive mammograms. Ann Intern Med. 146:502–510. https://doi.org/10.7326/0003-4819-146-7-200704030-00006

  5. 5.

    Lehman D, Arao RF, Sprague BL, et al: National performance benchmarks for modern screening digital mammography: update from the breast cancer surveillance consortium constance. Radiology. 283:(1)49-58, 2017

    Article  Google Scholar 

  6. 6.

    Litjens G, et al: A survey on deep learning in medical image analysis. Med Image Anal 42: 60-88, 2017

    Article  Google Scholar 

  7. 7.

    Samala RK, et al: Multi-task transfer learning deep convolutional neurals network: application to computer-aided diagnosis of breast cancer on mammograms. Phys Med Biol. 62: 23 8894, 2017

    Article  Google Scholar 

  8. 8.

    Shen L: End-to-end training for whole image breast cancer diagnosis using an all convolutional design. arXiv preprint arXiv:1708.09427, 2017

  9. 9.

    Aboutalib SS, et al: Deep learning to distinguish recalled but benign mammography images in breast cancer screening. Clinical Cancer Research, 2018

  10. 10.

    Hoo-Chang S, et al: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35: 5 1285, 2016

  11. 11.

    Tajbakhsh N, et al: Convolutional neural networks for medical image analysis: Full training or fine tuning?. IEEE Trans Med Imaging 35: 5 1299-1312, 2016

  12. 12.

    Clancy K, et al: Deep learning for identifying breast cancer malignancy and false recalls: a robustness study on training strategy. Medical Imaging 2019: Computer-Aided Diagnosis. Vol. 10950. International Society for Optics and Photonics, 2019

  13. 13.

    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, and Fei-Fei L: (* = equal contribution) ImageNet Large Scale Visual Recognition

  14. 14.

    Wang X, et al: Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 2017

  15. 15.

    Heath M, et al: The digital database for screening mammography. Proceedings of the 5th international workshop on digital mammography. Medical Physics Publishing, 2000

  16. 16.

    Mohamed AA, et al: A deep learning method for classifying mammographic breast density categories. Med Phys 45: 1 314-321, 2018

    Article  Google Scholar 

  17. 17.

    Krizhevsky A, Sutskever I, Hinton GE: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012.

  18. 18.

    He K, et al: Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016

  19. 19.

    Bradski G: The OpenCV Library. Dr. Dobb’s Journal of Software Tools. 2000

  20. 20.

    Keller BM, et al: Estimation of breast percent density in raw and processed full field digital mammography images via adaptive fuzzy c-means clustering and support vector machine segmentation. Med Phys 39: 8 4903-4917, 2012

    Article  Google Scholar 

  21. 21.

    Jia Y, et al: Caffe: convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014

  22. 22.

    Landgrebe, TCW, Duin RPW: Approximating the multiclass ROC by pairwise analysis. Pattern Recog Lett 28.13 (2007): 1747-1758.

    Article  Google Scholar 

  23. 23.

    Robin X, et al: pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12: 1 77, 2011

    Article  Google Scholar 

  24. 24.

    Yala A, et al: A deep learning mammography-based model for improved breast cancer risk prediction. Radiology. 182716, 2019

  25. 25.

    Rafferty EA, et al: Assessing radiologist performance using combined digital mammography and breast tomosynthesis compared with digital mammography alone: results of a multicenter, multireader trial. Radiology 266:1 104-113, 2013

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Institutes of Health (NIH)/National Cancer Institute (NCI) grants (#1R01CA193603, #3R01CA193603-03S1, and #1R01CA218405), a Radiological Society of North America (RSNA) Research Scholar Grant (#RSCH1530), an Amazon AWS Machine Learning Research Award, and a University of Pittsburgh Physicians (UPP) Academic Foundation Award. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the National Science Foundation grant number ACI-1548562. Specifically, it used the Bridges system, which is supported by NSF award number ACI-1445606, at the Pittsburgh Supercomputing Center (PSC).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Shandong Wu.

Ethics declarations

Conflict of Interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Clancy, K., Aboutalib, S., Mohamed, A. et al. Deep Learning Pre-training Strategy for Mammogram Image Classification: an Evaluation Study. J Digit Imaging (2020). https://doi.org/10.1007/s10278-020-00369-3

Download citation

Keywords

  • Breast cancer
  • Digital mammography
  • Deep learning
  • Transfer learning
  • Training strategy