Advertisement

Journal of Digital Imaging

, Volume 32, Issue 5, pp 888–896 | Cite as

Generalizable Inter-Institutional Classification of Abnormal Chest Radiographs Using Efficient Convolutional Neural Networks

  • Ian PanEmail author
  • Saurabh Agarwal
  • Derek Merck
Article

Abstract

Our objective is to evaluate the effectiveness of efficient convolutional neural networks (CNNs) for abnormality detection in chest radiographs and investigate the generalizability of our models on data from independent sources. We used the National Institutes of Health ChestX-ray14 (NIH-CXR) and the Rhode Island Hospital chest radiograph (RIH-CXR) datasets in this study. Both datasets were split into training, validation, and test sets. The DenseNet and MobileNetV2 CNN architectures were used to train models on each dataset to classify chest radiographs into normal or abnormal categories; models trained on NIH-CXR were designed to also predict the presence of 14 different pathological findings. Models were evaluated on both NIH-CXR and RIH-CXR test sets based on the area under the receiver operating characteristic curve (AUROC). DenseNet and MobileNetV2 models achieved AUROCs of 0.900 and 0.893 for normal versus abnormal classification on NIH-CXR and AUROCs of 0.960 and 0.951 on RIH-CXR. For the 14 pathological findings in NIH-CXR, MobileNetV2 achieved an AUROC within 0.03 of DenseNet for each finding, with an average difference of 0.01. When externally validated on independently collected data (e.g., RIH-CXR-trained models on NIH-CXR), model AUROCs decreased by 3.6–5.2% relative to their locally trained counterparts. MobileNetV2 achieved comparable performance to DenseNet in our analysis, demonstrating the efficacy of efficient CNNs for chest radiograph abnormality detection. In addition, models were able to generalize to external data albeit with performance decreases that should be taken into consideration when applying models on data from different institutions.

Keywords

Convolutional neural networks Deep learning Generalizability Chest radiographs Classification 

Notes

References

  1. 1.
    Yu Q, Yang Y, Liu F, Song Y-Z, Xiang T, Hospedales TM: Sketch-A-Net: a deep neural network that beats humans. Int J Comput Vis. 122(3):411–425, 2017.  https://doi.org/10.1007/s11263-016-0932-3 CrossRefGoogle Scholar
  2. 2.
    Dodge S, Karam L. A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions. arXiv:170502498 [cs]. May 2017. http://arxiv.org/abs/1705.02498
  3. 3.
    Krizhevsky A, Sutskever I, Hinton GE. ImageNet Classification with Deep Convolutional Neural Networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ, eds. Advances in Neural Information Processing Systems 25. 2012:1097–1105. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
  4. 4.
    Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros J, Kim R, Raman R, Nelson PC, Mega JL, Webster DR: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 316(22):2402–2410, 2016.  https://doi.org/10.1001/jama.2016.17216 CrossRefGoogle Scholar
  5. 5.
    Ting DSW, Cheung CY-L, Lim G et al.: Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA. 318(22):2211–2223, 2017.  https://doi.org/10.1001/jama.2017.18152 CrossRefGoogle Scholar
  6. 6.
    Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S: Dermatologist-level classification of skin cancer with deep neural networks. Nature. 542(7639):115–118, 2017.  https://doi.org/10.1038/nature21056 CrossRefGoogle Scholar
  7. 7.
    Ehteshami Bejnordi B, Veta M, Johannes van Diest P, van Ginneken B, Karssemeijer N, Litjens G, van der Laak JAWM, and the CAMELYON16 Consortium, Hermsen M, Manson QF, Balkenhol M, Geessink O, Stathonikos N, van Dijk MCRF, Bult P, Beca F, Beck AH, Wang D, Khosla A, Gargeya R, Irshad H, Zhong A, Dou Q, Li Q, Chen H, Lin HJ, Heng PA, Haß C, Bruni E, Wong Q, Halici U, Öner MÜ, Cetin-Atalay R, Berseth M, Khvatkov V, Vylegzhanin A, Kraus O, Shaban M, Rajpoot N, Awan R, Sirinukunwattana K, Qaiser T, Tsang YW, Tellez D, Annuscheit J, Hufnagl P, Valkonen M, Kartasalo K, Latonen L, Ruusuvuori P, Liimatainen K, Albarqouni S, Mungal B, George A, Demirci S, Navab N, Watanabe S, Seno S, Takenaka Y, Matsuda H, Ahmady Phoulady H, Kovalev V, Kalinovsky A, Liauchuk V, Bueno G, Fernandez-Carrobles MM, Serrano I, Deniz O, Racoceanu D, Venâncio R: Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 318(22):2199–2210, 2017.  https://doi.org/10.1001/jama.2017.14585 CrossRefGoogle Scholar
  8. 8.
    Lee H, Tajmir S, Lee J, Zissen M, Yeshiwas BA, Alkasab TK, Choy G, Do S: Fully automated deep learning system for bone age assessment. J Digit Imaging. 30(4):427–441, 2017.  https://doi.org/10.1007/s10278-017-9955-8 CrossRefGoogle Scholar
  9. 9.
    Larson DB, Chen MC, Lungren MP, Halabi SS, Stence NV, Langlotz CP: Performance of a deep-learning neural network model in assessing skeletal maturity on pediatric hand radiographs. Radiology. 287(1):313–322, 2017.  https://doi.org/10.1148/radiol.2017170236 CrossRefGoogle Scholar
  10. 10.
    Halabi SS, Prevedello LM, Kalpathy-Cramer J, Mamonov AB, Bilbily A, Cicero M, Pan I, Pereira LA, Sousa RT, Abdala N, Kitamura FC, Thodberg HH, Chen L, Shih G, Andriole K, Kohli MD, Erickson BJ, Flanders AE: The RSNA pediatric bone age machine learning challenge. Radiology.:180736, 2018.  https://doi.org/10.1148/radiol.2018180736
  11. 11.
    Chilamkurthy S, Ghosh R, Tanamala S, Biviji M, Campeau NG, Venugopal VK, Mahajan V, Rao P, Warier P: Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. Lancet. 392(10162):2388–2396, 2018.  https://doi.org/10.1016/S0140-6736(18)31645-3 CrossRefGoogle Scholar
  12. 12.
    Ribli D, Horváth A, Unger Z, Pollner P, Csabai I: Detecting and classifying lesions in mammograms with Deep Learning. Sci Rep. 8:4165, 2018.  https://doi.org/10.1038/s41598-018-22437-z CrossRefGoogle Scholar
  13. 13.
    Becker AS, Marcon M, Ghafoor S, Wurnig MC, Frauenfelder T, Boss A: Deep learning in mammography: diagnostic accuracy of a multipurpose image analysis software in the detection of breast cancer. Invest Radiol. 52(7):434–440, 2017.  https://doi.org/10.1097/RLI.0000000000000358 CrossRefGoogle Scholar
  14. 14.
    Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JAWM, van Ginneken B, Sánchez CI: A survey on deep learning in medical image analysis. Medical Image Analysis. 42:60–88, 2017.  https://doi.org/10.1016/j.media.2017.07.005 CrossRefGoogle Scholar
  15. 15.
    Lakhani P, Sundaram B: Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 284(2):574–582, 2017.  https://doi.org/10.1148/radiol.2017162326 CrossRefGoogle Scholar
  16. 16.
    Lakhani P: Deep convolutional neural networks for endotracheal tube position and X-ray image classification: challenges and opportunities. J Digit Imaging. 30(4):460–468, 2017.  https://doi.org/10.1007/s10278-017-9980-7 CrossRefGoogle Scholar
  17. 17.
    Cicero M, Bilbily A, Colak E, Dowdell T, Gray B, Perampaladas K, Barfett J: Training and validating a deep convolutional neural network for computer-aided detection and classification of abnormalities on frontal chest radiographs. Invest Radiol. 52(5):281–287, 2017.  https://doi.org/10.1097/RLI.0000000000000341 CrossRefGoogle Scholar
  18. 18.
    Putha P, Tadepalli M, Reddy B, et al. Can Artificial Intelligence Reliably Report Chest X-Rays?: Radiologist Validation of an Algorithm trained on 1.2 Million X-Rays. arXiv:180707455 [cs]. July 2018. http://arxiv.org/abs/1807.07455
  19. 19.
    Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:3462–3471.  https://doi.org/10.1109/CVPR.2017.369
  20. 20.
    Rajpurkar P, Irvin J, Ball RL, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz CP, Patel BN, Yeom KW, Shpanskaya K, Blankenberg FG, Seekins J, Amrhein TJ, Mong DA, Halabi SS, Zucker EJ, Ng AY, Lungren MP: Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLOS Medicine. 15(11):e1002686, 2018.  https://doi.org/10.1371/journal.pmed.1002686 CrossRefGoogle Scholar
  21. 21.
    Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. arXiv:180104381 [cs]. 2018. http://arxiv.org/abs/1801.04381
  22. 22.
    Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK: Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Med. 15(11):e1002683, 2018.  https://doi.org/10.1371/journal.pmed.1002683 CrossRefGoogle Scholar
  23. 23.
    Swenson DW, Baird GL, Portelli DC, Mainiero MB, Movson JS: Pilot study of a new comprehensive radiology report categorization (RADCAT) system in the emergency department. Emerg Radiol. 25(2):139–145, 2018.  https://doi.org/10.1007/s10140-017-1565-8
  24. 24.
    Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely Connected Convolutional Networks. arXiv:160806993 [cs]. 2016. http://arxiv.org/abs/1608.06993
  25. 25.
    Russakovsky O, Deng J, Su H, et al. ImageNet Large Scale Visual Recognition Challenge. arXiv:14090575 [cs]. 2014. http://arxiv.org/abs/1409.0575
  26. 26.
    Paszke A, Gross S, Chintala S, et al. Automatic differentiation in PyTorch. 2017. https://openreview.net/forum?id=BJJsrmfCZ.
  27. 27.
    Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. arXiv:14126980 [cs]. 2014. http://arxiv.org/abs/1412.6980
  28. 28.
    Efron B, Tibshirani R: Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statist Sci. 1(1):54–75, 1986.  https://doi.org/10.1214/ss/1177013815 CrossRefGoogle Scholar
  29. 29.
    Rajpurkar P, Irvin J, Zhu K, et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. arXiv:171105225 [cs, stat]. 2017. http://arxiv.org/abs/1711.05225
  30. 30.
    Raoof S, Feigin D, Sung A, Raoof S, Irugulpati L, Rosenow EC: Interpretation of plain chest roentgenogram. Chest. 141(2):545–558, 2012.  https://doi.org/10.1378/chest.10-1302 CrossRefGoogle Scholar

Copyright information

© Society for Imaging Informatics in Medicine 2019

Authors and Affiliations

  1. 1.Warren Alpert Medical SchoolBrown UniversityProvidenceUSA
  2. 2.Department of Diagnostic ImagingRhode Island HospitalProvidenceUSA

Personalised recommendations