Skip to main content

ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases

  • Chapter
  • First Online:

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

Abstract

The chest X-ray is one of the most commonly accessible radiological examinations for screening and diagnosis of many lung diseases. A tremendous number of X-ray imaging studies accompanied by radiological reports are accumulated and stored in many modern hospitals’ picture archiving and communication systems (PACS) . On the other side, it is still an open question how this type of hospital-size knowledge database containing invaluable imaging informatics (i.e., loosely labeled) can be used to facilitate the data-hungry deep learning paradigms in building truly large-scale high-precision computer-aided diagnosis (CAD)  systems. In this chapter, we present a chest X-ray database, namely, “ChestX-ray”, which comprises 121,120 frontal-view X-ray images of 30,805 unique patients with the text-mined eight disease image labels (where each image can have multi-labels), from the associated radiological reports using natural language processing. Importantly, we demonstrate that these commonly occurring thoracic diseases can be detected and even spatially located via a unified weakly supervised multi-label image classification and disease localization framework, which is validated using our proposed dataset. Although the initial quantitative results are promising as reported, deep convolutional neural network-based “reading chest X-rays” (i.e., recognizing and locating the common disease patterns trained with only image-level labels) remains a strenuous task for fully automated high-precision CAD systems.

X. Wang—This work was done during his fellowship at National Institutes of Health Clinical Center.

L. Lu—This work was done during his employment at National Institutes of Health Clinical Center.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://nihcc.app.box.com/v/ChestXray-NIHCC.

  2. 2.

    https://uts.nlm.nih.gov/metathesaurus.html.

  3. 3.

    Data split files could be downloaded via https://nihcc.app.box.com/v/ChestXray-NIHCC.

References

  1. Antol S, Agrawal A, Lu J, Mitchell M, Batra D, Zitnick L (2015) Vqa: visual question answering. In: ICCV

    Google Scholar 

  2. Aronson AR, Lang FM (2010) An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc 17(3):229–236. https://doi.org/10.1136/jamia.2009.002733

    Article  Google Scholar 

  3. Ba J, Swersky K, Fidler S, Salakhutdinov R (2015) Predicting deep zero-shot convolutional neural networks using textual descriptions. In: ICCV

    Google Scholar 

  4. Bird S, Klein E, Loper E (2009) Natural language processing with Python. O’Reilly Media, Inc

    Google Scholar 

  5. Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG (2001) A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 34(5):301–310. https://doi.org/10.1006/jbin.2001.1029, http://www.sciencedirect.com/science/article/pii/S1532046401910299

    Article  Google Scholar 

  6. Charniak E, Johnson M (2005) Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In: Proceedings of the 43rd annual meeting on association for computational linguistics (ACL), pp 173–180

    Google Scholar 

  7. De Marneffe MC, Manning CD (2015) Stanford typed dependencies manual. Stanford University (2015)

    Google Scholar 

  8. Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ (2015) Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inform Assoc 23(2):304–310. https://doi.org/10.1093/jamia/ocv080, http://jamia.oxfordjournals.org/content/jaminfo/early/2015/07/01/jamia.ocv080.1.full.pdf

    Article  Google Scholar 

  9. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition. IEEE, pp 248–255

    Google Scholar 

  10. Dou Q, Chen H, Yu L, Zhao L, Qin J, Wang D, Mok V, Shi L, Heng P (2016) Automatic detection of cerebral microbleeds from mr images via 3D convolutional neural networks. IEEE Trans Med Imaging 35(5):1182–1195

    Article  Google Scholar 

  11. Durand T, Thome N, Cord M (2016) Weldon: weakly supervised learning of deep convolutional neural networks. IEEE CVPR

    Google Scholar 

  12. Everingham M, Eslami SMA, Van Gool LJ, Williams C, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136

    Article  Google Scholar 

  13. Greenspan H, van Ginneken B, Summers RM (2016) Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans Med Imaging 35(5):1153–1159

    Article  Google Scholar 

  14. Hariharan B, Girshick R (2016) Low-shot visual object recognition. arXiv:1606.02819

  15. Havaei M, Guizard N, Chapados N, Bengio Y (2016) Hemis: hetero-modal image segmentation. In: MICCAI, (2). Springer, Berlin, pp 469–477

    Chapter  Google Scholar 

  16. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:1512.03385

  17. Hwang S, Kim HE (2015) Self-transfer learning for weakly supervised lesion localization. In: MICCAI, (2). pp 239–246

    Google Scholar 

  18. Jaeger S, Candemir S, Antani S, Wáng YXJ, Lu PX, Thoma G (2014) Two public chest x-ray datasets for computer-aided screening of pulmonary diseases. Quant Imaging Med Surg 4(6). http://qims.amegroups.com/article/view/5132

  19. Jamaludin A, Kadir T, Zisserman A (2016) Spinenet: automatically pinpointing classification evidence in spinal MRIs. In: MICCAI. Springer, Berlin

    Chapter  Google Scholar 

  20. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093

  21. Johnson J, Karpathy A, Fei-Fei L (2016) Densecap: fully convolutional localization networks for dense captioning. In: CVPR

    Google Scholar 

  22. Karpathy A, Fei-Fei L (2015) Deep visual-semantic alignments for generating image descriptions. In: CVPR

    Google Scholar 

  23. Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li LJ, Shamma DA, Bernstein M, Fei-Fei L (2016) Visual genome: connecting language and vision using crowdsourced dense image annotations. https://arxiv.org/abs/1602.07332

  24. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

    Google Scholar 

  25. Leaman R, Khare R, Lu Z (2015) Challenges in clinical natural language processing for automated disorder normalization. J Biomed Inform 57:28–37. https://doi.org/10.1016/j.jbi.2015.07.010

    Article  Google Scholar 

  26. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick L (2014) Microsoft coco: common objects in context. In: ECCV, (5). pp 740–755

    Google Scholar 

  27. McClosky D (2009) Any domain parsing: automatic domain adaptation for natural language parsing. Department of Computer Science, Brown University, Thesis

    Google Scholar 

  28. Moeskops P, Wolterink J, van der Velden B, Gilhuijs K, Leiner T, Viergever M, Isgum I (2016) Deep learning for multi-task medical image segmentation in multiple modalities. In: MICCAI. Springer, Berlin

    Chapter  Google Scholar 

  29. Open-i: an open access biomedical search engine. https://openi.nlm.nih.gov

  30. Oquab M, Bottou L, Laptev I, Sivic J (2015) Is object localization for free?-weakly-supervised learning with convolutional neural networks. In: IEEE CVPR, pp 685–694

    Google Scholar 

  31. Pinheiro PO, Collobert R (2015) From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1713–1721

    Google Scholar 

  32. Plummer B, Wang L, Cervantes C, Caicedo J, Hockenmaier J, Lazebnik S (2015) Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models. In: ICCV

    Google Scholar 

  33. Qiao R, Liu L, Shen C, van den Hengel A (2016) Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR

    Google Scholar 

  34. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: MICCAI. Springer, Berlin, pp 234–241

    Google Scholar 

  35. Roth H, Lu L, Farag A, Shin HC, Liu J, Turkbey EB, Summers RM (2015) Deeporgan: multi-level deep convolutional networks for automated pancreas segmentation. In: MICCAI. Springer, Berlin, pp 556–564

    Google Scholar 

  36. Roth HR, Lu L, Seff A, Cherry KM, Hoffman J, Wang S, Liu J, Turkbey E, Summers RM (2014) A new 2.5D representation for lymph node detection using random sets of deep convolutional neural network observations. In: MICCAI. Springer, Berlin, pp 520–527

    Chapter  Google Scholar 

  37. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  38. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  39. Setio A, Ciompi F, Litjens G, Gerke P, Jacobs C, van Riel S, Wille M, Naqibullah M, Sánchez C, van Ginneken B (2016) Pulmonary nodule detection in ct images: false positive reduction using multi-view convolutional networks. IEEE Trans Med Imaging 35(5):1160–1169

    Article  Google Scholar 

  40. Shin H, Lu L, Kim L, Seff A, Yao J, Summers R (2016) Interleaved text/image deep mining on a large-scale radiology database for automated image interpretation. J Mach Learn Res 17:1–31

    MathSciNet  Google Scholar 

  41. Shin H, Roberts K, Lu L, Demner-Fushman D, Yao J, Summers R (2016) Learning to read chest x-rays: recurrent neural cascade model for automated image annotation. In: CVPR

    Google Scholar 

  42. Shin H, Roth H, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers R (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learnings. IEEE Trans Med Imaging 35(5):1285–1298

    Article  Google Scholar 

  43. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  44. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

    Google Scholar 

  45. Tapaswi M, Zhu Y, Stiefelhagen R, Torralba A, Urtasun R, Fidler S (2015) Movieqa: understanding stories in movies through question-answering. In: ICCV

    Google Scholar 

  46. Vendrov I, Kiros R, Fidler S, Urtasun R (2016) Order-embeddings of images and language. In: ICLR

    Google Scholar 

  47. Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural image caption generator. In: CVPR, pp 3156–3164

    Google Scholar 

  48. Wilke HJ, Kümin M, Urban J (2016) Genodisc dataset: the benefits of multi-disciplinary research on intervertebral disc degeneration. Eur Spine J. http://www.physiol.ox.ac.uk/genodisc/

  49. Wu Q, Wang P, Shen C, Dick A, van den Hengel A (2016) Ask me anything: free-form visual question answering based on knowledge from external sources. In: CVPR

    Google Scholar 

  50. Yao J, et al (2016) A multi-center milestone study of clinical vertebral ct segmentation. Comput Med Imaging Graph 49(4):16–28

    Article  Google Scholar 

  51. Young P, Lai A, Hodosh M, Hockenmaier J (2014) From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. In: TACL

    Google Scholar 

  52. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2015) Learning deep features for discriminative localization. arXiv:1512.04150

  53. Zhu Y, Groth O, Bernstein M, Fei-Fei L (2016) Visual7w: grounded question answering in images. In: CVPR

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Intramural Research Programs of the NIH Clinical Center and National Library of Medicine. We thank NVIDIA Corporation for the GPU donation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaosong Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M. (2019). ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases. In: Lu, L., Wang, X., Carneiro, G., Yang, L. (eds) Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-030-13969-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-13969-8_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-13968-1

  • Online ISBN: 978-3-030-13969-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics