Skip to main content

A Multi-modal Deep Learning Method for Classifying Chest Radiology Exams

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11804))

Abstract

Non-invasive medical imaging techniques, such as radiography or computed tomography, are extensively used in hospitals and clinics for the diagnosis of diverse injuries or diseases. However, the interpretation of these images, which often results in a free-text radiology report and/or a classification, requires specialized medical professionals, leading to high labor costs and waiting lists. Automatic inference of thoracic diseases from the results of chest radiography exams, e.g. for the purpose of indexing these documents, is still a challenging task, even if combining images with the free-text reports. Deep neural architectures can contribute to a more efficient indexing of radiology exams (e.g., associating the data to diagnostic codes), providing interpretable classification results that can guide the domain experts. This work proposes a novel multi-modal approach, combining a dual path convolutional neural network for processing images with a bidirectional recurrent neural network for processing text, enhanced with attention mechanisms and leveraging pre-trained clinical word embeddings. The experimental results show interesting patterns, e.g. validating the high performance of the individual components, and showing promising results for the multi-modal processing of radiology examination data, particularly when pre-training the components of the model with large pre-existing datasets (i.e., a 10% increase in terms of the average value for the areas under the receiver operating characteristic curves).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://keras.io/.

  2. 2.

    http://github.com/nfrn/Multi-Modal-Classification-of-Radiology-Exams.

References

  1. Irvin, J., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)

    Google Scholar 

  2. Johnson, A., et al.: MIMIC-CXR: a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042 (2019)

  3. Khan, S., Rahmani, S., Shah, S.A., Bennamoun, M.: A guide to convolutional neural networks for computer vision. In: Synthesis Lectures on Computer Vision (2018)

    Article  Google Scholar 

  4. Hansell, D.M., Bankier, A.A., MacMahon, H., McLoud, T.C., Muller, N.L., Remy, J.: Fleischner society: glossary of terms for thoracic imaging. Radiology 246(3), 697–722 (2008)

    Article  Google Scholar 

  5. Laserson, J., et al.: TextRay: mining clinical reports to gain a broad understanding of chest x-rays. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 553–561. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_62

    Chapter  Google Scholar 

  6. Duarte, F., Martins, B., Pinto, C.S., Silva, M.J.: Deep neural models for ICD-10 coding of death certificates and autopsy reports in free-text. J. Biomed. Inform. 80, 64–77 (2018)

    Article  Google Scholar 

  7. Goldberg, Y.: Neural network methods for natural language processing. In: Synthesis Lectures on Human Language Technologies (2017)

    Article  Google Scholar 

  8. Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2017)

    Google Scholar 

  9. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  11. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (2017)

    Google Scholar 

  12. Krause, B., Lu, L., Murray, I., Renals, S.: Multiplicative LSTM for sequence modelling. arXiv preprint arXiv:1609.07959 (2017)

  13. Vaswani, A., et al.: Attention is all you need. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2017)

    Google Scholar 

  14. Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016)

    Article  Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2015)

  16. Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. J. Am. Med. Inform. Assoc. 23(2), 304–310 (2016)

    Article  Google Scholar 

  17. Szymański, P., Kajdanowicz, T.: A network perspective on stratification of multi-label data. In: Proceedings of the International Workshop on Learning with Imbalanced Domains: Theory and Applications (2017)

    Google Scholar 

  18. Sechidis, K., Tsoumakas, G., Vlahavas, I.: On the stratification of multi-label data. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6913, pp. 145–158. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23808-6_10

    Chapter  Google Scholar 

  19. Szymański, P., Kajdanowicz, T.: A scikit-based Python environment for performing multi-label classification. arXiv preprint arXiv:1702.01460 (2017)

  20. Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  21. Wang, X., Peng, Y., Lu, L., Lu, Z., Summers, R.M.: TieNet: text-image embedding network for common thorax disease classification and reporting in chest x-rays. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  22. Rajpurkar, R.M., et al.: CheXNet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017)

  23. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (2016)

    Google Scholar 

  24. Karimi, S., Dai, X., Hassanzadeh, H., Nguyen, A.: Automatic diagnosis coding of radiology reports: a comparison of deep learning and conventional classification methods. In: Proceedings of the Workshop on Biomedical Natural Language Processing (2017)

    Google Scholar 

  25. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2014)

    Google Scholar 

  26. Chen, Q., Peng, Y., Lu, Z.: BioSentVec: creating sentence embeddings for biomedical texts. arXiv preprint arXiv:1810.09302 (2018)

  27. Xu, B., Huang, R., Li, M.: Revise saturated activation functions. arXiv preprint arXiv:1602.05980 (2016)

  28. Eger, S., Youssef, P., Gurevych, I.: Is it time to swish? Comparing deep learning activation functions across NLP tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2018)

    Google Scholar 

  29. Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  30. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  31. Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. arXiv preprint arXiv:1802.01548 (2018)

  32. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  33. Alsentzer, E., et al.: Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323 (2019)

Download references

Acknowledgements

Authors from INESC-ID were partially supported through Fundação para a Ciência e Tecnologia (FCT), specifically through the INESC-ID multi-annual funding from the PIDDAC programme (UID/CEC/50021/2019). We gratefully acknowledge the support of NVIDIA Corporation, with the donation of the Titan Xp GPU used in the experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nelson Nunes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nunes, N., Martins, B., André da Silva, N., Leite, F., J. Silva, M. (2019). A Multi-modal Deep Learning Method for Classifying Chest Radiology Exams. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30241-2_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30240-5

  • Online ISBN: 978-3-030-30241-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics