A Multi-modal Deep Learning Method for Classifying Chest Radiology Exams

Nunes, Nelson; Martins, Bruno; André da Silva, Nuno; Leite, Francisca; J. Silva, Mário

doi:10.1007/978-3-030-30241-2_28

A Multi-modal Deep Learning Method for Classifying Chest Radiology Exams

Nelson Nunes¹¹,
Bruno Martins¹¹,
Nuno André da Silva¹²,
Francisca Leite¹² &
…
Mário J. Silva¹¹

Conference paper
First Online: 30 August 2019

2972 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11804))

Abstract

Non-invasive medical imaging techniques, such as radiography or computed tomography, are extensively used in hospitals and clinics for the diagnosis of diverse injuries or diseases. However, the interpretation of these images, which often results in a free-text radiology report and/or a classification, requires specialized medical professionals, leading to high labor costs and waiting lists. Automatic inference of thoracic diseases from the results of chest radiography exams, e.g. for the purpose of indexing these documents, is still a challenging task, even if combining images with the free-text reports. Deep neural architectures can contribute to a more efficient indexing of radiology exams (e.g., associating the data to diagnostic codes), providing interpretable classification results that can guide the domain experts. This work proposes a novel multi-modal approach, combining a dual path convolutional neural network for processing images with a bidirectional recurrent neural network for processing text, enhanced with attention mechanisms and leveraging pre-trained clinical word embeddings. The experimental results show interesting patterns, e.g. validating the high performance of the individual components, and showing promising results for the multi-modal processing of radiology examination data, particularly when pre-training the components of the model with large pre-existing datasets (i.e., a 10% increase in terms of the average value for the areas under the receiver operating characteristic curves).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Irvin, J., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Google Scholar
Johnson, A., et al.: MIMIC-CXR: a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042 (2019)
Khan, S., Rahmani, S., Shah, S.A., Bennamoun, M.: A guide to convolutional neural networks for computer vision. In: Synthesis Lectures on Computer Vision (2018)
Article Google Scholar
Hansell, D.M., Bankier, A.A., MacMahon, H., McLoud, T.C., Muller, N.L., Remy, J.: Fleischner society: glossary of terms for thoracic imaging. Radiology 246(3), 697–722 (2008)
Article Google Scholar
Laserson, J., et al.: TextRay: mining clinical reports to gain a broad understanding of chest x-rays. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 553–561. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_62
Chapter Google Scholar
Duarte, F., Martins, B., Pinto, C.S., Silva, M.J.: Deep neural models for ICD-10 coding of death certificates and autopsy reports in free-text. J. Biomed. Inform. 80, 64–77 (2018)
Article Google Scholar
Goldberg, Y.: Neural network methods for natural language processing. In: Synthesis Lectures on Human Language Technologies (2017)
Article Google Scholar
Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2017)
Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (2017)
Google Scholar
Krause, B., Lu, L., Murray, I., Renals, S.: Multiplicative LSTM for sequence modelling. arXiv preprint arXiv:1609.07959 (2017)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2017)
Google Scholar
Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2015)
Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. J. Am. Med. Inform. Assoc. 23(2), 304–310 (2016)
Article Google Scholar
Szymański, P., Kajdanowicz, T.: A network perspective on stratification of multi-label data. In: Proceedings of the International Workshop on Learning with Imbalanced Domains: Theory and Applications (2017)
Google Scholar
Sechidis, K., Tsoumakas, G., Vlahavas, I.: On the stratification of multi-label data. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6913, pp. 145–158. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23808-6_10
Chapter Google Scholar
Szymański, P., Kajdanowicz, T.: A scikit-based Python environment for performing multi-label classification. arXiv preprint arXiv:1702.01460 (2017)
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Wang, X., Peng, Y., Lu, L., Lu, Z., Summers, R.M.: TieNet: text-image embedding network for common thorax disease classification and reporting in chest x-rays. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Rajpurkar, R.M., et al.: CheXNet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (2016)
Google Scholar
Karimi, S., Dai, X., Hassanzadeh, H., Nguyen, A.: Automatic diagnosis coding of radiology reports: a comparison of deep learning and conventional classification methods. In: Proceedings of the Workshop on Biomedical Natural Language Processing (2017)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2014)
Google Scholar
Chen, Q., Peng, Y., Lu, Z.: BioSentVec: creating sentence embeddings for biomedical texts. arXiv preprint arXiv:1810.09302 (2018)
Xu, B., Huang, R., Li, M.: Revise saturated activation functions. arXiv preprint arXiv:1602.05980 (2016)
Eger, S., Youssef, P., Gurevych, I.: Is it time to swish? Comparing deep learning activation functions across NLP tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2018)
Google Scholar
Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. arXiv preprint arXiv:1802.01548 (2018)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Alsentzer, E., et al.: Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323 (2019)

Download references

Acknowledgements

Authors from INESC-ID were partially supported through Fundação para a Ciência e Tecnologia (FCT), specifically through the INESC-ID multi-annual funding from the PIDDAC programme (UID/CEC/50021/2019). We gratefully acknowledge the support of NVIDIA Corporation, with the donation of the Titan Xp GPU used in the experiments.

Author information

Authors and Affiliations

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
Nelson Nunes, Bruno Martins & Mário J. Silva
Hospital da Luz Learning Health, Luz Saude, Lisbon, Portugal
Nuno André da Silva & Francisca Leite

Authors

Nelson Nunes
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Martins
View author publications
You can also search for this author in PubMed Google Scholar
Nuno André da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Francisca Leite
View author publications
You can also search for this author in PubMed Google Scholar
Mário J. Silva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nelson Nunes .

Editor information

Editors and Affiliations

INESC-TEC, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal
Paulo Moura Oliveira
University of Minho, Braga, Portugal
Paulo Novais
LIACC/UP, University of Porto, Porto, Portugal
Luís Paulo Reis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nunes, N., Martins, B., André da Silva, N., Leite, F., J. Silva, M. (2019). A Multi-modal Deep Learning Method for Classifying Chest Radiology Exams. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-30241-2_28
Published: 30 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30240-5
Online ISBN: 978-3-030-30241-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics