Learning Concept-Driven Document Embeddings for Medical Information Search

Nguyen, Gia-Hung; Tamine, Lynda; Soulier, Laure; Souf, Nathalie

doi:10.1007/978-3-319-59758-4_17

Gia-Hung Nguyen¹⁷,
Lynda Tamine¹⁷,
Laure Soulier¹⁸ &
…
Nathalie Souf¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10259))

Included in the following conference series:

Conference on Artificial Intelligence in Medicine in Europe

2385 Accesses
7 Citations

Abstract

Many medical tasks such as self-diagnosis, health-care assessment, and clinical trial patient recruitment involve the usage of information access tools. A key underlying step to achieve such tasks is the document-to-document matching which mostly fails to bridge the gap identified between raw level representations of information in documents and high-level human interpretation. In this paper, we study how to optimize the document representation by leveraging neural-based approaches to capture latent representations built upon both validated medical concepts specified in an external resource as well as the used words. We experimentally show the effectiveness of our proposed model used as a support of two different medical search tasks, namely health search and clinical search for cohorts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Text Retrieval Conference (http://trec.nist.gov/).
2.
https://sourceforge.net/projects/cxtractor/.

References

Abdou, S., Savoy, J.: Searching in MEDLINE: query expansion and manual indexing evaluation. Inf. Process. Manag. 44(2), 781–789 (2008)
Article Google Scholar
Bordes, A., Usunier, N., García-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS (2013)
Google Scholar
Choi, E., Bahadori, M.T., Searles, E., Coffey, C., Sun, J.: Multi-layer representation learning for medical concepts. In: KDD, pp. 1495–1504 (2016)
Google Scholar
De Vine, L., Zuccon, G., Koopman, B., Sitbon, L., Bruza, P.: Medical semantic similarity with a neural language model. In: CIKM, pp. 1819–1822 (2014)
Google Scholar
Dinh, D., Tamine, L.: Combining global and local semantic contexts for improving biomedical information retrieval. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 375–386. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_38
Chapter Google Scholar
Edinger, N.T., Cohen, A.M., Bedrick, S., Ambert, K., Hersh, W.: Barriers to retrieving patient information from electronic health record data: failure analysis from the TREC medical records track. In: AMIA Annual Symposium, pp. 180–188 (2012)
Google Scholar
Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., Smith, N.A.: Retrofitting word vectors to semantic lexicons. In: NAACL (2015)
Google Scholar
Gobeill, J., Ruch, P., Zhou, X.: Query and document expansion with medical subject headings terms at medical Imageclef 2008. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 736–743. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04447-2_95
Chapter Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57 (1999)
Google Scholar
Iacobacci, I., Pilehvar, M.T., Navigli, R.: Sensembed: learning sense embeddings for word and relational similarity. In: ACL, pp. 95–105 (2015)
Google Scholar
Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., Lawley, M.: Information retrieval as semantic inference: a graph inference model applied to medical search. Inf. Retrieval 19(1–2), 6–37 (2016)
Article Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)
Google Scholar
Le, T.-D., Chevallet, J.-P., Dong, T.B.T.: Thesaurus-based query and document expansion in conceptual indexing with UMLS. In: RIVF 2007, pp. 242–246 (2007)
Google Scholar
Lu, Z., Kim, W., Wilbur, W.J.: Evaluation of query expansion using MeSH in PubMed. Inf. Retrieval 12(1), 69–80 (2009)
Article Google Scholar
Mao, J., Lu, K., Mu, X., Li, G.: Mining document, concept, and term associations for effective biomedical retrieval: introducing MeSH-enhanced retrieval models. Inf. Retrieval 18(5), 413–444 (2015)
Article Google Scholar
Marton, C., Choo, C.W.: A review of theroretical models on health information seeking on the web. J. Documentation 68(3), 330–352 (2012)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint (2013). arXiv:1301.3781
Minarro-Gimenez, J., Marin-Alonso, O., Samwald, M.: Exploring the application of deep learning techniques on medical text corpora. Stud. Health Technol. Inf. 205, 584–588 (2014)
Google Scholar
Ni, Y., Xu, Q.K., Cao, F., Mass, Y., Sheinwald, D., Zhu, H.J., Cao, S.S.: Semantic documents relatedness using concept graph representation. In: WSDM (2016)
Google Scholar
Pal, D., Mitra, M., Datta, K.: Improving query expansion using wordnet. JASIST 65(12), 2469–2478 (2014)
Google Scholar
Rocchio, J.J.: Relevance feedback in information retrieval. In: The SMART Retrieval System, pp. 313–323 (1971)
Google Scholar
Stokes, N., Cavedon, Y., Zobel, J.: Exploring criteria for succesful query expansion in the genomic domain. Inf. Retrieval 12, 17–50 (2009)
Article Google Scholar
Trieschnigg, D.: Proof of concept: concept-based biomedical information retrieval. Ph.D. thesis. University of Twente (2010)
Google Scholar
Voorhees, E., Hersh, W.: Overview of the TREC medical records track. In: TREC (2012)
Google Scholar
Wang, C., Akella, R.: Concept-based relevance models for medical and semantic information retrieval. In: CIKM, pp. 173–182 (2015)
Google Scholar
Wang, S., Hauskrecht, M.: Effective query expansion with the resistance distance based term similarity metric. In: SIGIR, pp. 715–716 (2010)
Google Scholar
Liu, X., Nie, J.-Y., Sordoni, A.: Constraining word embeddings by prior knowledge – application to medical information retrieval. In: Ma, S., Wen, J.-R., Liu, Y., Dou, Z., Zhang, M., Chang, Y., Zhao, X. (eds.) AIRS 2016. LNCS, vol. 9994, pp. 155–167. Springer, Cham (2016). doi:10.1007/978-3-319-48051-0_12
Chapter Google Scholar
Xu, C., Bai, Y., Bian, J., Gao, B., Wang, G., Liu, X., Liu, T.-Y.: Rc-net: a general framework for incorporating knowledge into word representations. In: CIKM (2014)
Google Scholar
Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: ACL, pp. 545–550 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Toulouse, UPS-IRIT, 118 Route de Narbonne, 31062, Toulouse, France
Gia-Hung Nguyen, Lynda Tamine & Nathalie Souf
Sorbonne Universités-UPMC, Univ Paris 06, LIP6 UMR 7606, 75005, Paris, France
Laure Soulier

Authors

Gia-Hung Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Lynda Tamine
View author publications
You can also search for this author in PubMed Google Scholar
Laure Soulier
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Souf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gia-Hung Nguyen .

Editor information

Editors and Affiliations

Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Annette ten Teije
Medical University of Vienna, Vienna, Austria
Christian Popow
University of Pennsylvania, Philadelphia, Pennsylvania, USA
John H. Holmes
University of Pavia, Pavia, Italy
Lucia Sacchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, GH., Tamine, L., Soulier, L., Souf, N. (2017). Learning Concept-Driven Document Embeddings for Medical Information Search. In: ten Teije, A., Popow, C., Holmes, J., Sacchi, L. (eds) Artificial Intelligence in Medicine. AIME 2017. Lecture Notes in Computer Science(), vol 10259. Springer, Cham. https://doi.org/10.1007/978-3-319-59758-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-59758-4_17
Published: 30 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59757-7
Online ISBN: 978-3-319-59758-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics