Skip to main content

Learning Concept-Driven Document Embeddings for Medical Information Search

  • Conference paper
  • First Online:
Artificial Intelligence in Medicine (AIME 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10259))

Included in the following conference series:

Abstract

Many medical tasks such as self-diagnosis, health-care assessment, and clinical trial patient recruitment involve the usage of information access tools. A key underlying step to achieve such tasks is the document-to-document matching which mostly fails to bridge the gap identified between raw level representations of information in documents and high-level human interpretation. In this paper, we study how to optimize the document representation by leveraging neural-based approaches to capture latent representations built upon both validated medical concepts specified in an external resource as well as the used words. We experimentally show the effectiveness of our proposed model used as a support of two different medical search tasks, namely health search and clinical search for cohorts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Text Retrieval Conference (http://trec.nist.gov/).

  2. 2.

    https://sourceforge.net/projects/cxtractor/.

References

  1. Abdou, S., Savoy, J.: Searching in MEDLINE: query expansion and manual indexing evaluation. Inf. Process. Manag. 44(2), 781–789 (2008)

    Article  Google Scholar 

  2. Bordes, A., Usunier, N., García-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS (2013)

    Google Scholar 

  3. Choi, E., Bahadori, M.T., Searles, E., Coffey, C., Sun, J.: Multi-layer representation learning for medical concepts. In: KDD, pp. 1495–1504 (2016)

    Google Scholar 

  4. De Vine, L., Zuccon, G., Koopman, B., Sitbon, L., Bruza, P.: Medical semantic similarity with a neural language model. In: CIKM, pp. 1819–1822 (2014)

    Google Scholar 

  5. Dinh, D., Tamine, L.: Combining global and local semantic contexts for improving biomedical information retrieval. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 375–386. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_38

    Chapter  Google Scholar 

  6. Edinger, N.T., Cohen, A.M., Bedrick, S., Ambert, K., Hersh, W.: Barriers to retrieving patient information from electronic health record data: failure analysis from the TREC medical records track. In: AMIA Annual Symposium, pp. 180–188 (2012)

    Google Scholar 

  7. Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., Smith, N.A.: Retrofitting word vectors to semantic lexicons. In: NAACL (2015)

    Google Scholar 

  8. Gobeill, J., Ruch, P., Zhou, X.: Query and document expansion with medical subject headings terms at medical Imageclef 2008. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 736–743. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04447-2_95

    Chapter  Google Scholar 

  9. Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57 (1999)

    Google Scholar 

  10. Iacobacci, I., Pilehvar, M.T., Navigli, R.: Sensembed: learning sense embeddings for word and relational similarity. In: ACL, pp. 95–105 (2015)

    Google Scholar 

  11. Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., Lawley, M.: Information retrieval as semantic inference: a graph inference model applied to medical search. Inf. Retrieval 19(1–2), 6–37 (2016)

    Article  Google Scholar 

  12. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)

    Google Scholar 

  13. Le, T.-D., Chevallet, J.-P., Dong, T.B.T.: Thesaurus-based query and document expansion in conceptual indexing with UMLS. In: RIVF 2007, pp. 242–246 (2007)

    Google Scholar 

  14. Lu, Z., Kim, W., Wilbur, W.J.: Evaluation of query expansion using MeSH in PubMed. Inf. Retrieval 12(1), 69–80 (2009)

    Article  Google Scholar 

  15. Mao, J., Lu, K., Mu, X., Li, G.: Mining document, concept, and term associations for effective biomedical retrieval: introducing MeSH-enhanced retrieval models. Inf. Retrieval 18(5), 413–444 (2015)

    Article  Google Scholar 

  16. Marton, C., Choo, C.W.: A review of theroretical models on health information seeking on the web. J. Documentation 68(3), 330–352 (2012)

    Article  Google Scholar 

  17. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint (2013). arXiv:1301.3781

  18. Minarro-Gimenez, J., Marin-Alonso, O., Samwald, M.: Exploring the application of deep learning techniques on medical text corpora. Stud. Health Technol. Inf. 205, 584–588 (2014)

    Google Scholar 

  19. Ni, Y., Xu, Q.K., Cao, F., Mass, Y., Sheinwald, D., Zhu, H.J., Cao, S.S.: Semantic documents relatedness using concept graph representation. In: WSDM (2016)

    Google Scholar 

  20. Pal, D., Mitra, M., Datta, K.: Improving query expansion using wordnet. JASIST 65(12), 2469–2478 (2014)

    Google Scholar 

  21. Rocchio, J.J.: Relevance feedback in information retrieval. In: The SMART Retrieval System, pp. 313–323 (1971)

    Google Scholar 

  22. Stokes, N., Cavedon, Y., Zobel, J.: Exploring criteria for succesful query expansion in the genomic domain. Inf. Retrieval 12, 17–50 (2009)

    Article  Google Scholar 

  23. Trieschnigg, D.: Proof of concept: concept-based biomedical information retrieval. Ph.D. thesis. University of Twente (2010)

    Google Scholar 

  24. Voorhees, E., Hersh, W.: Overview of the TREC medical records track. In: TREC (2012)

    Google Scholar 

  25. Wang, C., Akella, R.: Concept-based relevance models for medical and semantic information retrieval. In: CIKM, pp. 173–182 (2015)

    Google Scholar 

  26. Wang, S., Hauskrecht, M.: Effective query expansion with the resistance distance based term similarity metric. In: SIGIR, pp. 715–716 (2010)

    Google Scholar 

  27. Liu, X., Nie, J.-Y., Sordoni, A.: Constraining word embeddings by prior knowledge – application to medical information retrieval. In: Ma, S., Wen, J.-R., Liu, Y., Dou, Z., Zhang, M., Chang, Y., Zhao, X. (eds.) AIRS 2016. LNCS, vol. 9994, pp. 155–167. Springer, Cham (2016). doi:10.1007/978-3-319-48051-0_12

    Chapter  Google Scholar 

  28. Xu, C., Bai, Y., Bian, J., Gao, B., Wang, G., Liu, X., Liu, T.-Y.: Rc-net: a general framework for incorporating knowledge into word representations. In: CIKM (2014)

    Google Scholar 

  29. Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: ACL, pp. 545–550 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gia-Hung Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Nguyen, GH., Tamine, L., Soulier, L., Souf, N. (2017). Learning Concept-Driven Document Embeddings for Medical Information Search. In: ten Teije, A., Popow, C., Holmes, J., Sacchi, L. (eds) Artificial Intelligence in Medicine. AIME 2017. Lecture Notes in Computer Science(), vol 10259. Springer, Cham. https://doi.org/10.1007/978-3-319-59758-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59758-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59757-7

  • Online ISBN: 978-3-319-59758-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics