Abstract
The complexity and scale of the knowledge in the biomedical domain has motivated research work towards mining heterogeneous data from structured and unstructured knowledge bases. Towards this direction, it is necessary to combine facts in order to formulate hypotheses or draw conclusions about the domain concepts. In this work we attempt to address this problem by using indirect knowledge connecting two concepts in a graph to identify hidden relations between them. The graph represents concepts as vertices and relations as edges, stemming from structured (ontologies) and unstructured (text) data. In this graph we attempt to mine path patterns which potentially characterize a biomedical relation. For our experimental evaluation we focus on two frequent relations, namely “has target”, and “may treat”. Our results suggest that relation discovery using indirect knowledge is possible, with an AUC that can reach up to 0.8. Finally, analysis of the results indicates that the models can successfully learn expressive path patterns for the examined relations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Swanson, D.: Fish oil, raynaud’s syndrome, and undiscovered public knowledge. Perspect. Bio. Med 30, 7–18 (1986)
Cohen, T., Schvaneveldt, R., Widdows, D.: Reflective random indexing and indirect inference: a scalable method for discovery of implicit connections. J. Biomed. Inform. 43(2), 240–256 (2010)
Frijters, R., van Vugt, M., Smeets, R., van Schaik, R.C., de Vlieg, J., Alkema, W.: Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Computational Biology 6(9) (2010)
Goertzel, B., Goertzel, I.F., Pinto, H., Ross, M., Heljakka, A., Pennachin, C.: Using dependency parsing and probabilistic inference to extract relationships between genes, proteins and malignancies implicit among multiple biomedical research abstracts. In: BioNLP, pp. 104–111 (2006)
Vizenor, L., Bodenreider, O., McCray, A.T.: Auditing associative relations across two knowledge sources. Journal of Biomedical Informatics 42(3), 426–439 (2009)
Lao, N., Subramanya, A., Pereira, F., Cohen, W.W.: Reading the web with learned syntactic-semantic inference rules. In: EMNLP-CoNLL, pp. 1017–1026 (2012)
Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: AMIA Symposium, pp. 17–21 (2001)
Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: NIPS (2004)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Mimno, D.M., Hoffman, M.D., Blei, D.M.: Sparse stochastic inference for latent dirichlet allocation. In: ICML (2012)
McCallum, A., Schultz, K., Singh, S.: FACTORIE: Probabilistic programming via imperatively defined factor graphs. In: NIPS (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Weissenborn, D., Schroeder, M., Tsatsaronis, G. (2014). Discovering Relations between Indirectly Connected Biomedical Concepts. In: Galhardas, H., Rahm, E. (eds) Data Integration in the Life Sciences. DILS 2014. Lecture Notes in Computer Science(), vol 8574. Springer, Cham. https://doi.org/10.1007/978-3-319-08590-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-08590-6_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08589-0
Online ISBN: 978-3-319-08590-6
eBook Packages: Computer ScienceComputer Science (R0)