Building Hybrid Representations from Text Corpora, Knowledge Graphs, and Language Models
- 52 Downloads
In the previous chapter we saw how knowledge graph embedding algorithms can capture structured knowledge about concepts and relations in a graph as embeddings in a vector space, which then can be used in downstream tasks. However, this type of approaches can only capture the knowledge that is explicitly represented in the graph, hence lacking in recall and domain coverage. In this chapter, we focus on algorithms that address this limitation through the combination of information from both unstructured text corpora and structured knowledge graphs. The first approach is Vecsigrafo, which produces corpus-based word, lemma, and concept embeddings from large disambiguated corpora. Vecsigrafo jointly learns word, lemma, and concepts embeddings, bringing together textual and symbolic knowledge representations in a single, unified formalism for use in neural natural language processing architectures. The second and more recent approach is called Transigrafo, which adopts recent Transformer-based language models to derive concept-level contextual embeddings, providing state-of-the-art performance in word-sense disambiguation with reduced complexity.
Unable to display preview. Download preview PDF.