Capturing Lexical, Grammatical, and Semantic Information with Vecsigrafo

Gomez-Perez, Jose Manuel; Denaux, Ronald; Garcia-Silva, Andres

doi:10.1007/978-3-030-44830-1_8

Jose Manuel Gomez-Perez⁴,
Ronald Denaux⁴ &
Andres Garcia-Silva⁴

1466 Accesses

Abstract

Embedding algorithms work by optimizing the distance between a word and its context(s), generating an embedding space that encodes their distributional representation. In addition to single words or word pieces, other features, which result from a deeper analysis of the text, can be used to enrich such representations with additional information. Such features are influenced by the tokenization strategy used to chunk the text and can include not only lexical and part-of-speech information but also annotations about the disambiguated sense of a word according to a structured knowledge graph. In this chapter we analyze the impact that explicitly adding lexical, grammatical and semantic information during the training of Vecsigrafo has in the resulting representations and whether or not this can enhance their downstream performance. To illustrate this analysis we focus on corpora from the scientific domain, where rich, multi-word expressions are frequent, hence requiring advanced tokenization strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Expert System, Madrid, Spain
Jose Manuel Gomez-Perez, Ronald Denaux & Andres Garcia-Silva

Authors

Jose Manuel Gomez-Perez
View author publications
You can also search for this author in PubMed Google Scholar
Ronald Denaux
View author publications
You can also search for this author in PubMed Google Scholar
Andres Garcia-Silva
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gomez-Perez, J.M., Denaux, R., Garcia-Silva, A. (2020). Capturing Lexical, Grammatical, and Semantic Information with Vecsigrafo. In: A Practical Guide to Hybrid Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-030-44830-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-44830-1_8
Published: 17 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44829-5
Online ISBN: 978-3-030-44830-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics