EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric Knowledge Graph

Gottschalk, Simon; Demidova, Elena

doi:10.1007/978-3-319-98192-5_31

EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric Knowledge Graph

Simon Gottschalk²⁶ &
Elena Demidova²⁶

Conference paper
First Online: 02 August 2018

2247 Accesses
9 Citations
2 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11155))

Abstract

The provision of multilingual event-centric temporal knowledge graphs such as EventKG enables structured access to representations of a large number of historical and contemporary events in a variety of language contexts. Timelines provide an intuitive way to facilitate an overview of events related to a query entity - i.e. an entity or an event of user interest - over a certain period of time. In this paper, we present EventKG+TL - a novel system that generates cross-lingual event timelines using EventKG and facilitates an overview of the language-specific event relevance and popularity along with the cross-lingual differences.

You have full access to this open access chapter, Download conference paper PDF

Demo URL::: http://eventkg.l3s.uni-hannover.de/eventkg_tl.

1 Introduction

The amount of event-centric information regarding contemporary and historical events of global importance, such as Brexit and the migration crisis in Europe, constantly grows on the Web, in Web archives, in the news as well as within emerging event-centric collections [2] and knowledge graphs generated from these sources (e.g. [4, 7]). An important research area in this context is cross-cultural and cross-lingual event analytics (e.g. see [5, 6] for case studies, and [3] for a cross-lingual user interface). These studies aim to analyze language-specific and community-specific representations and perceptions of historical and contemporary events including their popularity and relations in a language context as well as to better understand the cross-lingual differences.

EventKG [4] - a recently proposed multilingual event-centric temporal knowledge graph incorporating over 690 thousand events in five languages - is an important knowledge source that can facilitate a variety of studies and applications related to cross-cultural and cross-lingual event analytics. However, given a query entity, i.e. an entity or an event of user interest, EventKG can contain hundreds of related events along with their descriptions in several language contexts, which makes the provision of a comprehensive cross-lingual overview and a selection of relevant events for further detailed analysis challenging.

Timelines are an intuitive way to provide an overview of events related to a query entity over a certain period of time. Timeline generation is an active research area [1], where the focus is to generate a timeline (i.e. a chronologically ordered selection) of events related to the query entity from a knowledge graph. However, existing timelines do not explicitly support a cross-lingual comparison of language-specific event representations, including their popularity and relation to the query entity in different language contexts.

EventKG+TL presented in this paper is a timeline generator that creates cross-lingual timelines for a query entity, while relying on EventKG to provide language-specific information with respect to the event popularity and the relation strength between the events and the query entity. To this extent, EventKG+TL conducts a language-specific event ranking and complements this ranking with a cross-lingual visual representation. The timelines generated by EventKG+TL facilitate efficient identification of relevant events based on their language-specific popularity, relation strength and the cross-lingual differences.

2 Scenarios and Timelines

A multilingual event-centric temporal knowledge graph $kg=(L, E, R)$ is a labeled directed multigraph, where L is a set of language contexts, E is a set of nodes (i.e. events or entities), and R is a multiset of directed edges (i.e. relations).

Given a query entity $q \in E$, the timelines generated by EventKG+TL can assist users in answering questions such as:

$Q_1$: What are the most popular events related to q?
$Q_2$: Which events are the most closely related to q?
$Q_3$: Which of the most popular events are the most closely related to q?
$Q_4$: How does the popularity of the identified events and the strength of their relations to the query entity q differ across the language contexts?

The provision of EventKG+TL facilitates users to answer these questions with respect to a particular language context $l \in L$ and enables a visual cross-lingual comparison. To answer these questions, the user of EventKG+TL can issue a timeline query that includes the following parameters:

a query entity $q \in E$;
a set of the language contexts of user interest $L' \subseteq L$;
the maximum number k of the events to be selected per language context;
the ranking criterion $rc_{i}$ to identify the top-k most relevant events among all events $E' \subset E$ related to q in kg according to the questions $Q_1-Q_3$.

The ranking criteria include:

$rc_{1}$::: popularity(e, l) is the popularity of an event $e\in E'$ in $l\in L'$;
$rc_{2}$::: relation strength(q, e, l) is the relation strength between the query entity q and an event $e \in E'$ in a language context $l\in L'$; and
$rc_{3}$::: combined(q, e, l) is a combination of the event popularity of $e \in E'$ and the relation strength between e and the query entity q in $l\in L'$.

The timelines generated by EventKG+TL complement the language-specific event ranking with a cross-lingual visual representation to address the question $Q_4$. To this extent, EventKG+TL utilizes labeled pie charts located on a timeline, where each pie chart represents an individual event. The size of the pie chart corresponds to an overall (i.e. language independent) relevance of the event according to the ranking criterion $rc_{i}$. Each slice of the pie chart represents a language context. The area of each slice is proportional to the contribution of the corresponding language context to the ranking criterion $rc_{i}$.

Figure 1 exemplifies a Brexit timeline. We can observe that the most important event according to $rc_{3}$ is the “United Kingdom European Union membership referendum, 2016” that is nearly equally important in all considered language contexts. Some of the events are more important in the specific language contexts, e.g. “European Migrant Crisis” in the German and “Dutch Ukraine-European Union Association Agreement referendum 2016” in the Russian context.

3 Timeline Generation

The Knowledge Graph. To answer a timeline query, EventKG+TL utilizes EventKG [4]. EventKG is a multilingual RDF knowledge graph incorporating over 690 thousand events and over 2.3 million temporal relations in V1.1 extracted from several large-scale entity-centric knowledge graphs (i.e. Wikidata, DBpedia in five language editions and YAGO), Wikipedia Current Event Portal (WCEP) and Wikipedia event lists. One of the key features of EventKG is the provision of event-centric information for historical and contemporary events, including their interlinking in the language-specific contexts to facilitate an assessment of relation strength and event popularity. The information on language-specific interlinking provided by EventKG is based on the corresponding Wikipedia language editions.

Event and Relation Retrieval. To retrieve relevant information from EventKG, EventKG+TL adopts SPARQL queries. First, EventKG+TL retrieves the query entity q, including its existence time, if available. Second, EventKG+TL retrieves a set of events $E'\subset E$ that are connected to q via an EventKG relation as the subject or the object, along with the time information associated with these events. Third, the interlinking information related to the events in $E'$ is retrieved from EventKG’s link relations and their eventKG-s:links and eventKG-s:mentions property values.

Event Ranking and Timeline Creation. The top-k events related to q are selected according to the ranking criterion. For each event $e \in E'$ and language $l \in L'$, the language-specific relevance score is computed using the interlinking information provided by EventKG. The following link counts are used:

$count_{links}(e,l)$: Event link count, i.e. the number of links pointing to the event e in a language context l (via eventKG-s:links).
$count_{pair}(q,e,l)$: Pair count, i.e. the number of links from q to e plus the number of links from e to q in l, denoted by eventKG-s:links values.
$count_{mentions}(q,e,l)$: Mention count, i.e. the number of sentences in a language context l that jointly link to q and e, denoted by eventKG-s:mentions.

Each count is normalized to [0, 1] by dividing its value by the highest value of this count related to the events in $E'$ in the respective language. That way, the bias resulting from the differences in the language-specific coverage is reduced. To avoid the domination of the disproportionately often linked events (e.g. the World War II), a smoothing parameter $\alpha $, experimentally set to 0.25, is adopted. The scores are computed as follows:

$$\begin{aligned} \text {popularity}(e,l) = \left( \frac{count_{links}(e,l)}{max \{ count_{links}(e',l) | e' \in E' \}}\right) ^ \alpha \end{aligned}$$

(1)

$$\begin{aligned} \begin{aligned} \text {relation strength}(q,e,l) =&\ \frac{1}{2} \cdot \left( \frac{count_{pair}(q,e,l)}{max \{ count_{pair}(q,e',l) | e' \in E' \}}\right) ^ \alpha \\&+ \frac{1}{2} \cdot \left( \frac{count_{mentions}(q,e,l)}{max \{ count_{mentions}(q,e',l) | e' \in E' \}}\right) ^ \alpha \end{aligned} \end{aligned}$$

(2)

The combined score ($rc_3$) is computed as a linear combination of the two ranking criteria. We experimentally set its weight to .

$$\begin{aligned} \begin{aligned} \text {combined}(q,e,l) =&\ w \cdot \text {popularity}(e,l) \\&+ (1 - w) \cdot \text {relation strength}(q,e,l) \end{aligned} \end{aligned}$$

(3)

The resulting timeline consists of a chronologically ordered list of the top-k highest ranked events per language with respect to the ranking criterion.

System Implementation. The EventKG+TL system is accessible as an HTML5 website. It is implemented using the Java Spark web framework^{Footnote 1}. The timeline is visualized through the browser-based Javascript library vis.js^{Footnote 2}, the pie charts are created using the Google Charts Javascript library^{Footnote 3} and pop-ups showing detailed event information are based on Twitter Bootstrap^{Footnote 4}.

4 Demonstration

In our demonstration we will primarily show how EventKG+TL works and how users can use it to create cross-lingual timelines. To highlight the advantages of our approach, we will ask our audience to create timelines for the entities and events of their choice using EventKG+TL based on the language-specific information contained in EventKG. Through the visual cross-lingual comparison provided by EventKG+TL, the audience can get an impression of the language-specific event representations, as well as their relation to the query entity and popularity in different language contexts.

Notes

References

Althoff, T., Dong, X.L., Murphy, K., Alai, S., Dang, V., Zhang, W.: TimeMachine: timeline generation for knowledge-base entities. In: Proceedings of SIGKDD 2015 (2015)
Google Scholar
Gossen, G., Demidova, E., Risse, T.: iCrawl: improving the freshness of web collections by integrating social web and focused web crawling. In: JCDL 2015 (2015)
Google Scholar
Gottschalk, S., Demidova, E.: MultiWiki: interlingual text passage alignment in Wikipedia. TWEB 11(1), 6:1–6:30 (2017)
Google Scholar
Gottschalk, S., Demidova, E.: EventKG: a multilingual event-centric temporal knowledge graph. In: Proceedings of the ESWC 2018 (2018)
Google Scholar
Gottschalk, S., Demidova, E., Bernacchi, V., Rogers, R.: Ongoing events in Wikipedia: a cross-lingual case study. In: Proceedings of WebSci 2017, pp. 387–388 (2017)
Google Scholar
Rogers, R.: Digital Methods. MIT Press, Cambridge (2013)
Book Google Scholar
Rospocher, M., et al.: Building event-centric knowledge graphs from news. Web Semant. 37, 132–151 (2016)
Article Google Scholar

Download references

Acknowledgements

This work was partially funded by the ERC (“ALEXANDRIA”, 339233) and BMBF (“Data4UrbanMobility”, 02K15A040).

Author information

Authors and Affiliations

L3S Research Center, Leibniz Universität Hannover, Hannover, Germany
Simon Gottschalk & Elena Demidova

Authors

Simon Gottschalk
View author publications
You can also search for this author in PubMed Google Scholar
Elena Demidova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Gottschalk .

Editor information

Editors and Affiliations

University of Bologna, Bologna, Italy
Aldo Gangemi
IBM Research - Almaden, San Jose, CA, USA
Anna Lisa Gentile
CNR-ISTC, Rome, Italy
Andrea Giovanni Nuzzolese
Technische Universität Dresden, Dresden, Germany
Sebastian Rudolph
Karlsruhe Institute of Technology, Karlsruhe, Germany
Maria Maleshkova
University of Mannheim, Mannheim, Germany
Heiko Paulheim
University of Aberdeen, Aberdeen, UK
Jeff Z Pan
CNR-ISTC, Rome, Italy
Mehwish Alam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gottschalk, S., Demidova, E. (2018). EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric Knowledge Graph. In: Gangemi, A., et al. The Semantic Web: ESWC 2018 Satellite Events. ESWC 2018. Lecture Notes in Computer Science(), vol 11155. Springer, Cham. https://doi.org/10.1007/978-3-319-98192-5_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-98192-5_31
Published: 02 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98191-8
Online ISBN: 978-3-319-98192-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics