1 Introduction

Knowledge bases (KBs) are logic-based representations of both data and domain knowledge, for which there exists a rich toolset to query data and reason about data semantically, i.e., in terms of the domain knowledge. This enables domain users to interact with modern IT systems [39] without being exposed to implementation details, as well as to make their domain knowledge available for software applications. KBs are the foundation of many modern innovation drivers and key technologies: Applications range from Digital Twin engineering [31], over industry standards in robotics [23] to expert systems, e.g., in medicine [38].

The success story of KBs, however, is so far based on the use of domain knowledge about static data. The connection to transition systems and programs beyond Prolog-style logic programming has just begun to be explored. This is mainly triggered by tool support for developing applications that use KBs [7, 13, 28], in a type-safe way [29, 32].

In this work, we investigate how one can use domain knowledge about dynamic processes and formalize knowledge about the order of computations to be performed. More concretely, we describe a runtime enforcement technique to use domain knowledge to guide the selection of rules in a transition system, for example to simulate behavior with respect to domain knowledge, a scenario that we use as a guiding example in this article, or to enforce compliance of business process models with respect to restrictions arising from the domain [41].

Approach. At the core, our approach considers the execution trace of a run, i.e., the sequence of rule applications, as a KB itself. As such, it can be combined with the KB that expresses the domain knowledge of dynamic processes (DKDP). The DKDP expresses knowledge about (partial) executions such that the execution trace must be consistent with it before and after every rule application. For example, in a simulation system for geology, the DKDP may express that a certain rock layer A is above a certain rock layer B and, thus, the event to deposit a layer must occur for B, before it occurs for A. Consistency with the DKDP forms a domain invariant for the trace of a system, i.e., a trace property.

To trigger a transition rule, we use a hypothetical execution step: the execution trace is extended with a potential event and the consistency of the extended trace against the DKDP is checked. However using this consistency invariant directly at run time can be computationally heavy, as it requires to check the consistency of a large logical theory. Thus, we give a transformation that removes the need for a hypothetical execution step and instead results in a transition system that evaluates a transformed condition on (1) the existing trace and (2) the parameters of the potentially extended event. This condition does not require domain-specific reasoning anymore. This transformation removes the need for hypothetical execution steps and DKDP can be used to guide any transition system, including languages based on structural operational semantics. For example, it is then possible to express the invariant checking as a guard for the rule that deposits layers (e.g., only deposit A if layer B has been deposited already).

It is crucial that this system is usable for both the domain user (who possesses the domain knowledge) and the programmer (that has to program the interaction with the domain knowledge), a requirement explicitly stressed by Corea et al. [16] for the use of ontologies in business process models. We, thus, carefully designed our framework to increase its usability: First, the reasoning (in the geology example above, from spatial properties of layers to temporal properties of events) is completely performed in the domain and needs not be handled by the transition system. I.e., the programmer must not perform reasoning over the KB in the program itself. Second, the DKDP is expressed over domain events, as the domain users do not have knowledge about implementation details, such as the state organization. Furthermore, the formalization of the DKDP should not be affected by the underlying implementation details such that the DKDP can be reused. The DKDP can reuse the aforementioned industry standards and established ontologies, as well as modeling languages and techniques from ontology engineering [17], such as OWL [42], which are established for domain modeling and more suitable for this task than correctness-focused temporal logics such as LTL [35]: The domain users must not be an expert in programming or verification to contribute to the system.

The transformation that replaces the need for a hypothetical execution step with a transition system evaluating a transformed condition is also transparent to the domain users. We say a transformed guarded rule is applicable if it would not violate consistency w.r.t. the DKDP. Lastly, we provide the domain users possibilities to query the final result, i.e., the KB of the final execution trace, and to explore possible simulations using the defined DKDP. Note that the mapping from trace to KB must not necessarily be designed manually: various (semi-) automatic mapping design strategies are discussed in the paper.

Contributions and Structure. Our main contributions are (1) a system that enforces domain knowledge to guide a transition system at runtime, and (2) a procedure that transforms such a transition system that uses consistency with domain knowledge as an invariant into a transition system using first-order guards over past events in a transparent way. We give preliminaries in Sec. 2 and present our running example in Sec. 3. We formalize our approach in Sec. 4 and give the transformation in Sec. 5, before we discuss (semi-)automatically generated mappings in Sec. 6. We discuss the mappings in Sec. 7 and related work in Sec. 8. Lastly, Sec. 9 concludes.

2 Preliminaries

We give some technical preliminaries for knowledge bases as well as transition systems, as far as they are needed for our runtime enforcement technique.

Definition 1

(Domain Knowledge of Dynamic Processes). Domain knowledge of dynamic processes (DKDP) is the knowledge about events and changes.

Example 1

(DKDP in Geology).

DKDP describes knowledge about some temporal properties in a domain. In geology, for example, this may be the knowledge that a deposition of some geological layers in Cretaceous should happen after a deposition in Jurassic, because the Cretaceous is after the Jurassic. This can be deduced from, e.g., fossils found in the layers.

A description logic (DL) is a decidable fragment of first-order logic with suitable expressive power for knowledge representation [3]. We do not commit to any specific DL here, but require that for the chosen DL it is decidable to check consistency of a KB, which we define next. A knowledge base is a collection of DL axioms, over individuals (corresponding to first-order logic constants), concepts, also called classes (corresponding to first-order logic unary predicates) and roles, also called properties (corresponding to first-order logic binary predicates).

Definition 2

(Knowledge Base). A knowledge base (KB) \(\mathcal {K} = (\mathcal {R},\mathcal {T},\mathcal {A})\) is a triple of three sets of DL axioms, where the ABox \(\mathcal {A}\) contains assertions over individuals, the TBox \(\mathcal {T}\) contains axioms over concepts, and the RBox \(\mathcal {R}\) contains axioms over roles. A KB is consistent if no contradiction follows from it.

KBs can be seen as first-order logic theories, so we refrain from introducing them fully formally and introduce them by examples throughout the article. The Manchester syntax [25] is used for DL formulas in examples to emphasize that they model knowledge, but we treat them as first-order logic formulas otherwise.

Example 2

Continuing Exp. 1, the following axiom, expressing that Jurassic is before Cretaceous, is expressed by the following ABox axiom, where \(\texttt{Jurassic}\) and \(\texttt{Cretaceous}\) are individuals, while \(\texttt{before}\) is a role.

$$\begin{aligned} \mathtt {before(Jurassic, Cretaceous)} \end{aligned}$$

The following TBox axioms express that every layer with Stegosaurus fossils has been deposited during the Jurassic. The first two axioms define the concepts \(\texttt{StegoLayer}\) (the class of things having the value Stegosaurus as their contains role) and \(\texttt{JurassicLayer}\) (the class of things having the value Jurassic as their during role). The last axiom says that the class of things having the value Stegosaurus as their contains role is a subclass of \(\texttt{JurassicLayer}\).Footnote 1 The bold literals are keywords, the literals \(\texttt{StegoLayer}\), \(\texttt{JurassicLayer}\) denote concepts/classes, the literals \(\texttt{contains}\), \(\texttt{during}\) denote roles/properties and the literals \(\texttt{Stegosaurus}\), \(\texttt{Jurassic}\) denote individuals.

StegoLayer EquivalentTo contains value Stegosaurus

JurassicLayer EquivalentTo during value Jurassic

StegoLayer SubClassOf JurassicLayer

The following RBox axioms express two constraints: The first line states that both below and before roles are asymmetric. The second line states that if a deposition is from an age before the age of another deposition, then it is below that deposition. Formally, the axiom expresses that the concatenation of the following three roles (a) the during role, (b) the before role, and (c) the inverse of the during role, is the sub-property of the below role. I.e., given an individual a, every individual b reachable from a following the chain during, before and the inverse of during, is also reachable by just below.

Asy(below)       Asy(before)

during o before o inverse(during) SubPropertyOf below

Knowledge based guiding can be applied to any transition system to leverage domain knowledge during execution. States are not the focus of our work, and neither is the exact form of the rules that specify the transition between states. For our purposes, it suffices to define states as terms, i.e., finite trees where each node is labeled with a name from a finite set of term symbols, and transition rules as transformations between schematic terms. State guards can be added but are omitted for brevity’s sake.

Definition 3

(Terms and Substitutions). Let \(\varSigma _T\) be a finite set of term labels and \(\varSigma _V\) a disjoint set of term variables. A term t is a finite tree, where each inner node is a term label and each leaf is either a term label or a term variable. The set of term variables in a term t is denoted \(\varSigma (t)\). We denote the set of all terms with T. A substitution \(\sigma \) is a map from term variables to terms without term variables. The application of a substitution \(\sigma \) to a term t, with the usual semantics, is denoted \(t\sigma \). In particular, if t contains no term variables, then \(t\sigma = t\).

Rewrite rules map one term to another by unifying a subterm with the head term. The matched subterm is then rewritten by applying the substitution to the body term. Normally one would have additional conditions on the transition rules, but these are not necessary to present semantical guiding.

Definition 4

(Term Rewriting Systems). A transition rule in the term rewriting system has the form

$$\begin{aligned} t_\texttt{head} \xrightarrow {\textsf{r}} t_\texttt{body} \end{aligned}$$

Where \(\textsf{r}\) is the name of the rule, and \(t_\texttt{head}, t_\texttt{body} \in T\) are the head and body terms.

A rule matches on a term t with \(\varSigma (t) = \emptyset \), if there is a subterm \(t_s\) of t, such that \(t_\texttt{head} = t_s\sigma \), for a suitable substitution \(\sigma \). A rule produces a term \(t'\), by matching on subterm \(t_s\) with substitution \(\sigma \), and generating \(t'\) by replacing \(t_s\) in t by \(t_s\sigma '\), where \(\sigma '\) is equal to \(\sigma \) for all \(v \in \varSigma (t_\texttt{body}) \cap \varSigma (t_\texttt{head})\) and maps \(v \in \varSigma (t_\texttt{head}) \setminus \varSigma (t_\texttt{body})\) to fresh term symbols. For production, we write

$$\begin{aligned} t \xrightarrow {\textsf{r},\sigma '} t' \end{aligned}$$

3 A Scenario for Knowledge Based Guiding

To illustrate our approach, we continue with geology, namely with a simulator for deposition and erosion of geological layers. Such a simulator is used, e.g., for hydrocarbon exploration [20]. It contains domain knowledge about the type of fossils and the corresponding geological age, and connects spatial information about deposition layers with temporal information about their deposition. We started a formalization of the DKDP in Ex. 2 and expand it below.

The core challenge is that the simulator must make sure that it does not violate domain properties. This means that it cannot deposit a layer containing fossils from the Jurassic after depositing a layer containing fossils from the Cretaceous. This information is given by the domain users as an invariant, i.e., as knowledge that the execution must be consistent with at all times.

Programming with Knowledge Bases. Our model of computation is a set of rewrite rules on some transition structure. The sequence of rule applications, denoted events, forms the trace. DKDP constrains the extension of the trace. This realizes a clear separation of concerns between declarative data modelling and imperative programming with, in our case, transitions.

Example 3

Let us assume 4 rules: a rule deposit that deposits a layer without fossils, a rule depositStego that deposits a layer with Stegosaurus fossils, an analogous rule depositTRex that deposits a layer with Tyrannosaurus fossils, and a rule \(\texttt{erode}\) that removes the top layer of the deposition. One example reduction sequence, for some terms \(t_i\) and with substitutions omitted, is as follows:

$$ t_0 \xrightarrow {\texttt {depositStego}} t_1 \xrightarrow {\texttt {erode}} t_2 \xrightarrow {\texttt {depositTRex}} t_3 $$

which describes the rule application of depositStego on term \(t_0\) following by the rule application of erode on term \(t_1\) and then depositTRex on term \(t_2\).

In the domain KB, we add an axiom expressing that the geological layer containing Stegosaurus fossils is deposited during the Jurassic, and that the geological layer containing Tyrannosaurus fossils is deposited during the Cretaceous.

Consider that rule depositStego may trigger on term \(t_3\).

$$ \dots t_2 \xrightarrow {\texttt {depositTRex}} t_3 \xrightarrow [?]{\texttt {depositStego}} $$

This would violate the domain knowledge, as we can derive a situation, where a layer with Tyrannosaurus fossils is below a layer with Stegosaurus fossils, implying that the Cretaceous is before the Jurassic. This contradiction is captured by the knowledge base in Fig. 1. The domain knowledge DKDP should prevent this rule application at \(t_3\) to happen. To achieve this, i.e., enforce domain knowledge at runtime, we must connect the trace with the KB. Specifically, we represent the trace as a KB itself, i.e., instead of operating on a KB, we record the events and generate a KB from a trace using a mapping.

For example, consider the left KB in Fig. 1. The upper part is (a part of) our DKDP about geological ages, while the lower part is the KB mapped from the trace. Together they form a KB. In the knowledge base of this example, we add one layer that contains Stegosaurus fossils for each depositStego event and analogously for depositTRex events. We also add the below relation between two layers, if their events are ordered. So, if we would execute depositStego after depositTRex, there would be two layers in the KB as shown in Fig. 1, with corresponding fossils, connected using the below relation. On the right, the KB is shown with the additional knowledge following from its axioms. In particular, we can deduce that layer2 must be below layer1 using the axioms from Sec. 2. This, in turn, makes the overall KB inconsistent, as below must be asymmetric.

Fig. 1.
figure 1

Left: KB as generated. Right: Inferred KB to detect inconsistency.

We stress that consistency of the execution with the DKDP is a trace property, it is reasoning about the events that happen regardless of the current state. In our example, consider the situation, where the next event after \(t_3\) rule erode triggers again, and then we consider rule depositStego. I.e., the following continuation of the trace

$$ \dots t_2 \xrightarrow {\texttt {depositTRex}} t_3 \xrightarrow {\texttt {erode}} t_4 \xrightarrow [?]{\texttt {depositStego}} $$

We still consider the layer with the Tyrannosaurus fossils in our KB, despite its erosion. Firstly, because the layer may potentially have had an effect on the execution before being removed, and, secondly, because its deposition also models implicit information. It expresses the current geological era of the system, which cannot be reverted: at \(t_3\) the system is in the Cretaceous, and while the depositStego models an action in the Jurassic – the trace would not represent a semantically sensible execution if the depositStego rule would be executed.

Fig. 2 illustrates the runtime enforcement of domain knowledge on traces in a more general setting. The execution itself is a reduction sequence over some terms t, where each rule application emits some event \(\textsf{ev}\), e.g., name of the applied rule and matched subterms. A mapping \(\mu \) is used to generate a KB from the trace. The knowledge base then contains (a) the DKDP, pictured as the shaded box, (b) the mapping of the trace so far, pictured as the unshaded box with solid frame, and (c) the potential next event, pictured as the dashed box. Additionally, new connections may be inferred.

The mapping from a trace to a KB matches the system formalized by the domain knowledge to the system used for programming, it is the interface between domain experts and the programmer. Indeed, the mapping allows the domain users to investigate program executions without being exposed to the implementation details. Given a fixed execution, the mapping can be applied to allow the domain users to query its results (in form of the trace) using domain vocabulary.

From the program’s point of view, it defines an invariant over the trace, which must always hold: consistency with domain knowledge. While this saves the domain users from learning about the implementation, it poses two challenges to the programmer: first, the mapping must be developed additionally to the rules, and second, the invariant is not specific to the rules. The extended trace caused by the execution of one single event, must be checked against the full DKDP, which is not specific to any transition event. Instead of this computationally costly operation, we provide an alternative. For example, to ensure consistency when executing the rule depositStego, it suffices to evaluate the following formula on the past trace \( tr \) to check that depositTRex has not been executed yet: . The condition of a rule is specific to the corresponding transition action, instead of a general condition on all the rules.

After defining runtime enforcement of domain knowledge formally, we will return to these challenges and (a) discuss different mapping strategies, and especially the (semi-)automatic generation of mappings and (b) give a system that, for a big class of mappings, also derives local conditions.

Fig. 2.
figure 2

Runtime enforcement of knowledge bases on traces.

4 Knowledge Guided Transition Systems

We now introduce runtime enforcement using KBs. To this end, we define the mapping of traces to KBs formally and give the transition system that uses this lifting for consistency checking. First, we define the notion of traces.

Definition 5

(Execution Traces). An event \( ev \) for a rule r and a substitution \(\sigma \) has the form \(\textsf{ev}(r,\sigma )\), which we write as\(\textsf{ev}(r, v_1 : t_1,\dots , v_n : t_n)\), where \(v_i : t_i\) are the pairs in \(\sigma \). To record the sequence of an execution, we use traces. A trace is a finite sequence of events, where each event records the applied rule and the corresponding substitutions, if there are any.

Example 4

The trace of the rule application in Ex. 3 is as follows, for suitable substitutions that all store the deposited or eroded layer in the variable v.

figure b

To connect executions with knowledge bases, we define mappings that transform traces into knowledge bases, given a fixed vocabulary \(\varSigma \).

Definition 6

(Mappings). A \(\varSigma \)-mapping \(\mu \) is a function from traces to knowledge bases over vocabulary \(\varSigma \).

The mapping is given by the user, who has to respect the signature of the KB formalizing the used domain knowledge. While we are not specific in the structure of the mapping in general, we introduce the notion of a first-order matching mapping, which allow for optimization and automatization.

Definition 7

(First-Order Matching Mapping). A first-order matching mapping \(\mu \) is defined by a set \(\{ \varphi _1 \mapsto _{N_1} ax _1,\dots ,\varphi _n \mapsto _{N_n} ax _n\}\), where each element has a first-order logic formula \(\varphi _i\) as its guard, a set of individuals \(N_i\) and some set \( ax _i\) of KB axioms as its body. We write \( ax _i(N)\) to emphasize that a set of individuals N occur in \( ax _i(N)\).

The mapping is applied to a trace tr by adding all those bodies whose guard evaluates to true and replacing all members of N in \( ax_1 \) by fresh individual names:

$$\begin{aligned} \mu (tr) = \left( \bigcup _{tr \models \varphi _i} ax _i(N)\right) [N \text { fresh}] \end{aligned}$$

Where \(A[N \text { fresh}]\) substitutes all individuals in N with fresh names in A.

Example 5

Consider the following first-order matching mapping \(\mu \), for some role/property \(\texttt{P}\) and individuals \(\texttt{A}\), \(\texttt{B}\) and \(\texttt{C}\). The function \(\textsf{rule}( ev )\) extracts the rule name from the given event \( ev \).

$$\begin{aligned} \big \{&\exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_1 \mapsto _{\emptyset } \mathtt {P(A,B)},\quad \exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_2 \mapsto _{\emptyset } \mathtt {P(B,A)},\\&\exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_3 \mapsto _{\emptyset } \mathtt {P(A,C)},\quad \exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_4 \mapsto _{\emptyset } \mathtt {P(C,A)} \big \} \end{aligned}$$

Its application to a trace \(\langle \textsf{ev}(\textsf{r}_1), \textsf{ev}(\textsf{r}_1),\textsf{ev}(\textsf{r}_2) \rangle \) is the set \(\{ \mathtt {P(A,B)}, \mathtt {P(B,A)}\}\).

First-order matching mapping can also be applied to our running example.

Example 6

We continue with the trace from Ex. 4, extended with another event \(\textsf{ev}\)(depositStego, \(v:\texttt{layer2}\)). We check whether adding an event to the trace would result in a consistent KB by actually extending the trace for analysis. We call this a hypothetical execution step.

The following mapping, which must be provided by the user adds the spatial information about layers w.r.t. the fossils found within. The first-order logic formula at the guard of the mapping expresses that an event of \(\texttt{depositTRex}\) is found before the event of \(\texttt{depositStego}\) in the trace. Note that the given set of axioms from the mapping faithfully describes the event structure of the trace, i.e., the mapping could produce axioms which will cause inconsistency w.r.t. the domain knowledge: Together with the DKDP, we can see that the trace is mapped to an inconsistent knowledge base by adding 5 axioms. Note that we do not generate one layer for each deposition event during simulation, but only two specific ones, \(\mathtt {Layer(l_1)}\) and \(\mathtt {Layer(l_2)}\) in this case, for the relevant information. One can extend mapping rules for the different cases (for instance, \(\texttt{depositStego}\) before \(\texttt{depositTRex}\), only \(\texttt{depositTRex}\) events, etc.), or use a different mapping mechanism, which we discuss further in Sec. 6.

$$\begin{aligned} \exists&l_1, l_2.~\exists i_1,i_2.~\\&tr [i_1] \doteq \textsf{ev}\big (\texttt{depositTRex}, v:l_1\big ) \wedge tr [i_2] \doteq \textsf{ev}\big (\texttt{depositStego}, v:l_2\big ) \wedge i_1 < i_2 \\&\mapsto _{\mathtt {l_1}, \mathtt {l_2}}\\ \big \{&\mathtt {Layer(l_1)},\mathtt {contains(l_1,Tyrannosaurus)},\\&\mathtt {Layer(l_2)},\mathtt {contains(l_2,Stegosaurus)}, \mathtt {below(l_1,l_2)}\big \} \end{aligned}$$

We stress again that we are interested in trace properties, a layer may still have had effects on the state despite being completely removed at one point (by an \(\texttt{erode}\) event). Thus, we must consider the deposition event of a layer to check the trace against the domain knowledge.

The guided transition systems extends the mapping of a basic transition system, by additionally ensuring that the trace after executing the rule would be mapped to a consistent knowledge base. This treats the domain knowledge as an invariant that is enforced, i.e., a transition is only allowed if it indeed preserves the invariant.

Definition 8

(Guided Transition System). Given a set of rules \(\mathcal {R}\), a mapping \(\mu \) and a knowledge base \(\mathcal {K}\), the guided semantics is defined as a transition system between pairs of terms t and traces \( tr \). For each rule \(\textsf{r} \in \mathcal {R}\), we have one guided rule (for consistency, cf. Def. 2):

figure c

The set of traces generated by a rewrite system \(\mathcal {R}\) from a starting term \(t_0\) is denoted \(\textbf{H}(\mathcal {R}, \mu , \mathcal {K}, t_0)\). Execution always starts with the empty trace.

5 Well-Formedness and Optimization

The transition rule in Def. 8 uses the knowledge base directly to check consistency, and while this enables to integrate domain knowledge into the system directly, it also poses challenges from a practical point of view. First, the condition of the rule application is not specific to the change of the trace, and must check the consistency of the whole knowledge base, which can be computationally heavy. Second, the consistency check is performed at every step, for every potential rule application. Third, the trace must be mapped whenever it is extended. Which means the same mapping computation that has been performed in the previous step may be executed all over again.

To overcome these challenges, we provide a system that reduces consistency checking by using well-formedness guards, which only require to evaluate an expression over the trace without accessing the knowledge base. These guards are transparent to the domain users, the system behaves the same as with the consistency checks of the knowledge base. At its core, we use well-formedness predicates, which characterize the relation of domain knowledge and mappings.

Definition 9

(Well-Formedness). A first-order predicate \( wf \) of a trace tr is a well-formedness predicate for some mapping \(\mu \) and some knowledge base \(\mathcal {K}\), if the following holds:

$$\begin{aligned} \forall tr .~ wf (tr) \iff \mu (tr) \cup \mathcal {K}\text { is consistent} \end{aligned}$$

Using this definition we can slightly rewrite the rule of Def. 8: For every starting term \(t_0\), the set of generated traces is the same if the rule of Def. 8 is replaced by the following one

figure d

For first-order matching mappings, we can generate the well-formedness predicate by testing all possible extensions of the knowledge base upfront and defining the guards of those sets that are causing inconsistency as non-well-formed.

Theorem 1

Let \(\mu \) be a first-order matching mapping for some knowledge base \(\mathcal {K}\). Let \(\texttt{Ax} = \{ ax _1,\dots , ax _n\}\) be the set of all bodies in \(\mu \). Let \(\texttt{Incons}\) be the set of all subsets of \(\texttt{Ax}\), such that for each \(A \in \texttt{Incons}\), \(\bigcup _{a\in A} a \cup \mathcal {K}\) is inconsistent. Let \(\texttt{guard}_A\) be the set of guards corresponding to each body in A. The following predicate \( wf _\mu \) is a well-formedness predicate for \(\mu \) and \(\mathcal {K}\).

$$ wf _\mu = \lnot \bigvee _{A \in \texttt{Incons}}\bigwedge _{\varphi \in {\texttt{guard}_A}}\varphi $$

Example 7

We continue with Ex. 5. Consider a knowledge base \(\mathcal {K}\) expressing that role \(\texttt{P}\) is asymmetric. The knowledge base becomes inconsistent if the first two or the last two axioms from \(\mu \) are added to the knowledge base. Thus, the generated well-formedness predicate \( wf \) is the following

$$\begin{aligned} wf _\mu (tr) \equiv \lnot \Big (&\big ((\exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_1) \wedge (\exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_2)\big )\vee \\&\big ((\exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_3) \wedge (\exists i.~\textsf{rule}( tr [i]) \doteq \textsf{r}_4)\big )\Big ) \end{aligned}$$

The above procedure has exponential complexity in the number of branches of the mapping. But as the superset of an inconsistent set is also inconsistent, it is not necessary to generate all the subsets. I.e., it suffices to consider the following set of minimal inconsistencies instead, which can be computed by testing for inconsistencies based on the sets ordered by \(\subset \).

$$\begin{aligned} \mathtt {min\text {-}Incons} = \{ A \ | \ A \in \texttt{Incons} \wedge \forall A ' \in \texttt{Incons}.~ A ' \ne A \rightarrow A ' \not \subset A \} \end{aligned}$$

If well-formedness is defined inductively, then we can give an even more specific transformation. The well-formedness predicate is inductive, if it checks well-formedness for each trace and its last event is equivalent to the evalution of a formula over the trace, which is specific to the event. If this is the case, then each rule, which dictates the event, can have an own, highly specialized well-formedness guard, which further enhances efficiency.

Definition 10

(Inductive Well-Formedness). A well-formedness predicate \( wf \) is inductiveFootnote 2 for some set of rules \(\mathcal {R}\) if there is a set of predicates \( wf _r\) for all rules \(r \in \mathcal {R}\), such that \( wf \) can be written as an inductive definition:

$$\begin{aligned} wf (\langle \rangle )&\equiv \textsf{true}\\ wf (tr\circ ev )&\equiv wf (tr) \wedge \bigwedge _{r \in \mathcal {R}} \big ((\textsf{rule}( ev ) \doteq r) \rightarrow wf _r(tr, ev )\big ) \end{aligned}$$

in which \( wf _{r}(tr, ev )\) is the local well-formedness predicate specifically for rule r with the condition \(\textsf{rule}( ev ) \doteq r\). The predicate \( wf _r\) forms the guard for rule r. Every well-formedness predicate is equivalent to an inductive well-formedness predicate by setting \( wf _r(tr, ev ) = wf (tr\circ ev )\), but we aim to give more specific predicates per rule.

Example 8

Finishing our geological system, we can give local well-formedness predicates for all rules. For example, we can define a specific guard for rule \(\texttt{depositStego}\) expressing that the deposition of a layer containing Stegosaurus fossil is not allowed if there is already a deposition of a layer containing Tyrannosaurus fossils captured in the trace tr up to now. Compare with the approach that the whole knowledge base needs to be checked, this proposed solution using inductive well-formedness simplifies the complexity of analysis significantly. For instance, the rule for deposition does not need to concern with the ordering of the geological age.

figure e

Definition 11

(Transition System using Well-Formedness). Let \( wf \) be an inductive well-formedness predicate for a set of rules \(\mathcal {R}\), some mapping \(\mu \), some knowledge base \(\mathcal {K}\). We define the transformed guarded transition system with the following rule for each \(r \in \mathcal {R}\).

figure f

The set of traces generated by this transition system from a starting term \(t_0\) is denoted \(\textbf{G}(\mathcal {R}, wf , t_0)\). Execution always starts with the empty trace.

Note that (a) we do use a specific well-formedness predicate per rule, and that (b) we do not extend the trace tr in the premise as the rules in Def. 8 and Def. 9.

Theorem 2

Let \( wf \) be an inductive well-formedness predicate for a set of rules \(\mathcal {R}\), some mapping \(\mu \), some knowledge base \(\mathcal {K}\). The guided system of Def. 8 and Def. 11 generate the same traces: \( \forall t.~\textbf{H}\big (\mathcal {R}, \mu , \mathcal {K}, t\big ) = \textbf{G}\big (\mathcal {R}, wf , t\big ) \)

We can also define determinism as terms of the inductive well-formedness. An inductive well-formedness predicate \( wf \) is deterministic, if for each trace tr and event ev, only one possible local well-formedness predicate \( wf _\textsf{r}( tr , ev)\) holds.

Proposition 1

(Deterministic Well-Formedness). An inductive well-formedness predicate \( wf \) with local well-formedness predicates \(\{ wf _\textsf{r}\}_{r \in \mathcal {R}}\) is deterministic, if

$$\begin{aligned} \forall tr.~\forall ev .~\bigwedge _{r \in \mathcal {R}}\Big ( wf _r(tr, ev ) \rightarrow \bigwedge _{\begin{array}{c} r' \in \mathcal {R}\\ r' \ne r \end{array}}\lnot wf _{r'}(tr, ev )\Big ) \end{aligned}$$

For deterministic predicates, only one trace is generated: \(\big |\textbf{G}\big (\mathcal {R}, wf , t\big )\big | = 1\).

When the programmer designs the mapping, the focus is on mapping enough information to achieve inconsistency, to ensure that certain transition steps are not performed. If the same mapping is to be used to retrieve results from the computation, e.g., to query over the final trace, this may be insufficient. Next, we discuss mappings that preserve more, or all information from the trace.

6 (Semi-)Automatically Generated Mappings

The mappings we discussed so far require to be defined completely by the programmer and are used to extract a certain correct information from a trace, which is sufficient to enforce domain invariants at runtime. In this section, we introduce mappings which can be constructed (semi-)automatically to simplify the usage of domain invariants: The transducing mappings and direct mappings leverage the structure of the trace directly. A transducing mapping is constructed semi-automatically. It applies some manually defined mapping to each event and automatically connects every pair of consecutive events in a trace using the next role in KB. A direct mapping relates each event with its parameters and is constructed fully automatically. Both kinds of mappings are not only easier to use for the programmer, they can also be used by the domain users to access the results of the computation in terms of the domain.

A transducing mapping is semi-automatic in the sense that part of the mapping is pre-defined, and the programmer must only define a part of it, namely the mapping from a single event to a KB.

Formally, a transducing mapping consists of a function \(\iota \) that generates unique individual namesFootnote 3 per event and a user-defined function \(\epsilon \) that maps every event to a KB.

Definition 12

(Transducing Mapping). Let \(\iota \) an injective function from natural numbers to individuals, and \(\epsilon \) be a function from events to KBs. Let \(\texttt{next}\) be an asymmetric role. Given a trace tr, a transducing mapping \(\delta ^{\texttt{next}}_{\iota ,\epsilon }(tr)\) is defined as follows. For simplicity, we annotate the index i of an event in tr directly.

$$\begin{aligned} \delta ^{\texttt{next}}_{\iota ,\epsilon }(\langle \rangle )&= \emptyset \qquad \qquad \delta ^{\texttt{next}}_{\iota ,\epsilon }(\langle \textsf{ev}_i \rangle ) = \epsilon (\textsf{ev}_i)\\ \delta ^{\texttt{next}}_{\iota ,\epsilon }(\langle \textsf{ev}_i, \textsf{ev}_j \rangle \circ tr)&= \epsilon (\textsf{ev}_i) \cup \{\texttt{next}(\iota (i), \iota (j))\} \cup \delta ^{\texttt{next}}_{\iota ,\epsilon }(\langle \textsf{ev}_j \rangle \circ tr) \end{aligned}$$

in which the \(\circ \) operator concatenates two traces. This approach is less demanding than to design an arbitrary mapping, as the structure of the sequence between each pair of consecutive events is taken care of by the \(\texttt{next}\) role and \(\iota \) is trivial in most cases: one can just generate a fresh node with the number as part of its individual symbol. The programmer only has to provide a function \(\epsilon \) for events.

Example 9

Our geology example can be reformulated with the following user-defined function \(\epsilon _ geo \). Let \(\iota _ geo \) map every natural number i to the symbol \(\texttt{layer}_{i}\):

$$\begin{aligned} \epsilon _ geo (\textsf{ev}_i(\textsf{depositStego}, v\!:\! l))&\!=\! \{\texttt{contains}(\iota _ geo (i), \texttt{Stegosaurus}), \texttt{Layer}(\iota _ geo (i))\}\\ \epsilon _ geo (\textsf{ev}_i(\textsf{depositTRex}, v\!:\! l))&\!=\! \{\texttt{contains}(\iota _ geo (i), \texttt{Tyrannosaurus}), \texttt{Layer}(\iota _ geo (i))\}\\ \epsilon _ geo (\textsf{ev}_i(\textsf{deposit}, v\!:\! l))&\!=\! \{\texttt{contains}(\iota _ geo (i), \texttt{Nothing}), \texttt{Layer}(\iota _ geo (i))\}\\ \epsilon _ geo (\textsf{ev}_i(\textsf{erode}))&\!=\!\emptyset \end{aligned}$$

Note that the function \(\iota _ geo (i)\) is used to generate new symbols for each event, which are then declared to be geological layers by the axiom \(\texttt{Layer}(\iota _ geo (i))\). It generalizes the set of fresh names from first-order matching mappings in Def. 7. Based on this function definition, the example in Sec. 3 can be performed using the transducing mapping \(\delta ^\texttt{below}_{\iota _ geo ,\epsilon _ geo }\). The connections between each pair of consecutive events in a trace, i.e., a layer is \(\texttt{below}\) another layer, is derived from the axioms in the domain knowledge and is added as additional axioms to the KB.

So far, the mappings of the trace to some information in terms of a specific domain are defined by the programmer. To further enhance the automation of the mapping construction, we give a direct mapping, that captures all information of a trace in a KB. More technically, the direct mapping directly expresses the trace structure using a special vocabulary, which captures domain knowledge about traces themselves and is independent from any application domain. We first define the domain knowledge about trace structure.

Definition 13

(Knowledge Base for Traces). The knowledge base for traces contains the concept \(\texttt{Event}\) modeling events, the concept \(\texttt{Match}\) modeling one pair of variable and its matching terms, and the concept \(\texttt{Term}\) for terms. Furthermore, the functional property \(\texttt{appliesRule}\) connects events to rule names (as strings), the property \(\texttt{match}\) that connects the individuals for events with the individuals for matches (i.e., an event with the pairs v : t of a variable and the term assigned to this variable), the property \(\texttt{var}\) that connects matches and variables (as strings), and \(\texttt{term}\) that connects matches and terms.

We remind that KBs only support binary predicates and we cannot avoid formalizing the concept of a match, which connects three parts: event, variable and term. The direct mapping lessens the workload for the programmer further: it requires no additional input and can be done fully automatically. It is a pre-defined mapping for all programs and is defined by instantiating a transducing mapping using the \(\texttt{next}\) role and pre-defined functions \(\epsilon _ direct \) and \(\iota _ direct \) for \(\epsilon \) and \(\iota \). Also, we must generate additional fresh individuals for the matches. The formal definition of the pre-defined functions for the direct mapping is as follows.

Definition 14

(Direct Mapping). The direct mapping is defined as a transducing mapping \(\delta ^\texttt{next}_{\iota _ direct ,\epsilon _ direct }\), where the function \(\iota _ direct \) maps every natural number i to an individual \(\texttt{e}i\). The individuals \(\texttt{match}i{\_}j\) uniquely identify a match inside a trace for the jth variable of the ith event, and we regard variables as strings containing their names. Function \(\epsilon _ direct \) is defined as follows:

$$\begin{aligned} \epsilon&_ direct (\textsf{ev}_i(\textsf{r}, v_1:t_1,\dots ,v_n,t_n)) = \\&\{ \texttt{Event}(\iota _ direct (i)), \texttt{appliesRule}(\iota _ direct {(i)}, \textsf{r}) \} \cup \\&\bigcup _{j \le n} \big (\{\texttt{match}(\iota _ direct (i), \texttt{match}i{\_}j), \texttt{var}( \texttt{match}i{\_}j, v_j), \texttt{term}(\texttt{match}i{\_}j, \eta (t_j)\} \cup \delta (t_j) \big ) \end{aligned}$$

where \(\delta (t_j)\) deterministically generates the axioms for the tree structure of the term \(t_j\) according to Def. 3 and \(\eta (t_j)\) returns the individual of the head of \(t_j\).

The properties \(\texttt{match}\), \(\texttt{var}\) and \(\texttt{term}\) connect each event with its parameters. For example, the match \(v : \texttt{layer0}\) of the first event in Ex. 4, generates

$$ \mathtt {match(e1, match0{\_}1)}, \mathtt {var(match0{\_}1, ``v")}, \mathtt {term(match0{\_}1, layer0)} $$

where \(\texttt{e1}\) is the representation of the event and \(\mathtt {match0{\_}1}\) is the representation of the match in the KB. The complete direct mapping is given in the following example.

Example 10

The direct mapping of Ex. 4 is as follows. We apply the \(\epsilon _ direct \) function to all three events, where each event has one parameter.

figure g

7 Discussion

Querying and Stability. The mapping can be used by the domain users to interact with the system. For one, it can be used to retrieve the result of the computation using the vocabulary of a domain. For example, the following SPARQL [44] query retrieves all depositions generated during the Jurassic:

$$\begin{aligned} {\texttt {SELECT \,\,?l\,\, WHERE \,\,\{?l\,\, a \,\,Layer.\,\, ?l\,\, during\,\, Jurassic\}}} \end{aligned}$$

Indeed, one of the main advantages of knowledge bases is that they enable ontology-based data access [46]: uniform data access in terms of a given domain. Another possibility is to use justifications [5]. Justifications are minimal sets of axioms responsible for entailments over a knowledge base, e.g., to find out why it is inconsistent. They are able to explain, during an interaction, why certain steps are not possible.

The programmers do not need to design a complete knowledge base – for many domains knowledge bases are available, for example in form of industrial standards [23, 26]. For more specific knowledge bases, clear design principles based on experiences in ontology engineering are available [17]. Note that these KBs are stable and do rarely change. Our system requires a static domain knowledge, as changes in the DKDP can invalidate traces during execution without executing a rule, which is, thus, not a limitation if one uses stable ontologies.

The direct mapping uses a fixed vocabulary, but one can formulate the connection to the domain knowledge by using additional axioms. In Ex. 10, one can declare every event to be a layer. The axiom for \(\texttt{depositStego}\) is as follows.

figure h

The exact mapping strategy is application-specific – for example, to remove information \(\texttt{erode}\) must be handled through additional axioms as well, for example by adding a special concept \(\texttt{RemovedLayer}\) that is defined as all layers that are matched on by some \(\texttt{erode}\) event. We next discuss some of the considerations when choosing the style of mapping, and the limitations of each.

There are, thus, two styles to connect trace and domain knowledge: One can add axioms connecting the vocabulary of traces with the vocabulary of the DKDP (direct mapping), or one can translate the trace into the vocabulary of the DKDP (first-order matching mapping, transducing mappings).

The two styles require different skills from the programmer to interact with the domain knowledge: The first style requires to express a trace as part of the domain as a set of ABox axioms, while the second one requires to connect general traces to the domain using TBox axioms. Thus, the second style operates on a higher level of abstraction and we conjuncture that such mappings may require more interaction with the domain expert and a deeper knowledge about knowledge graphs. However, the same insights needed to define the TBox axioms, are also needed to define the guards of a first-order matching mapping.

Naming Schemes. The transducing mappings and the first-order matching mapping have different naming schemes. A transducing mapping, and thus, a direct mapping, generate a new name per event, while the first-order matching mapping generates a fixed number of new names per rule: A transducing mapping can extract quite extensive knowledge from a trace, with the direct mapping giving a complete representation of it in a KB. As discussed, this requires the user to define general axioms. A first-order matching mapping must work with less names, and extract less knowledge from a trace. Its design requires to choose the right amount of abstraction to detect inconsistencies.

Fig. 3.
figure 3

Runtime comparison.

Evaluation. To evaluate whether the proposed system indeed gives a performance increase, we have implemented the running exampleFootnote 4 as follows: The system generates all traces up to length n, using three different transition systems: (a) The guided system (Def. 8) using the transducing mapping of Ex. 9. For reasoning, we use the Apache Jena framework [2]. (b) The guarded system (Def. 11) that uses a native implementation of the well-formedness predicate, and (c) the guarded system that uses the Z3 SMT solver [18] to check the first-order logic guards. The results are shown in Fig. 3. As we can see, the native implementation of the guarded systems is near instant for \(n \le 7\), while the guided system needs more than 409s for \(n=7\) and shows the expected blow-up due to the N2ExpTime-completeness of reasoning in the logic underlying OWL [30]. The guarded system based on SMT similarly shows a non-linear behavior, but scales better then the guided system. For the evaluation, we ran each system for every n three times and averaged the numbers, using a Ubuntu 21.04 machine with an i7-8565U CPU and 32GB RAM. As we can see, the guarded system allows for an implementation that does not rely on an external, general-purpose reasoners to evaluate the guards and increases the scalability of the system, while the guided system does not scale even for small system and KBs.

8 Related Work

Runtime enforcement is a vast research field, for a recent overview we refer to the work of Falcone and Pinisetty [22], and give the related work for combinations of ontologies/knowledge bases and transitions systems in the following.

Concerning the combination of ontologies/knowledge bases and business process modeling, Corea et al. [16] point out that current approaches lack the foundation to annotate and develop ontologies together with business process rules. Our approach focuses explicitly on automating the mapping, or support developers in its development in a specific context, thus satisfying requirement 1 and 7 in their gap analysis for ontology-based business process modelling. Note that most work in this domain uses ontologies for the process model itself, similar to the ontology we give in Def. 13 and Def. 13 (e.g., Rietzke et al. [36]) or the current state (e.g., Corea and Delfmann [15]), not the trace. We refer to the survey of Corea et al. for a detailed overview.

Compared with existing simulators of hydrocarbon exploration [20, 47], which formalized the domain knowledge of geological processes directly in the transition rules, we propose a general framework to formalize the domain knowledge in a knowledge base which is independent from the term rewriting system. This clear separation of concerns makes it easier for domain users to use the knowledge base for simulation without having the ability to program.

Tight interactions between programming languages, or transition systems, beyond logical programming and knowledge bases have recently received increasing research attention. The focus of the work of Leinberger [29, 32] is the type safety of loading RDF data from knowledge bases into programming languages. Kamburjan et al. [28] semantically lift states for operations on the KB representation of the state, but are not able to access the trace. In logic programming, a concurrent extension of Golog [33] is extended to verify CTL properties with description logic assertions by Zarrieß and Claßen [48].

Cauli at al. [12] use knowledge bases to reason about the security properties of deployment configuration in the cloud, a high level representation of the overall system. As for traces, Pattipati et al. [34] introduce a debugger for C programs that operates on logs, i.e., special Traces. Their system operates post-execution and cannot guide the system. Al Haider et al. [1] use a similar technique to investigate logged traces of a program.

In runtime verification, knowledge bases has been investigated by Baader and Lippmann [6] in \(\mathcal {ALC}\)-LTL, which uses the description logic \(\mathcal {ALC}\) instead of propositional variables inside of LTL. An overview over further temporalizations of description logics can be found in the work of Baader et al. [4]. Runtime enforcement has been using to temporal properties over traces since its beginnings [37], but, as a recent survey by Falcone and Pinisetty [22] points out, mainly for security/safety or usage control of libraries. In contrast, our work requires the enforcement to do any meaningful computation and uses a different way to express constraints than prior work: consistency with knowledge bases.

DatalogMTL extends Datalog with MTL operators [9, 45] to enable ontology-based data access about sequences using inference rules. The ontology is expressed in these rules, i.e., it is not declarative but an additional programming layer, which we deem unpractical for domain users from non-computing domains. DatalogMTL has been used for queries [10] but not for runtime enforcement.

Traces have been explored from a logical perspective mainly in the style of CTL\(^*\), TLA and similar temporal logics. More recently, interest in more expressive temporal properties over traces of programming languages for verification using more complex approaches has risen and led to symbolic traces [11, 19], integration of LTL and dynamic logics for Java-like languages [8] and trace languages based on type systems [27]. These approaches have in common that they aim for more expressive power and are geared towards better usability for programmers and simple verification calculi. They are only used for verification, not at runtime, and do not connect to formalized domain knowledge.

The guided system can be seen as a meta-computation, as put forward by Clavel et al. [14] for rewrite logics, which do not discuss the use of consistency as a meta-computation and instead program such meta computations explicitly.

9 Conclusion

We present a framework to use domain knowledge about dynamic processes to guide the execution of generic transition systems through runtime enforcement. We give a transformation to use of rule specific guards instead of using the domain knowledge directly as a consistency invariant over knowledge bases. The transformation is transparent and the domain user can interact with the system without being aware of the transformation or implementation details. To reduce the working load on the programmer, we discuss semi-automatic design of mappings using transducing approaches and a pre-defined direct mapping. We also discuss further alternatives, such as additional axioms on the events, and the use of local well-formedness predicates for certain classes of mappings.

Future Work. We plan to investigate how our system can interact with knowledge base evolution [24], a more declarative approach for changes in knowledge bases, as well as other approaches to modeling sequences in knowledge bases [40].