1 Introduction

Inspired by natural phenomena, many new computational formalisms have been introduced to model different aspects of biology. Basic chemical reactions inspired Ehrenfeucht and Rozenberg’s reaction systems (RSs) [1, 2]: a qualitative modelling formalism that is based on two opposite mechanisms: facilitation and inhibition. Facilitation means that a reaction can occur only if all its reactants are present, while inhibition means that the reaction cannot occur if any of its inhibitors is present. A reaction system is a set of reactions, each determined by its reactants, inhibitors and products, over a (finite) support set of biological entities. The theory of RSs is based on three principles: no permanency, any entity vanishes unless it is sustained by a reaction; no competition, an entity is either available for all reactions, or it is not available at all; and no counting, the exact concentration level of available entities is ignored, as if it was always high enough to activate all enabled reactions.

Dynamically, RS exploit a discrete time model, where each state collects the entities that are present at a given time unit. The computation of the next state is a deterministic procedure. However, the overall dynamics is influenced by the entities received (nondeterministically) from the external environment, called contextual entities. Such entities join the current state of the system and participate to enabling and disabling reactions. The behaviour of a RS is hence defined as a discrete time interactive process consisting of a context sequence (the sets of entities received at each unit of time from the environment), a result sequence (the sets of entities produced at each unit of time by reactions) and a state sequence (the join of the context sequence with the result sequence). Since their introduction, RSs have shown to be a quite general computation model whose application ranges from the modelling of biological phenomena [3,4,5,6] and molecular chemistry [7] to theoretical foundations of computing [8,9,10].

1.1 Towards extensible reaction systems

As it is often the case for computational models, the study of RSs has to balance between simplicity and expressiveness. If more accurate models are required to precisely account for certain aspects of biological system, then RSs must be extended accordingly to increase their predictive power, possibly making their theory less straight or even cumbersome and more difficult to approach  [11,12,13,14,15,16,17]. Moreover, different kinds of enhancements proposed in the literature do not necessarily agree.

Our long-term goal is to develop a convenient way to embed RSs in a modular and extensible formal framework, where new extensions can be accommodated with a friendly syntax, so to match simplicity with increased expressiveness. To this aim, we plan to build on a process algebraic representation of RSs, whose dynamics can be expressed as a labelled transition system (LTS) generated by a small set of inference rules defined by structural induction (SOS style) [18, 19]. We have already exploited this technique to explore and experiment with locally scoped entities and recursively defined nondeterministic contexts, where a single LTS accounts for different evolutions of the same system. Among the main advantages of our approach we mention: (1) transparency: each transition label conveys information about all the activities connected to the execution step it describes; (2) compositionality: the behaviour of a composite system is defined in terms of the behaviours of its constituents; and (3) extensibility: RS variants can be enhanced by modifying/adding language operators and inference rules in a modular fashion.

1.2 The problem

In this paper, motivated by three quite different case studies taken from the literature, we investigate the possibility to devise general-purpose extensions of our framework to tackle quantitative features of biological systems. The three case studies are concerned with a model of drug administration in tumour growth, a complex gene network that regulates the differentiation of T lymphocytes and the modelling of synaptic transmission between neurons, which we briefly introduce below.

Drug administration in tumour growth We develop a model for comparing the efficacy of drug administration strategies to block the tumour growth. Our model is inspired by the delay differential equation model presented in [20], where delays are added to differential equations to describe the duration of the different phases of the cell cycle. In particular, we model the immune system response and a phase-specific drug able to alter the natural course of action of the cell cycle of the tumour cells. While a general method for transforming differential equation models into RSs in such a way to preserve all properties is likely unfeasible, we use the case study to demonstrate that, for this particular example, it is possible to exploit delays and durations in order to rediscover some of the phenomena also present in the differential equation model. In this case, the advantage is of course the simplicity offered by a discrete time model and by the key feature of RSs (no permanency, no competition, no counting).

Regulating differentiation in Th cells We focus on the discrete dynamical model for differentiation in Th cells as proposed in [21], which was able to reproduce the most important dynamics aspects of the regulatory process. While a previous RS encoding had to classify different levels of the same entity in separate objects (see [22]), thus requiring some arbitrary ad hoc classification, we exploit this case study to show some advantages of using numerical (discrete) concentration levels instead of (distinct) object levels in RS models.

Synaptic transmission We introduce a simple functional model with a quantitative abstraction for synaptic transmission, that is the process that allows two neurons connected by a synapse to communicate. Communication consists in impulsive chemical signals that are sent from the first neuron to the second. Chemical signals take the form of neurotransmitters that are released by the first neuron and perceived by the second neuron, and they are stimulated by ionic currents. Mathematical continuous models were applied to model the dynamics of this synapse communication [23]. Here, we do not consider the kinetic rates of the different biological reactions, but we make a combined use of delays, durations and concentration levels to model different facets of this complex phenomenon. Although our model is very simple and deterministic, we show that the dynamics of the entire system is faithfully modelled and can be compared to more complex approaches such as [24], where a stochastic modelling method is used.

1.3 Contribution

The above case studies serve to witness that the ability to account for reactions with different speeds or to define different reactions according to the levels of concentration of certain entities can play a very important role in the study of biological phenomena. Therefore, in this paper we propose two conservative extensions of RSs that try to increase the expressiveness while preserving its simplicity as much as possible.

First, we add the possibility to express reaction delays and durations, which makes it possible to encode reactions with different speeds. A reaction with associated delay n will deliver its products after n time units. The value zero for n represents fastest reactions (0 is the smallest delay, the product being immediately available) but delays can otherwise take any integer value \(n >0\) to model slower and slower reactions. For reactions, the delay is thus the time that elapses between the enabling of the reaction itself and its product being available in the system. Following this idea also a duration of ‘permanency’ can be specified for each reaction. A reaction r that has delay n and duration m will deliver, if applicable, its products after n time units and such products will remain available for the following m time units before vanishing.

The second enhancement adds some quantitative information to each entity to model concentration levels that can influence both the facilitation and inhibition mechanisms of reactions. Each entity in a reaction comes with an approximated quantitative threshold that will be necessary for enabling the reaction. We note that we still maintain a qualitative perspective on the biological system, since the concentration levels will be used to determine the set of reactions that can be applied in any step, whereas competition between different enabled reactions to ‘consume’ available reactants will not be considered.

Most importantly, although we formalise the two features separately for ease of presentation, they can be integrated and used in combination without much efforts. Also, they are conservative extensions, meaning that the new features can be readily cancelled out by some default parameters whenever we just want to study their ordinary RS counterparts. This witnesses the flexibility, extensibility and modularity of an SOS-based approach.

The formal specification has been instrumental to develop a prototype implementation in Prolog that allows us to perform computational experiments and to compute and inspect the resulting LTS. Since the code follows the formal specification very closely (apart for minor optimisations), its soundness is easy to check by code inspection and the implementation will be easy to extend if new features must be added.

The tool has been fundamental to experiment with the case studies, because even if small- to medium-sized specifications were sufficient to encode the biological systems under scrutiny, it would have been very difficult and time consuming to analyse their behaviour without any automatic support.

Structure of the paper In Sect. 2, we recall the basics of RSs. In Sect. 3, we recall the syntax and operational semantics of our process algebra for RSs. The original contribution starts from Sect. 4, where we add new features to our framework: we introduce the concepts of delay and duration in Sect. 4.1 and linear patterns for expressing constraints on the concentration levels of the entities in Sect. 4.2. Section 5 describes the related work. The logic programming implementation of the new features is described in Sect. 6. In Sects. 79, we show how the extensions proposed in Sect. 4 can improve the study of the three biological phenomena we selected. For each case study, we report the key findings of our experimentation with the tool from Sect. 6. Section 10 discusses some related work and concludes the paper.

This article is the full version of the conference paper [25], here extended with a more detailed account of the theory behind our RS enhancements, many small examples to illustrate the syntax and semantics of our models, an in-depth inspection of the first two case studies and an entirely new example about synaptic transmission. We have also notably extended the implementation of our tool adding several features. We just mention a new parser for our extended syntax and an automated graphical representation of the computations, which we illustrate by including in the paper some automatically generated figures and graphics.

2 Reaction systems

The theory of reaction systems (RSs) [2] was born in the field of Natural Computing to model the behaviour of biochemical reactions in living cells. While our contribution builds on a process algebraic presentation of RSs, we recall here the main concepts as introduced in the classical set theoretic version. In the following, we use the term entities to denote generic molecular substances (e.g., atoms, ions, molecules) that may form some biochemical system.

Let S be a (finite) set of entities. A reaction in S is a triple \(a = (R,I,P)\), where \(R, I, P\subseteq S\) are finite setsFootnote 1 such that \(R \cap I = \emptyset \). The sets RI and P are the sets of reactants, inhibitors and products, respectively. All reactants have to be present in the current state for the reaction to take place. The presence of any of the inhibitors blocks the reaction. Products are the outcome of the reaction, to be released in the next state. We denote with \( rac (S)\) the set of all reactions over S. A reaction system is a pair \(\mathcal{A} = (S, A)\) where S is the set of entities, and \(A \subseteq rac (S)\) is a finite set of reactions over S.

Given the current set of entities \(W\subseteq S\), the result of a reaction \(a = (R,I,P)\in rac (S)\) on W, denoted \( res _a(W)\), is given by:

$$\begin{aligned} res _a(W) \triangleq \; \left\{ \begin{aligned}P &{} \quad\text{ if } en _a(W)\\ \emptyset &{} \quad\text{ otherwise } \end{aligned} \right. \quad en _a(W)\triangleq \; R \subseteq W\; \wedge \; I \# W \end{aligned}$$

where \( en _a(W)\) is called the enabling predicate for the reaction a in state W: it requires that all reactants R are present in W and that no inhibitor I is present in W, here written \(I \# W\) as a shorthand for \(I\cap W=\emptyset \) (it reads as ‘the sets I and W are disjoint’). Similarly, the result of the application of all reactions in \(\mathcal{A}\) to W, denoted \( res _A(W)\), is the obvious lifting of the above function, i.e.,

$$\begin{aligned} res _A(W) \triangleq \bigcup _{a \in A} res _a(W) .\end{aligned}$$

Since living cells are seen as open systems that react to environmental stimuli, the behaviour of a RS is formalised in terms of an interactive process. Let \(\mathcal{A} = (S, A)\) be a RS and let \(n \ge 0\). An n-step interactive process in \(\mathcal{A}\) is a pair \(\pi = (\gamma , \delta )\) s.t. \( \gamma =\{C_i\}_{i\in [0,n]}\) is the context sequence and \(\delta =\{D_i\}_{i\in [0,n]} \) is the result sequence, where \( C_{i}, D_{i} \subseteq S\) for any \(i\in [0,n]\), \(D_0 = \emptyset \), and \(D_{i+1} = res _{A}(C_{i} \cup D_{i})\) for any \(i \in [0,n-1]\). The context sequence \(\gamma \) represents the environment, while the result sequence \(\delta \) is entirely determined by \(\gamma \) and A. We call \(\tau = \{W_i\}_{i\in [0,n]}\) the state sequence, with \(W_{i} \triangleq C_{i} \cup D_{i}\), for any \(i \in [0, n]\). Note that each state \(W_{i}\) in \(\tau \) is the union of two sets: the context \(C_{i}\) at step i and the result set \(D_i= res _{A}(W_{i-1})\) from the previous step.

Example 1

We consider a toy RS defined by

$$\begin{aligned} \mathcal{A} \triangleq (S, A) \qquad S\triangleq \{\mathsf {a},\mathsf {b},\mathsf {c}\} \qquad A\triangleq \{a_1\}\end{aligned}$$

whose unique reaction is \(a_1\triangleq (\{\mathsf {a},\mathsf {b}\},\{\mathsf {c}\},\{\mathsf {b}\})\), to be written more concisely as \((\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b})\). Then, we consider a \(\text{3 - step}\) interactive process \(\pi \triangleq (\gamma ,\delta )\), where

$$\begin{aligned}\gamma \triangleq \{ C_0,C_1,C_2,C_3\} ,\qquad \delta \triangleq \{D_0,D_1,D_2,D_3\},\end{aligned}$$

with \(C_0 \triangleq \{\mathsf {a},\mathsf {b}\}\), \(C_1 \triangleq \{ \mathsf {a}\}\), \(C_2 \triangleq \{\mathsf {c}\}\), and \(C_3 \triangleq \{\mathsf {c}\}\); and \(D_0 \triangleq \emptyset \), \(D_1 \triangleq \{\mathsf {b}\}\), \(D_2 \triangleq \{\mathsf {b}\}\), and \(D_3 \triangleq \emptyset \). The context sequence can be written more concisely as \(\gamma = \mathsf {a}\mathsf {b},\mathsf {a},\mathsf {c},\mathsf {c}\), and similarly, the result sequence can be shortened as \(\delta = \emptyset ,\mathsf {b},\mathsf {b},\emptyset \). Then, the resulting state sequence is \(\tau = \{W_0, W_1, W_2, W_3\}= \mathsf {a}\mathsf {b}, \mathsf {a}\mathsf {b}, \mathsf {b}\mathsf {c}, \mathsf {c}\). In fact, it is easy to check that, e.g., \(W_0 =C_0\), \(D_1 = res _{A}(W_0) = res _{A}(\{\mathsf {a},\mathsf {b}\}) = \{\mathsf {b}\}\) because \( en _{a_1}(W_0)\), and \(W_1 = C_1\cup D_1 = \{ \mathsf {a}\}\cup \{ \mathsf {b}\} = \{\mathsf {a}, \mathsf {b}\}\).

3 SOS rules for reaction systems

Inspired by process algebras such as CCS [26], in [18, 19, 27] the authors introduced an algebraic syntax for RSs and equipped it with SOS inference rules defining the behaviour of each operator. This made it possible to consider a LTS semantics for RSs, where states are terms of the algebra, each transition corresponds to a step of the RS and transition labels retain some information on the entities needed to perform each step. In this paper, we build on the approach in [19], which we briefly summarise below.

Definition 2

(RS processes) Let S be a set of entities. An RS process \(\mathsf {P}\) is any term defined by the following grammar:

$$\begin{aligned} {\mathsf {P}} {:}{:}= [{\mathsf {M}}]\quad & {\mathsf {M}} {:}{:}= (R,I,P) \big \vert D \big \vert {\mathsf {K}} \big \vert {\mathsf {M}}{\mid } {\mathsf {M}}\\ & {\mathsf {K}} {:}{:}= {\textbf {0}} \big \vert X \big \vert C. {\mathsf {K}} \big \vert {\mathsf {K}}+ {\mathsf {K}} \big \vert {\mathsf {rec}} X. {\mathsf {K}} \end{aligned}$$

where \(R,I,P\subseteq S\) are nonempty sets of entities, \(C,D\subseteq S\) are possibly empty set of entities, and X is a process variable.

An RS process \(\mathsf {P}\) embeds a mixture process \(\mathsf {M}\) that is an arbitrary parallel composition of reactions (RIP), (possibly empty) sets of currently present entities D, and context processes \(\mathsf {K}\). We write \(\prod _{i\in I} \mathsf {M}_i\) for the parallel composition of all \(\mathsf {M}_i\) with \(i\in I\). For example, we let \(\prod _{i\in \{1,2\}} \mathsf {M}_i = \mathsf {M}_1 \mid \mathsf {M}_2\).

Example 3

A mixture process containing the reaction \(a_1\) from Example 1 with initial entities \(\mathsf {a}\) and \(\mathsf {b}\) can be written \((\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b}) \mid \mathsf {a} \mid \mathsf {b}\), and its RS process as \([(\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b}) \mid \mathsf {a} \mid \mathsf {b}]\).

A process context \(\mathsf {K}\) is a possibly nondeterministic and recursive system: the nil context \({\textbf {0}}\) halts the computation; the prefixed context \(C.\mathsf {K}\) says that the entities in the (possibly empty) set C are immediately available to be consumed by the reactions, and then, \(\mathsf {K}\) is the context offered at the next step; the nondeterministic choice \(\mathsf {K}_1+\mathsf {K}_2\) allows the context to behave either as \(\mathsf {K}_1\) or \(\mathsf {K}_2\); X is a process variable, and \(\mathsf {rec}~X.~\mathsf {K}\) is the usual recursive operator of process algebras that intuitively corresponds to the recursive definition \(X=\mathsf {K}\) (see Example 4). We write \(\sum _{i\in I} \mathsf {K}_i\) for the nondeterministic choice between all \(\mathsf {K}_i\) with \(i\in I\). For example, we let \(\sum _{i\in \{1,2\}} \mathsf {K}_i = \mathsf {K}_1 + \mathsf {K}_2\).

Example 4

The context process \(\mathsf {a}.\mathsf {b}.{\textbf {0}}\) represents a context sequence where \(C_0=\{\mathsf {a}\}\) and \(C_1=\{\mathsf {b}\}\), while \(\mathsf {a}.(\mathsf {b}.{\textbf {0}}+ \mathsf {c}.{\textbf {0}})\) represents a nondeterministic context that initially provides the entity \(\mathsf {a}\) and then either the entity \(\mathsf {b}\) or the entity \(\mathsf {c}\) before stopping. The context process \(\mathsf {rec}~X.~\mathsf {a}.{\textbf {0}}+ \mathsf {b}.X\) represents a nondeterministic context that, recursively, either provides the entity \(\mathsf {a}\) and then halts, or the entity \(\mathsf {b}\) and iterates.

The keen reader might have already noticed from the previous examples that there are different ways to represent the same concept. For example, why should we care to distinguish \([\mathsf {a} \mid (\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b}) \mid \mathsf {b}]\) from \([(\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b}) \mid \mathsf {a}\mathsf {b}]\)? Or \(\mathsf {b}.{\textbf {0}}+ \mathsf {b}.{\textbf {0}}\) from \(\mathsf {b}.{\textbf {0}}\)? In fact, we prefer not to. In process algebras, this is typically achieved by defining a suitable notion of structural equivalence \(\equiv \) and by taking processes up to such equivalence. Formally, we say that \(\mathsf {P}\) and \(\mathsf {P}'\) are structurally equivalent, written \(\mathsf {P} \equiv \mathsf {P}'\), when they denote the same term up to the laws of commutative monoids (unit, associativity and commutativity) for parallel composition \(\cdot \mid \cdot \), with \(\emptyset \) as the unit, and the laws of idempotent and commutative monoids for choice \(\cdot +\cdot \), with \({\textbf {0}}\) as the unit. We also assume \(D_1\mid D_2 \equiv D_1\cup D_2\) for any \(D_1,D_2\subseteq S\).

Remark 5

The processes \(\emptyset \) and \({\textbf {0}}\) are not interchangeable: as it will become clear from the operational semantics, the process \(\emptyset \) can perform just a trivial transition to itself, while the process \({\textbf {0}}\) cannot perform any transition and stops the computation.

Definition 6

(From RSs to RS processes) Let \(\mathcal{A} = (S, A)\) be a RS, and \(\pi = (\gamma , \delta )\) an n-step interactive process in \(\mathcal{A}\), with \(\gamma = \{C_i\}_{i\in [0,n]}\) and \(\delta = \{D_i\}_{i\in [0,n]}\). For any unit of time \(i\in [0,n]\), the corresponding RS process \(\llbracket \mathcal{A},\pi \rrbracket _i\) is defined as follows:

$$\begin{aligned} \llbracket \mathcal{A},\pi \rrbracket _i \triangleq \left[ \prod _{a\in A} a~\mid ~D_i~\mid ~\mathsf {K}_{\gamma ^i}\right] \end{aligned}$$

where the context process \(\mathsf {K}_{\gamma ^i} \triangleq C_i.C_{i+1}.\cdots .C_n.{\textbf {0}}\) is the sequentialisation of the entities offered by \(\gamma ^i\triangleq \{C_j\}_{j\in [i,n]}\). We write \(\llbracket \mathcal{A},\pi \rrbracket \) as a shorthand for \(\llbracket \mathcal{A},\pi \rrbracket _0\).

Example 7

Here, we give the encoding of the reaction system \(\mathcal{A}\) from Example 1. The resulting RS process is as follows:

$$\begin{aligned} \mathsf {P}\quad & \triangleq {} \llbracket \mathcal{A},\pi \rrbracket = \llbracket (S,A),\pi \rrbracket = \llbracket (\{\mathsf {a},\mathsf {b},\mathsf {c}\},\{a_1\}),(\gamma ,\delta ) \rrbracket \\&{}= {} [(\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b})~\mid ~ \emptyset ~\mid ~\mathsf {K}_{\gamma }] \\ & \equiv {} [(\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b})~\mid ~ \mathsf {K}_{\gamma }] \\ & \equiv {} [(\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b})~\mid ~ \mathsf {a}\mathsf {b}.\mathsf {a}.\mathsf {c}.\mathsf {c}.{\textbf {0}}] \end{aligned}$$

Note that \(D_0=\emptyset \) is inessential and can be discarded thanks to structural congruence (because \(\emptyset \) is the unit of parallel composition).

As already exemplified, our syntax allows for more general kinds of contexts than plain sequences. Nondeterministic contexts can be used to describe several alternative experimental conditions, while recursion can be exploited to extract some regularity in the longterm behaviour of a RS. Together, they can deal with a wide variety of in-breadth/in-depth behavioural analysis.

The behaviour of RS processes is defined as an LTS whose states are processes and whose transitions represent the possibility to move from one process configuration to another in a single unit of time. Transition labels are used to compose behaviours of separate components and to record some information about the entities involved in that move.

Definition 8

(Label) A label is a tuple \(\langle W\vartriangleright R,I,P\rangle \) with \(W,R,I,P\subseteq S\). The set of transition labels is ranged over by \(\ell \).

In a transition label \(\langle W\vartriangleright R,I,P\rangle \), we record the set W of entities currently in the system (produced in the previous step or provided by the context), the set R of entities whose presence is assumed (either because they are needed as reactants on an applied reaction or because their presence prevents the application of some reaction); the set I of entities whose absence is assumed (either because they appear as inhibitors for an applied reaction or because their absence prevents the application of some reaction); and the set P of products of all the applied reactions. Our LTS will be defined in such a way that any transition will carry a label \(\langle W\vartriangleright R,I,P\rangle \) such that I and \(W\cup R\) are disjoint, written \(I\#(W\cup R)\).

As a convenient notation, we write \(\ell _1\otimes \ell _2\) for the component-wise union of labels. We also define a noninterference predicate over labels, written \(\ell _1\frown \ell _2\), that will be used to guarantee that there is no conflict between reactants and inhibitors of the reactions that take place in two separate parts of the system. Formally, we let:

$$\begin{aligned} \langle W_1\vartriangleright R_1,I_1,P_1\rangle \otimes \langle W_2\vartriangleright R_2,I_2,P_2\rangle\; & \triangleq\; \langle W_1\cup W_2\vartriangleright R_1\cup R_2,I_1\cup I_2,P_1\cup P_2\rangle\\ \langle W_1\vartriangleright R_1,I_1,P_1\rangle \frown \langle W_2\vartriangleright R_2,I_2,P_2\rangle\; & \triangleq\; (I_1 \cup I_2)\# (W_1\cup W_2 \cup R_1 \cup R_2) \end{aligned}$$

Remark 9

In Sect. 4.2, when we will present the extension with concentration levels, transition labels will be extended to include lower bounds on the availability of certain entities, and the operators \(\otimes \) and \(\frown \) will be updated accordingly.

Definition 10

(Operational semantics) The operational semantics of processes is defined by the set of SOS inference rules in Fig.1.

Fig. 1
figure 1

SOS semantics of the RS processes

The process \({\textbf {0}}\) has no transition. The rule \((\textit{Ent})\) makes available the entities in the (possibly empty) set D, then reduces to \(\emptyset \). As a special instance of \((\textit{Ent})\), we have, e.g., \(\emptyset \xrightarrow {\langle \emptyset \vartriangleright \emptyset ,\emptyset ,\emptyset \rangle }\emptyset \).

The rule \((\textit{Cxt})\) says that a prefixed context process \(C.\mathsf {K}\) makes available the entities in the set C and then reduces to \(\mathsf {K}\). The rule \((\textit{Rec})\) is the classical rule for recursion. Here, \(\mathsf {K}[^{\mathsf {rec}~X.~\mathsf {K}}/_{X}]\) denotes the process obtained by replacing in \(\mathsf {K}\) every free occurrence of the variable X with its recursive definition \(\mathsf {rec}~X.~\mathsf {K}\). For example, we can use rule \((\textit{Rec})\) to derive transitions such as \(\mathsf {rec}~X.~\mathsf {a}.\mathsf {b}.X \xrightarrow {\langle \mathsf {a}\vartriangleright \emptyset ,\emptyset ,\emptyset \rangle } \mathsf {b}.\mathsf {rec}~X.~\mathsf {a}.\mathsf {b}.X\). The rules \((\textit{Suml})\) and \((\textit{Sumr})\) select a move of either the left or the right component and discard the other process.

The rule \((\textit{Pro})\) executes the reaction (RIP) (its reactants, inhibitors and products are recorded in the label), which remains available at the next step together with P. The rule \((\textit{Inh})\) applies when the reaction (RIP) should not be executed; it records in the label the possible causes for which the reaction is disabled: possibly some inhibiting entities \((J \subseteq I)\) are present or some reactants \((Q \subseteq R)\) are missing, with \(J \cup Q \ne \emptyset \), as at least one cause is needed for explaining why the reaction is not enabled. The rule \((\textit{Par})\) puts two processes in parallel by pooling their labels and joining all the set components of the labels. The sanity check \(\ell _1\frown \ell _2\) is required to guarantee that there is no conflict between the two labels. The labels \(\ell _1\) and \(\ell _2\) are joined in the conclusion of \((\textit{Par})\), which carries the label \(\ell _1\otimes \ell _2\).

Finally, the rule \((\textit{Sys})\) requires that all the processes of the systems have been considered, and also checks that all the needed reactants are actually available in the system (\(R \subseteq W\)). In fact, this constraint can only be met on top of all processes. The check that inhibitors are absent (\(I\# W\)) is not necessary, as it is embedded in rule \((\textit{Par})\) by the premise \(\ell _1\frown \ell _2\).

Fig. 2
figure 2

Full SOS derivation of a transition for process \(\mathsf {P}_0\) (see Example 11)

Fig. 3
figure 3

A second SOS derivation of a different transition for process \(\mathsf {P}_0\) (see Example 11)

Example 11

Let us consider the RS process \(\mathsf {P}_0\triangleq [(\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b})~\mid ~\mathsf {rec}~X.~\mathsf {c}.{\textbf {0}}+ \mathsf {ab}.X ]\) from Example 7. The process \(\mathsf {P}_0\) has two outgoing transitions, depending on the entities provided by the context. The case where the context provides \(\{\mathsf {a},\mathsf {b}\}\) is detailed in Fig. 2, where we use the shorthand \(\mathsf {K}_0 \triangleq \mathsf {rec}~X.~\mathsf {c}.{\textbf {0}}+ \mathsf {ab}.X\). Alternatively, the context \(\mathsf {K}_0\) can provide \(\{\mathsf {c}\}\), in which case we derive the transition in Fig. 3.

The following theorem (from [19]) shows that the rewrite steps of a RS exactly match the transitions of its corresponding RS process.

Theorem 12

Given a RS \(\mathcal{A} = (S, A)\) and an n-step interactive process \(\pi =(\gamma ,\delta )\), with context sequence \(\gamma = \{C_i\}_{i\in [0,n]}\), result sequence \(\delta = \{D_i\}_{i\in [0,n]}\) and state sequence \(\tau = \{W_i\}_{i\in [0,n]}\) (where, as usual, \(W_i \triangleq C_i\cup D_i\) for any \(i\in [0,n]\)), let \(\mathsf {P}_i \triangleq \llbracket \mathcal{A},\pi \rrbracket _i\). Then, \(\forall i\in [0,n-1]\):

  1. 1.

    \(\mathsf {P}_i \xrightarrow {\langle W\vartriangleright R,I,P\rangle } \mathsf {P}\) implies \(W= W_i\), \(P= D_{i+1}\) and \(\mathsf {P} \equiv \mathsf {P}_{i+1}\);

  2. 2.

    there exists \(R,I \subseteq S\) such that \(\mathsf {P}_i \xrightarrow {\langle W_i\vartriangleright R,I,D_{i+1}\rangle } \mathsf {P}_{i+1}\).

4 Quantitative variants of reaction systems

In the following, we will introduce two different features in reaction systems.

The first extension is along the time dimension, to handle the concept of delay and durations/decay for reaction products. Instead of making the products of a reaction immediately available at the next time unit and then vanish in one step (as done in the original framework), we now allow the possibility to specify that a reaction will make available its products after a certain number of time units and that such products will not decay after just one step, but they can have a longer persistency. RS processes with delays and durations will be exploited to experiment with the first and third case studies.

The second extension adds some quantitative information for modelling concentration levels and linear constraints over them. RS processes with concentration levels will be exploited in the second and third case studies.

Both variants are obtained as simple modifications of the process algebraic framework presented in Sect. 3. For the sake of simplicity, both variations are described separately, as enhancements of the original SOS semantics, but it should be clear that they can be combined in a unique integrated framework.

4.1 Delays, durations and timed processes

In biology, it is well known that reactions occur with different frequencies. For example, since enzymes catalyse reactions, many reactions are more frequent when some enzymes are present, and less frequent when such enzymes are absent. Moreover, reactions describing complex transformations may require time before releasing their products. To capture these dynamical aspects in our framework by preserving the discrete and abstract nature of RS, we propose a discretisation of the delay between two occurrences of a reaction by using a scale of natural numbers, from 0 (smallest delay, highest frequency) up to n (increasing delay, lower frequency).

Intuitively, the notation \(D^n\) stands for making the entities D available after n time units, and we use the shorthand D for \(D^0\), meaning that the entities are immediately available. Similarly, we can associate a delay value with the product of each reaction by writing \((R,I,P)^n\) when the product of the reaction will be available after n time units, and we write (RIP) for \((R,I,P)^0\). The syntax for mixture processes is thus extended as below and the operational semantics is changed accordingly (see Fig. 4).

$$\begin{aligned}\mathsf {M} {}{:}{:}=(R,I,P)^n~\big \vert ~D^n~\big \vert ~ \mathsf {K}~\big \vert ~\mathsf {M}{\mid }\mathsf {M} \end{aligned}$$

In Fig. 4, we only report the rules that are new and those that override the ones in Fig. 1. Note, e.g., that the semantics of context processes is unchanged. Rule \((\textit{Tick})\) represents the passing of one time unit, while rule \((\textit{Ent})\) notifies the availability of entities whose delay has expired. Rule \((\textit{Pro})\) attaches to the product of the reaction the same delay as the one of the reaction itself, while rule \((\textit{Inh})\) is used when the reaction is not enabled.

Fig. 4
figure 4

SOS semantics with delays and durations

Fig. 5
figure 5

Two transition sequences of timed processes \(\mathsf {P}_1\) and \(\mathsf {P}_2\) (see Example 14)

In the following, we use the name timed processes for processes with delays and durations. Our extension is conservative, i.e., it does not change the semantics of processes without delays and durations. Therefore, the encoding of standard RSs described in Def. 6 still applies.

Proposition 13

Timed processes are a conservative extension of RS processes.

Example 14

Let us consider two RSs sharing the same entity set \(S=\{\mathsf {a},\mathsf {b},\mathsf {c},\mathsf {d}\}\) and the same reactions \(a_1 = (\mathsf {a},\mathsf {b},\mathsf {b})\), \(a_2=(\mathsf {b},\mathsf {a},\mathsf {a})\), \(a_3=(\mathsf {a}\mathsf {c},\mathsf {b},\mathsf {d})\), \(a_4=(\mathsf {d},\mathsf {a},\mathsf {c})\), but working with different reaction speeds. For simplicity, we assume only two speed levels are distinguished: 0 the fastest and 1 the slowest. The reaction system \(\mathcal {A}_1\) provides the following speed assignment to its reactions: \(\{a_1^1,a_2,a_3,a_4^1\}\). The reaction system \(\mathcal {A}_2\) provides the following speed assignment to its reactions: \(\{a_1,a_2^1,a_3^1,a_4\}\). We assume that the context process for both reaction systems is just \(\mathsf {K}\triangleq \mathsf {a}\mathsf {c}.\emptyset .{\textbf {0}}\). The LTSs of their corresponding timed processes are in Fig. 5, where, for brevity we let:

$$\begin{aligned} \mathsf {M}_1 \triangleq a_1^1\mid a_2\mid a_3\mid a_4^1 \qquad \mathsf {M}_2 \triangleq a_1\mid a_2^1\mid a_3^1\mid a_4 . \end{aligned}$$

Additionally, inspired by [12], we can also provide entities with a duration, i.e., entities that last a finite number of steps. To this aim, we use the syntax \(D^{[n,m]}\) to represent the availability of D for \(m>0\) time units starting after n time units from the current time. By assuming that each reaction only produces entities with the same duration, we can describe duration and delay also associated with reactions: \((R,I,P)^{[n,m]}\) means that all the entities in P (the products) have a delay of n but will last m steps (once they appear in the state). While we could easily define the SOS rules for the above processes, we note that durations are just syntax sugar, because they can be encoded in timed processes as follows:

$$\begin{aligned} D^{[n,m]} \triangleq \prod _{k=n}^{n+m-1} D^k \qquad \qquad (R,I,P)^{[n,m]} \triangleq \prod _{k=n}^{n+m-1} (R,I,P)^k . \end{aligned}$$

For example, we have \(\mathsf {a}^{[2,3]} \equiv \mathsf {a}^2 \mid \mathsf {a}^3 \mid \mathsf {a}^4\) and \(\mathsf {a}^{[0,1]} \equiv \mathsf {a}^0 \equiv \mathsf {a}\).

4.2 Concentration levels and linear processes

Quantitative modelling of chemical reaction requires taking molecule concentrations into account. An abstract representation of concentrations that is considered in many formalisms is based on concentration levels: rather than representing such quantities as real numbers, a finite classification is considered (e.g., low/medium/high) with a granularity that reflects the number of concentrations levels at which significant changes in the behaviour of the molecule are observed. In classical RSs, the modelling of concentration levels would require using different entities for the same molecule (e.g., \(\mathsf {a}_{\texttt {l}}\), \(\mathsf {a}_{\texttt {m}}\), and \(\mathsf {a}_{\texttt {h}}\) for low, medium and high concentration of \(\mathsf {a}\), respectively). This may introduce some additional complexity due to the need of guaranteeing that only one of these entities is present at any time for the state to be consistent. Moreover, consistency would be put at risk whenever entities representing different levels of the same molecule (e.g., \(\mathsf {a}_{\texttt {l}}\) and \(\mathsf {a}_{\texttt {h}}\)) could be provided by the context.

We now enhance RS process by adding some quantitative information associated with each entity of each reaction, so that levels are just natural numbers and the concentration levels of the products depend on the concentration levels of reactants. The idea is to associate linear expressions, such as \(e=m\cdot x+n\) (where \(m\in \mathbb {N}\) and \(n \in \mathbb {N}^+\) are two constants and x stands for a variable ranging over natural numbers),Footnote 2 to reactants and products of each reaction. In the following, we write s(e) to state that expression e is associated with entity s. Expressions associated with reactants are used as patterns to match the current levels of the entities involved in the reaction. Pattern matching allows to find the largest value for the variable x (the same for all reactants) that is consistent with the current concentration levels. Then, linear expressions associated with products (that can contain, again, variable x) can be evaluated to compute the concentration levels of those entities. Expressions can be associated also with reaction inhibitors in order to let such entities inhibit the reaction only when their concentration level is above a given threshold. However, we require inhibitor expressions to be ground; namely, they cannot contain the \(m\cdot x\) term and simply correspond to a positive natural number n. Also the state of the system has to take into account concentration levels. Consequently, in the definition of states we will exploit again ground expressions: each entity in the state is paired with a natural number representing its concentration level.

Example 15

Assume that we want to write a reaction that produces \(\mathsf {c}\) with a concentration level that corresponds to the current concentration level of \(\mathsf {a}\) (but at least one occurrence of \(\mathsf {a}\) must be present), and that requires \(\mathsf {b}\) not to be present at a concentration level higher than 1. Such a reaction would be \(r_1=(R,I,P)\) where \(R=\mathsf {a}(x+1)\), \(I=\mathsf {b}(2)\) and \(P=\mathsf {c}(x+1)\). In the state \(\{\mathsf {a}(3),\mathsf {b}(1)\}\). Reaction \(r_1\) is enabled by taking \(x=2\) (the maximum value for x that satisfies \(\mathsf {a}(x+1)\le \mathsf {a}(3)\)). Since \(\mathsf {b}(1)< \mathsf {b}(2)\), entity \(\mathsf {c}\) will be produced with concentration level \(\mathsf {c}(x+1)=\mathsf {c}(2+1)=\mathsf {c}(3)\). On the contrary, in the state \(\{\mathsf {a}(2),\mathsf {b}(2)\}\) the reaction a is not enabled because the concentration of the inhibitor is too high (\(\mathsf {b}(2)\nleq\mathsf {b}(2)\)).

To formalise the above linear constraints we introduce some notation and terminology. A ground linear expression is just a natural number, and e[v/x] represents the substitution of variable x with the value v in e. A pattern \(T=\{s_1(e_1),...,s_k(e_k)\}\) is a set of associations of linear expressions to entities. We write \(T(s_i)\) for the linear expression associated with \(s_i\) in T. When s is not present in T, we let \(T(s)=0\) by default, and we write \(s\in T\) whenever \(T(s)\ne 0\).

Definition 16

(Ground patterns) A pattern \(T=\{s_1(e_1),...,s_k(e_k)\}\) is ground if \(T(s_i)\in \mathbb {N}\) for any \(s_i\in S\) and we write \(\lfloor T\rfloor \) in this case. We denote by T[v/x] the ground pattern such that \(T[v/x](s_i)=e_i[v/x]\) for all \(s_i\in S\).

A ground pattern T is unitary if \(T(s_i)\in \{0,1\}\) for any \(s_i\in S\). Given two ground patterns \(T_1,T_2\), we write \(T_1\le T_2\) if \(T_1(s)\le T_2(s)\) for all \(s\in S\).

Example 17

Let us consider the pattern \(T_1=\{\mathsf {a}(x+1),\mathsf {b}(2x+1)\}\) and the ground pattern \(T_2=\{\mathsf {a}(3),\mathsf {b}(3),\mathsf {c}(2)\}\). We have \(T_1(\mathsf {a})=x+1\), \(T_1(\mathsf {b})=2x+1\) and \(T_1[2/x] = \{\mathsf {a}(3),\mathsf {b}(5)\}\). It is immediate to verify that \(T_1[1/x] \le T_2\), while \(T_1[2/x] \nleq T_2\).

We extend the syntax of reactions \(r=(R,I,P)\) by taking I as a ground pattern, and R and P as patterns such that if R is ground then P is ground. Formally, r is valid iff \(\lfloor I\rfloor \) and \(\lfloor R\rfloor \Rightarrow \lfloor P\rfloor \).

For example, the triple \((\mathsf {a}(1),\mathsf {b}(x+1),\mathsf {c}(x+1))\) is not valid because its inhibitor pattern \(\{\mathsf {b}(x+1)\}\) is not ground, and moreover, the product pattern \(\{\mathsf {c}(x+1)\}\) is not ground while the reactant pattern \(\{\mathsf {a}(1)\}\) is ground. Vice versa, the triple \((\mathsf {a}(x+1),\mathsf {b}(1),\mathsf {c}(x+1))\) is a valid reaction. We will see later that it makes sense to allow for reactants sets R and inhibitors sets I that are not disjoint (see Example 23). As a special case, when all patterns of a reaction \(r=(R,I,P)\) are ground (respectively, unitary), we say r is ground (respectively, unitary). Unitary reactions behave as reactions of ordinary RSs. A RS whose reactions are all ground (respectively, unitary) is also called ground (respectively, unitary).

A state W is just a ground pattern. We write \(I\# W\) and overload the previously used notation for denoting disjoint sets when the inhibitor pattern I does not conflict with the state W, i.e., we let

$$\begin{aligned} I\# W \triangleq \forall s\in I.~W(s)<I(s) . \end{aligned}$$

The definition states that whenever the entity s is present in the inhibitor pattern I (i.e., \(I(s)>0\)), then the threshold required for s to inhibit the reaction is strictly larger than the available concentration of s in the current state (i.e., \(W(s)<I(s)\)).Footnote 3

At each step, starting from a given state, the semantics verifies the enabled reactions using function \( en ()\), computes the multiplicity of each reaction application (the value of x obtained by matching the current state W against the pattern R) by function \( mul ()\), and computes the resulting state by function \( res ()\). Formally, given a reaction \(a=(R,I,P)\) and a state W, we define:

  • the function \( en (a,W)\) returns 1 if the reaction is enabled, 0 otherwise

    $$ en(a,W) \triangleq \left\{ {\begin{aligned}1 & \quad{{\text{if}}\,{\text{ }}R[0/x] \le W\,{\text{and}}\,I\# W} \\ 0 & \quad{{\text{otherwise }}} \end{aligned} } \right.$$
  • the function \( mul (a,W)\) returns the value v that will correctly bind x when applied to state W

    $$\begin{aligned} mul (a,W) \triangleq \left\{ \begin{array}{ll} 1 &{} \text{ if } en (a,W)=0\text { or }\lfloor R\rfloor \\ \max \{v\in \mathbb {N} \mid R[v/x]\le W\} &{} \text{ otherwise } \end{array}\right. \end{aligned}$$
  • the function \( res (a,W)\) returns the product of the reaction a on state W

    $$\begin{aligned} res (a,W) \,\triangleq\, en (a,W)\cdot P[ mul (a,W)/x] \end{aligned}$$

Example 18

Consider again the previous example, \(r_1=(R,I,P)\) with \(R=\mathsf {a}(x+1)\), \(I=\mathsf {b}(2)\) and \(P=\mathsf {c}(x+1)\) and the state \(W=\{\mathsf {a}(3),\mathsf {b}(1)\}\), we compute:

  • \( en (r_1,W)= 1\), as \(R[0/x] = \mathsf {a}(1) \le \mathsf {a}(3)\) and \(W(\mathsf {b})= 1<2=I(\mathsf {b})\).

  • \( mul (r_1,W) = 2\), as \( \max \{x\in \mathbb {N} \mid R[v/x]=\mathsf {a}(v+1)\le \mathsf {a}(3) \mathsf {b}(1)=W \} = 2.\)

  • \( res (r_1,W) = en (r_1,W)\cdot P[ mul (r_1,W)/x]= 1 \cdot \mathsf {c}(2+1)= \mathsf {c}(3).\)

Once the product of each enabled reaction has been calculated, we need to compute the next state. We consider the operator \(\sqcup \) that computes the maximum between two ground patterns by computing the point-wise maximum value of each entity. Analogously, to combine inhibitor constraints, we consider the operator \(\sqcap \) that computes the minimum between two ground patterns. The two operators are defined as follows:Footnote 4

$$ \begin{aligned} (T_{1} { \sqcup }T_{2} )(s) & \triangleq \max \{ T_{1} (s),T_{2} (s)\} \\ (T_{1} { \sqcap }T_{2} )(s) & \triangleq \left\{ {\begin{aligned} & {T_{1} (s)} \quad {{\text{if}}\quad {\text{ }}T_{2} (s) = 0} \\ & {T_{2} (s)} \quad {{\text{if}}\quad {\text{ }}T_{1} (s) = 0} \\ & {\min \{ T_{1} (s),T_{2} (s)\} } \quad {{\text{otherwise }}} \end{aligned} } \right. \end{aligned} $$

Example 19

Assume we add a new reaction \(r_2=(R',I',P')\) to the previous example, where \(R' = \mathsf {a}(x+2)\mathsf {b}(1)\), \(I'=\emptyset \), \(P'=\mathsf {c}(3x+2)\). The reaction \(r_2\) is enabled in the state \(W=\{\mathsf {a}(3),\mathsf {b}(1)\}\) and it produces

$$\begin{aligned} res (r_2,W)= en (r_2,W)\cdot P'[ mul (r_2,W)/x] = 1\cdot \mathsf {c}(3x+2)[1/x] = \mathsf {c}(5) . \end{aligned}$$

Since we already observed that \( res (r_1,W)=\mathsf {c}(3)\), we conclude that the reactions \(r_1\) and \(r_2\) in the state W produce the entities \(\mathsf {c}(3) \sqcup \mathsf {c}(5) = \mathsf {c}(5)\).

In the SOS style, the hypotheses under which a reaction is applied or inhibited are recorded in the label and their consistency is verified by rule (Par) and (Sys). We stretch here the fact that such hypotheses consist of constraints over concentration levels. If we assume that a reaction \(a=(R,I,P)\) is enabled with multiplicity v, it means that it must be \(\forall s\in I.~W(s)<I(s)\) and \(\forall s\in S.~R[v/x](s)\le W(s)\) but also that \(R[v+1/x]\nleq W\). The first two constraints can be already represented in the ordinary labels. Instead, we note that the property \(R[v+1/x]\nleq W\) is quantified existentially over the entities, i.e., it is equivalent to \(\exists s\in S.~R[v+1/x](s)>W(s)\). Thus, in general, different constraints \(R_1\nleq W\) and \(R_2\nleq W\) due to the applications of different reactions cannot be combined in a single expression of the form \(R\nleq W\). To account for such constraints, we need to extend labels with a set of bounds \(B=\{R_1,...,R_n\}\) for which we shall require that in the current state W we have \(\forall i\in [1,n].~ R_i\nleq W\). To this aim, for \(B=\{R_1,...,R_n\}\) and \(\ell = \langle W\vartriangleright R,I,P\rangle \), we write \(B\nleq \ell \) iff \(\forall i\in [1,n].~ R_i\nleq W\).

Definition 20

(Bounded Labels) A bound is a set \(B=\{R_1,...,R_n\}\) of ground patterns. A bounded label is a pair \(B@ \ell \), where \(B=\{R_1,...,R_n\}\) is a set of bounds and \(\ell =\langle W\vartriangleright R,I,P\rangle \) is an ordinary label. As a special case, we abbreviate \(\emptyset @\ell \) as \(\ell \).

Our LTS will be defined in such a way that any transition will carry a bounded label \(B@\langle W\vartriangleright R,I,P\rangle \) such that \(I\# (W\sqcup R)\) and \(B\nleq \ell \).

To define the bound related to the application of a reaction when the rule \((\textit{Pro})\) is applied, we define the function \( bnd ()\) as follows:

$$ bnd(R,v) \triangleq \left\{ {\begin{array}{*{20}l} \emptyset &{{\text{ if }}\,\left\lfloor R \right\rfloor } \\ {\{ R[v + 1/x]\} } & {{\text{otherwise }}} \\ \end{array} } \right. $$

To handle the presence of bounds, we update the operation \(\otimes \) and \(\frown \), to combine and to compare extended labels, as follows:

$$\begin{aligned} (B_1@\ell _1)\otimes (B_2@\ell _2) &\triangleq (B_1\cup B_2)@(\ell _1\otimes \ell _2) \\ \langle W_1\vartriangleright R_1,I_1,P_1\rangle \otimes \langle W_2\vartriangleright R_2,I_2,P_2\rangle & \triangleq \langle W_1\sqcup W_2\vartriangleright R_1\sqcup R_2,I_1\sqcap I_2,P_1\sqcup P_2\rangle \\ (B_1@\ell _1)\frown (B_2@\ell _2) & \triangleq \ell _1\frown \ell _2~\wedge ~(B_1\cup B_2)\nleq (\ell _1\otimes \ell _2) \\ \langle W_1\vartriangleright R_1,I_1,P_1\rangle \frown \langle W_2\vartriangleright R_2,I_2,P_2\rangle & \triangleq (I_1 \sqcap I_2) \# (W_1\sqcup W_2 \sqcup R_1 \sqcup R_2)\end{aligned}$$
Fig. 6
figure 6

SOS semantics for concentration levels

Apparently, the syntax for linear processes is the same as the ordinary one presented in Sect. 3.

$$\begin{aligned}\mathsf {M} &{} {:}{:}=(R,I,P)~\big \vert ~D~\big \vert ~\mathsf {K}~\big \vert ~\mathsf {M} {\mid } \mathsf {M}\\ \mathsf {K} &{} {:}{:}={\textbf {0}}~\big \vert ~X~\big \vert ~C.\mathsf {K}~\big \vert ~\mathsf {K}+\mathsf {K}~\big \vert ~\mathsf {rec}~X.~\mathsf {K} \end{aligned}$$

The difference is that now C and D are ground patterns, and in any reaction (RIP), we require that both \(\lfloor I\rfloor \) and \(\lfloor R\rfloor \Rightarrow \lfloor P\rfloor \) hold. The operational semantics for linear processes is changed accordingly (see Fig. 6, where we only report the rules that are modified).

A linear process is ground if it contains only ground reactions. It is immediate to observe that if the reaction (RIP) is ground, any application of rule \((\textit{Pro})\) will produce a label of the form \(\emptyset @\ell \), because \( bnd (R,v)=\emptyset \) for any v when \(\lfloor R\rfloor \). Since nonempty bounds B can only be produced by rule \((\textit{Pro})\), it follows that any transition of a ground RS process will also have the form \(\emptyset @\ell \), i.e., only ordinary labels are generated by ground linear processes.

A (ground) linear process is unitary if it contains only unitary patterns. It is not difficult to see that unitary processes behave as the ordinary RS processes in Sect. 3. Hence, likewise timed processes, we have the following result.

Proposition 21

Linear processes are a conservative extension of RS processes.

Example 22

The unary process \([(\mathsf {a}(1)\mathsf {b}(1), \mathsf {c}(1),\mathsf {b}(1))~\mid ~\mathsf {rec}~X.~\mathsf {c}(1).{\textbf {0}}+ \mathsf {a(1)b(1)}.X ]\) corresponds to the RS process \(\mathsf {P}_0\triangleq [(\mathsf {a}\mathsf {b}, \mathsf {c},\mathsf {b})~\mid ~\mathsf {rec}~X.~\mathsf {c}.{\textbf {0}}+ \mathsf {ab}.X ]\) from Example 11.

Fig. 7
figure 7

A transition for the linear process \(\mathsf {P}_1\) from Example 23

Example 23

Let us consider the linear process

$$\begin{aligned} \mathsf {P}_1 \triangleq [(\mathsf {a}(x+1),\mathsf {a}(4),\mathsf {a}(x+2))~\mid ~\mathsf {K}] \end{aligned}$$

where \(\mathsf {K}\triangleq \mathsf {rec}~X.\,\mathsf {a}(3) . X\) We remark that the reaction \(a=(\mathsf {a}(x+1),\mathsf {a}(4),\mathsf {a}(x+2))\) contained in \(\mathsf {P}_1\) has nondisjoint reactants and inhibitors. This makes sense because the inhibitor pattern \(\mathsf {a}(4)\) can be used to fix a boundary on the maximum concentration level of \(\mathsf {a}\) where the reaction is still enabled. Letting \(R =\mathsf {a}(x+1)\), \(I=\mathsf {a}(4)\), \(P=\mathsf {a}(x+2)\) and \(W=\mathsf {a}(3)\), we have:

$$\begin{aligned} R[0/x] &{} = &{} \mathsf {a}(1) \le \mathsf {a}(3) = W &{}\qquad &{} R[2/x] &{} = &{} \mathsf {a}(3) \le \mathsf {a}(3) = W \\ R[1/x] &{} = &{} \mathsf {a}(2) \le \mathsf {a}(3) = W &{} &{} R[3/x] &{} = &{} \mathsf {a}(4) \nleq \mathsf {a}(3) = W \end{aligned}$$

Moreover \(I\# W\) holds, because the condition \( \forall s\in I.~W(s)<I(s) \) amounts to \(W(\mathsf {a})=3 < 4=I(\mathsf {a})\). Therefore, \( en (a,W) =1\), because \(R[0/x]\le W\) and \(I\# W\), \( mul (a,W) = \max \{v\in \mathbb {N} \mid R[v/x]\le W\} = 2\) and

$$\begin{aligned} res (a,W) = en (a,W)\cdot P[ mul (a,W)/x] = P[2/x] = \mathsf {a}(4) . \end{aligned}$$

Consequently, we can derive the transition \(\mathsf {P}_1 \xrightarrow {B@\ell } \mathsf {P}_2\) as shown in Fig. 7. Note that, at the next time unit, the reaction a will not be enabled, because for \(W'=\mathsf {a}(4)\sqcup \mathsf {a}(3) = \mathsf {a}(4)\) we have that \(I\# W'\) is false (the condition \( \forall s\in I.~W'(s)<I(s) \) amounts to \(W'(\mathsf {a})=4\nleq 4=I(\mathsf {a})\)). Thus, by composing in parallel three transitions (derived using rule (Inh), (Ent) and (Rec), respectively).

Since \(\mathsf {a}(3)\sqcup \mathsf {a}(4)=\mathsf {a}(4)\), we can conclude that \( \mathsf {P}_2 \xrightarrow {\emptyset @\langle \mathsf {a}(4)\vartriangleright \mathsf {a}(4),\emptyset ,\emptyset \rangle } \mathsf {P}_1 \).

5 Related work

The model of RSs is qualitative as there is no direct representation of the number of molecules involved in biochemical reactions as well as of rate parameters influencing the frequency of reactions. In [13], the authors introduce an extension with discrete concentrations allowing for quantitative modelling. They demonstrate that although RSs with discrete concentrations are semantically equivalent to the original qualitative RSs, they provide much more succinct representations in terms of the number of molecules being used. They then define the problem of reachability for RSs with discrete concentrations and provide its suitable encoding in satisfiability modulo theory, together with a verification method (bounded model checking) for reachability properties. Experimental results show that verifying RSs with discrete concentrations instead of the corresponding basic RS is more efficient.

A crucial feature of a RS is that (unless introduced from outside the system) an entity from the current state will belong also to the next state only if it is in the product set of an enabled reaction. In other words, an entity vanishes unless it is sustained by a reaction. In [12], it is introduced an extension where such a property is mitigated; indeed, they provide each entity x with a duration d(x), which guarantees that x will last through at least d(x) consecutive states. The authors demonstrate that duration/decay is a result of an interaction with a ‘structured environment’, and they also investigate fundamental properties of state sequences of reaction systems with duration.

Each of the above enhancements of the RS framework requires complex changes in the syntax and semantics of the original framework, and they cannot be easily combined together. Our formal framework for RSs is more flexible, since it allows us to define extensions by simply playing with the defined SOS rules. We have shown this possibility by defining extensions with reaction delays and durations in Sect. 4.1, and with concentration levels in Sect. 4.2. Also adapting our prototype tool for RS execution as we discuss in Sect. 6 is made easier by the SOS formalisation. It is worth noting that these and other extensions can be combined and integrated in our framework by following the same approach. There are several approaches using process algebras for modelling biological systems which are based on SOS formalisations (cf. the survey [28]), but we are the first to combine the expressiveness and flexibility of process algebras and RSs.

There are some similarities between our mechanism for delays and lazy transition systems introduced in [29] for modelling asynchronous circuits. Indeed the work [29] introduces a methodology to optimise asynchronous circuits by making the assumption that a gate introduces a noninstantaneous delay and that two gate delays have always a bigger delay than a single gate. This allows to determine whether an event in the graph of the states happens before another one, or at the same time. In order to model this behaviour, lazy transition systems distinguish between the enabling and the firing of an event in a transition system. This looks similar to the delay that we impose on some entities in our framework. The methodology in [29] allows to show that some states (due to precedence between events) can never be reached and the state graph can be optimised. We believe that optimisation of asynchronous circuits could be an interesting challenge for applying our framework. We also note that the work in [10] shows a tight relationship of reaction systems and synchronous circuits, while our extension with delay and duration might open the way to show a relationship of our extension with asynchronous circuits.

6 The tool

A preliminary implementation of RSs in a logic programming language (Prolog) was already presented in [19] where the intended aim was rapid prototyping. In this paper, we enrich such implementation by introducing delays/durations and the concept of concentration levels in RSs as they are formally defined in the previous sections. In particular, we will describe how such extensions have been integrated in the prototype resulting in a new tool available for download.Footnote 5 Thanks to the modular nature of the SOS formalisation, the integration of new features into the existing tool is facilitated. The use of a declarative programming language reduces the distance between the implementation and the mathematical specification given in Sects. 34 in a significant way, which is important to reduce the presence of bugs in the tool and thus offers a convenient tradeoff between efficiency and correctness.

Our interpreter allows the combined use of delay and duration with linear patterns in RS specifications and exploits DCG (Declarative Clause Grammars) rules to offer a friendly syntax to users. Internally, quantitative entities are encoded in two ways as either the term e(Entity,Delay,Level) or the term e(Entity,Delay,M,N), where the parameters M and N are the coefficients of a linear expression \({\texttt {M}}x+{\texttt {N}}\). The first format is used for ground instances, while the second for linear patterns. For efficiency reasons, sets of quantitative entities are implemented as ordered lists and RS processes are represented as tuples of the form sys(Delta,Es,Ks,Rs) where Delta is the environment that collects all recursive definitions exploited in contextual processes, Es is the set of currently available entities, Ks represents the parallel composition of all contextual processes and Rs represents the parallel composition of all reactions.

To experiment with the tool, the user can write a separate specification file, say, e.g., , and then change the directive for importing the specification in the main file of our tool to something like

figure b

where \verb|<path>| is the global path of myspec.pl in her file system. The specification file requires the definition of four predicates, one for each of the components in sys(Delta,Es,Ks,Rs). All predicates expect a single string.

Fig. 8
figure 8

A sample RS specification and its process like syntax

To briefly account for the syntax defined by our DCG rules, a sample specification is shown in Fig. 8, together with usage instructions. Roughly, an entity a has default delay 0 and level 1, when we write a(2) it means a has delay 2 but still level 1 and when we write a(1,2) we declare that a has delay 1 and level 2. For nonground patterns, we use either the syntax a(2x+1) with implicit delay 0, or a(1,2x+1) when an explicit delay must be considered. We believe the rest of the syntax is self-explanatory. Note that, in the product set of the same reaction, we can specify different delays for each entity, as well as multiple delays for the same entity, even noncontiguous ones. When writing RS specifications, this adds a little more flexibility w.r.t. to assigning the same delay and duration to a whole reaction. (Otherwise, we would need to repeat different instances of reactions with the same reactants and inhibitors but different quantitative product sets.)

In the following three sections, we will exploit our tool to study three models of biological systems, in which the new features of reaction systems that we have introduced in this paper can play an important role. In more details the first biological system need timed reactions to be faithfully modelled, while the second one requires the ability to handle different levels of concentrations, which is accomplished by linear reactions. Finally, both features are used to simulate neural transmission as the third case study.

All figures representing the LTS of the case studies have been generated using the primitive . There are many available options to define different colours of nodes based on their textual descriptions and to select which information is shown. Our default choice is to print the entities provided by the context as transition labels and just the entities currently present in the system inside the nodes of the LTS. Whenever nonrecursive contexts are considered, a single maximal run can be generated using the primitive . The neuron activity figures in Sect. 9 have been obtained by generating automatically the description of the concentration levels of all entities in the states of a run through the directive and then importing such raw data in a spreadsheet.

7 Drug administration in tumour growth

The case study presented in this section is concerned with a delay differential equation model of tumour growth as proposed in [20]. We first will model the system using timed processes and then will execute some simulations to compare different drug administration strategies.

7.1 Biological phenomenon

The cell cycle is a series of sequential events leading to cell duplication. It consists of four phases: G\(_1\), S, G\(_2\) and M. The G\(_1\) phase is a resting phase (or gap period) called presynthetic phase. G\(_1\) could last as long as 48 h and is the longest phase of the cycle. The next phase is the S phase or synthetic period, where the replication of DNA occurs. This phase may last between 8 and 20 h. The cells complete the DNA replication and enter another gap period G\(_2\) called the postsynthetic phase. G\(_2\) is a preparation phase for mitosis. The first three phases (G\(_1\), S and G\(_2\)) are called interphase (I). The last phase is mitosis M in which the cells segregate the duplicated sets of chromosomes between daughter cells. Mitosis is the shortest phase of all, lasting up to 1 h. The duration of the cell cycle is very much dependent on the type of cell and their growth conditions. The most typical (human) normal cell will have a cell cycle duration of approximately 24 h, with various exceptions (e.g., liver cells can take up to a year to complete their cycle).

There are many checkpoints throughout the cell cycle that prevent the cell from completing the cycle if it detects an abnormality. A cancerous cell does not necessarily divide more rapidly than their normal counterparts, but they lose the ability to regulate the cell cycle, thus proliferation of these cells is not controlled. Once mitosis is completed, each daughter cell can enter the cycle again or shift into a quiescent phase during which cells do not divide for long periods. Phase-specific drugs alter the natural course of action for the active or cycling cells. Many chemotherapeutic agents acting on the S phase aim to suppress mitosis and therefore have no visible effect until the M phase.

In [20], a delay differential equation model of tumour growth has been proposed that includes the immune system response and a phase-specific drug able to alter the natural course of action of the cell cycle of the tumour cells. A delay is used to model the duration of the interphase.

7.2 On encoding drug effects on cell cycles using timed processes

Inspired from [20], we define a RS model of tumour growth using delays and durations. We consider two populations of tumour cells: those in the interphase of the cell cycle (\(\mathsf {T_I}\)) and those in mitosis phase (\(\mathsf {T_M}\)). We assume that cells reside in the interphase for \(\sigma \) time units. Moreover, we represent the drug with entity \(\mathsf {D}\) and assume that, once received from the environment, it takes an active form \(\mathsf {D_a}\) and disappears after a delay of \(\delta \) time units. The reactions of the model are the following:

$$\begin{aligned} \mathsf {a_1}\, \triangleq\, (\mathsf {T_I},\mathsf {D_a},\mathsf {T_M})^\sigma \qquad \mathsf {a_2}\, \triangleq\, (\mathsf {T_M},\emptyset ,\mathsf {T_I}) \qquad \mathsf {a_3}\, \triangleq\, (\mathsf {D},\emptyset ,\mathsf {D_a})^{[0,\delta ]} . \end{aligned}$$

Let us assume that the system starts from a configuration in which tumour cells are in the interphase. Hence, the corresponding timed process is

$$\begin{aligned}\mathsf {P}\, \triangleq \,[\mathsf {K} \mid \mathsf {T_I} \mid \mathsf {A} ]\end{aligned}$$

where \(\mathsf {A} \,\triangleq \,\mathsf {a_1} \mid \mathsf {a_2} \mid \mathsf {a_3}\) and \(\mathsf {K}\) is a suitable context process. Now, different drug administration strategies can be simulated by providing different definitions for \(\mathsf {K}\). As a first experiment, let us consider \(\sigma =1\) and \(\delta =2\) and two different context processes \(\mathsf {K_0}\, \triangleq \,\mathsf {rec}~X.\,\emptyset . X\) and \(\mathsf {K_3}\, \triangleq \,\mathsf {rec}~X.\,\mathsf {D} .\emptyset . \emptyset . \emptyset . X\).

The context \(\mathsf {K_0}\) describes the case when no drug is administered; in this case, tumour cells execute the cell cycle infinitely:

$$\begin{aligned}\mathsf {P}[\,^\mathsf {K_0}\!/_\mathsf {K}] = &{} [ \mathsf {K_0} \mid \mathsf {T_I} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {T_I}\vartriangleright \mathsf {T_I},\mathsf {D_a}\mathsf {T_M}\mathsf {D},\mathsf {T_M}\rangle }\, [ \mathsf {K_0} \mid \mathsf {T_M^1} \mid \mathsf {A} ] \xrightarrow {\langle \emptyset \vartriangleright \emptyset ,\mathsf {T_I}\mathsf {T_M}\mathsf {D},\emptyset \rangle }\\ &{} [ \mathsf {K_0} \mid \mathsf {T_M} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {T_M}\vartriangleright \mathsf {T_M},\mathsf {T_I}\&{}\mathsf {D},\mathsf {T_I}\rangle } [ \mathsf {K_0} \mid \mathsf {T_I} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {T_I}\vartriangleright \mathsf {T_I},\mathsf {D_a}\mathsf {T_M}\mathsf {D},\mathsf {T_M}\rangle } \ldots \end{aligned} $$

The context \(\mathsf {K_3}\) describes the case when the drug is administered every 4 time units. Here we observe that the cell cycle is interrupted after two interphase–mitosis phases and the cell dies (note that last state cannot evolve in a state describing the cell at any phase):

$$\begin{aligned} {\mathsf {P}}[\,^\mathsf {K_3}\!/_\mathsf {K}] =\, &{} [ \mathsf {K_3} \mid \mathsf {T_I} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {T_I}\mathsf {D}\vartriangleright \mathsf {T_I}\mathsf {D},\mathsf {D_a}\mathsf {T_M},\mathsf {T_M}\mathsf {D_a}\rangle } [ \emptyset . \emptyset . \emptyset . \mathsf {K_3} \mid \mathsf {D_a^1} \mid \mathsf {D_a} \mid \mathsf {T_M^1} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {D_a}\vartriangleright \mathsf {D_a},\mathsf {T_I}\mathsf {T_M}\mathsf {D},\emptyset \rangle }\\ & [ \emptyset . \emptyset . \mathsf {K_3} \mid \mathsf {D_a} \mid \mathsf {T_M} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {D_a}\mathsf {T_M}\vartriangleright \mathsf {D_a}\mathsf {T_M},\mathsf {T_I}\mathsf {D},\mathsf {T_I}\rangle } [ \emptyset . \mathsf {K_3} \mid \mathsf {T_I} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {T_I}\vartriangleright \mathsf {T_I},\mathsf {D_a}\mathsf {T_M}\mathsf {D},\mathsf {T_M}\rangle }\\ &{} [ \mathsf {K_3} \mid \mathsf {T_M^1} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {D}\vartriangleright \mathsf {D},\mathsf {T_I}\mathsf {T_M},\mathsf {D_a}\rangle } [ \emptyset . \emptyset . \emptyset . \mathsf {K_3} \mid \mathsf {D_a^1} \mid \mathsf {D_a} \mid \mathsf {T_M} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {T_M}\mathsf {D_a}\vartriangleright \mathsf {D_a}\mathsf {T_M},\mathsf {T_I}\mathsf {D},\mathsf {T_I}\rangle } \\ &{} [ \emptyset . \emptyset . \mathsf {K_3} \mid \mathsf {D_a} \mid \mathsf {T_I} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {D_a}\mathsf {T_I}\vartriangleright \mathsf {D_a},\mathsf {T_M}\mathsf {D},\emptyset \rangle } [ \emptyset . \mathsf {K_3} \mid \mathsf {A} ] \ldots \end{aligned}$$
Fig. 9
figure 9

Different drug administration strategies with delay \(\sigma =1\) and duration \(\delta =2\)

We have performed several experiments dealing with different drug administration strategies (including the two that we have just discussed) using our tool where the reaction system \(a_1,a_2,a_3\) is run in parallel with \(\mathsf {T_I}\) and with nondeterministic context \(\sum _{i\in [0,3]}\mathsf {K}_i\), where \(\mathsf {K}_1=\mathsf {rec}~X.\,\mathsf {D}.\emptyset . X\) and \(\mathsf {K}_2= \mathsf {rec}~X.\,\mathsf {D} .\emptyset . \emptyset . X \). Each branch of the tree of Fig. 9 depicts the evolution of the system driven by a different context: from left to right, we see the effect of \(\mathsf {K_1}\), then \(\mathsf {K_2}\), followed by \(\mathsf {K_3}\) and finally \(\mathsf {K_0}\). To improve readability, each transition is labelled with just the entities provided by the context (a transition labelled with 0 indicates that no entity is provided by the context at that step) and the states are labelled using just the available entities using the format Entity(Delay,Level) (context processes and reactions are hidden). For example, da(0,1) da(1,1) tm(1,1) stands for \(\mathsf {Da}\mid \mathsf {Da}^1\mid \mathsf {T_M}^1\). A state labelled with 0 indicates that no entity is present in such a state.

Fig. 10
figure 10

Different drug administration strategies with delay \(\sigma =1\) and duration \(\delta =3\)

As expected, in the evolution with the context \(\mathsf {K_3}\) the cell dies after performing two cycles of interphase and mitosis (observe that in the final cycle only the drug remains in circle). More interestingly, a drug strategy that gives the drug as described by context \(\mathsf {K_2}\) (one drug administration followed by two consecutive pauses) does not lead to the death of the cell. Finally, when drug is administrated as described by context \(\mathsf {K_1}\) the result is the death of the cell after just one complete cycle of interphase and mitosis.

The parameter \(\delta \) can be varied to test alternative drug activation times. Of course, if the drug remains in circulation for a longer time its effects will be stronger. For example, assuming a duration \(\delta = 3\) the experiment with the context \(\mathsf {K_3}\) describes the case in which the drug remains active between two consecutive administrations, and this leads the cell to die at an earlier stage (after just one cycle):

$$\begin{aligned}{\mathsf {P}}[\,^\mathsf {K_3}\!/_\mathsf {K}] =\, &{} [ \mathsf {K_3} \mid \mathsf {T_I} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {T_I}\mathsf {D}\vartriangleright \mathsf {T_I}\mathsf {D},\mathsf {D_a}\mathsf {T_M},\mathsf {T_M}\mathsf {D_a}\rangle } [ \emptyset . \emptyset . \emptyset . \mathsf {K_3} \mid \mathsf {D_a^{[0,3]}} \mid \mathsf {T_M^1} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {D_a}\vartriangleright \mathsf {D_a},\mathsf {T_I}\mathsf {T_M}\mathsf {D},\emptyset \rangle } \\ &{} [ \emptyset . \emptyset . \mathsf {K_3} \mid \mathsf {D_a^{[0,2]}}\mid \mathsf {T_M} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {D_a}\mathsf {T_M}\vartriangleright \mathsf {D_a}\mathsf {T_M},\mathsf {T_I}\mathsf {D},\mathsf {T_I}\rangle } [ \emptyset . \mathsf {K_3} \mid \mathsf {D_a^{[0,1]}} \mid \mathsf {T_I} \mid \mathsf {A} ] \xrightarrow {\langle \mathsf {D_a}\mathsf {T_I}\vartriangleright \mathsf {D_a},\mathsf {T_M}\mathsf {D},\emptyset \rangle } \\ &{} [ \mathsf {K_3} \mid \mathsf {A} ] \ldots \end{aligned}$$
Fig. 11
figure 11

Different drug administration strategies with delay \(\sigma =3\) and duration \(\delta =3\)

Repeating also the other set of experiments with all the other different administration strategies represented by nondeterministic context \(\sum _{i\in [0,3]}\mathsf {K}_i\), we depict the results in Fig. 10. They show how the effect of the drug is much stronger when \(\delta =3\): as expected, both drug administration strategies \(\mathsf {K_1}\) and \(\mathsf {K_3}\) stop the cell cycle but this time after just one cycle of interphase and mitosis. Moreover, now even the strategy \(\mathsf {K_2}\) leads to the death of the cell, as desired.

Finally, one may wonder which are the effects of increasing the value for \(\sigma \) that represents a slower mitosis phase. The result of such experiments are reported in Fig. 11, showing that the administration strategies \(\mathsf {K_1}\), \(\mathsf {K_2}\) and \(\mathsf {K_3}\) are still successful in killing the cell.

8 Regulating differentiation in Th cells

Fig. 12
figure 12

Differentiation of Th cells

The case study presented in this section is concerned with a Boolean network model of the regulatory process for differentiation in Th cells as proposed in [21] and recently translated in a RS model [22]. While the RS model from [22] is able to reproduce the most important dynamics aspects of the regulatory process, it must encode different levels of the same entity as separate objects. Here we show that, using linear processes, the ability to directly deal with concentration levels offers a more natural and simple way to represent this biological phenomenon.

8.1 Biological phenomenon

The immune system is composed by various cell types, including antigen cells and B and T lymphocytes. Among the latter, T cells can be further sub-classified into T helper 1 (Th1) or T helper 2 (Th2) cells, originating from a common precursor Th0. The molecules secreted by Th1 cells lead to an inflammatory immune responses, while those secreted by Th2 cells intervene in humoral immune responses. Importantly, molecules produced by mature Th cells promote their own differentiation and at the same time inhibit the differentiation of cells of the other type. This is illustrated in Fig. 12, where the Th1 differentiation has as principal promoter \(\text {IFN-}\gamma \) (such positive relation is represented by a standard arrow from \(\text {IFN-}\gamma \) to Th1) and \(\text {IL-4}\) as inhibitor (such negative relation is represented by an arrows ending with a rhombus directed from \(\text {IL-4}\) to Th1), while, on the contrary, the Th2 differentiation has as principal promoter \(\text {IL-4}\) and \(\text {IFN-}\gamma \) as inhibitor.

A complex gene network regulates the differentiation of Th0 cells. Studying the molecular mechanisms of this differentiation process is relevant since enhanced Th1 and Th2 responses may cause autoimmune and allergic diseases, respectively.

Fig. 13
figure 13

Graphical representation of the Boolean network

While a number of molecules were known to participate in this process, before [21], it was not clearly understood how they regulate each other to ensure differentiation. Finally, in [21] a Boolean network model of such a regulatory process has been conceived from the large amount of molecular data available in the literature. The proposed network includes 17 nodes regulating the differentiation of the Th0 precursor [30, 31].

Fig. 14
figure 14

Boolean functions modelling the differentiation of Th cells from time t to time \(t+1\)

The structure of the Boolean network which describes how substances influence each other is depicted in Fig. 13, where, as before, standard arrows describe a promoting relation while arrows ending with a rhombus represent an inhibiting relation.

The update functions that described how each influenced substance changes at the following step are summed up in Fig. 14. Without going into all details, the particularity of this system is that some substances (the ones coloured in grey in Fig. 13) admit different concentration levels of activation: an object can be inactive, activated at the medium level of concentration or activated at the high (maximum) level of concentration. This is the reason why, in Fig. 14, different update functions are used to describe the behaviour of a single object. Indeed, the different update functions describe how the different concentration level of the substance influence the other entities. For example, the behaviour of the object \(\text {IFN-}\gamma \) (coloured grey in Fig. 13) is described by two update functions in Fig. 14: the one that updates \(\text {IFN-}\gamma \) at the medium level (\(\text {IFN-}\gamma \text {-m}\)) and a second one that updates \(\text {IFN-}\gamma \) at the high level (\(\text {IFN-}\gamma \text {-h}\)). The same holds for all the entities coloured in grey in Fig. 13 since they admit different levels of concentrations.

The above model identifies two key pathways involving \(\text {IFN-}\gamma \) and \(\text {IL-4}\). In the pathway involving \(\text {IFN-}\gamma \), Th1 cells produce \(\text {IFN-}\gamma \) which acts on a membrane receptor (\(\text {IFN-}\gamma \text {R}\)). The transduction of the \(\text {IFN-}\gamma \)/\(\text {IFN-}\gamma \text {R}\) signal acts via \(\text {STAT-1}\), which can be activated also by \(\text {IFN-}\beta \) via \(\text {IFN-}\beta \text {R}\). \(\text {STAT-1}\) cannot be activated by \(\text {IL-4}\), but \(\text {STAT-1}\) itself modulates \(\text {IL-4}\) signal through other molecules. Further down the \(\text {IFN-}\gamma \) signal transduction pathway is \(\text {SOCS-1}\), a molecule that is highly expressed in Th1 cells, but not in Th0 and Th2 cells. \(\text {IFN-}\gamma \) strongly induces \(\text {SOCS-1}\) via a \(\text {STAT-1}\)-dependent pathway. \(\text {SOCS-1}\), in turn, influences both the \(\text {IFN-}\gamma \) and \(\text {IL-4}\) pathways. Finally, it is known that \(\text {SOCS-1}\) is able to block the capacity of \(\text {IL-4R}\) to generate a signalling in response to \(\text {IL-4}\). \(\text {T-bet}\) is a transcription factor detected in Th1, but not in Th0 or Th2 cells. Its expression is up-regulated by \(\text {IFN-}\gamma \), via \(\text {STAT-1}\). In turn, \(\text {T-bet}\) is an \(\text {IFN-}\gamma \) activator, thus creating an indirect positive feedback. Furthermore, it has been shown that \(\text {T-bet}\) is able to induce the transcription of its own gene.

The second pathway, involving \(\text {IL-4}\), starts by the binding of \(\text {IL-4}\) to its receptor, \(\text {IL-4R}\), which is highly expressed in Th2 cells. The \(\text {IL-4R}\) signal is transduced by \(\text {STAT-6}\), which in turn activates \(\text {GATA-3}\). \(\text {GATA-3}\) is capable of inducing \(\text {IL-4}\), thus establishing a positive feedback loop. The influence of the \(\text {IL-4}\) pathway on the \(\text {IFN-}\gamma \) pathway is mediated by \(\text {GATA-3}\), since \(\text {T-bet}\) is down-regulated by \(\text {GATA-3}\) expression. Conversely, \(\text {T-bet}\) is capable of inhibiting \(\text {GATA-3}\). This mutual inhibition ensures that Th1 and Th2 cells express either one or the other molecule (\(\text {T-bet}\) in Th1 and \(\text {GATA-3}\) in Th2), but not both.

Apart from previous two key pathways, there are other molecules which affect the differentiation of Th0 cells and we do not describe here.

8.2 On replacing distinct objects by concentration levels in linear processes

A standard closed RS (that is a RS with empty environment) that uses different entities to model different levels for the grey nodes in Fig. 13 was already defined in [22], where the authors translated the Boolean network into a RS able to reproduce the dynamics of the update functions in Fig. 14. However, this encoding was ad hoc, because it required the introduction of different objects to deal with different levels of the same entity. For example, we needed two different objects \(\text {IFN-}\gamma \text {-m}\) and \(\text {IFN-}\gamma \text {-h}\) to represent \(\text {IFN-}\gamma \) at the medium and high level. The use of two different objects to model different level of activation required a particular care when considering sets of entities. Indeed, in this case not all subsets can have a meaning since a substance can either be activated at the medium or at the high level but not both. The artificial concept of valid state was introduced to avoid entities representing different levels of the same object to be present at the same time.

We aim to show that linear processes (from Sect. 4.2) allows us to seamlessly model the level of concentrations that are the key feature of this biological system. We express different concentrations levels with concentration values \(\{1,2\}\), where 1 stands for medium and 2 for high. For example, the state \(\{\text {STAT-1}(1), \text {T-bet} (2)\}\) states that we have a medium concentration of \(\text {STAT-1}\) and a high concentration of T-bet. The resulting linear RS contains the 26 reactions in Fig. 15 that model the system described in Fig. 14. Since durations are not needed, we exploit the syntax from Sect. 4.2: a linear pattern is written as \(\mathsf {a}(m\cdot x+n)\), while ground patterns just as \(\mathsf {a}(n)\).Footnote 6

Fig. 15
figure 15

Reactions with concentration levels

We now show that using the linear patterns framework has many advantages compared to approaches where the different levels are modelled using different objects. In the following, we assume that the medium level of concentration of an entity is represented by a new entity whose name is obtained postponing the suffix \(\text {-m}\) to the name of the original entity while the high level of concentration of the entity is represented by a second new entity whose name is obtained postponing the suffix \(\text {-h}\).

For example, when using two objects such as \(\text {STAT-1-h}\) and \( \text {STAT-1-m}\) to model the different levels of \(\text {STAT-1}\), a reaction like \((\{\text {GATA-3}\},\{ \text {STAT-1-h}, \text {STAT-1-m} \},\{ \text {IL-4}\})\) needs to be included in the reaction system to describe the production of \(\text {IL-4}\), which is inhibited by \( \text {STAT-1}\) at any concentration. Such rule must include two objects as inhibitors (one for each level of \(\text {STAT-1}\), using suffixes -h and -m), which seems somehow artificial, because both objects refer to the same entity, although they carry different names. On the contrary, in our framework based on linear patterns the very same constraint can be modelled by the following unary reaction: \((\;\text {GATA-3}(1)\;,\;\text {STAT-1}(1)\;,\; \text {IL-4}(1)\;)\).

Indeed, such reaction is enabled in any state containing \( \text {GATA-3}\), but no \( \text {STAT-1}\) at any level of concentration, as desired. Moreover, even it emerged the necessity to differentiate among other levels of \(\text {STAT-1}\) (e.g., low, or very high) the above reaction would still remain valid. Similarly, let us consider the reaction for the production of \(\text {SOCS-1}\) that is enabled when \(\text {T-bet}\) is present at any level of concentration. When the different levels are modelled using different objects, we need to write two different reactions, one for each level, namely \((\{ \text {T-bet-h} \}\;,\;\emptyset \;,\;\{\text {SOCS-1}\})\) and \((\{ \text {T-bet-m}\}\;,\;\emptyset \;,\;\{ \text {SOCS-1}\})\). Instead, using concentration levels, the production of \( \text {SOCS-1}\) can be expressed by the single unary reaction \((\;\text {T-bet}(1)\;,\;\emptyset \;,\;\text {SOCS-1}(1)\;)\). Likewise the previous rule, even it emerged the necessity to differentiate among other levels of \(\text {T-bet}\) the above reaction would still remain unchanged.

But there are even more interesting consequences in replacing different objects by different concentration levels. In fact, it is very frequent that the level of one entity that is produced is dependent upon the level of some reactant. For example, using different objects, we need two reactions such as \((\{ \text {IFN-}\gamma \text {R-m} \}\;,\;\emptyset \;,\;\{ \text {STAT-1-m}\})\) and \( (\{\text {IFN-}\gamma \text {R-h}\}\;,\;\emptyset \;,\;\{\text {STAT-1-h}\})\) to express the fact that the levels of \(\text {IFN-}\gamma \text {R}\) and \(\text {STAT-1}\) are correlated. Using linear patterns, one reaction suffices: \((\;\text {IFN-}\gamma \text {R}(x+1)\;,\;\emptyset \;,\;\text {STAT-1}(x+1)\;)\).

As RS specifications can grow very large and complex, having the ability to keep the number of reactions as small as possible has many advantages, because reactions will be easier to write, to maintain, to change, to study and to extend and they will also be more flexible to experiment with. For example, imagine a situation where one wants to compare models based on different levels of some entities, without having the knowledge for fixing in advance how many levels is more convenient to use: building on the above examples it should be evident that using linear processes the comparison may be conducted even without rewriting a new specification for each possible combination of levels.

Our prototype implementation allows us to compute the LTS. We performed one in silico experiment that show all the paths leading to Th1 differentiation (characterised by the presence of T-bet as the marker of the differentiation of the cell into Th1 form) and Th2 (characterised by the presence of GATA-3 as the marker of the differentiation of the cell into Th2 form).

Fig. 16
figure 16

Evolution of \(Th_0\) cell

Note that there are two possible paths that lead to the expression of to Th1. The first one is driven by the up-regulation of \(\text {IFN-}\gamma \) that is expressed at the maximal level at the initial state: \(\text {IFN-}\gamma (2)\). The second path leading to the expression of Th1 is driven by the initial expression of both \(\text {IL-12}\) and \(\text {IL-18}\). The initial state in this case is \(\text {IL-12}(1)\mid \text {IL-18}(1)\), and after 9 steps, the system reaches the same stable state. Instead, the evolution leading to Th2 is activated by the initial state \(\text {IL-4}(1)\) and after 6 evolution steps reaches the stable state \(\text {IL-4}(1)\mid \text {GATA-3}(1)\mid \text {STAT-6}(1)\mid \text {IL-4R}(1)\).

Our tool allows us to inspect all different evolution paths by starting with the reaction system in parallel with the initial context

$$\begin{aligned}\mathsf {K} \,\triangleq\,\text {IFN-}\gamma (2).\mathsf {K}_{\emptyset } \;+\; \text {IL-12}(1)\,\text {IL-18}(1).\mathsf {K}_{\emptyset } \;+\; \text {IL-4}(1).\mathsf {K}_\emptyset \end{aligned}$$

where \(\mathsf {K}_\emptyset \,\triangleq\,\mathsf {rec}~X.~\emptyset .X\) is the trivial recursive process that provides an empty context at any step. The corresponding LTS is given in Fig. 16, where it is possible to observe that the evolution path triggered by the context \(\text {IL-12}(1)~ \text {IL-18}(1)\) takes a few steps before joining the path driven by the up-regulation of \(\text {IFN-}\gamma \). For the differentiation leading to Th1, it can be observed that IL-4 needs to be inactive; on the contrary, for the differentiation leading to Th2, \(\text {IFN-}\gamma \) needs to be inactive.

9 Synaptic transmission

The case study presented in this section requires the combined use of duration and concentration levels in a single RS. The case study is concerned with the modelling of synaptic transmission in neural networks communication. Our goal is to show that with RSs it is possibile to approximate spiking behaviours obtained by kinetic and stochastic models such as those defined in [24, 32].

9.1 Biological phenomenon

Synaptic transmission is the process that allows two neurons connected by a synapse to communicate. Communication consists in impulsive chemical signals that are sent from the first neuron to the second. Chemical signals take the form of neurotransmitters that are released from the first neuron and perceived by the second one, and they are stimulated by ionic currents.

The macroscopic dynamics of the currents involved in synaptic transmission can be described by means of kinetic models in which all the essential processes are expressed in terms of reactions. Synaptic transmission can be described as a two-phase phenomenon. The first phase (presynaptic release) is the release of neurotransmitters by the first neuron. It is stimulated by a calcium current that promotes the release of neurotransmitters from vesicles in which they are contained into the synaptic cleft. The second phase (postsynaptic uptake) involves different receptors (e.g., \(\text {AMPA}\), \(\text {NMDA}\), \(\text {GABA}_\text {A}\) and \(\text {GABA}_\text {B}\)) that react to the availability of the transmitters with the creation of new currents in the second neuron. Different receptors generate currents with different intensities and rise/decay times.

As an example of synaptic transmission we mention the one mediated by the calyx of Held, which is a large synapse in the mammalian auditory brainstem circuit that synapses onto the cell body of the principal neuron of the medial nucleus of the trapezoid body (\(\text {MNTB}\)). The functional communication between the active sites of the calyx of Held and the principal neuron of \(\text {MNTB}\) is implemented by the release of a large number of synaptic vesicles containing glutamate.

Roughly, the vesicle release (exocytosis) depends on the amount of calcium, \(\textit{Ca}2^+\), in the presynaptic site. After the exocytosis has took place, the vesicles release their content that in turn bind the receptors of the postsynaptic sides. Here, the \(\text {AMPA}\) receptors bind the neurotransmitter released by the vesicle, the glutamate, and causes a chain of reactions inside the postsynaptic side that ends with the changing of the receptor configuration in its open form.

For a more detailed description of the biological phenomenon and of mathematical models of the dynamics of synaptic transmission, we refer to [23, 24, 32].

9.2 A discrete model of neural communication

In this paper, we present a simple functional model with a quantitative abstraction that does not consider the kinetic rates of the different biological reactions. We describe three neural examples, consisting of one, two and three neurons, respectively. The dynamics described by our model can be qualitatively compared with those described in [24, 32].


A single-neuron model

Fig. 17
figure 17

Rules for the pre- and the postsynaptic sides activity of a neuron

The synaptic activity concerning the presynaptic side is described by the reactions in Fig. 17, together with the one of the membrane receptor in the postsynaptic side. The presynaptic activity is characterised by the growth of calcium by doubling its quantity at each step, until the threshold 10 is reached. This is modelled by reaction (r1). Reactions (r4) and (r5) model the formation of the vesicles, \(\text {Ve}^*\), that are then ejected via exocytosis releasing their content \(\text {T}\) by reaction (r7). Reaction (r8) models the binding of an external neurotransmitter \(\text {T}\), opening of the neural receptor, which changes its own state from \(\text {c}\), closed, to \(\text {o}\), open, and (r9) models the closure of the receptor. The remaining reactions (r2), (r3) and (r7) model the permanency of vesicles, \(\text {Ve}\), the calcium ligand, \(\text {X}\), and the closed receptor \(\text {c}\). By abuse of notation, we indicate as \(\text {T}\) both the whole content of the vesicle released by the presynaptic side and the neurotransmitter that binds the receptors on the postsynaptic membrane (causing the neuron to send the \(\text {T}\) signal to itself).

Fig. 18
figure 18

Neuron activities: a one neuron; b synaptic transmission between two neurons

Given the initial state \(\text {c}(1) \mid \text {Ca}(1) \mid \text {X}(10) \mid \text {Ve}(5) \mid \text {T}(1)\), the LTS showing the cyclic behaviour of the neuron is shown in Fig. 18a, where the colour of each node depends on the status (closed/open) of the receptor.

Fig. 19
figure 19

Activity chart of one neuron

Figure 19 shows the peaks of the calcium quantity that activate the release of the neurotransmitter that in turn causes the opening of the receptor; then, the amount of the calcium restarts to rise again.


A two-neuron model Here, we consider two neurons such that the neurotransmitter released by the first neuron activates the receptor of the second one and vice versa. In this model, we assume the two neurons have different speeds: the receptor of the second neuron is slower to close. This implies that it remains open for a longer time and this allows a greater quantity of calcium to be produced. In Fig. 20, we only present the modified rules for the two neurons. Positive delays are represented as superscripts. We use subscripts 1 and 2 to distinguish between the entities belonging to neuron one and two, respectively.

The opening of the receptor of neuron one is modelled by reaction \((r_18)\), and the opening of the receptor of neuron two is described by reaction \((r_28)\). Reactions \((r_29a)\) and \((r_29b)\) model the increase of calcium stimulated by the open receptor, whose closure also depends on reaction \((r_28)\) (using delay 2). Given the initial state \(\text {c}(1) \mid \text {Ca}(1) \mid \text {X}(10) \mid \text {Ve}(5) \mid \text {T}(1)\), the LTS showing the cyclic behaviour of the neuron is shown in Fig. 18b, where states in which the receptors of each of the two neurons are open are coloured differently.

Finally, the effect of the interaction between the two neurons is shown by charts (1)–(4) of Fig. 21: in (1), the calcium in the second neuron, ca2, grows more quickly than ca1; in (2), the neurotransmitter T\(_2\) remains active longer than T\(_1\); and consequently, in (4) the receptor of neuron two remains open longer than the receptor of neuron one, in chart (3).

Fig. 20
figure 20

Model of two interacting neurons: receptor of neuron two is slower in closing

Fig. 21
figure 21

Charts for the activity of two neurons

Fig. 22
figure 22

Rules for three interacting neurons connected to form a ypsilon

Fig. 23
figure 23

Charts for the activity of three interacting neurons


A three-neuron model The three neurons in this example are assumed to form a network with a ypsilon structure: the neurotransmitters of neuron one and two, T\(_1\) and T\(_2\), respectively, interact with the two receptors in neuron three, c31, and c32, respectively. Then, the neurotransmitter T\(_3\) of neuron three interacts with the receptors of neurons one and two, c\(_1\) and c\(_2\), respectively. As done for the case of two neurons, we use subscripts 1, 2 and 3 to denote entities belonging to each of the three neurons, and in Fig. 22, we only give the rules that change with respect to the previous cases. In particular, reactions \(r_39a\), \(r_39b\) and \(r_39c\) manage the opening of one or both of the two receptors in neuron three. The charts in Fig. 23 show that when only neurons one releases a neurotransmitter, T\(_1\) in chart (2), only receptor (a) of neuron three will be open, as shown by chart (5). Chart (1) shows that when only one receptor in neuron three is open, the increase of calcium in neuron three is slower with respect to when both the receptors are open. Please note that in chart (2), the second and the third activation of neurotransmitter T\(_1\) (the blue one) is overlaid by the activation of neurotransmitter T\(_2\) (the orange one).

10 Conclusions and future work

In this paper, we presented the formal theory of timed and linear RS processes, we developed a tool where both extensions are integrated and we used this tool to investigate some biological pathways, gene regulation and neural networks.

As future work, we plan to exploit our framework to deepen the study of quantitative extensions of RSs without abandoning the discrete and abstract nature of RSs. Moreover, the availability of a formal semantics will allow us to study and apply formal analysis techniques aimed at assessing dynamical properties of the modelled biological systems, like logical properties and behavioural equivalences.

Finally, we plan to investigate the applicability of abstract interpretation techniques [33,34,35] to study properties of quantitative reaction systems by exploiting under- and over-approximations of current states, which is useful to make the analysis of large systems tractable.