figure a

1 Introduction

Temporal behaviors are sequences of actions and observations in time generated by various systems and the environment around us. A temporal pattern is a set of compositions of different temporal behaviors satisfying some relations among their components such as precedence and coincidence or possessing some properties such as repetition and a certain duration. Searching good [bad, desirable, undesirable] patterns over their temporal behaviors is an important task while we reason about systems and the environment.

Timed regular expressions (TREs) [2] extend regular expressions, a well-established formalism for specifying sequences of symbols, with the notion of real-time and timing constraints. Many patterns requiring both qualitative and quantitative temporal properties can be specified by TREs in a compact and natural way. Given a TRE that specifies a temporal pattern and a real-time behavior the problem of timed pattern matching is defined as locating all segments that satisfy the expression. This problem has been solved by an offline algorithm in [14]. It is further endowed with an online algorithm that incrementally matches patterns over streaming behaviors [15].

In this paper, we describe Montre a new tool for timed pattern matching whose applications are numerous and diverse. First of all, Montre can naturally check execution traces of software and hardware systems against real time properties specified in TRE (e.g. [5, 7]), thus complementing temporal logic based property checkers such as [1, 4, 11]. Further, Montre can be used for specification mining such as [3, 8] as matching is a basic task for mining. Outside the verification context, Montre has a potential use in temporal data mining [10] and (vehicle or human) trajectory data mining [9, 16] as it can label time segments with meaningful tags such as overtaking (another car) or sprinting. To illustrate our tool in action, we present such an example from the domain of sports analytics in Sect. 3 where we find all sprints of a soccer player.

Fig. 1.
figure 1

The work flow and extent of the monitoring tool Montre

2 Tool Description

The tool Montre essentially incorporates online and offline timed pattern matching algorithms extended with some practical features such as anchors and a Boolean layer. It takes a timed behavior and a timed regular expression as inputs, and produces a finite set of two dimensional zones representing the (possibly uncountable) set of segments that watch the pattern. Montre provides a standard text-based interface for easy integration with other tasks such as data preparations and visualization as we consider them necessary but outside the scope of Montre. In Fig. 1, we illustrate the work flow and extent of Montre, and we give details for each component in the following.

Table 1. Montre timed regular expression syntax

Timed Behaviors. A timed behavior is a sequence of time segments where each segment has a duration value and is associated with a set of propositional variables that hold continuously in the segment. In general, we assume all propositions are concurrent. For example, \(\mathtt (3,pq);(2,q);(2,p)\) is a timed behavior with 3 segments over propositions p and q. It means that p and q evaluate to true for the first 3 time units, then q is true for 2 more time units, and then p is true for 2 time units again. We assume behaviors start at time 0; therefore, the example behavior can be alternatively stated such that p holds from 0 to 3 and then 5 to 7 while q holds from 0 to 5.

Timed Regular Expressions. An atomic timed regular expression corresponds to a Boolean expression over a set of propositions, denoted by letters p, q, r. These propositions can stand for predicates over real-valued variables. Usual Boolean operators (!), (||), (&&) are used to build Boolean expressions. We say that an atomic expression occurs on a time period \((t,t')\) if the corresponding Boolean expression holds from t to \(t'\) continuously. Complex timed regular expressions are built from other expressions by using tre operators: sequential composition (concatenation) (;), time restriction (%), choice (|), coincidence (&) and zero-or-more repetition (*). Further, we add one-or-more repetition (+) and two anchoring (\(\mathtt{<:}\) and \(\mathtt{:>}\)) operators to the set of operators. Typically parentheses are used to group expressions. We summarize all Boolean and tre operations in Montre in Table 1.

Zones. For a proposition p that holds from \(t_{1}\) to \(t_{2}\), all sub-periods of \((t_1, t_2)\) satisfy the expression p. As shown in Fig. 2-(i), such a set of matches \(\{(t, t')\ |\ t_{1}\le t < t' \le t_{2}\}\) can be represented on a two-dimensional plane as a triangular zone. Then the match set of any atomic expression would be a union of such triangular zones. A triangular zone is a special case of zones, which constitutes a restricted class of convex polygons defined by orthogonal and diagonal constraints as shown Fig. 2-(ii). Zones are basic data objects for timed pattern matching as unions of zones are closed under Boolean and regular operations. It follows that the match set of any timed regular expression over a timed behavior can be representable by a finite union of zones.

Fig. 2.
figure 2

(i) A triangular zone.   (ii) A zone in general.

Implementation. Montre is a command line programFootnote 1 that uses structured text files for input/output specification. When invoked Montre parses the timed regular expression passed as an argument and starts to reads the input file. According to flags set by the user Montre would run in either online or offline mode. For online mode it is possible that the input can be given interactively using the command line or directed from another process as usual. At its core, Montre contains our efficient zone manipulation library, libmontre, called dynamically by top-level online and offline timed pattern matching algorithms. As Boolean and regular operations over sets of zones are intensive numerical computations, we have implemented libmontre in C++. In the implementation, we use an integer-valued time model where all time values are represented by integers for efficiency and accuracy reasons. For the majority of applications, integers give us sufficient precision and range; and a proper scaling can be found.

We implement timed pattern matching algorithms in PureFootnote 2, a functional programming language based on term rewriting with a support for native code compilation and native calls to dynamic libraries. For the online algorithm [15], built upon derivatives of regular expressions [12, 13], we extensively use the rewriting functionality when deriving an expression with respect to a newly observed segment. The offline algorithm [14] is a recursive computation over the syntax tree of the expression; therefore, the role of Pure’s rewriting engine is minimal. The worst case complexity is polynomial in the size of input behavior and expression for the offline approach. For the online approach it is polynomial in the size of the behavior and exponential in expression. In practice, however, we realistically assume patterns to be much shorter than behaviors and somewhat sparse in them. Then we expect a linear-time performance in the size of input behavior for both algorithms. Under these assumptions, Montre can process timed behaviors with a size of 1M segments in a few seconds (offline) and a few hundred seconds (online).

3 An Illustrative Example

We present an example use of Montre on a data set obtained by tracking positions of players in a real soccer match. In this example, we find all sprints performed by a single player where a sprint is formally specified by a timed regular expression over speed and acceleration behaviors. The data are obtained by a computer vision algorithm with a frame rate of 10 Hz so we have a xy-coordinate for each player on the field at every 100 milliseconds. Therefore we use milliseconds as our base time unit for behaviors and expressions.

Table 2. Speed and acceleration thresholds [6].

In order to specify a pattern for sprints, we need to address two issues in order: (1) how to categorize continuous speed and acceleration axes, and (2) which composition of these categories defines a sprinting effort best. Clearly, there are no universal answers for these questions so we rely on the study [6] in the following. First, we partition speed and acceleration axes into four categories (near-zero, low, medium, and high), and we associate a letter for each category in Table 2. For example, a period of medium speed, denoted by r, means the speed value resides between 3.7 and 6 m/s during the period.

Often a sprint effort is characterized by any movement above a certain speed threshold for a limited time. This gives us our first sprint pattern such that a period of high speed between 1–10 s, formally written as follows:

figure b

Above we use anchor operators from both sides on the proposition s to obtain only maximal periods that satisfy s; otherwise, any sub-period satisfies the pattern as well. The operator % specifies that the duration is restricted to be in 1000 and 10000 milliseconds. Alternatively we may want to find other efforts starting with high acceleration but not reaching top speeds necessarily. This gives us our second sprint pattern such that a period of high acceleration followed by a period of medium or high speed between 1–10 s, formally written as follows:

figure c

Notice that we do not use the right-anchor on g. This allows a medium or high speed period to overlap with a high acceleration period as it is usually the case that they are concurrent. Writing an equivalent pattern using classical regular expressions over a product alphabet would be a very tedious task partly due to a requirement to handle such interleavings explicitly (and the lack of timing constraints). For TREs all propositions are considered to be concurrent by definition, which results in concise and intuitive expressions. Finally we give a third pattern to find rather short but intense sprints such that

Fig. 3.
figure 3

The trajectory of a soccer player for 45 min on the field, and his sprinting periods found by Montre for patterns P1-P3.

figure d

Then we visualize all sprints found by Montre for patterns P1-P3 in Fig. 3 over the behavior of a single player during one half of the game (45 min.) containing 27 K data points that reduces to timed behaviors of 5K segments after pre-processing. Note that we used Python to prepare data and visualize results.

4 Conclusions

Timed regular expressions can define many timed properties and Montre is the first tool to check such properties and detect timed patterns. Its performance is satisfactory for such monitoring tasks but we note that there is still some room for optimization especially for the online algorithm. The example we presented illustrates a complete Montre experience from raw data to visualization. As seen defining good patterns and categories are important to achieve intended results but it is not always obvious what a good pattern is. Such patterns should be found in the future using (unsupervised) pattern mining methods. We believe Montre would provide a good starting point for such research as it encapsulates timed pattern matching with an easy-to-use interface.