1 Introduction

The analysis of trajectories of so called Moving Point Objects (MPO) is relevant for many important applications [13]. One goal of the interpretation of locomotion patterns is the identification and temporal positioning of cattle activities. The focus of this paper is a method for the robust identification of activities based on two sensors, one spatial (position) and one non-spatial (heart rate). Whereas the limited accuracy of the indoor positioning sensor in the third dimension does not allow a reliable distinction between lying and standing, the additional information from the heart rate sensor does. Thus in the proposed approach, the non-spatial sensor supports the identification of activities with locomotion and, in addition, enables the identification of states without locomotion.

For measuring the spatial trajectories of moving objects multiple sensor principles are available. In many cases, such trajectories are recorded by mobile phones with self-locating functionality [4] or cameras [5], but in specific local scenarios indoor positioning systems are applied [6]. These spatial sensors generate time series of coordinates which were analyzed for activity information [7]. Interpreting such trajectories is defined as deriving semantic information from the geometric features, meaning generating high-level knowledge from low-level data [8].

In this study, we evaluate the data from the novel integration of positioning system and heart rate sensor to observe dairy cattle with minimal interfering. We apply a novel integration of Support Vector Machines with Conditional Random Fields including an explicit representation of the duration. The distinction between states such as lying and standing is difficult since the positioning system provides no z-coordinates. Our approach provides an added value for research in cattle welfare by an automated and reliable identification of relevant activity patterns. It goes beyond unconnected time stamps and was designed for a reliable estimation of sequence and duration of activity states by robust sensors. The two sensor types complement each other in recording information about the activity patterns and internal states.

In dairy research there is a general interest in analyzing the heart rate and, especially, the heart rate variability to get access to wellbeing and stress [9]. However, without a reliable distinction between resting and activity the analysis of the heart rate data is difficult.

The data set containing the trajectories and heart signals of dairy cows was used to derive a sequence of activity states. The map of the barn and the samples of two simultaneously recorded signals including the corresponding activity states are shown in Fig. 1. Different approaches for the interpretation of movement patterns have been published [10, 11]. Some methods were developed for unconstrained movement, e.g. the routes of migratory birds [11] or the scale-dependency of cow trajectories [12]. In recent time, machine learning methods like SVM [13, 14] and CRF [15] are increasingly adopted. However, in many scenarios available spatial constraints, e.g. road graphs or the border of playing field, improve the position accuracy and simplify the interpretation. Furthermore, known areas or points of interests, like train stations or feeding station at animal monitoring support the reliable interpretation [10].

Fig. 1
figure 1

The used sensors attached to a cow: The Local Positioning Measurement System attached to the neck belt and the heart rate sensor under the chest strap b. A filtered sample in both signal types with corresponding activity states: the Local Position Measurement (LPM) system in the research barn a and the heart frequency c. The map of the barn visualizes two additional trajectories to imply potential investigations of social interactions

However, the interpretation of movement patterns, whether in a local or a global context, is not sufficient for the detection of activities without or with only small movements below the resolution of the positioning system. Such activities require further information recorded by additional sensors. Therefore, the integration of heterogeneous data from multiple sensors including different measuring frequencies, scales and reference systems is required. Such an approach is still a challenge for data analysis methods.

In recent years in European countries, especially in Germany, there is a vivid debate on intensive livestock farming, and the pressure on farmers ensuring animal welfare is increasing. The big question is how to identify, objectify and quantify animal welfare, and how to provide optimal conditions in stable design and management for both, welfare and performance. To this end continuous monitoring and interpretation of the derived data is becoming increasingly important. Animal behavior represented by sequences of activities and locomotion patterns provide sensitive indicators for this kind of research.

Several studies have used accelerometers to monitor animal movement patterns [16, 17]. Martiskainen et al. [18] used the data from a three-dimensional accelerometer and multiclass support vector machine classifiers to develop a method for measuring behavior patterns. Although this method achieves a high prediction quality, information regarding the cow position in 3D space and its relative position to other objects like feeding tables or cubicles are absent. In the open air, it is possible to track the movement patterns of the animals using the signals of the Global Positioning System (GPS [19]). Early approaches to continuous monitoring of cattle with multiple sensors were discussed in [20].

However, under a roof, the usefulness of such systems is limited since obstacles (walls, ceilings, etc.) weaken the signals [21]. For indoor application a tracking system based on electromagnetic waves, such as the ABATEC System [22] has been used (Fig. 1). It is a local position measurement (LPM) system developed for field sports [6] and can determine the positions of moving objects in real time. This system achieves an accuracy which is superior to GPS and specified by the manufacturer with a standard deviation of 2 cm [22]. However, for measurements in a barn a standard deviation of up to 20 cm was reported in [23]. Neisen et al. [24] used the LPM System to monitor cow behavior and movement patterns. In a cow barn, shadings and reflections resulting from metal surfaces and the relatively small distances between the antennas greatly impair sensor signals (Fig. 1). The signal-to-noise ratio declines, the outlier rate exceeds 10 % and the height coordinate ceases [23]. These data characteristics complicate the reliable analysis of recorded trajectories.

A promising approach to get reliable activity information even from such unreliable position signals is the integration of additional sensors in the analysis. The heart rate and the heart rate variability are strongly related to animal welfare [9] and are strong indicators for the physical stress of the body. As summarized by Langbein et al. [25] the heart rate indicates changes of the sympathetic tone, without considering parasympathetic influences. On the other hand, parasympathetic influences and changes in vagal activity are registered by the heart rate variability (HRV). To compare processes (or animals) with each other, heart rate data are related to specific states [26]. States like lying, in which the heart rate of the individual is relatively stable and which are less affected by short-term influences, are best suited for the interpretation of the cardiac response of an animal.

The interpretation of heart signals is challenging as these signals represent simultaneously various processes, short- and long-term as well as relevant and irrelevant ones. A promising approach for defining heart rate properties is the use of Recurrence Quantitative Analysis (RQA) - this specifies the characteristics of nonlinear systems [27]. Mohr et al. [28] used the RQA to assess stress levels in calves and cows based on the heart rate variability Non-invasive and, thus non-stress-inducing techniques like heart rate measurements enable the assessment of stress levels without having an effect on the heart rate parameters itself [29].

We present a prediction method that uses signals from spatial (LPM) and non-spatial (heart rate) sensors for the prediction of activity state sequences (Fig. 2). In the domain of machine learning this kind of problem is solved by sequence labeling algorithms [30]. These algorithms have been successfully applied in DNA sequencing [31], natural speech recognition [32] and protein structure prediction [33]. These tasks require on the one hand the interpretation of signals and on the other hand, the modeling of the temporal neighborhood within the label sequence. A common characteristic of these methods is the use of graphical models [34] often in the form of a Hidden Markov Model (HMM [35]). This approach has been extended and improved to the discriminative modeling realized by the Conditional Random Field (CRF [36]). In contrast do HMMs, CRFs do not require strong assumption on statistical independence which are unlikely in most cases and thus provide more reliable predictions. In Machine Learning, esp. image processing and robotics, they have become very popular in recent years.

Fig. 2
figure 2

“Digital Dairy Diary” of cow extracted by the combination of SVM and CRF based on a heart signal and the recorded trajectory. The pattern “activity” comprises the activity patterns “walking”, “feeding”, and “drinking”

CRFs, however, do rely on the Markov assumption that all necessary information on history is given in the current state. They do not provide a notion of time or duration, which, however, is crucial in the given scenario. Due to the involved strain cattle tend to remain in the state of lying for some time instead of standing up immediately. If this inertia is not modelled appropriately, this results in an erratic behavior of the model. Thus, CRFs had to be enhanced appropriately.

The application field of animal monitoring in barns requires an analysis method which is able to achieve the fine balance between a comprehensive data interpretation including spatial and stochastically prior knowledge and the extent of intervention by the sensor handling. These demands request a suitable combination and partial extension of state of the art methods. The underlying methodological concept is not limited to animal science but allows integrating various sensors and arbitrary distributions on state durations in various scenarios.

The contribution of this article is twofold. First, it introduces a novel sensor combination for an improved identification of activity states and locomotion patterns. In addition, SVM and CRF are combined [37] in a novel way. State transitions are modeled by a stochastic finite-state machine. This is realized by extending the concept of the CRF by an explicit notion of duration. This allows to take appropriate account of the perseverance in specific states. The enhanced CRF concept incorporates prior knowledge about state durations and prohibits activity transitions which are impossible due to spatial constraints.

The manuscript is structured in the following manner: Section 2 introduces the scenario and the applied sensor setup. Section 3 explains the analysis methodology including feature extraction, SVM and CRF. The extension of the CRF by duration-aware probabilities for activity transitions is shown. Section 4 provides the final results of the analysis, interim results and the Resting Pulse Rate (RPR) as a first application. The results are discussed Section in 5 and a conclusion about the method is provided in Section 6.

2 Data acquisition

The introduced combination of two sensor types deduces reliable information about the six behavior patterns described in Section 2.3. The used sensory systems are the local positioning system (LPM) from Abatec and heart rate sensors from Polar. The LPM provides information on position and locomotion and is described in Section 2.2. The heart rate sensor permits the derivation of information on local physical exertion and long-term stress load and is described in Section 2.3.

2.1 Spatial context

The study was conducted at the Frankenforst research station of the University of Bonn. A herd of 65 German Holstein Friesian cows is loose-housed in a two-row open free-stall barn with cubicles and concrete floor.

2.2 Sensor for automatic recording of 3D location

For the automatic recording of the cow position in real-time a local position measurement system (LPM, http://www.abatec-ag.com) based on runtime measurement using radio waves (around 5.8GHz) and triangulation was used. The system was developed for sports like soccer games and modified in order to adapt to the conditions in a free stall barn. To cover the area and to minimize the shadings and reflections due to the barn equipment, 12 antennas (so called base stations) with known coordinates were mounted on the walls and under the roof. Furthermore one reference transponder with a defined position was located centrally. It sends a continuous signal to the base stations to synchronize them. In this experiment each cow wore a head-collar with a fixed transponder on the top. At pre-defined intervals one specific antenna (called the master base station) triggers the transponder to transmit a signal and hence, the position can be calculated.

Under ideal conditions, the system estimates the 3D position of a transponder with a frequency up to 1000Hz with a standard deviation of 2 cm [22]. A cow barn provides sub-optimal conditions due to the high amount of metal present, wet surfaces and the relatively small distances between the base stations. Therefore, antenna positions were optimized regarding 2D accuracy with the corresponding disadvantage that vertical distances could not be measured anymore (Fig. 1).

After the optimization, we evaluated the LPM system from Abatec AG by observing 10 control points for a week on a 24 h base. Every 10 seconds the measured coordinate was recorded. Huge differences were observed between the best point in the middle of the barn with a standard deviation of 9 mm (all points within a 1 m radius) and the worst point at the border of the barn with a standard deviation of 2.15 m (98 % percent of the data was outside the 1 m radius and 35 % even outside the 3 m radius). The mean standard deviation over all points on all days was 19.2 cm as most of the control points were in the center of the barn. Summarizing, the accuracy depends strongly on the position and varies between precise and an accuracy that does not allow associating single lying boxes.

2.3 Heart sensor

For the measurement of the heart rate and the heart rate variability sensors from Polar Electro GmbH (Büttelborn, Germany) were used (Polar Equine RS800, Polar Equine RS800CX). The electrodes, integrated in a chest belt (Polar Equine WearLink® W.I.N.D. transmitter), measure electrical heart signals and transmit them wireless to a receiver with integrated data logger attached to a neck belt. An elastic surcingle and conductivity ultrasound gel was used to improve the contact of the sensors. The data was downloaded once a day to the corresponding software (Polar ProTrainer 5 Equine Edition - version 5.35.165) using an infrared interface.

2.4 Experimental design

The observations were taken at three periods at the same time of day: in the first two periods 4 cows were observed for at least 4 h and in the last period 5 cows were observed for at least 4 h. Overall 43 valid time series for 12 different cows were recorded. The model was trained and evaluated using the data gathered in the two first observation periods. In these two periods more than 100 h with approximately 180,000 individual measurements of position and heart signal were recorded and annotated. In the last measurement period the local position measurement system was deactivated, because in this period the resting pulse rate should be tested for stability and relevance for parameters like pregnancy. Therefore the annotation was limited to the distinction between the states “lying” and “standing”.

The direct observations of cow behavior were done with regard to five activity states (period I and II) defined in a protocol and used as labels for the training data. The used behavior states were “standing”, “lying”, “walking”, “feeding” and “drinking”. Synchronized digital stop-watches were used to measure the start time of each behavior state in seconds. The alteration of the behavior was used as start time for a new behavior category. The pattern “standing up” and “lying down” could not be annotated precisely enough. Hence, their point in time was deduced afterwards from the switch between “standing” and “lying” and their duration was set to 30s (plausible estimation from own observations and in accordance with [38]).

3 Identification of spatio-temporal activity sequences

The data analysis which derives the sequence of activity states is divided in three steps (Fig. 3). In the first step, spatiotemporal features were extracted from filtered sensor data. In the second step, preliminary probabilities for activity states were derived by a multiclass Support Vector Machine (SVM [39]). In the third step, these probabilities were linked with each other and combined with contextual knowledge by a graphical model, more precisely by a Conditional Random Field (CRF [36]).

Fig. 3
figure 3

Data analysis workflow including feature extraction, data fusion, SVM probabilities and final state sequence classification using the CRF model

3.1 Feature extraction

The position and heart rate signals were observed by two independent sensors with different measurement rates. The LPM provides coordinates in around 2 s intervals and the heart parameters were derived once per second. Both signals were synchronized with an internet time server for the calculation of joint features (Table 1). Due to the higher measurement frequency of the heart rate sensor, the heart rate measurements were associated with the temporally closest position measurement. The result was a synchronized data set with the sampling frequency of the polar heart rate sensor.

Table 1 List of all features extracted and used as input for the SVM classifier

As the signals of the introduced sensors were noisy and included outliers, e.g. measured positions outside the stall barn or heart frequencies over 250 bpm, the data sets were filtered before features for classification were extracted. A Gaussian and a nonlinear median filter were used for the data smoothing. For the smoothing of the positions, a median filter width of 30 observation periods and a Gaussian filter width of 40 periods were used. For the heart rate sensor both filters were used with filter width of 200 periods. The median filter applied to the position signal was triggered by the condition that the distance between the active and the preceding position was larger than 3 m which is a strong indication of an outlier.

Following, spatial and non-spatial features are extracted for each time step from the synchronized signals (Table 1). At the evaluation of the trajectories, absolute coordinates on its own does not contain much information as the different zones of the barn are close to each other. Prior knowledge about the structure of the stall barn has to be included in form of a map. The distances to the characteristic objects like feeders and water trough were used as features. Additionally the movement, specified by a significant change in position and the movement speed, and the information about the current barn zone (cubicle, walking alley and feeding alley) were included as features.

RMSSD (“root mean square of successive differences”) was used as HRV parameter [25] and was added together with the HR to the feature set. Additional features were gained from Recurrence Quantitative Analysis (RQA; [27]), namely Recurrence, Determinism, Entropy, MaxLine and Trend. These features were successfully applied by Mohr et al. [28] for the stress assessment at cows. Thus, the same parameter settings were used and the features were calculated by the CRP Toolbox 3.19 from the Potsdam Institute for Climate Impact Research [40].

Overall 13 features (Table 1) combined in a feature vector. A feature vector is provided each second and used as input for the SVM.

3.2 Machine Learning method for spatio-temporal data analysis

3.2.1 Support Vector machine (SVM)

SVMs are a state of the art method for supervised classification and provide linear and nonlinear discrimination functions [39, 41]. As well as their ability to provide non-linear discriminators their specific advantage is that they generalize well from the training data to new unobserved cases. SVMs contain “Kernel” functions specifying distances metrics in the space of observations [42]. The most common kernels are radial basis functions (rbf) which were also used in our study.

The model of a SVM consists of a subset of weighted data instances - so called support vectors (SV). They define a hyperplane separating two classes whereas its position is defined by the principle of a maximum margin. In the separable case the margin is intuitively conceivable as the smallest distance between the hyperplane and a training data point. This principle of maximum margin facilitates the learning of models with excellent generalization performance and robustness.

In recent years SVMs have been successfully applied in various fields, e.g. remote sensing and genetics. This method was used for tasks from precision agriculture [43] ranging from the detection of meat and bone meal [44] to determination of biotic stress in agrarian plants [45]. The application of SVMs for activity pattern prediction of dairy cows with accelerometers was examined by Martiskainen et al. [18].

The accuracy and performance of the SVM depends on the selection of a suitable model. The model type is controlled by the chosen kernel function, the corresponding parameters and the penalty weight C [41]. For the given task, the rbf kernel achieves the best result due to the nonlinear class boundaries included in the data. The combined optimization of the kernel width σ and the penalty weight C is done by a 2d grid optimization (C = 2−5…210, σ = 2−10…25) in conjunction with cross validation.

The classification and probability estimation was done by an rbf-SVM (LIBSVM, version 3.11 [46]). As input data, the features shown in Table 1 were normalized and used. For the application of the graphical model in the next section the presented method demands on class probabilities for every observation. These probabilities were estimated following Lin et al. [47] by scaling the decision values, meaning the distance to the discriminating function. The scaling is a mapping of the decision values [−∞, +∞] to posterior probabilities [0, 1] for every class. The derivation of the class probabilities of a complete time series was conducted as follows: first, a SVM model was derived from the remaining time series (including cross-validation for parameter optimization). Following, for each time step in the current time series the probabilities for each class were predicted. This process was repeated for each available time series. In this way it was guaranteed that only general patterns included in every time series are used for prediction and that no over-fitting effects are included in the model.

The result of this analysis step is a probability for each activity state in each observation period for all observed time series.

3.2.2 Conditional random field

Graphical Models [48] - with CRFs as one particular instance - are an established method to represent prior knowledge in models. The data and the unknown conclusions are represented by nodes linked by edges denoting possible dependencies. This general approach has been adapted to several kinds of graphical models differentiated in graphical models with directed and undirected edges.

Bayesian networks use directed edges to represent the conditional influence of one node on another [49]. Markov random fields (MRFs) use undirected nodes to define the relation between the random variables that follow the Markov property. The Markov property states that a variable is conditionally independent of all other variables if its neighbor nodes are observed. CRF includes edges of both types shown in Fig. 4 [36]. Attached to the links are distributions which describe the relation of the linked nodes and are represented by distribution parameters.

Fig. 4
figure 4

Graphical representation of a chain structured Conditional Random Field (CRF) connecting the locally measured observations “Obi” with the sequence of hidden activity patterns “Pati”. The undirected links between the observation nodes and the activity nodes are realized by the SVM probabilities and the directed links between the activity nodes by a duration-aware transition matrix

The developed CRF model is composed of two node types, observations nodes and unobserved label nodes, and two edge types: The undirected edge between the observations and the hidden behavior patterns and the directed edge between succeeding behavior pattern nodes (Fig. 3). The observation nodes and the undirected edges were realized by an SVM model [37]. The parameters of corresponding probability functions were optimized in SVM optimization.

For our specific application there is still the problem of the Markov assumption included in the inference step. The Markov assumption states that all the knowledge about the preceding sequence of activities is representable by the current state. In fact, if a cow is “lying”, there is a high probability that the subsequent state is again “lying”, a much lower probability that it is “standing up”, but a probability of zero that it is “standing”. The latter is caused by the constraint that the state “standing up” has to be in between. The high probability for remaining in “lying” is caused by an inherent inertia: after “lying down” the cow will rest for a while and - as a rule - not “stand up” again immediately. This inertia cannot be modeled neither by HMMs nor CRFs as they are. For this reason we have adapted the CRF model by incorporating a time counter. We record the number of time steps in a given state and adapt the a priori probability for a state transition by the duration of the current state. The Viterbi algorithm [50] inferring the most probable label sequence has been modified appropriately (Alg. 1). It was enhanced to estimate the most probable duration d i of the active state i. This duration is used to adapt the transition probabilities and represent the inertia of the cow (Fig. 5). The original Viterbi-algorithm relies on constant transition probabilities leading to a geometric cumulative distribution function (CDF).

Fig. 5
figure 5

Visualization of the activity states and the possible state transitions. The colors distinguish the short (orange) and long (green) duration states from the inter-states (grey). The constraints and different state durations are realized by a sparse and duration aware transition matrix

The Viterbi algorithm was enhanced by a tracing step executed for every node, in which the most probable sequence based on the preceding observations is determined (Algorithm 1 [51]). From this label sequence the duration d i was determined as the number of steps since the last state change:

$$ {d}_i=i-\left( \max k\ \Big|\ stat{e}_k\ne stat{e}_i;k = i-1\ \cdots 0\right) $$
(1)

The local tracing can be truncated if a state change is reached or alternatively congruence with the preceding local tracing is found. The extracted duration is used for the adaption of the transition matrix a by multiplying the transition probabilities with the value of a cumulative distribution function (CDF) at position d i :

$$ {a}_{dur}\left[s,i\right]=CDF\left({d}_i\right)*a\left[s,i\right] $$
(2)

The application for activity pattern recognition requires the definition of functions of d i for the transition matrix. Consequently two different kinds of transition functions are used, one for fast state transitions and one for a longer persistence in a state. A semi-Markov process is shown in Fig. 4, where possible transitions are shown by an arrow and the different types of states are marked by node and edge color. An example for a short persistence in a state is “drinking” (orange) and for a long persistence is “lying” (green).

A rough estimation of the parameters describing these duration distributions (CDFs) was sufficient because of the high variability in the individual state durations in the data. For the short persistence like “feeding”, “drinking” and “walking” a linear model (Eq. 3) is used and for the longer persistence like “lying” and “standing” a sigmoid function (Eq. 4) modeling the transition function is used. It has the ability to suppress extremely short state sequences and is constant at a higher d. Also the linear model tends to suppress state durations of few seconds because these seem implausible but in contrast to the sigmoid model, the probability for transition at higher duration times has clearly increased and is constantly growing. This transition function reduces the probability for a longer persistence in a short state.

$$ CD{F}_{short}(d)=500*d*{10}^{-6} $$
(3)
$$ CD{F}_{long}(d)=\frac{1}{1+{e}^{-0.03*\left(10*d-500\right)}} $$
(4)

Another advantage of using a transition matrix is the possibility to introduce contextual knowledge about fundamental movement constraints of the animal in the stall environment. Therefore it is possible to avoid specific transitions in the resulting sequence e.g. from “lying” to “standing”. In this manner, it can be forced that the pattern “standing up” has to occur in the meantime (Fig. 4). Hence the generated time series of labels is in every case valid from the viewpoint of physical constraints of the cattle in the stall.

Alg 1. Adapted Viterbi-algorithm for determination of the most probable sequence considering the duration dependent transition matrix and the local determination of the duration of the current state.

These extensions are induced in the Viterbi algorithm as shown in Alg. 1. The original algorithm covers lines 1 to 11 and lines 22 and 23. The lines 12 to 20 contain the determination of most probable duration d (lines 16 to 20) of the most probable current state lok_max (lines 13 and 14). The transition matrix a dur is then adapted the determined duration in line 21. These adapted transition probabilities are used in the next iteration of the Viterbi algorithm (lines 3 to 11).

The resulting CRF model contains a SVM classification [37] for feature interpretation and models the transition probabilities between neighbored state nodes by a transition matrix depending on the duration of the current state.

4 Results and discussion

The validation of the classification model was done by cross validation on the described data set. This was done, not on the level of single observations, but on time series level. This approach excludes the whole time series to classify from the training data that is extracted from the remaining time series. Otherwise, the strong similarity between neighboring points would prevent an unbiased evaluation of the result quality.

For building a “Digital Dairy Diary”, the most probable sequence of labels with the given observations and considering the physical constraints has to be stored. One sample time series is presented in Fig. 6 showing the accuracy of the classification by the SVM and its improvement by the additional use of the graphical model. SVMs as state-of-the-art classifier achieve an overall accuracy of 79.01 % (average over all states) for the classification of activity pattern on its own. Consequently, it outperforms the widespread Naive Bayes approach (Acc = 56.33 %) and is comparable to the Random Forest classifier (Acc = 78.01 %).

Fig. 6
figure 6

Prediction of activity states showing the different characteristics of the SVM classification result and the result of the graphical model. From top to bottom: a the manual annotation, b prediction of the SVM and c prediction of the CRF. The provided prediction accuracies are related to the shown time series. The pattern “activity” comprises the active pattern “walking”, “feeding” and “drinking”

The SVM results were improved by the Graphical Model (Fig. 6 and Table 2). The used CRF enables the introduction of sequence information and can be adapted to the task of activity pattern determination by including time dependent transition functions. Implausible short sequences caused by outliers would interrupt these sequences and reduce the accuracy of the extracted parameters. This is taken into account by introducing the semi-Markov process. This additional duration dependency has drastically reduced the number of state changes. This focus is considered by utilizing additional quality measurements. Besides the accuracy in percent the “Longest Common Subsequence” (LCS) and all “Common Subsequences” (CS) demonstrate the improved result quality of the CRF.

Table 2 Comparison of classification results of the SVM and the CRF

The improvement is based on the smoothing mechanism of the CRF which is controlled by defined probability distribution functions (Fig. 5 and Table 2). In this way, the model is able to handle the high noise level with regard to specific states and underlying movement models.

The proposed method determines the activity patterns in a physically plausible order whilst also creating an activity state sequence that is valid, given the boundary conditions of the barn and the cattle physiology. In the cases of “standing up” and “lying down” in Fig. 5 the point in time was correctly determined according to the state transition model in Fig. 4. This effect is forced by the model due to the included contextual knowledge.

The general trends in all analyzed time series correspond to the properties of the visualized time series in Fig. 5. The mean accuracies of different states and model quality parameters are shown in Table 2, confirming the desired effects. The model is able to determine reliably between “standing” and “lying” - although there is no height information from the positioning system. The resulting accuracies of the individual states reveal the difficulties of the monitoring task. Some states are very similar, e.g. feeding, standing and walking causing many misclassifications and “drinking” as extremely rare state in the data set are never predicted (Table 3). The low accuracies of “walking” and “standing” are related to varying granularities of annotation. Typically, the cow switches multiple times between “standing”, “walking” and “feeding” during a visit at the feeding alley. Multiple feeding stations are visited and in between the states “standing” and “walking” occur (Fig. 1). Therefore, the temporal accuracy and the level of summarization of the annotation have highest impact on the validation result. The model catches these sequences by the most probable state series enabling a high level of interpretability (Fig. 5).

Table 3 Accuracy partition of the “standing” states

Overall, the reached accuracies should be evaluated in the light of high sensor noise, especially in the case of positioning systems. Standard deviations of more than 4 m prevent a reliable assignment to a box and essentially complicate a state determination. The reliable determination of “lying” and “standing” states shows the potential of the proposed method even under roughest conditions.

The sequence of activity patterns (especially the count and length of the resting phases during a day) provides useful information about the physiological condition of individual cattle. Table 3 shows the improvement of the classification by the CRF. Slight improvements in all accuracy are accompanied by significantly improved opportunity to interpret automatically the classification results (shown by LCS and mean CS). Long sequences like “lying” are classified without single errors, which enable the robust extraction of parameters like “longest lying sequence” or “count of standing up sequences”. The developed transition model (Fig. 4) supports the robust determination of long lying periods by the explicit modeling of “standing up” and “lying down”. The difference of slow walking and standing is, from a technical point of view, imprecise and, therefore, the annotation quality varies strongly. The shown smoothed state summary was desired and, as the longer states were in focus, does not reduce the value of the results for application.

The presented method overcomes the limitation of absent height information which is essential for distinguishing “standing” and “lying”. The combination of two sensors (heart rate and location sensor) and the analysis of their signals with a combination of two classifiers improve significantly the classification results.

4.1 Automated derivation of resting pulse rate

During the study it became obvious that the variance of the HR while lying (Resting Pulse Rate, RPR) of an individual cow was distinctly lower than the variance within the whole herd. To make this explicit the mean and the standard deviation of the median heart rate while lying on the three observation days per individual animal was calculated (Table 4).

Table 4 Cattle specific Mean Resting Pulse Rate (RPR) and its standard deviation derived from at least three plausible observations

The mean resting pulse of the ten individual animals varied from 64.9 bpm to 81.0 bpm. The mean standard deviation of an individual cow was 1.96 bpm (within the range of 0.29 bpm to 3.66 bpm). Furthermore it was noticed that the average heart rate per cow dependent significantly of the respective pregnancy stage (p = 0.049; by Kruskal-Wallis H Test): In the first 100 days on average 65.0 bpm (n = 2), for 101–200 days of pregnancy on average 69.8 bpm (n = 2) and in the last pregnancy period starting at day 200 on average 75.3 bpm (n = 4) was recorded. The cow no. 44 with the longest pregnancy duration (day 218) had the highest resting pulse rate of 81.0 bpm (Fig. 7).

Fig. 7
figure 7

Visualization of the relation between day of pregnancy and automatically derived RPR (Table 4 ). A significant correlation was verified

5 Conclusion and perspectives

We presented an approach for the classification of spatio-temporal activity states based on multiple sensors. We applied the method to a data set of dairy cows but it is generalizable to the detection of activity sequences based on heterogeneous, spatial and non-spatial sensor signals. We have shown that the method utilizes the non-spatial signals to bridge the activity states without spatial effects like the distinction of standing and lying without height information.

We have shown that the proposed method is able to improve the determination of activities by considering contextual knowledge about distribution information and constraints. The model design supports the inclusion of contextual knowledge by considering time and space dependent activity properties by the adaption of the transition probabilities. Furthermore, the definition of an explicit movement model like at a Kalman filter is not required to combine spatial and non-spatial features. The machine learning algorithms derive the relations between the features and the activity states directly from the data. In this way, the result quality benefits particularly from the time series characteristic of the data. Additional prior knowledge about the inertia of the moving object improves the result quality further.

An advantage of the combination of two discriminative classification methods is that the modeling of the unknown statistical data distribution that causes a decreasing model quality (if wrong assumptions were made), can be neglected. Another advantage of the presented model with two stages is the improved interpretability of the result. The interim result of the SVM can be visualized (Fig. 6) and each step optimized on its own. The model is suited to determine the activity pattern by combining the position and additional signal source in various scenarios. This task is highly relevant, e.g. for the increasing requirements for monitoring high performance dairy cows.

The field of application for such a system will be strongly related to the public debate on animal welfare and behavior science. With minimal interference, extensive experiments on the response of cattle to different housing conditions and cowshed layouts can derive wellbeing standards based on objective, reproducible and automated observations.

Furthermore the final result enables a clearer view regarding the individual properties and supports an automatic interpretation as the preliminary task for a “Digital Dairy Diary” (Fig. 2). On a 24/7 basis, recorded activity profiles can then be used for reconstructing the case history of disorders. The definition of “normal behavior” derived from a wider data base will enable the detection of deviations leading to a root cause analysis.

Moreover, it enables an automatic evaluation of the heart rate signal by detecting comparable situations on different days. An example of such application is the automatic determination of a representative pulse rate while “lying” (so called resting pulse rate) without additional visual annotation. First analyses show a significant correlation between the resting pulse rate and physiological states. For this reason it is also used in human medicine - the heart rate measurement while in the resting phase (the resting pulse rate) is used for clinical assessment purposes [52]. For these reasons, the individual animal heart rates on the days of measurement stay constant and are thus significant data sources (Table 4). In human physiology it is known that the heart rate is affected by certain underlying conditions, regardless of internal and external stressors; these conditions include sex, age and pregnancy [52, 53]. In dairy cows it has been shown that the heart rate increases significantly with positive variations in body weight and as the day goes forward [26]. From our own research we are able to prove significantly that with the increasing days of gestation of the pregnant dairy cow, the heart rate increases. This relation is already known at human physiology [53], but has not yet been considered in studies of dairy cows. Further studies with a larger number of animals should be undertaken so that this presumption can be examined statistically in depth.

The sensor selection reflects a compromise between the amount of information and the degree of interference with normal cattle behavior. Great importance has been attached to the simultaneous sensing of non-redundant signals, the inner (heart rate sensor) and outer condition (position sensor). The two sensor types complement each other and give an integrated view of the individual characteristics of the cattle. Further studies may take effort of this potential.

The integration of additional sensors like rumination sensors or pedometers in this model is possible and straight forward by expanding the feature vector. Combination with accelerometers would probably improve the accuracy of positions derived from the LPM-system by reducing the local noise and increasing the short-time precision. With new information the classification quality would increase. However, additional sensors require additional synchronization and increase the effort for continuous monitoring.