Background

The use of acoustic telemetry has expanded rapidly in recent years. Acoustic telemetry has been widely adopted in studies of marine mammals, reptiles (e.g., sea turtles, crocodiles), and fish and has become an important tool in the study of spatial ecology in marine and freshwater systems [13]. To address these challenges large-scale cooperative telemetry networks are now deployed around the world [46] including the Great Lakes Acoustic Telemetry Observation System [7], the Atlantic Cooperative Telemetry Network [8], and the Ocean Tracking Network [9].

A key objective of acoustic telemetry studies is localization of sources using data from arrays of receivers, i.e., estimation of the location of an individual source from detections at one or more receivers of an array. When regarded formally as an estimation problem, localization is essentially statistical triangulation, and it can be done when signals are obtained from an array of receivers configured so that multiple detections of the same signal are possible [1015]. Localization may be based on simple detection history information (the pattern of receivers at which detections occur) and also auxiliary information on time delay of arrival at different receivers, or signal strength. Localization of sources is crucial for studies of spatial ecology, resource selection, and density estimation. The precision of localization is therefore an important objective function in many acoustic studies, where researchers often consider trade-offs in the number and spacing of receivers to optimize precision of localizations subject to logistical and financial constraints [16].

Multiple types of data can provide information important to localization from acoustic arrays. For example, detections of signals at fixed receivers provide information on an individual source location. Similarly, non-detections also provide information on the location of an individual, however, most localization approaches do not use “observed zeros” – that is, locations of receivers where detections did not occur, and in some cases do not use occasions when individuals are detected at fewer than 3 receivers. Second, the location of an individual at occasion t should be informed to some extent by data from previous and subsequent occasions, with the degree of information decreasing as the time interval between observed source locations increases. Most methods of localization, however, do not explicitly integrate information about movement processes to inform the localization of acoustic sources (but see [17]).

Integrating movement processes and spatial capture-recapture (SCR, [18]) data in terrestrial systems has led to important methodological improvements that are relevant to localization in acoustic telemetry (e.g., [1921]). Indeed, localization is analogous to inference about the activity or home-range center in SCR models, and therefore SCR ideas have been adapted to accommodate data obtained by acoustic sampling methods [15, 19, 22, 23]. The benefit of this SCR-based view of localization, what we refer to as statistical localization, is that it allows in situ estimation of parameters related to detection range along with simultaneous localization of source locations, and potentially other parameters that describe the detection process, the rate at which signals are produced, the distribution of individuals, and their movement through time [21, 2426]. Moreover, SCR-based statistical localization uses all available detection history information, including the observed non-detections, and sources which are detected by only 1 or 2 receivers. Herein, we propose to modify and extend these SCR ideas to describe a general conceptual framework to localization in acoustic telemetry systems which integrates detection data from receiver arrays with explicit models of individual movement and signal (or “cue”) rates. The important advance of our work is recognition that integrating an explicit movement model with the estimation of source locations from spatial encounter histories introduces additional information into the localization process. In general, we believe that integrating an explicit movement model to link locations through time will improve localization, provide deeper insight into movement dynamics, resource selection and individual behavior, and allow coarser receiver spacing which may improve receiver array design. Moreover, formulating the localization process in terms of a spatially explicit model of individual distribution, movement, and signal rates may lead to solutions to some similar outstanding problems in applications of acoustic monitoring related to estimation of density, resource selection, and movement.

Methods

Data structure and model

Let ut be the unknown location at the time the tth signal was produced. Signals are produced at t=1,2,…,T occasions and may be detected by one or more receivers in an array. Let Δt be the interval between signal transmissions (hereafter “signals” or “transmissions”). In some cases Δt is constant for all tags and prescribed by design but, in practice, when many tags are deployed at the same frequency the interval is often set to be random in order to introduce an offset in detection times of individuals and avoid interference among tags. For example, an individual tag might be set to emit a signal on a random schedule with a minimum of 50 seconds and a maximum of 100 seconds between signals. Thus Δt∼Uniform(50,100).

For demonstration purposes, we assume a Markovian movement process conditional on Δt according to (e.g., Brownian motion)

$$ \mathbf{u}_{t} \sim \text{Normal}(\mathbf{u}_{t-1}, \sigma^{2} | \Delta_{t} |) $$
(1)

although we note that a number of alternative movement models are possible [27]. The observed data for each of the t occasions are the locations of the receivers that detected the individual (including possibly none). To be consistent with SCR terminology we call this the spatial encounter history and denote it by the vector yt where elements yj,t=1 if the individual tag was detected by receiver j at transmission t. Receivers have coordinates xj which are fixed by design. In some cases we might have time-difference-of-arrival (TDOA) information but, in practice, such information may not be available and, instead, only a coarse summary of arrival time is given (e.g., rounded to whole second, [15]).

Localization based on the observed encounter history

Using only the spatial encounter history information, localization is achieved by application of Bayes’ rule to compute the conditional probability of ut given the observed encounter history yt, i.e., the posterior probability distribution of the latent source location ut,

$$ \Pr(\mathbf{u} | \mathbf{y}) = \frac{ \Pr(\mathbf{y} | \mathbf{u})\Pr(\mathbf{u}) } { \int \Pr(\mathbf{y} | \mathbf{u})\Pr(\mathbf{u}) d\mathbf{u} } $$
(2)

This requires specification of two probability distributions: (1) Pr(u) is the probability distribution of the source location which, lacking specific additional information, can be taken to be uniform over the planar region in the vicinity of the acoustic arrayFootnote 1. If explicit habitat features are available then parameters which allow for non-uniformity can be estimated [28, 29]. (2) We also require the probability distribution of the spatial encounter history Pr(y|u) which, for binary detection encounter data, is determined by a set of detection probabilities pj,t, normally taken to be a homogeneous function of distance between the source location u and the receiver locations xj. For example, the normal kernel is commonly used in distance sampling [30] and spatial capture-recapture applications:

$$ \ p_{j,t} \equiv \Pr(y_{j,t} = 1 | \mathbf{u}_{t}) = p_{0} \exp\left(- ||\mathbf{u}_{t} - \mathbf{x}_{j}||^{2} / (2\sigma_{det}^{2})\right) \ $$
(3)

which has parameters p0 and \(\sigma _{det}^{2}\). In acoustic telemetry applications a logistic model is often used [31]

$$ \text{logit}(p_{j,t}) = \alpha_{0} + \alpha_{1} ||\mathbf{u}_{t} - \mathbf{x}_{j}|| $$
(4)

or functions that allow detection to remain high at distances close to a receiver before declining [30] e.g.,

$$ \ p_{j,t} = 1-\exp\left(- \left(||\mathbf{u}_{t} - \mathbf{x}_{j}||^{2} / \sigma_{det}\right)^{-\theta} \right) $$
(5)

The parameters of various encounter models can be estimated by maximum likelihood without difficulty [19, 28, 32] and used as a plug-in or empirical Best Unbiased Predictor (BUP) [33] of the latent variable ut based on the posterior distribution (Eq. 2). From a practical standpoint the functional form of the detection model is not important, but note that these standard models are a function of some power of Euclidean distance and may take a variety of forms [18, 28].

In practice, formal statistical estimation of the parameters from observed acoustic data is seldom done. Instead, the parameters are prescribed based on estimates from controlled “range testing” studies (e.g., [31]) in which transmitters are located at prescribed distances from one or more receivers and the detection of broadcast signals is modeled given known source locations. However, we believe that a better approach is to use formal models of the observed encounter history data in a spatial capture-recapture framework and estimate detection parameters directly from the observed data (e.g., following [19, 20]).

Movement-assisted localization

An obvious shortcoming of the approach described above is that localization of ut only uses information at occasion t. In practice, occasions with only 1 or 2 detections are sometimes not localized due to insufficient data, or alternatively, detection data are binned into longer time windows until a sufficient number of detections are available for localization. In the latter case the localization is producing an estimate of average location during the full time window. The obvious trade-off from a design standpoint is more detections at t improves localization, which is often accomplished by placing receivers closer together, thus requiring more receivers to sample a given area and greater cost. On the other hand, increasing the receiver spacing produces fewer detections, and fewer and more imprecise localizations.

We propose to resolve this trade-off formally by extending the localization model described above through the integration of an explicit movement model to simultaneously estimate the movement and detection processes. For example, instead of assuming Pr(ut) is uniform we replace this assumption with the Markovian movement assumption given above, that is

$$ \ \mathbf{u}_{t}|\mathbf{u}_{t-1} \sim \text{Normal}(\mathbf{u}_{t-1}, \sigma_{u}^{2} | \Delta_{t} |)\ $$
(6)

Then, in effect, the data from the previous (and subsequent) interval provide information about ut via the prior distribution Pr(ut|ut−1).

Use of this movement model based prior distribution is effectively an informative prior in the sense that it restricts potential states of ut to be in the vicinity of the previous state where the extent of this vicinity is determined by the parameter \(\sigma ^{2}_{u}\) as well as the sampling interval Δt. Thus, note that as Δt increases, the information provided by previous states diminishes rapidly and the prior tends to a uniform (non-informative) prior defined by the state-space.

When the signal schedule is known (see below), so that the non-detections are in effect observed, then the observed data are the spatial encounter histories yt for a given individual, for each t=1,2,…,T. This may include “all zero” observations (yt is a vector of zeros, indicating non-detection). Then, the joint distribution of the observed data and the latent movement trajectory u1,…,uT is

$$ {\begin{aligned} \Pr(\mathbf{y}_{1},\ldots,\mathbf{y}_{T}, \mathbf{u}_{1}, \ldots, \mathbf{u}_{T}) &= \prod_{t} \Pr(\mathbf{y}_{t} | \mathbf{u}_{t}, \sigma_{det}, p_{0})\\&\Pr(\mathbf{u}_{t} | \mathbf{u}_{t-1}, \sigma^{2}_{u}| \Delta_{t} |) \ \end{aligned}} $$
(7)

The inference objective is to jointly estimate the model parameters p0,σdet, and σu as well as the latent trajectory u1,…,uT. For this we adopt a Bayesian approach based on Markov chain Monte Carlo (MCMC) as described in “Inference for the movement-assisted localization model” section. When multiple tags are in operation simultaneously, data from all tags can be pooled for joint estimation of the parameters and latent trajectories from all encounter history data. In this case the encounter data are yi,t and the trajectories are ui,t for individual i and tth signal. Then the joint distribution to be analyzed is:

$$ {\begin{aligned} &\Pr(\mathbf{y}_{1,1},\ldots,\mathbf{y}_{N,T}, \mathbf{u}_{1,1}, \ldots, \mathbf{u}_{N,T}) = \prod_{i} \prod_{t}\\&\quad \Pr(\mathbf{y}_{i,t} | \mathbf{u}_{i,t}, \sigma_{det}, p_{0})\Pr(\mathbf{u}_{i,t} | \mathbf{u}_{i,t-1}, \sigma^{2}_{u}| \Delta_{i,t} |) \ \end{aligned}} $$
(8)

Inference for the movement-assisted localization model

In general, it is challenging to express the likelihood of the observed data (the spatial encounter histories including auxiliary information such as arrival time) as a function of the model parameters, because the model is formulated explicitly in terms of latent (unobserved) locations, the ut variables. The problem has many analogs in the statistical literature, as a Hidden Markov model [34], which suggests promise toward achieving a solution to constructing the likelihood. Instead, we suggest that Bayesian analysis of the model using standard methods of Markov chain Monte Carlo (MCMC) is relatively straightforward. This is facilitated by the use of existing software packages such as JAGS [35] which requires little more than a pseudo-code representation of the model. The model described in the previous section, where the interval duration schedule is known, is shown in Panel 1. This illustrates the simplicity of the movement-assisted localization framework which amounts to formal integration of a movement model with a spatial encounter model.

Here, we seek to estimate the model parameters, which include the detection range parameter σdet, the detection probability parameter p0 and also the movement variance parameter \(\sigma ^{2}_{u}\). However, the key quantities we wish to estimate in the context of localization are the posterior distributions for the locations of the tag at each signal occasion t=1,2,…,T given all available data. In the context of Bayesian analysis, localization is naturally achieved using the posterior distribution obtained by MCMC sampling. A key step in the construction of a Metropolis-within-Gibbs algorithm is that the full-conditional distribution for ut can be constructed by noting that,

$$ \ \pi(\mathbf{u}_{t} | \mathbf{y}_{t}, \ldots) \propto \Pr(\mathbf{y}_{t} | \mathbf{u}_{t}) \Pr(\mathbf{u}_{t} | \mathbf{u}_{t-1}) \Pr(\mathbf{u}_{t+1} | \mathbf{u}_{t}) \ $$
(9)

This does not have a convenient simplified form (to the best of our knowledge) but it does emphasize the point that information about the state of ut derives not only from the data yt but also from the previous (ut−1) and subsequent (ut+1) locations. In turn, those previous and subsequent locations are informed by detection data from t−1 and t+1, respectively. Thus our generalized localization model is using “all the data” in a manner that is prescribed by the specific movement and detection model imposed upon the system.

Unknown duration between signals

In many acoustic telemetry applications the time interval between signals will not be known. Rather, devices are programmed to emit a signal on a random schedule, e.g., Δt∼Uniform(a,b), but the timing and number of such signals is not registered. In this case, when the interval between observed signals is long, there is uncertainty in the number of missed signals. The challenge of random intervals between signals in acoustic telemetry has many similarities to the estimation of cue rates in passive acoustic studies [23]. In both instances, an individual (or an individual’s tag) is producing a cue at some partially known rate, but the number of undetected cues remains unknown. For example, if the signal schedule for a telemetry tag is random with an interval of between 90 and 150 seconds and a 10 minute gap between detections is observed, then there are “missed signals” which have to be accommodated to achieve unbiased estimates of the detection process parameters. Clearly it is uncertain whether the 10 minute interval contained 3, 4, 5 or 6 signals that were missed. In this case, one option for including the data obtained at the 10 minute interval is to set Δt=10 and then Eq. 6 is correct for the observed interval. The effects of missed signals in this approach, however, is to decrease the information about the current location ut provided by previous and subsequent locations and produce bias in estimates of the true detection parameters as all missed signals are ignored.

Alternatively, we develop an MCMC algorithm that treats the number of missed signals and the interval duration between each successive missed signal as random variables that are estimated as part of the model. To do this, we integrate an additional sub-model to describe the signal rate and associated interval durations. The specific form of the signal rate sub-model can be tailored to any study (e.g., programmed acoustic tags versus passive cetacean cues), but generally requires two stochastic components to account for: (i) the number of missed signals and (ii) the interval length between missed signals. Broadly, our approach proceeds by first identifying the number of possible missed signals during each observed interval and initiating locations for all possible latent trajectories. During each MCMC iteration, a Metropolis Hastings update proposes n, the number of missed signals for a given interval (and the respective trajectory), then accepts or rejects n conditional on all other model parameters. This approach requires that Δt are not excessively small such that the number of missed signals has some reasonable lower and upper bounds. If this is not possible, the total number of missed signals could be summed across fixed intervals and localizations modeled as the average location during an interval [36] (see Discussion). Additional details on the signal rate submodel are described below, while the R code is provided in Additional file 1.

The signal rate sub-model can be tailored to different protocols and formulated based on “interval duration” (Δt; e.g., defined transmitter settings in telemetry studies) or number of cues per unit time (n; e.g., animal calls). In most applications, assuming a distribution for interval duration also induces a distribution for the total number of missed signals and vice versa. For demonstration purposes, we develop a signal rate sub-model for acoustic tags programmed with signal intervals Δt∼Uniform(a,b). The constraints for estimating the number of missed signals (n) and their associated intervals (Δt,1:n) quickly becomes complex as for any observed interval, (Δobs, i.e., the length of the gap), there is a known minimum and maximum number of missed signals, a fixed minimum and maximum interval duration, and a requirement that the intervals must sum to Δobs. In our example, we use a normal approximation for the sum of uniform random variables to link Δt, and n. Specifically, we assume:

$$ \ \Delta_{obs} \sim \text{Normal}(n(a+b)/2, n(b-a)^{2}/12)\ $$
(10)

where Δobs is the observed gap length and n is the latent number of missed signals during Δobs. In this example, n is an estimated parameter, while a, b, and Δobs are given as data. This approach greatly improved MCMC efficiency and provided reasonable estimates for our study system. However, a variety of approaches are possible depending on the study system. For example, the signal rate sub-model for n and Δt could be formulated as a Poisson process where n∼Poisson(λ) so that Δt∼Exponential(λ). This Poisson formulation may be particularly advantageous for studies focused on animal calls or cues such as the monitoring of whales or, in terrestrial systems, birds or primates. Changes in n, however, also cause dimensionality changes in the latent trajectory u as n is equivalent to the number of locations where a signal was emitted. As such, trajectories for all possible values of n are monitored and updated during each MCMC iteration. A Metropolis-Hastings update is used to accept or reject a proposed n conditional on its associated trajectory and other model parameters. For example, larger values of n are likely to be rejected when an individual is towards the interior of the array and detection probability is high. Conversely, in areas of low detection probability, n is primarily influenced by prior information on signal rate and duration of the observed interval (here, a, b, the Normal approximation equation, and Δobs). Numerous statistical and ecological extensions to this general signal rate sub-model are possible, however, this relatively simple example demonstrates the concept of a sub-model for signal rate. Inclusion of the signal rate sub-model is not possible in JAGS (that we are aware of) and instead we developed a custom MCMC algorithm in R (Additional file 1).

Illustration: Simulated data

We simulated 100 data sets for a system that involved 25 acoustically tagged individuals, subjected to sampling at 100 receivers on a 10 x 10 grid with unit spacing (Fig. 1). The R code for simulating this population and sampling array, as well as for fitting all the models and post-processing the output, are given in Additional file 1. The initial location of each tagged individual was distributed randomly over the area in the vicinity of the array as shown in Fig. 1. Each signal interval was generated as a uniform random variable where a = 1 and b = 2 time units (e.g., minutes) and sampling occurred across 150 time units (therefore the number of potential signals of each individual is a random outcome). Obviously the time scale here is arbitrary and the system can be rescaled by increasing or decreasing the time interval. We simulated detections using the half normal model with p0=logit−1(0.25)=0.562 and detection radius parameter to be σdet=0.75 in the standardized units shown in Fig. 1. The standard deviation of the Brownian motion movement process, σu, was set at 0.25. We selected these particular parameter settings so that the probability of detecting an individual within the array was >0.90, but quickly decreased as distance from the array increased (Fig. 2). As such, individual locations near the center of the array were detected at 0 - 6 receivers, while locations on the periphery were detected at 0 - 2 receivers (Fig. 2). The true movement trajectories and detection locations of two individuals are shown in Fig. 2. We see that signals produced at interior locations rarely go undetected, individuals at interior locations are often detected at >1 receiver, and the number of detections decreases as individuals move towards the periphery of the array. This situation illustrates one of the important motivations for using an explicit movement model in localization: the observed detections are distinctly biased toward the interior of the receiver array where sampling is more intensive. Therefore, localizations using classical methods will also necessarily be biased toward areas of higher sampling intensity. This sampling bias must be accounted for in studies of movement and resource selection unless receiver placement itself is random with respect to habitat structure. In an SCR framework, however, signals that produce zero detections provide information on the detection process but ignoring “all-zero” occasions may bias parameter estimates [18].

Fig. 1
figure 1

Experimental array used for simulated encounter data. Receiver array (blue x’s) has unit spacing. Initial positions (red dots) of 25 tagged individuals were simulated randomly on the plot region defined by buffering the receiver array by 5 σdet units (here 3.75 units). Subsequent positions of each individual were simulated by Brownian motion as described in the main text

Fig. 2
figure 2

Location specific detection probability (probability of detecting a signal at ≥1 receiver) under current simulation settings (σdet = 0.75, p0 = 0.56; left). Receiver locations are denoted by blue x’s. True trajectories and encounter histories of two tagged individuals (right) where black dots denote locations when the acoustic tag signaled (filled [detected], open [not detected]). Locations with detections are shown in red, scaled to the number of detections (0 - 5 detections per location). The parameters σdet and p0 are easily varied to affect the expected probability of detection and the rate at which detection probability declines with distance from a receiver

We analyzed each simulated dataset using four modelling approaches, one approach that used independent localization based on detected locations, and three types of movement-assisted localization models. The independent localization model did not involve the movement model and used only occasions with >0 detections. Second, we fit a movement-assisted localization model using only detection occasions (hereafter “movement-assisted localization detection-only model”). Third, we fit a movement-assisted localization model where we assumed all interval durations (Δt) were known even when the tag was not detected (hereafter “movement-assisted localization known-interval model”). Although this approach allows estimation of all undetected signal locations, we buffered the first and last detection occasions by five intervals to prevent an inordinate number of estimated leading and trailing locations. Finally, we fit a movement-assisted localization model where the number of missed signals and their associated intervals are unknown and are instead estimated as part of the model (hereafter “movement-assisted localization unknown-interval model”). While this movement-assisted localization unknown-interval model is more complex, it likely describes the most realistic field sampling protocol. We compared these four modeling approaches to investigate biases in the SCR framework, improvements in localization due to the integration of a movement model, effects of ignoring all-zero occasions, and technological (or statistical) considerations if we can assume the number of unobserved signals is known or partially informed by transmitter settings.

We fit the first three models using the software JAGS [35] accessed through R version 3.5.2 [37], using the jagsUI package [38], with specifications similar to those shown in Panel 1 (see Additional file 1 for the full R script). We ran three parallel Markov chains for 15 000 iterations with 5000 burn-in iterations and 2000 adaptation iterations. Chains were thinned by 2 to reduce the size of model output. The movement-assisted localization unknown-interval model required a custom MCMC algorithm implemented in R (see Additional file 1 for the full R script) due to the challenges of the signal rate sub-model for updating the latent number of missed signals and their associated intervals and locations. Due to slower mixing, the movement-assisted localization unknown-interval model used three chains of 10 000 burn-in iterations and 50 000 saved iterations thinned by 10. We summarize posterior distributions as medians and 2.5 and 97.5 percentiles (95% CRI). We evaluated relative frequentist bias, using posterior means as point estimates, for higher level parameters, (σdet,σu,p0). We expect some bias in the independent localization model and detection-only model as these methods discard the all-zero encounter occasions. To characterize localization efficiency, we evaluated three metrics describing the accuracy and precision of (i) mean posterior location estimates relative to the true locations, (ii) precision of the posterior localizations, and (iii) localization credible interval coverage. To evaluate precision of means, we computed the Euclidian distance between each posterior mean and the true location (RMSE). For precision of the full posteriors, we calculated the Euclidian distance between each posterior estimate and the true location (hereafter precision). In both cases, smaller values are preferred. We also recorded the proportion of occasions in which the true location was within the 95% highest posterior density kernel, which we report as localization coverage.

Results

Simulation results

Both the movement-assisted localization known-interval and unknown-interval models returned relatively unbiased estimates of the movement and detection parameters (Table 1). The independent localization and movement-assisted localization detection-only models displayed positive bias in p0 due to the exclusion of all-zero occasions (Table 1). In the detection-only model, σdet and σu displayed -1% and -9% relative bias, respectively (Table 1). The unknown-interval model reduced the relative bias in σdet and σu to 0% and -8.0%, respectively (Table 1), while reducing bias in p0 from 16% to -1% and providing additional inference to locations with zero detections.

Table 1 Mean and relative bias of movement (σu) and detection (σdet,p0) parameters from 100 simulated data sets analyzed using the independent localization model (Ind. loc) and three forms of movement-assisted localization using: (Mvmt DO) detection occasions only, (Mvmt KI) known signal intervals, or (Mvmt UI) unknown signal intervals, which are modeled as random variables

We observed large benefits in our primary objective of improving localization by integrating an underlying movement model. First, the three movement-assisted localization modeling approaches reduced RMSE by nearly one-half relative to the independent localization model when the number of detections were <3, and still provided noticeable improvements when the number of detections was >3 (Table 2, Fig. 3). Movement-assisted localization models achieved similar levels of RMSE and precision using 1 - 2 detections as the independent localization model achieved with 5 - 6 detections (Table 2). Credible interval coverage of ut was generally >0.90 for all models across all levels of detections (1 - 6 detections per occasion; Table 2).

Fig. 3
figure 3

Latent trajectories of two individuals (red) with detection and non-detection locations (filled and open red dots, respectively; see Fig. 2 for additional details) and model-specific posterior localizations (2000 posterior samples; black points). Modeling approaches include (a) the independent localization model and (b - d) three forms of movement-assisted localization using: (b) detection occasions only, (c) assuming all time intervals are known, or (d) modeling the unknown time intervals as random variables. Receiver array is denoted by blue x’s

Table 2 Root mean squared error of posterior means (RMSE), precision of full posterior (Prec; smaller is better), and credible interval coverage (Cov) for localization of ut from 100 simulated data sets analyzed using the independent localization model (Ind loc) and three forms of movement-assisted localization: (Mvmt DO) detection occasions only, (Mvmt KI) assuming the signal interval is known, or (Mvmt UI) modeling the unknown signal interval as a random variable. Values are presented as a function of the number of receivers where the signal was detected. Only the known-interval model localizes to a specific ut when 0 detections. Due to their rarity, results for signals detected at 7 or 8 receivers were excluded from results

Inference to locations with zero detections is possible in both the movement-assisted localization known-interval and unknown-interval models. The known-interval model returned relatively precise estimates of zero-detection locations and achieved close to nominal credible interval coverage (0.94, Table 2). Similarly, the unknown-interval model accurately estimated the number of missed signals per individual (relative bias = 2%) across a wide range of values (<5 to >100 missed signals per individual; Fig. 4). The unknown-interval model provides a set of posterior locations conditional on the latent number of missed signals (rather than a fixed number of locations as in the known-interval model), and as such RMSE and precision for specific zero-detection locations are not directly comparable (Table 2). Minimally biased estimates and close to nominal credible interval coverage of upper level parameters, missed signals, and localization (Table 1, Table 2, Fig. 4) together demonstrate the capacity of the unknown-interval model to simultaneously estimate movement, signal rate, and detection processes from acoustic data even when the numbers of missed signals are unknown.

Fig. 4
figure 4

Number of true missed signals per simulated individual (100 simulated data sets of 25 individuals; y-axis) and the number of estimated missed signals from the movement-assisted localization unknown interval model (x-axis). Diagonal line is a 1:1 line. Relative bias = 2%

Integrating the movement process dramatically improved precision of posterior localizations while still maintaining nominal or close to nominal credible interval coverage (Table 2, Fig. 3). Increased localization precision between the independent localization model and the movement-assisted localization detection-only model are due directly to the integration of the movement model (Fig. 3a vs b). The known-interval model further extends inference to locations with zero detections, thus the full posterior trajectory is slightly larger than the detection-only model (Fig. 3c vs b). Finally, the unknown-interval model relaxes the requirement of a known number of missed signals and the interval between those signals (Fig. 3d). Interestingly, localization at the level of the full trajectory was only minimally influenced by an unknown signal interval in these examples (Fig. 3c vs d).

The primary influence of the movement model is the restriction of locations to regions consistent with the movement trajectory and preventing unrealistically large movements between detection occasions, which results in improved accuracy and precision of the full trajectory (Fig. 3). An animation of occasion-specific localizations are provided in Additional file 2, further demonstrating the improved localization from movement-assisted localization methods at the single-occasion and full trajectory levels. For the unknown-interval model, animations present the marginal distribution of ut when intervals consist of multiple zero detection occasions.

Discussion

We described a localization approach for acoustic telemetry data that combines classical ideas of movement modeling [27] and localization of sources from receiver arrays. Similar ideas exist in the field of spatial capture-recapture [18, 39]. In fact, localization of sources in acoustic telemetry is precisely analogous to estimation of an individual’s activity center in terrestrial spatial capture-recapture and the movement of activity centers through time [18, 21, 25, 26]. The relevance of the movement model, however, is of much greater importance in acoustic telemetry studies where a high frequency of observations can introduce considerably more information about individual locations over short time periods. Conversely, in many terrestrial capture-recapture studies spatial sampling is much coarser and temporal observations usually occur at a much lower frequency, thus the movement process itself often provides less information about individual locations and model parameters [21].

Localization from receiver array data only requires the spatial encounter history as we have demonstrated in this paper. However, it is possible to use auxiliary information on signal strength [22], time-difference-of-arrival (TDOA), or even both signal strength and TDOA simultaneously [15, 20]. Such auxiliary information often improves the precision of localizations, and can be included in the localization model as independent data with contribution to the likelihood depending on the nature of the auxiliary data [17, 20]. For example, TDOA is incorporated into the localization model by regarding arrival time of the tth signal at detector j as a normal random variable with mean E(xt,jut)=β0t+β1||xjut|| or some similar function of distance, and normal measurement error of arrival time with variance \(\sigma _{err}^{2}\). Here, β0t is the time the tth sound was generated and β1 is the inverse of the speed of sound in water or air depending on the context. Because β0t is not known, the likelihood can be expressed in terms of the time difference between arrival at j and the mean arrival time. This requires at least 2 measurements of arrival time are obtained for each localization [20]. Any model for localization based on auxiliary data, however, can be integrated into our movement-assisted localization framework.

Classical localization based on hyperbolic positioning algorithms requires that individuals are detected simultaneously at multiple receivers, and improvements in localization precision are primarily achieved through increased density of receivers. Alternatively, we described a method that uses information on previous and subsequent locations of an individual to produce improved localizations with fewer detections, including even zero detections. Conceptually, our approach to movement-assisted localization is based on formulating the model for an individual’s spatial-temporal detection history conditional on the individual’s latent movement trajectory (i.e., its sequence of locations). This formulation is similar to specification of the likelihood in spatial capture-recapture models and we feel that it has a number of benefits to modeling acoustic telemetry data, including joint inference about the entire trajectory of an individual and model parameters based on all available data. Additionally, this approach allows for in situ estimation of the detection range parameter(s) in the model simultaneously with estimation of the latent trajectory. The framework can be extended to allow heterogeneity in detection range due to environment and other factors. For example, detection probability can be expressed as a function of covariates such as depth or distance from shore, or non-Euclidean effective distance models which allow for sound attenuation in heterogeneous environments [32].

A second important application of our model is in developing improved models of resource selection, habitat use, and identification of movement or migration corridors. Resource selection models usually regard individual locations as fixed known points, which is a reasonable assumption in classical terrestrial GPS-based telemetry (although see [40]). However, in acoustic telemetry applications locations of individuals are unobserved and inferred based on detections at known receiver locations. Considerable localization uncertainty reduces the power to detect important habitat effects. Our approach estimates the posterior distribution of the individual’s trajectory (Fig. 3), which yields a characterization of the latent point pattern of true locations of the individual during the study. The underlying point pattern of true locations can then be used to obtain a posterior characterization of any particular resource selection model conditional on the point locations. Thus, a coherent accounting for uncertainty in locations can propagate through to the inference about resource selection and movement processes, providing a solution to explicitly accommodating the variable precision of localizations in models of resource selection and similar applications. Moreover, our movement-assisted localization approach accommodates inherent bias in acoustic localization data that is due to estimated locations being biased toward areas of high receiver density (e.g., toward the interior of an array vs. the edge).

A third potential application of this framework is for modeling the number of active tags in the vicinity of the array (i.e., abundance). In practice, the number of tagged individuals available for capture is rarely known. This is precisely the problem that spatial capture-recapture resolves by introducing parameter(s) to describe the density of the underlying point process and a key motivating factor for viewing the problem as one of SCR. Here, SCR provides a framework to localize not only to detected transmitters, but also estimate the density and distribution of undetected transmitters. Further, it may then be possible to explicitly model the effect of density dependence on detection probability. When multiple tags of the same frequency exist in the vicinity of a receiver array, interference can induce a density dependent decrease in detection probability. We believe this is an important research area which can be resolved using our formulation of the movement-assisted localization model.

SCR methods provide a hierarchical framework to describe the three primary components of acoustic telemetry: (1) the movement of individuals, (2) the signal or cue rate, and (3) the detection process. Our simulation study described a relatively simple example to demonstrate SCR as a method to analyze acoustic telemetry data, and by no means explores all the possible applications or challenges. The greatest challenges to these methods arise from an unknown number of missed signals and the development of an appropriate signal rate sub-model. In many acoustic telemetry systems, the inter-signal duration is defined by transmitter settings and can encompass large periods of time. Under these settings, model complexity quickly increases due to the high number of possibly missed signals (n), the inter-signal intervals (Δt), and associated changing dimensions of trajectory locations (ut). Broadly speaking, this is not a deficiency in the application of SCR methods but a restriction on the specific model described herein. For example, one approach to situations where tracking specific Δt are difficult or even unnecessary for inference, is to standardize Δt to a fixed interval (e.g., 10-minute or 1-hr periods) and model localizations as the average location during an interval [36]. Overall, most of these challenges can be addressed by modifications to the modelling approaches we described, while retaining the general SCR framework and hierarchical approaches developed herein. Alternatively, some of these sampling challenges may be addressed by design or technological advances (e.g., by fixing the sampling interval or retaining data on the transmission sequence). In most situations, resolving uncertainties through careful planning and design is highly preferable compared to trying to resolve uncertainties with statistical models.

Our application of movement-assisted localization focused on acoustic telemetry data, however, similar concepts are applicable to acoustic monitoring systems aimed at detecting signals, calls, or cues. In acoustic monitoring applications signal intervals are stochastic, being equivalent to the signal rate of individuals in the population. It is therefore essential to develop models for cue rate in order to make progress toward developing unbiased abundance and demographic parameter estimates from acoustic data. Our unknown-interval model does just this, as it estimates the number of signals (e.g., signals emitted from a tag [active] or cues produced by an animal [passive calls, clicks, singing, etc.]) and the duration between signals as part of the model. Prior information on cue rates may arise in the form of transmitter settings or auxiliary data on cue rates. In our example, integrating the latent movement, detection, and cue rate processes allowed estimation of all parameters with minimal data. Our movement-assisted localization framework is a particularly promising approach to estimate detection probability, a key step to accurate abundance estimation in any system [41]. Further, describing these process through an underlying point process model and implementation in a Bayesian framework provides tremendous flexibility to integrate auxiliary information to improve parameter estimation and explore additional ecological processes (e.g., aerial- or boat-based count data to inform abundance, call rate data to inform cue rates, telemetry data to inform movement). Joint efforts from ecologists, statisticians, and engineers will be vital to solving these challenges and advance the utility of acoustic studies to address pressing challenges in the study of movement, survival, and abundance in both the aquatic and terrestrial realms.

Finally, our movement-assisted localization framework provides a host of opportunities to evaluate integrated, multi-objective study designs. In general, the design of receiver arrays can be based on objectives that involve maximizing the probability of detecting a tagged individual or statistical precision of detection parameters [18, 42]. Combining a state-space movement model with a sub-model for the detection process (i.e., a positioning model; [17]) dramatically improved localization precision, whereby only 1 - 2 detections per occasion resulted in similar precision as 5 - 6 detections in the independent localization model. As such, study designs may consider coarser receiver spacing and broader spatial coverage without sacrificing localization precision. Overall, movement-assisted localization provides a flexible, transparent framework that is easily adapted to aquatic telemetry systems and studies interested in using acoustic data to investigate ecological processes.

Conclusions

Understanding animal movement and space-use is crucial for effective conservation and management of species. Acoustic telemetry data are increasingly used to study aquatic ecosystems and face many similar logistical and modeling challenges as spatial capture-recapture studies in terrestrial environments. Our results demonstrate a unifying framework to model acoustic telemetry data based on an underlying spatial capture-recapture model integrated with explicit movement and signal rate sub-models. This approach improves localization estimates and can be adapted to a variety of species- and study-specific settings such as unknown signal (or cue) rates, imperfect detection, and the integration of habitat data to inform individual- and population-level movement and space-use.