# Sequential Monte Carlo Methods for Joint Detection and Tracking of Multiaspect Targets in Infrared Radar Images

**Part of the following topical collections:**

## Abstract

We present in this paper a sequential Monte Carlo methodology for joint detection and tracking of a multiaspect target in image sequences. Unlike the traditional contact/association approach found in the literature, the proposed methodology enables integrated, multiframe target detection and tracking incorporating the statistical models for target aspect, target motion, and background clutter. Two implementations of the proposed algorithm are discussed using, respectively, a resample-move (RS) particle filter and an auxiliary particle filter (APF). Our simulation results suggest that the APF configuration outperforms slightly the RS filter in scenarios of stealthy targets.

## Keywords

Target Motion Sequential Monte Carlo Background Clutter Joint Detection Image Grid## 1. Introduction

This paper investigates the use of sequential Monte Carlo filters [1] for joint multiframe detection and tracking of randomly changing multiaspect targets in a sequence of heavily cluttered remote sensing images generated by an infrared airborne radar (IRAR) [2]. For simplicity, we restrict the discussion primarily to a single target scenario and indicate briefly how the proposed algorithms could be modified for multiobject tracking.

Most conventional approaches to target tracking in images [3] are based on suboptimal decoupling of the detection and tracking tasks. Given a reference target template, a two-dimensional (2D) spatial matched filter is applied to a single-frame of the image sequence. The pixel locations where the output of the matched filter exceeds a pre-specified threshold are treated then as initial estimates of the true position of detected targets. Those preliminary position estimates are subsequently assimilated into a multiframe tracking algorithm, usually a linearized Kalman filter, or alternatively discarded as false alarms originating from clutter.

Depending on its level of sophistication, the spatial matched filter design might or might not take into account the spatial correlation of the background clutter and random distortions of the true target aspect compared to the reference template. In any case, however, in a scenario with dim targets in heavily cluttered environments, the suboptimal association of a single-frame matched filter detector and a multiframe linearized tracking filter is bound to perform poorly [4].

As an alternative to the conventional approaches, we introduced in [5, 6] a Bayesian algorithm for joint multiframe detection and tracking of known targets, fully incorporating the statistical models for target motion and background clutter and overcoming the limitations of the usual association of single-frame correlation detectors and Kalman filter trackers in scenarios of stealthy targets. An improved version of the algorithm in [5, 6] was later introduced in [7] to enable joint detection and tracking of targets with unknown and randomly changing aspect. The algorithms in [5, 6, 7] were however limited by the need to use discrete-valued stochastic models for both target motion and target aspect changes, with the "absent target" hypothesis treated as an additional dummy aspect state. A conventional hidden Markov model (HMM) filter was used then to perform joint minimum probability of error multiframe detection and maximum a posteriori (MAP) tracking for targets that were declared present in each frame. A smoothing version of the joint multiframe HMM detector/tracker, based essentially on a 2D version of the forward-backward (Baum-Welch) algorithm, was later proposed in [4]. Furthermore, we also proposed in [4] an alternative tracker based on particle filtering [1, 8] which, contrary to the original HMM tracker in [7], assumed a continuous-valued kinematic (position and velocity) state and a discrete-valued target aspect state. However, the particle filter algorithm in [4] enabled tracking only (assuming that the target was always present in all frames) and used decoupled statistically independent models for target motion and target aspect.

To better capture target motion, we drop in this paper the previous constraint in [5, 6, 7] and, as in the later sections of [4], allow the unknown 2D position and velocity of the target to be continuous-valued random variables. The unknown target aspect is still modeled however as a discrete random variable defined on a finite set Open image in new window , where each symbol is a pointer to a possibly rotated, scaled, and/or sheared version of the target's reference template. In order to integrate detection and tracking, building on our previous HMM work in [7], we extend the set Open image in new window to include an additional dummy state that represents the absence of a target of interest in the scene. The evolution over time of the target's kinematic and aspect states is described then by a *coupled* stochastic dynamic model where the sequences of target positions, velocities, and aspects are *mutually dependent*.

Contrary to alternative feature-based trackers in the literature, the proposed algorithm in this paper detects and tracks the target directly from the raw sensor images, processing pixel intensities only. The clutter-free target image is modeled by a nonlinear function that maps a given target centroid position into a spatial distribution of pixels centered around the (quantized) centroid position, with shape and intensity being dependent on the current target aspect. Finally, the target is superimposed to a structured background whose spatial correlation is captured by a noncausal Gauss-Markov random field (GMRf) model [9, 10, 11]. The GMRf model parameters are adaptively estimated from the observed data using an approximate maximum likelihood (AML) algorithm [12].

Given the problem setup described in the previous pa ragraph, the optimal solution to the integrated detection/ tracking problem requires the recursive computation at each frame Open image in new window of the joint posterior distribution of the target's kinematic and aspect states conditioned on all observed frames from instant Open image in new window up to instant Open image in new window . Given, however, the inherent nonlinearity of the observation and (possibly) motion models, the exact computation of that posterior distribution is generally not possible. We resort then to mixed-state particle filtering [13] to represent the joint posterior by a set of weighted samples (or particles) such that, as the number of particles goes to infinity, their weighted average converges (in some statistical sense) to the desired minimum mean-square error (MMSE) estimate of the hidden states. Following a sequential importance sampling (SIS) [14] approach, the particles may be drawn recursively from the coupled prior statistical model for target motion and aspect, while their respective weights may be updated recursively using a likelihood function that takes into account the models for the target's signature and for the background clutter.

We propose two different implementations for the mixed-state particle filter detector/tracker. The first implementation, which was previously discussed in a conference paper (see [15]) is a resample-move (RS) filter [16] that uses particle resampling [17] followed by a Metropolis-Hastings move step [18] to combat both particle degeneracy and particle impoverishment (see [8]). The second implementation, which was not included in [15], is an auxiliary particle filter (APF) [19] that uses the current observed frame at instant Open image in new window to preselect those particles at instant Open image in new window which, when propagated through the prior dynamic model, are more likely to generate new samples with high likelihood. Both algorithms are original with respect to the previous particle filtering-based tracking algorithm that we proposed in [4], where the problem of joint detection and tracking with coupled motion and aspect models was not considered.

Related Work and Different Approaches in the Literature

Following the seminal work by Isard and Blake [20], particle filters have been extensively applied to the solution of visual tracking problems. In [21], a sequential Monte Carlo algorithm is proposed to track an object in video subject to model uncertainty. The target's aspect, although unknown, is assumed, however, to be fixed in [21], with no dynamic aspect change. On the other hand, in [22], an adaptive appearance model is used to specify a time-varying likelihood function expressed as a Gaussian mixture whose parameters are updated using the EM [23] algorithm. As in our work, the algorithm in [22] also processes image intensities directly, but, unlike our problem setup, the observation model in [22] does not incorporate any information about spatial correlation of image pixels, treating instead each pixel as independent observations. A different Bayesian algorithm for tracking nonrigid (randomly deformable) objects in three-dimensional images using multiple conditionally independent cues is presented in [24]. Dynamic object appearance changes are captured by a mixed-state shape model [13] consisting of a discrete-valued cluster membership parameter and a continuous-valued weight parameter. A separate kinematic model is used in turn to describe the temporal evolution of the object's position and velocity. Unlike our work, the kinematic model in [24] is assumed statistically independent of the aspect to model.

Rather than investigating solutions to the problem of multiaspect tracking of a single target, several recent references, for example, [25, 26], use mixture particle filters to tackle the different but related problem of detecting and tracking an unknown number of multiple objects with different but fixed appearance. The number of terms in the nonparametric mixture model, that represents the posterior of the unknowns, is adaptively changed as new objects are detected in the scene and initialized with a new associated observation model. Likewise, the mixture weights are also recursively updated from frame to frame in the image sequence.

Organization of the Paper

The paper is divided into 6 sections. Section 1 is this introduction. In Section 2, we present the coupled model for target aspect and motion and review the observation and clutter models focusing on the GMRf representation of the background and the derivation of the associated likelihood function for the observed (target + clutter) image. In Section 3, we detail the proposed detector/tracker in the RS and APF configurations. The performance of the two filters is discussed in Section 4 using simulated infrared airborne radar (IRAR) data. A preliminary discussion on multitarget tracking is found in Section 5, followed by an illustrative example with two targets. Finally, we present in Section 6 the conclusions of our work.

## 2. The Model

In the sequel, we present the target and clutter models that are used in this paper. We use lowercase letters to denote both random variables/vectors and realizations (samples) of random variables/vectors; the proper interpretation is implied in context. We use lowercase Open image in new window to denote probability density functions (pdfs) and uppercase Open image in new window to denote the probability mass functions (pmfs) of discrete random variables. The symbol Open image in new window is used to denote the probability of an event Open image in new window in the Open image in new window -algebra of the sample space.

State Variables

Let Open image in new window be a nonnegative integer number and let superscript Open image in new window denote the transpose of a vector or matrix. The kinematic state of the target at frame Open image in new window is defined as the four-dimensional continuous (real-valued) random vector Open image in new window that collects the positions, Open image in new window and Open image in new window , and the velocities, Open image in new window and Open image in new window , of the target's centroid in a system of 2D Cartesian coordinates Open image in new window . On the other hand, the target's aspect state at frame Open image in new window , denoted by Open image in new window , is assumed to be a discrete random variable that takes values in the finite set Open image in new window where the symbol " Open image in new window " is a dummy state that denotes that the target is absent at frame Open image in new window , and each symbol Open image in new window , Open image in new window , is in turn a pointer to one possibly rotated, scaled, and/or sheared version of the target's reference template.

### 2.1. Target Motion and Aspect Models

The random sequence Open image in new window , is modeled as first-order Markov process specified by the pdf of the initial kinematic state Open image in new window , the transition pdf Open image in new window , the transition probabilities Open image in new window Open image in new window Open image in new window , Open image in new window , and the initial probabilities Open image in new window , Open image in new window .

Aspect Change Model

Assume that, at any given frame, for any aspect state Open image in new window , the clutter-free target image lies within a bounded rectangle of size Open image in new window . In this notation, Open image in new window and Open image in new window denote the maximum pixel distances in the target image when we move away, respectively, up and down, from the target centroid. Analogously, Open image in new window and Open image in new window are the maximum horizontal pixel distances in the target image when we move away, respectively, left and right, from the target centroid.

*extended grid*Open image in new window that contains all possible target centroid locations for which at least one target pixel still lies in the sensor image. Next, let Open image in new window be a matrix of size Open image in new window such that Open image in new window for any Open image in new window and

where Open image in new window and Open image in new window are the spatial resolutions of the image, respectively, in the directions Open image in new window and Open image in new window . The parameter Open image in new window in (2) denotes in turn the probability of a new target entering the image once the previous target became absent. For simplicity, we restrict the discussion in this paper to the situation where there is at most one single target of interest present in the scene at each image frame. The specification Open image in new window , Open image in new window , corresponds to assuming the worst-case scenario where, given that a new target entered the scene, there is a uniform probability that the target will take any of the Open image in new window possible aspect states. Finally, the term Open image in new window in (2) is the probability of a target moving out of the image at frame Open image in new window given its kinematic and aspect states at frame Open image in new window .

Motion Model

### 2.2. Observation Model and Likelihood Function

Next, we discuss the target observation model. Previous references mentioned in Section 1, for example, [21, 22, 24, 25, 26], are concerned mostly with video surveillance of near objects (e.g., pedestrian or vehicle tracking), or other similar applications (e.g., face tracking in video). For that class of applications, effects such as object occlusion are important and must be explicitly incorporated into the target observation model. In this paper by contrast, the emphasis is on a different application, namely, detection and tracking of small, quasipoint targets that are observed by remote sensors (usually mid-to high-altitude airborne platforms) and move in highly structured, generally smooth backgrounds (e.g., deserts, snow-covered fields, or other forms of terrain). Rather than modeling occlusion, our emphasis is instead on additive natural clutter.

Image Frame Model

*target signature coefficients*dependent on the aspect state Open image in new window . Specifically, we make [4]

where Open image in new window is an Open image in new window matrix whose entries are all equal to zero, except for the element Open image in new window which is equal to 1.

For a given fixed template model Open image in new window , the coefficients Open image in new window in (8) are the target signature coefficients responding to that particular template. The signature coefficients are the product of a binary parameter Open image in new window , that defines the target shape for each aspect state, and a real coefficient Open image in new window , that specifies the pixel intensities of the target, again for the various states in the alphabet Open image in new window . For simplicity, we assume that the pixel intensities and shapes are deterministic and known at each frame for each possible value of Open image in new window . In particular, if Open image in new window takes the value Open image in new window denoting absence of target, then the function Open image in new window in (7) reduces to the identically zero matrix, indicating that sensor observations consist of clutter only.

*Remark.* Equation (8) assumes that the target's template is entirely located within the sensor image grid. Otherwise, for targets that are close to the image borders, the summation limits in (8) must be changed accordingly to take into account portions of the target that are no longer visible.

Clutter Model

where Open image in new window , with Open image in new window if Open image in new window and zero otherwise. The symbol Open image in new window denotes here the expectation (or expected value) of a random variable/vector.

Likelihood Function Model

*data term*and

*energy term*. On the other hand, for Open image in new window , Open image in new window reduces to the likelihood of the absent target state, which corresponds to the probability density function of Open image in new window assuming that the observation consists of clutter only, that is,

*block-tridiagonal*structure of the form

where Open image in new window denotes the Kronecker product, Open image in new window is Open image in new window identity matrix, and Open image in new window is a Open image in new window matrix whose entries Open image in new window if Open image in new window and are equal to zero otherwise.

*spatial matched filter*using the expression

*differential operator*

with Dirichlet (identically zero) boundary conditions.

Similarly, the energy term Open image in new window can be also efficiently computed by exploring the block-banded structure of Open image in new window . The resulting expression is the difference between the autocorrelation of the signature coefficients Open image in new window and their lag-one cross-correlations weighted by the respective GMrf model parameters Open image in new window or Open image in new window . Before we leave this section, we make two additional remarks.

Remark 2.

As before, (15) is valid for Open image in new window and Open image in new window . For centroid positions close to the image borders, the summation limits in (15) must be varied accordingly (see [6] for details).

Remark 3.

Within our framework, a crude non-Bayesian single frame maximum likelihood target detector could be built by simply evaluating the likelihood map Open image in new window for each aspect state Open image in new window and finding the maximum over the image grid of the sum of likelihood maps weighted by the a priori probability for each state Open image in new window (usually assumed to be identical). A target would be considered present then if the weighted likelihood peak exceeded a certain threshold. In that case, the likelihood peak would also provide an estimate for the target location. The integrated joint detector/tracker presented in Section 3 outperforms, however, the decoupled single-frame detector discussed in this remark by fully incorporating the dynamic motion and aspect motion into the detection process and enabling multiframe detection within the context of a track-before-detect philosophy.

## 3. Particle Filter Detector/Tracker

### 3.1. Sequential Importance Sampling

- (1)
*Initialization*For Open image in new window- (i)
Draw Open image in new window , and Open image in new window .

- (ii)
Make Open image in new window and Open image in new window .

- (i)
- (2)
*Importance Sampling*For Open image in new window- (i)
Draw Open image in new window according to (2).

- (ii)
Draw Open image in new window according to (4) or (5).

- (iii)Update the importance weights

using the likelihood function in Section 2.2.

- (i)

- (i)
Normalize the weights Open image in new window such that Open image in new window .

- (ii)
For Open image in new window , make Open image in new window , Open image in new window , and Open image in new window .

- (iii)
Make Open image in new window and go back to step 2.

### 3.2. Resample-Move Filter

The sequential importance sampling algorithm in Section 3.1 is guaranteed to converge asymptotically with probability one; see [27]. However, due to the increase in the variance of the importance weights, the raw SIS algorithm suffers from the "particle degeneracy" phenomenon [8, 14, 17]; that is, after a few steps, only a small number of particles will have normalized weights close to one, whereas the majority of the particles will have negligible weight. As a result of particle degeneracy, the SIS algorithm is inefficient, requiring the use of a large number of particles to achieve adequate performance.

Resampling Step

A possible approach to mitigate degeneracy is [17] to resample from the existing particle population with replacement according to the particle weights. Formally, after the normalization of importance weights Open image in new window , we draw Open image in new window with Open image in new window , and build a new particle set Open image in new window , Open image in new window , such that Open image in new window Open image in new window Open image in new window After the resampling step, the new selected trajectories Open image in new window Open image in new window Open image in new window Open image in new window Open image in new window Open image in new window are approximately distributed (see, e.g., [28]) according to the mixed posterior pdf Open image in new window so that we can reset all particle weights to Open image in new window .

Move Step

Although particle resampling according to the weights reduces particle degeneracy, it also introduces an undesirable side effect, namely, loss of diversity in the particle population as the resampling processes generate multiple copies of a small number or, in the extreme case, only one high-weight particle. A possible solution, see [16], to restore sample diversity without altering the sample statistics is to move the current particles Open image in new window to new locations Open image in new window using a Markov chain transition kernel Open image in new window , that is, invariant to the conditional mixture pdf Open image in new window . Provided that the invariance condition is satisfied, the new particle trajectories Open image in new window remain distributed according to Open image in new window and the associated particle weights may be kept equal to Open image in new window . A Markov chain that satisfies the desired invariance condition can be built using the following Metropolis-Hastings strategy [15, 18].

- (i)
Draw Open image in new window according to (2).

- (ii)
Draw Open image in new window according to (4) or (5).

- (iii)
- (iv)
Reset Open image in new window .

End-For.

### 3.3. Auxiliary Particle Filter

- (1)
*Pre-sampling Selection Step*For Open image in new window- (i)
Draw Open image in new window according to (2).

- (ii)
Draw Open image in new window according to (4) or (5).

- (iii)Compute the first-stage importance weights

using the likelihood function model in Section 2.2.

- (i)

- (2)
*Importance Sampling with Auxiliary Particles*For Open image in new window- (i)
Sample Open image in new window with Open image in new window .

- (ii)
Sample Open image in new window according to (2).

- (iii)
Sample Open image in new window according to (4) or (5).

- (iv)Compute the second-stage importance weights
End-For.

- (v)
Normalize the weights Open image in new window such that Open image in new window .

- (i)
- (3)
*Post-sampling Selection Step*For Open image in new window- (i)
Draw Open image in new window with Open image in new window .

- (ii)
Make Open image in new window Open image in new window and Open image in new window .

End-For.

- (iii)
Make Open image in new window and go back to step 1.

- (i)

### 3.4. Multiframe Detector/Tracker

*Monte Carlo estimate*, Open image in new window , of the posterior probability of target absence by dividing the number of particles for which Open image in new window by the total number of particles Open image in new window . The minimum probability of error test to decide between hypotheses Open image in new window and Open image in new window at frame Open image in new window is approximated then by the decision rule

Finally, if Open image in new window is accepted, the estimate Open image in new window of the target's kinematic state at instant Open image in new window is obtained from the Monte Carlo approximation of Open image in new window , which is computed by averaging out the particles Open image in new window such that Open image in new window .

## 4. Simulation Results

In this section, we quantify the performance of the proposed sequential Monte Carlo detector/tracker, both in the RS and APF configurations, using simulated infrared airborne radar (IRAR) data. The background clutter is simulated from real IRAR images from the MIT Lincoln Laboratory database, available at the CIS website, at Johns Hopkins University. An artificial target template representing a military vehicle is added to the simulated image sequence. The simulated target's centroid moves in the image from frame to frame according to the simple white-noise acceleration model in [3, 4] with parameters Open image in new window and Open image in new window second. A total of four rotated, scaled, or sheared versions of the reference template is used in the simulation.

The target's aspect changes from frame to frame following a known discrete-valued hidden Markov chain model where the probability of a transition to an adjacent aspect state is equal to 40%. In the notation of Section 2.1, that specification corresponds to setting Open image in new window , Open image in new window , Open image in new window if Open image in new window , and Open image in new window otherwise. All four templates are equally likely at frame zero, that is, Open image in new window for Open image in new window . The initial Open image in new window and Open image in new window positions of the target's centroid at instant zero are assumed to be uniformly distributed, respectively, between pixels 50 and 70 in the Open image in new window coordinate and pixels 10 and 20 in the Open image in new window coordinate. The initial velocities Open image in new window and Open image in new window are in turn Gaussian-distributed with identical means ( Open image in new window or Open image in new window pixels/frame) and a small standard deviation ( Open image in new window ).

Finally, the background clutter for the moving target sequence was simulated by adding a sequence of synthetic GMrf samples to a matrix of previously stored local means extracted from the database imagery. The GMrf samples were synthetized using correlation and prediction error variance parameters estimated from real data using the algorithms developed in [11, 12] see [4] for a detailed pseudocode.

Two video demonstrations of the operation of the proposed detector/tracker are available for visualization by clicking on the links in [29]. The first video (peak target-to-clutter ratio, or PTCR *≈* 10 dB) illustrates the performance over 50 frames of an 8 000-particle RS detector/tracker implemented as in Section 3.2, whereas the second video (PTCR *≈* 6.5 dB) demonstrates the operation over 60 frames of a 5 000-particle APF detector/tracker implemented as in Section 3.3. Both video sequences show a target of interest that is tracked inside the image grid until it disappears from the scene; the algorithm then detects that the target is absent and correctly indicates that no target is present. Next, once a new target enters the scene, that target is acquired and tracked accurately until, in the case of the APF demonstration, it leaves the scene and no target detection is once again correctly indicated.

Both video demos show the ability of the proposed algorithms to (1) detect and track a present target both inside the image grid and near its borders, (2) detect when a target leaves the image and indicate that there is no target present until a new target appears and (3), when a new target enters the scene, correctly detect that the target is present and track it accurately. In the sequel, for illustrative purposes only, we show in the paper the detection/tracking results for a few selected frames using the RS algorithm and a dataset that is different from the one shown in the video demos.

*excluding*the divergent realizations. Our simulation results suggest that, despite the reduction in the number of particles from 8000 to 5000, the APF tracker still outperforms the RS tracker, showing similar RMS error performance with a slightly lower divergence rate. For both filters, in the nondivergent realizations, the estimation error is higher in the initial frames and decreases over time as the target is acquired and new images are processed.

## 5. Preliminary Discussion on Multitarget Tracking

We have considered so far a single target with uncertain aspect (e.g., random orientation or scale). In theory, however, the same modeling framework could be adapted to a scenario where we consider multiple targets with known (fixed) aspect. In that case, the discrete state Open image in new window , rather than representing a possible target model, could denote instead a possible multitarget configuration hypothesis. For example, if we knew a priori that there is a maximum of Open image in new window targets in the field of view of the sensor at each time instant, then Open image in new window would take Open image in new window possible values corresponding to the hypotheses ranging from "no target present" to "all targets present" in the image frame at instant Open image in new window . The kinematic state Open image in new window , on the other hand, would have variable dimension depending on the value assumed by Open image in new window , as it would collect the centroid locations of all targets that are present in the image given a certain target configuration hypothesis. Different targets could be assumed to move independently of each other when present and to disappear only when they move out of the target grid as discussed in Section 2. Likewise, a change in target configuration hypotheses would result in new targets appearing in uniformly random locations as in (5).

The main difficulty associated with the approach described in the previous paragraph is however that, as the number of targets increases, the corresponding growth in the dimension of the state space is likely to exacerbate particle depletion, thus causing the detection/tracking filters to diverge if the number of particles is kept constant. That may render the direct application of the joint detection/tracking algorithms in this paper unfeasible in a multitarget scenario. The basic tracking routines discussed in the paper may be still viable though when used in conjunction with more conventional algorithms for target detection/acquisition and data association. For a review of alternative approaches to multitarget tracking, mostly for video applications, we refer the reader to [30, 31, 32, 33].

### 5.1. Likelihood Function Modification in a Multitarget Scenario

where Open image in new window is the long-vector representation of the clutter-free image of the Open image in new window th target under the target configuration hypothesis Open image in new window , assumed to be identically zero for target configurations under which the Open image in new window th target is not present. The sum of the data terms corresponds to the sum of the outputs of different correlation filters matched to each of the Open image in new window possible (fixed) target templates taking into account the spatial correlation of the clutter background. The energy terms, Open image in new window , are on the other hand constant with Open image in new window for most possible locations of targets Open image in new window and Open image in new window on the image grid, except when either one of the two targets or both are close to the image borders. Finally, for Open image in new window , the energy terms are zero for present targets that are sufficiently apart from each other and, therefore, most of the time, they do not affect the computation of the likelihood function. The terms Open image in new window must be taken into account, however, for overlapping targets; in this case, they may be computed efficiently exploring the sparse structure of Open image in new window and Open image in new window . For details, we refer the reader to future work.

### 5.2. Illustrative Example with Two Targets

*≈*12.5 dB, that preliminary acquisition was done by applying the differential filter in (16) to the initial frame, and then applying the output of the differential filter to a bank of two spatial matched filters as in (15), designed according to the signature coefficients, respectively, for targets 1 and 2. The outputs of the two matched filters minus the corresponding energy terms for targets 1 and 2, respectively, are finally added together and thresholded to provide the initial estimates of the location of the two targets. Note that the cross-energy terms discussed in Section 5.1 may be ignored in this case since we are assuming that the two targets are initially sufficiently far apart. Frames 1 and 10 of the simulated cluttered sequence with the two targets are shown in Figures 7(a) and 7(b) for illustration purposes.

## 6. Conclusions and Future Work

We discussed in this paper a methodology for joint detection and tracking of multiaspect targets in remote sensing image sequences using sequential Monte Carlo (SMC) filters. The proposed algorithm enables integrated, *multiframe* target detection and tracking incorporating the statistical models for target motion, target aspect, and spatial correlation of the background clutter. Due to the nature of the application, the emphasis is on detecting and tracking small, remote targets under additive clutter, as opposed to tracking nearby objects possibly subject to occlusion.

Two different implementations of the SMC detector/tracker were presented using, respectively, a resample-move (RS) particle filter and an auxiliary particle filter (APF). Simulation results show that, in scenarios with heavily obscured targets, the APF and RS configurations have similar tracking performance, but the APF algorithm has a slightly smaller percentage of divergent realizations. Both filters, on the other hand, were capable of correctly detecting the target in each frame, including accurately declaring absence of target when the target left the scene and, conversely, detecting a new target when it entered the image grid. The multiframe track-before-detect approach allowed for efficient detection of dim targets that may be near invisible in a single-frame but become detectable when seen across multiple frames.

The discussion in this paper was restricted to targets that assume only a finite number of possible aspect states defined on a library of target templates. As an alternative for future work, an appearance model similar to the one described in [24] could be used instead, allowing the discrete-valued aspect states Open image in new window to denote different classes of continuous-valued target deformation models, as opposed to fixed target templates. Similarly, the framework in this paper could also be modified to allow for multiobject tracking as indicated in Section 5.

## Notes

### Acknowledgment

Part of the material in this paper was presented at the 2005 IEEE Aerospace Conference.

## References

- 1.Doucet A, Godsill S, Andrieu C: On sequential Monte Carlo sampling methods for Bayesian filtering.
*Statistics and Computing*2000, 10(3):197-208. 10.1023/A:1008935410038CrossRefGoogle Scholar - 2.Bounds JK: The Infrared airborne radar sensor suite. In
*RLE Tech. Rep. 610*. Massachusetts Institute of Technology, Cambridge, Mass, USA; 1996.Google Scholar - 3.Bar-Shalom Y, Li X:
*Multitarget-Multisensor Tracking: Principles and Techniques*. YBS Publishing, Storrs, Conn, USA; 1995.Google Scholar - 4.Bruno MGS: Bayesian methods for multiaspect target tracking in image sequences.
*IEEE Transactions on Signal Processing*2004, 52(7):1848-1861. 10.1109/TSP.2004.828903CrossRefGoogle Scholar - 5.Bruno MGS, Moura JMF: Optimal multiframe detection and tracking in digital image sequences.
*Proceedings of the IEEE International Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey*5: 3192-3195.Google Scholar - 6.Bruno MGS, Moura JMF: Multiframe detector/tracker: optimal performance.
*IEEE Transactions on Aerospace and Electronic Systems*2001, 37(3):925-945. 10.1109/7.953247CrossRefGoogle Scholar - 7.Bruno MGS, Moura JMF: Multiframe Bayesian tracking of cluttered targets with random motion.
*Proceedings of the International Conference on Image Processing (ICIP '00), September 2000, Vancouver, BC, Canada*3: 90-93.Google Scholar - 8.Arulampalam MS, Maskell S, Gordon N, Clapp T: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking.
*IEEE Transactions on Signal Processing*2002, 50(2):174-188. 10.1109/78.978374CrossRefGoogle Scholar - 9.Moura JMF, Balram : Recursive structure of noncausal Gauss-Markov random fields.
*IEEE Transactions on Information Theory*1992, 38(2):334-354. 10.1109/18.119691MATHCrossRefGoogle Scholar - 10.Moura JMF, Bruno MGS: DCT/DST and Gauss-Markov fields: conditions for equivalence.
*IEEE Transactions on Signal Processing*1998, 46(9):2571-2574. 10.1109/78.709549CrossRefGoogle Scholar - 11.Moura JMF, Balram N: Noncausal Gauss Markov random fields: parameter structure and estimation.
*IEEE Transactions on Information Theory*1993, 39(4):1333-1355. 10.1109/18.243450MATHCrossRefGoogle Scholar - 12.Schweizer SM, Moura JMF: Hyperspectral imagery: clutter adaptation in anomaly detection.
*IEEE Transactions on Information Theory*2000, 46(5):1855-1871. 10.1109/18.857796MATHCrossRefGoogle Scholar - 13.Isard , Blake : A mixed-state condensation tracker with automatic model-switching.
*Proceedings of the 6th International Conference on Computer Vision, January 1998, Bombay, India*107-112.Google Scholar - 14.Doucet A, Freitas JFG, Gordon N: An introduction to sequential Monte Carlo methods. In
*Sequential Monte Carlo Methods in Practice*. Edited by: Doucet A, Freitas NFG, Gordon NJ. Springer, New York, NY, USA; 2001.CrossRefGoogle Scholar - 15.Bruno MGS, de Araújo RV, Pavlov AG: Sequential Monte Carlo filtering for multi-aspect detection/tracking.
*Proceedings of the IEEE Aerospace Conference, March 2005, Big Sky, Mont, USA*2092-2100.Google Scholar - 16.Gilks WR, Berzuini C: Following a moving target—Monte Carlo inference for dynamic Bayesian models.
*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*2001, 63(1):127-146. 10.1111/1467-9868.00280MATHMathSciNetCrossRefGoogle Scholar - 17.Gordon N, Salmond D, Ewing C: Bayesian state estimation for tracking and guidance using the bootstrap filter.
*Journal of Guidance, Control, and Dynamics*1995, 18(6):1434-1443. 10.2514/3.21565CrossRefGoogle Scholar - 18.Robert CP, Casella G:
*Monte Carlo Statistical Methods, Springer Texts in Statistics*. Springer, New York, NY, USA; 1999.CrossRefGoogle Scholar - 19.Pitt MK, Shephard N: Filtering via simulation: auxiliary particle filters.
*Journal of the American Statistical Association*1999, 94(446):590-599. 10.2307/2670179MATHMathSciNetCrossRefGoogle Scholar - 20.Isard M, Blake A: Condensation—conditional density propagation for visual tracking.
*International Journal of Computer Vision*1998, 29(1):5-28. 10.1023/A:1008078328650CrossRefGoogle Scholar - 21.Li B, Chellappa R: A generic approach to simultaneous tracking and verification in video.
*IEEE Transactions on Image Processing*2002, 11(5):530-544. 10.1109/TIP.2002.1006400CrossRefGoogle Scholar - 22.Zhou SK, Chellappa R, Moghaddam B: Visual tracking and recognition using appearance-adaptive models in particle filters.
*IEEE Transactions on Image Processing*2004, 13(11):1491-1506. 10.1109/TIP.2004.836152CrossRefGoogle Scholar - 23.Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm.
*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*1977, 39(1):1-38.MATHMathSciNetGoogle Scholar - 24.Giebel J, Gavrila DM, Schnörr C: A bayesian framework for multi-cue 3D object tracking.
*Proceedings of the 8th European Conference on Computer Vision (ECCV '04), May 2004, Prague, Czech Republic*4: 241-252.Google Scholar - 25.Vermaak J, Doucet A, Pérez P: Maintaining multi-modality through mixture tracking.
*Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03), October 2003, Nice, France*2: 1110-1116.CrossRefGoogle Scholar - 26.Okuma K, Taleghani A, de Freitas N, Little JJ, Lowe DG: A boosted particle filter: multitarget detection and tracking.
*Proceedings of the 8th European Conference on Computer Vision (ECCV '04), May 2004, Prague, Czech Republic*3021: 28-39.Google Scholar - 27.Geweke J: Bayesian inference in econometric models using Monte Carlo integration.
*Econometrica*1989, 57(6):1317-1339. 10.2307/1913710MATHMathSciNetCrossRefGoogle Scholar - 28.Liu JS, Chen R, Logvinenko T: A theoretical framework for sequential importance sampling with resampling. In
*Sequential Monte Carlo Methods in Practice*. Edited by: Doucet A, Freitas JFG, Gordon NJ. Springer, New York, NY, USA; 2001:225-246.CrossRefGoogle Scholar - 29.Video Demonstration 1 & 2 http://www.ele.ita.br/~bruno
- 30.Ng W, Li J, Godsill S, Vermaak J: Tracking variable number of targets using sequential Monte Carlo methods.
*Proceedings of the IEEE/SP 13th Workshop on Statistical Signal Processing, July 2005, Bordeaux, France*1286-1291.Google Scholar - 31.Ng W, Li J, Godsill S, Vermaak J: Multitarget tracking using a new soft-gating approach and sequential Monte Carlo methods.
*Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA*4: 1049-1052.Google Scholar - 32.Hue C, Le Cadre J-P, Pérez P: Tracking multiple objects with particle filtering.
*IEEE Transactions on Aerospace and Electronic Systems*2002, 38(3):791-812. 10.1109/TAES.2002.1039400CrossRefGoogle Scholar - 33.Czyz J, Ristic B, Macq B: A color-based particle filter for joint detection and tracking of multiple objects.
*Proceedingd of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA*217-220.Google Scholar

## Copyright information

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.