Multicamera object tracking using surprisal observations in visual sensor networks
 1.2k Downloads
 3 Citations
Abstract
In this work, we propose a multicamera object tracking method with surprisal observations based on the cubature information filter in visual sensor networks. In multicamera object tracking approaches, multiple cameras observe an object and exchange the object’s local information with each other to compute the global state of the object. The information exchange among the cameras suffers from certain bandwidth and energy constraints. Thus, allowing only a desired number of cameras with the most informative observations to participate in the information exchange is an efficient way to meet the stringent requirements of bandwidth and energy. In this paper, the concept of surprisal is used to calculate the amount of information associated with the observations of each camera. Furthermore, a surprisal selection mechanism is proposed to facilitate the cameras to take independent decision on whether their observations are informative or not. If the observations are informative, the cameras calculate the local information vector and matrix based on the cubature information filter and transmit them to the fusion center. These cameras are called as surprisal cameras. The fusion center computes the global state of the object by fusing the local information from the surprisal cameras. Moreover, the proposed scheme also ensures that on average, only a desired number of cameras participate in the information exchange. The proposed method shows a significant improvement in tracking accuracy over the multicamera object tracking with randomly selected or fixed cameras for the same number of average transmissions to the fusion center.
Keywords
Kalman filters Information filters State estimation Information entropy1 Introduction
Object tracking is an extensively studied topic in visual sensor networks (VSN). A VSN is a network composed of smart cameras; they capture, process, and analyze the image data locally and exchange extracted information with each other [1]. The main applications of a VSN are indoor and/or outdoor surveillance, e.g., airports, massive waiting rooms, forests, deserts, inaccessible locations, and natural environments [2]. In general, the typical task of a VSN is to detect and track specific objects. The objects are usually described by a state that includes various characteristics of the objects such as position, velocity, appearance, behavior, shape, and color. These states can be used to detect and track the objects. Recursive state estimation algorithms are predominantly used to track objects in a VSN [3].
In [4, 5, 6, 7, 8, 9, 10, 11], the authors presented several Kalman filter (KF)based object tracking methods. Extended Kalman filter (EKF)based object tracking method is proposed in [12]. The unscented Kalman filter (UKF) is applied for visual contour tracking in [13] and object tracking in [14]. In terms of object tracking in a VSN, the cubature Kalman filter (CKF) is primarily applied in our previous work [15]. In [16, 17, 18, 19, 20, 21, 22, 23, 24], the authors presented particle filter (PF)based object tracking. The object tracking methods based on these conventional Bayesian filters have a varying degree of complexity and accuracy.
In general, the performance of the tracking algorithms suffers from different adverse effects such as distance or orientation of the camera, and occlusions. However, a VSN with overlapping field of views (FOVs) is capable of providing multiple observations of the same object simultaneously. The authors in [25] presented a distributed and collaborative sensing mechanism to improve the observability of the objects by dynamically changing the camera’s pan, tilt, and zoom. Other examples of distributed object tracking methods are presented in [26] and [27].
Recently, information filters have emerged as suitable methods for multisensor state estimation [28]. In information filtering, the information vector and matrix are computed and propagated over time instead of the state vector and its error covariance. The information matrix is the inverse of the state error covariance matrix. The information vector is the product of the information matrix and state vector. The information filters have an inherent information fusion mechanism which makes them more suitable for multicamera object tracking. A more detailed description of information filters is given in Section 3. The authors in [29] and [30] presented information weighted consensusbased distributed object tracking with an underlying KF or a distributed maximum likelihood estimation. In our work [31], we have presented a robust cubature information filter (CIF)based distributed object tracking in VSNs. However, the limited processing, communication, and energy capabilities of the cameras in a VSN present a major challenge.
Nowadays, VSNs tend to evolve into largescale networks with limited bandwidth and energy reservoirs. This allows a large number of cameras to observe a single object. In spite of the improved tracking accuracy, the information exchange of the large number of observations among the cameras increases the communication overhead and energy consumption. Hence, allowing only a desired number of cameras to participate in the information exchange is a way to meet the stringent requirements of bandwidth and energy.
Estimating an object’s state with a selected set of cameras is a wellinvestigated topic. Several camera selection mechanisms have been proposed in literature to minimize and/or maximize different metrics such as estimation accuracy, monitoring area, number of transmissions, and amount of data transfer. In [32], the authors presented an object tracking method based on fuzzy automaton in handing over to expand the monitoring area. This method selects a single best camera to control and track the objects by comparing its rank with the neighboring cameras. This method fails to select multiple cameras, and cameras have to communicate with each other to select the best camera. In [33], the authors presented an efficient cameratasking approach to minimize the visual hull area (maximal area that could be occupied by objects) for a given number of objects and cameras. They also presented several methods to select a subset of cameras based on the positions of the objects and cameras to minimize the visual hull area. If the objects are recognized in the vicinity of a certain location, then a subset of cameras that is best suited to observe this location performs the tracking. This method is capable of selecting multiple cameras but not the desired number of cameras on average. In [34], the authors presented a framework for dynamically selecting a subset of cameras to track people in a VSN with limited network resources to achieve the best possible tracking performance. However, the camera selection decision is made at the FC based on training data and the selection is broadcast to the cameras in the VSN. Hence, this selection process does not depend on the true observations.
The observations received by the cameras in the VSN are typically realizations of a random variable. Hence, they contain a varying degree of information about the state of the object. They can be broadly classified into informative and uninformative observations. The noninformative observations do not contribute significantly to the tracking accuracy. Hence, a camera selection strategy that allows only a desired number of cameras with most informative observations to participate in the information exchange and discards the cameras with noninformative observations is an efficient way to meet the requirements of bandwidth and energy.
In [35], the authors presented an entropybased algorithm that dynamically selects multiple cameras to reduce transmission errors and subsequently communication bandwidth. In this work, the cameras in the VSN use the extended information filter (EIF) as the local filter and calculate the expected information gain (EIG) in the form of a logarithmic ratio of the expected and posterior information matrices. If the information gain is greater than the cost of transmissions, then the cameras participate in the information fusion. The calculated EIG in this method does not depend on the measurements directly, and the cluster head has to run an optimization step to select the best possible cameras at each step. Moreover, this method is not capable of selecting only a desired number of cameras on average. In [36], a camera set is selected based on an individual image quality metric (IQM) for spherical objects. The cameras that detect the spherical target are ranked in ascending order based on their value of the local IQM, and the required number of cameras with highest IQM are chosen. This approach is limited to spherical objects. However, it can be easily extended to nonspherical objects. The major disadvantage of this method is either all the cameras in the VSN or the FC should know IQM of all the other cameras in the VSN. Hence, this method does not ensure cameras to take independent decisions thus restricting the scalability.
In our work, a multicamera object tracking method based on the CIF is proposed in which the cameras can take independent decisions on whether or not to participate in information exchange. Furthermore, the proposed method also ensures that on average, only a desired number of cameras participate in the information exchange to meet bandwidth requirements. We model the state of an object utilizing a dynamic state representation that includes its position and velocity on the ground plane. Further, we consider a VSN with overlapping FOVs; thus, multiple cameras can observe an object simultaneously. Each camera in the VSN has a local CIF on board. Hence, they can calculate the local information metrics (information contribution vector and matrix) based on their observations. The cameras that can observe a specific object form a cluster (observation cluster) with an elected fusion center (FC). In this paper, we consider the concept of surprisal [37] to evaluate the amount of information in the observations received by the cameras in the VSN. The surprisal of the measurement residual indicates the amount of new information received from the corresponding observation. The observations of a camera are informative only if the corresponding surprisal of the measurement residual is greater than a threshold. The threshold is calculated as a function of the ratio of the number of desirable cameras and the total number of cameras in the observation cluster. This ensures that on average, only the desired number of cameras are selected as the cameras with informative observations (surprisal cameras). The surprisal cameras calculate the local information metrics based on the CIF and transmit them to the FC. Then, the FC fuses the surprisal local information metrics to achieve the global state by using the inherent fusion mechanism of the CIF. The proposed selection mechanism only requires the knowledge of the total number of cameras in the observation cluster and the desired number of cameras. Further, we compare the proposed multicamera object tracking method with surprisal cameras with multicamera object tracking with random and fixed cameras using simulated and experimental data.
The paper is organized as follows: Section 2 describes the considered VSN with motion and observation models. Section 3 presents theoretical concepts of information filtering. Section 4 describes the camera selection based on the surprisal of the measurement residual and the calculation of the surprisal threshold. Section 5 explains the proposed CIFbased multicamera object tracking with surprisal cameras. Section 6 evaluates the proposed method based on simulation and experimental data. Finally, Section 7 presents the conclusions.
2 System model
where v _{ i,k } is an IID measurement noise vector with covariance R _{ i,k }. The measurement function h _{ i,k } is the nonlinear homography function which converts the object’s coordinates from the ground to the image plane. The considered motion model (1) and measurement model (4) are adapted from [27].
3 Information filtering
where \(\widehat {\mathbf {x}}_{k1k1}\) and P _{ k−1k−1} are the estimated global state vector and error covariance matrix at time k−1. At time k and camera c _{ i }, the information filter has two steps: time and measurement update.
3.1 Time update
where \(\widehat {\mathbf {x}}_{i,kk1}\) and P _{ i,kk−1} are the predicted state vector and the error covariance matrix, respectively.
3.2 Measurement update
where \(\widehat {\mathbf {z}}_{i,kk1}\) is the predicted measurement. In this work, the CIF is used at the cameras to track the objects locally. We refer to Appendices Appendix 1: time update (TU) and Appendix 2: measurement update (MU) and [38] for the CIF algorithm.
3.3 Information fusion
where \(\widehat {\mathbf {y}}_{o,kk1}\) and Y _{ o,kk−1} are the predicted information vector and matrix at the FC, respectively.
4 Surprisal camera selection
The VSNs usually have limited bandwidth and energy reservoirs. Therefore, it might be necessary that only a desired number of cameras (subset) transmit their local information to the FC. On the other hand, this can lead to decreased tracking accuracy. A better tracking accuracy can be achieved by selecting the cameras based on the information associated with their observations. This strategy improves the accuracy of the global state estimation under the given bandwidth and energy constraints. The information content associated with the observations can be calculated by applying the concept of selfinformation or surprisal.
4.1 Surprisal
where Pr(x) is the probability of the outcome x and the base of the logarithm can be considered as 2, 10, or e. In this paper, the surprisal is calculated with the natural logarithm (base e) for the sake of mathematical simplification. The surprisal of the outcome of a random variable depends only on the probability of the corresponding outcome Pr(x). A highly probable outcome of a random variable is less surprising and vice versa.
4.2 Surprisal of measurement residual
The cameras with enough informative measurements are called surprisal cameras. The threshold χ _{ k } has to be defined based on the bandwidth and energy constraints in such a way that at each time k, on average, only a given number of cameras are selected as surprisal cameras.
4.3 Surprisal threshold
where \(\mathrm {F}{\chi ^{2}_{n_{\mathbf {z}}}}\) is the cumulative distribution function of the chisquare distribution \(\chi ^{2}_{n_{\mathbf {z}}}\) with a degree of freedom of n _{ z }.
Hence, the surprisal threshold β _{ k } at time k can be calculated by using the knowledge of the number of cameras in the observation cluster C _{ k } and the number of desirable surprisal cameras l _{ k }. Thus, the cameras c _{ i } in the cluster can independently decide whether their local observations are informative or not.
5 Multicamera object tracking with surprisal cameras (MOTSC)

Surprisal threshold calculation: The surprisal threshold can be calculated with the knowledge of the size C _{ k } of the observation cluster and desired size l _{ k } of the surprisal cluster as shown in (25). Hence, the FC which knows this information calculates and broadcasts the surprisal threshold whenever the observation and surprisal cluster sizes change.

Local filtering: The FC performs the local estimation based on its measurement z _{ o,k } by using the onboard CIF. Firstly, the FC predicts the information vector and matrix \(\left (\widehat {\mathbf {y}}_{o,kk1}, \mathbf {Y}_{o,kk1}\right)\) from the prior global information vector and matrix \(\left (\widehat {\mathbf {y}}_{k1k1}, \mathbf {Y}_{k1k1}\right)\) as shown in Appendix Appendix 1: time update (TU). Then, it computes the information contribution vector and matrix (i _{ o,k },I _{ o,k }) by using its own local observations z _{ o,k } as shown in Appendix Appendix 2: measurement update (MU).

Information fusion: The FC receives a set of information contribution metrics (i _{ i,k },I _{ i,k }) where i=1,2,⋯,l _{ k } from the surprisal cameras in the cluster. The global information vector and information matrix \(\left (\widehat {\mathbf {y}}_{kk}, \mathbf {Y}_{kk}\right)\) are obtained by fusing the received surprisal information contributions and its own information contributions (i _{ f,k },I _{ f,k }) with the predicted information vector and matrix \(\left (\widehat {\mathbf {y}}_{kk1}, \mathbf {Y}_{kk1}\right)\).

Global state dissemination: After the information vector and matrix \(\left (\widehat {\mathbf {y}}_{kk}, \mathbf {Y}_{kk}\right)\) are computed, the FC broadcasts it in the network. Hence, the cameras in the network have the global knowledge which can be used as prior information for the local filtering in the time step k+1.

Time update: The camera predicts the information vector and matrix \(\left (\widehat {\mathbf {y}}_{i,kk1}, \mathbf {Y}_{i,kk1}\right)\) from the prior global information vector and matrix \(\left (\widehat {\mathbf {y}}_{k1k1}, \mathbf {Y}_{k1k1}\right)\) using the CIF time update as shown in Appendix Appendix 1: time update (TU).

Surprisal update: Each camera receives the surprisal threshold β _{ k } from the FC whenever the observation and/or surprisal cluster size changes. Upon receiving the measurement z _{ i,k }, each camera c _{ i } calculates the corresponding measurement residual and innovation covariance (e _{ i,k },P _{ z z,i,k,}). The proposed surprisal threshold rule in Section 4.3 is used to determine whether it is a surprisal camera or not. If the camera is a surprisal camera, the information contribution vector and matrix (i _{ i,k },I _{ i,k }) are calculated according to (9) and (10). Thereafter, the information metrics are transmitted to the FC. If the camera is not a surprisal camera, then the surprisal update is aborted.
After the surprisal update, each camera c _{ i } in the network receives the global information \(\left (\widehat {\mathbf {y}}_{kk}, \mathbf {Y}_{kk}\right)\) from the FC. Hence, each camera in the network has the knowledge of the global state of the object which can also be used as the prior information in the local estimation for the next time step k+1.
In this paper, the FC is assumed to be fixed and not effected by node failures. It is also assumed that the delays in transmitting local information to the FC are all less than the sampling interval of the cameras. Thus, the FC can fuse the arriving information contribution in time. The communication links in the network are assumed to be perfect. Hence, the only cause of a missing information metric from a camera is that the corresponding observations are not informative enough.
6 Results
In this section, the efficiency of the proposed MOTSC method is evaluated based on the simulation and experimental data. In our approach, the efficiency is defined in terms of the sum of the root mean square errors (RMSEs) of the estimated global state and the ground truth in x and y directions. Moreover, the energy and bandwidth efficiency are calculated in terms of the average number of transmissions from the cameras in the observation cluster to the FC.
6.1 Simulation results
6.1.1 Scenario 1
In this scenario, the accuracy of the CIF and EIFbased object tracking methods in the VSN are compared. In this comparison, the proposed surprisal selection method is not employed. Hence, all the cameras in the observation cluster participate in the information fusion. In the abovementioned simulation setup, each camera calculates the local information metrics based on the local observations. The information metrics from the local cameras are fused at the FC. Moreover, the process noise covariance Q _{ k } and measurement noise covariance R _{ k } are considered to be known to all the cameras in the cluster. The cluster is also assumed to be fully connected with perfect communication links to the FC.
6.1.2 Scenario 2

Multicamera object tracking with random cameras (MOTRC): A random subset of cameras in the observation cluster transmit their local information metrics to the FC independent of the information contained in their measurements.

Multicamera object tracking with fixed cameras (MOTFC): A fixed subset of cameras in the observation cluster transmit their local information metrics to the FC.

Multicamera object tracking with best cameras (MOTBC): All the cameras in the observation cluster C _{ k } send their surprisal of the measurement residual to the FC. The FC ranks the cameras in the ascending order of their surprisal score and informs l _{ k } best cameras to share their local information metrics. Then, the informed cameras send their local information metrics to the FC. The total number of transmissions to and from the FC involved in this method are C _{ k }+2l _{ k }. The MOTBC method is an adoption from [36].

Multicamera object tracking method with active sensing cameras (MOTAC): The FC activates or deactivates the cameras from participating in information exchange by maximizing rewardcost utility function as given in [35]. The reward is expected information gain (EIG). At each time k, the FC evaluates the utility function for all possible activated and deactivated camera combinations before activating the best cameras to participate in the information fusion. Refer to [35] for complete details.
6.2 Experimental results
7 Conclusions
In this work, a multicamera object tracking with surprisal cameras in a VSN is proposed. The cameras in the VSN that can observe an object form an observation cluster with a fixed FC. However, due to bandwidth constraints and energy limitations, it is usually desirable to have only a subset of cameras exchanging their local information to the fusion center. In our approach, each camera runs a local object tracking algorithm based on the onboard CIF. Each camera independently determines whether its observations are informative enough or not by using the surprisal of its measurement residual. Only if a camera’s measurements are informative enough (surprisal cameras), it calculates and transmits the local information vector and matrix to the fusion center. The global state of the object is obtained by fusing the local information from surprisal cameras at the fusion center. The proposed scheme also ensures that on average, only a desired number of cameras participate in the information exchange. The proposed multicamera object tracking with surprisal cameras shows a considerable improvement in tracking accuracy over the multicamera object tracking with random and fixed cameras for the same number of transmissions to the fusion center.
8 Endnote
^{1} In general, the surprisal is defined for the discrete random variables (DRV). Hence, we are considering the innovation to be a DRV.
9 Appendices
The multisensor CIF constitutes of three main steps: time update and measurement update at each sensor i and time k.
9.1 Appendix 1: time update (TU)
 1.Calculate the state estimate$$ \widehat{\mathbf{x}}_{k1k1} = \mathbf{Y}_{k1k1}\widehat{\mathbf{y}}_{k1k1}. $$
 2.Compute the cubature points m=(1,2,…,2n _{ x })where n _{ x } is the length of the state vector. ξ _{ m } represent the mth intersection point of the surface of the ndimensional unit sphere and its axes.$$\mathbf{cp}_{m,k1\mid k1} = \sqrt{\mathbf{Y}^{1}_{k1\mid k1}}\xi_{m} + \widehat{\mathbf{x}}_{k1\mid k1}, $$
 3.Propagate the cubature points through the motion model$$\mathbf{x}^{*}_{m,k \mid k1} = \mathbf{f}_{i,k}\left(\mathbf{cp}_{m,i,k1\mid k1}\right). $$
 4.Calculate the predicted state as$$\widehat{\mathbf{x}}_{i,k\mid k1} = \frac{1}{2n_{\mathbf{x}}}\sum^{2n_{\mathbf{x}}}_{m=1} \mathbf{x}^{*}_{m,i,k\mid k1}. $$
 5.Calculate the predicted error covariance aswhere Q _{ i,k } is the process noise covariance. The predicted weighted centered matrix M _{ i,kk−1} is given as$$\textbf{P}_{kk1} = \textbf{M}_{i,kk1}\textbf{M}^{T}_{i,kk1} + \textbf{Q}^{s}_{i,k}, $$$$\begin{aligned} \mathbf{M}_{i,kk1} &= \frac{1}{\sqrt{2n}} \left[\mathbf{x}^{*}_{1,i,k \mid k1} \widehat{\mathbf{x}}_{i,k\mid k1} \quad \mathbf{x}^{*}_{2,i,k \mid k1}\right. \\ &\quad\left. \widehat{\mathbf{x}}_{i,k\mid k1} \cdots \mathbf{x}^{*}_{2n,i,k \mid k1} \widehat{\mathbf{x}}_{i,k\mid k1}\right]. \end{aligned} $$
 6.Compute the predicted information matrix and predicted information vector$$ \mathbf{Y}_{i,kk1} = \mathbf{P}^{1}_{i,kk1}, $$$$ \widehat{\mathbf{y}}_{i,kk1} = \mathbf{Y}_{i,kk1}\widehat{\mathbf{x}}_{i,kk1}. $$
9.2 Appendix 2: measurement update (MU)
 1.Calculate the cubature points$$ \mathbf{cp}_{m,i,k\mid k1} = \sqrt{\mathbf{P}_{i,k\mid k1}}\xi_{m} + \widehat{\mathbf{x}}_{i,k\mid k1}. $$
 2.Propagate the cubature points through the observation function$$\mathbf{z}^{*}_{m,i,k\mid k1} = \mathbf{h}_{i,k}\left(\mathbf{cp}_{m,i,k\mid k1}\right). $$
 3.Calculate the predicted measurement$$\widehat{\mathbf{z}}_{i,k\mid k1} = \frac{1}{2n_{\mathbf{x}}}\sum^{2n_{\mathbf{x}}}_{m=1} \mathbf{z}^{*}_{m,i,k\mid k1}. $$
 4.Calculate the measurement residual$$\mathbf{e}_{i,k} = \mathbf{z}_{i,k}  \widehat{\mathbf{z}}_{i,k\mid k1}. $$
 5.Calculate the cross covariance$$\begin{aligned} \mathbf{P}_{\mathbf{xz},i,k\mid k1} &= \frac{1}{2n}\sum^{2n}_{m=1} \mathbf{cp}_{m,i,k\mid k1}\mathbf{z}^{*T}_{m,i,k\mid k1} \\ &\quad\widehat{\mathbf{x}}_{i,k\mid k1}\widehat{\mathbf{z}}^{T}_{i,k\mid k1}. \end{aligned} $$
 6.Calculate the information contribution matrixwhere R _{ i,k } is the measurement noise covariance matrix.$$ \mathbf{I}_{i,k} = \mathbf{Y}_{i,kk1}\mathbf{P}_{\mathbf{xz},i,k\mid k}\mathbf{R}^{1}_{i,k}\mathbf{P}^{T}_{\mathbf{xz},i,k\mid k1}\mathbf{Y}^{T}_{i,kk1}, $$
 7.Compute the information contribution vector$$\begin{aligned} \mathbf{i}_{i,k} &= \mathbf{Y}_{i,kk1}\mathbf{P}_{\mathbf{xz},i,k\mid k}\mathbf{R}^{1}_{i,k} \\ &\quad\left(\mathbf{e}_{i,k}+\mathbf{P}^{T}_{\mathbf{xz},i,k\mid k1}\mathbf{Y}^{T}_{i,kk1}\widehat{\mathbf{x}}_{i,k\mid k1}\right). \end{aligned} $$
Notes
Acknowledgements
This work was supported in part by the EACEA Agency of the European Commission under EMJD ICE FPA no. 20100012. The work has also been supported in part by the ERDF, KWF, and BABEG under grant KWF20214/21530/32602 (ICE Booster). It has been performed in the research cluster Lakeside Labs.
References
 1.B Rinner, M Quaritsch, W Schriebl, T Winkler, W Wolf, The evolution from single to pervasive smart cameras. Paper presented at the 2nd ACM/IEEE international conference on distributed smart cameras (IEEE, Stanford, CA, 2008).Google Scholar
 2.S Soro, W Heinzelman, A survey of visual sensor networks. Adv. Multimed (2009).Google Scholar
 3.A Yilmaz, O Javed, M Shah, Object tracking: a survey. ACM Comput. Surv. 38(4), 1–45 (2006).CrossRefGoogle Scholar
 4.SK Weng, CM Kuo, SK Tu, Video object tracking using adaptive Kalman filter. J. Visual Commun. Image Represent. 17(6), 1190–1208 (2006).CrossRefGoogle Scholar
 5.H Wang, D Suter, K Schindler, C Shen, Adaptive object tracking based on an effective appearance filter. IEEE Trans. Pattern Anal. Mach. Intell. 29(9), 1661–1667 (2007).CrossRefGoogle Scholar
 6.WJ Liu, YJ Zhang, Edgecolourhistogram and Kalman filterbased realtime object tracking. J. Tsinghua Univ. (Sci. Technol). 48(7) (2008).Google Scholar
 7.R OlfatiSaber, NF Sandell, Distributed tracking in sensor networks with limited sensing range. Am. Control Conf, 3157–3162 (2008).Google Scholar
 8.C Soto, S Bi, AK RoyChowdhury, Distributed multitarget tracking in a selfconfiguring camera network. IEEE Conf. Comput. Vis. Pattern Recogn, 1486–1493 (2009).Google Scholar
 9.HM Wang, LL Huo, J Zhang, Target tracking algorithm based on dynamic template and Kalman filter. IEEE Int. Conf. Commun. Softw. Netw, 330–333 (2011).Google Scholar
 10.B Song, C Ding, AT Kamal, JA Farrell, AK RoyChowdhury, Distributed camera networks. IEEE Signal Process. Mag. 28(3), 20–31 (2011).CrossRefGoogle Scholar
 11.SY Chen, Kalman filter for robot vision: a survey. IEEE Trans. Ind. Electron. 59(11), 4409–4420 (2012).CrossRefGoogle Scholar
 12.R Rosales, S Sclaroff, Improved tracking of multiple humans with trajectory prediction and occlusion modeling. IEEE CVPR Workshop Int. Vis. Motion (1998).Google Scholar
 13.P Li, T Zhang, B Ma, Unscented Kalman filter for visual curve tracking. Image and Vision Comput. 22(2), 157–164 (2004).CrossRefGoogle Scholar
 14.M Meuter, U Iurgel, SB Park, A Kummert, The unscented Kalman filter for pedestrian tracking from a moving host. IEEE Intell. Veh. Symp, 37–42 (2008).Google Scholar
 15.VP Bhuvana, M Schranz, M Huemer, B Rinner, Distributed object tracking based on cubature Kalman filter. Asilomar Conf. Signals, Syst. Comput, 423–427 (2013).Google Scholar
 16.K Nummiaro, E KollerMeier, L Van Gool, An adaptive colorbased particle filter, image and vision computing. 21(1), 99–110 (2003).Google Scholar
 17.K Okuma, A Taleghani, N Freitas, JJ Little, DG Lowe, A boosted particle filter: multitarget detection and tracking. Eur. Conf. Comput. Vis (2004).Google Scholar
 18.Y Rui, Y Chen, Better proposal distributions: object tracking using unscented particle filter. IEEE Comput. Soc. Conf. Comput. Vis Pattern Recognit. 2:, 786–793 (2001).Google Scholar
 19.CC Wang, C Thorpe, S Thrun, M Hebert, H DurrantWhyte, Simultaneous localization, mapping and moving object tracking. Int. J. Robot. Res. 26:, 889–916 (2007).CrossRefGoogle Scholar
 20.Y Rathi, N Vaswani, A Tannenbaum, A Yezzi, Tracking deforming objects using particle filtering for geometric active contours. IEEE Transactions on Pattern Analysis and Machine Intelligence. 29(8), 1470–1475 (2007).CrossRefGoogle Scholar
 21.Y Li, H Ai, T Yamashita, S Lao, M Kawade, Tracking in low frame rate video: a cascade particle filter with discriminative observers of different life spans. IEEE Trans. Pattern Anal. Mach. Intell. 30(10), 1728–1740 (2008).CrossRefGoogle Scholar
 22.MD Breitenstein, F Reichlin, B Leibe, E KollerMeier, LV Gool, Robust trackingbydetection using a detector confidence particle filter. IEEE Int. Conf. Comput. Vis, 1515–1522 (2009).Google Scholar
 23.AD Bimbo, F Dini, Particle filterbased visual tracking with a first order dynamic model and uncertainty adaptation. Comput. Vis. Image Underst. 115(6), 771–786 (2011).CrossRefGoogle Scholar
 24.Z Ni, S Sunderrajan, A Rahimi, BS Manjunath, Distributed particle filter tracking with online multiple instance learning in a camera sensor network. 17th IEEE Int. Conf. Image Process, 37–40 (2010).Google Scholar
 25.C Ding, B Song, AA Morye, JA Farrell, AKR Chowdhury, Collaborative sensing in a distributed PTZ camera network. IEEE Trans. Image Process. 21(7), 3282–3295 (2012).MathSciNetCrossRefGoogle Scholar
 26.AT Kamal, JA Farrell, AK RoyChowdhury, Consensusbased distributed estimation in camera networks. IEEE Int. Conf. Image Process, 1109–1112 (2012).Google Scholar
 27.H Medeiros, J Park, AC Kak, Distributed object tracking using a clusterbased Kalman filter in wireless camera networks. IEEE J. Sel. Top. Signal Process. 2(4), 448–463 (2008).CrossRefGoogle Scholar
 28.S Dan, Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches (John Wiley & Sons, 2006).Google Scholar
 29.AT Kamal, C Ding, B Song, JA Farrell, AK RoyChowdhury, A generalized Kalman consensus filter for widearea video networks. IEEE Conf. Decis. Control. Eur. Control, 7863–7869 (2011).Google Scholar
 30.AT Kamal, JA Farrell, AK RoyChowdhury, Information weighted consensus. IEEE Annu. Conf. Decis. Control (2012).Google Scholar
 31.VP Bhuvana, M Huemer, CS Regazzoni, Distributed object tracking based on square root cubature Hinfinity information filter. IEEE Int. Conf. Inf. Fusion, 1–6 (2014).Google Scholar
 32.K Morioka, K Szilveszter, JH Lee, P Korondi, H Hashimoto, A cooperative object tracking system with fuzzybased adaptive camera selection. Int. J. smart Sens. Intell. Syst. 3:, 338–58 (2010).Google Scholar
 33.DB Yang, J Shin, AO Ercan, LJ Guibas, Sensor tasking for occupancy reasoning in a network of cameras. Stanf. Netw. Res. Center (2010).Google Scholar
 34.L Tessens, M Morbee, H Aghajan, W Philips, Camera selection for tracking in distributed smart camera networks. ACM Trans. Sensor Netw. 10:, 1–33 (2014).CrossRefGoogle Scholar
 35.A de San Bernabe, JR Martinezde Dios, A Ollero, Entropyaware clusterbased object tracking for camera wireless sensor networks. IEEE/RSJ Int. Conf. Intell. Robot. Syst, 3985–3992 (2012).Google Scholar
 36.E Shen, R Hornsey, in Proceedings of the 5th ACM/IEEE International Conference on Distributed Smart Cameras. Local image quality metric for a distributed smart camera network with overlapping FOVs, (2011), pp. 1–6.Google Scholar
 37.CE Shannon, A mathematical theory of communications. Bell Syst. Technical J. 27:, 379–423 (1948).MathSciNetCrossRefMATHGoogle Scholar
 38.I Arasaratnam, S Haykin, Cubature Kalman filters. IEEE Trans. Auto. Control. 54(6), 1254–1269 (2009).CrossRefGoogle Scholar
 39.M Schranz, B Rinner, Resourceaware state estimation in visual sensor networks with dynamic clustering. 4th Int. Conf. Sensor Netw, 10 (2015).Google Scholar
 40.B Dieber, J Simonjan, L Esterle, B Rinner, G Nebehay, R Pflugfelder, GJ Fernandez, Ella: Middleware for multicamera surveillance in heterogeneous visual sensor networks. ACM/IEEE Int. Conf. Distrib. Smart Cameras, 1–6 (2013).Google Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.