# A recursive kinematic random forest and alpha beta filter classifier for 2D radar tracks

- 915 Downloads

**Part of the following topical collections:**

## Abstract

In this work, we show that by using a recursive random forest together with an alpha beta filter classifier, it is possible to classify radar tracks from the tracks’ kinematic data. The kinematic data is from a 2D scanning radar without Doppler or height information. We use random forest as this classifier implicitly handles the uncertainty in the position measurements. As stationary targets can have an apparently high speed because of the measurement uncertainty, we use an alpha beta filter classifier to classify stationary targets from moving targets. We show an overall classification rate from simulated data at 82.6 % and from real-world data at 79.7 %. Additional to the confusion matrix, we also show recordings of real-world data.

### Keywords

Radar Classification Random forest Alpha beta filter Kinematic## 1 Introduction

The increasing demand for protection and surveillance of the coastal areas requires modern coastal surveillance radars. These radars are designed such that small objects can be detected. Therefore, there is an increasing amount of information for the radar observer. Moreover, the number of false and unwanted objects increases as the demand for seeing small objects makes the radar more sensitive. Generally, the false objects can be avoided by using a reliable tracker. However, the tracker does not exclude unwanted objects. The difference between false and unwanted objects is that false objects do not originate from true objects but are mainly noise objects, whereas the unwanted objects originate from true objects but are unwanted in the surveillance image. These objects depend on the purpose of the radar; however, for coastal surveillance radars, the unwanted objects are normally birds, wakes from large ships, etc.

It has been shown in [1] that it is possible to classify tracks by using a recursive classifier where a Gaussian mixture model (GMM) is used to model the probability distribution function (PDF) of the target’s kinematic behavior. However the classifier does not handle the uncertainty in the measurements from the radar. In [2], the position uncertainty is used as an input to the classifier. The classifier also use a GMM to model the PDF of the kinematic behavior of the target. The problem with this is that it is very computationally expensive. To obtain an easier way to handle uncertainty, joint target tracking and classification can be used, as shown in [3, 4, 5]. The problem with joint target tracking and classification is that it is difficult to achieve a high degree of freedom in the filters to separate the classes. For example, a car driving 130 km/h on the highway is not likely to accelerate but more likely to decelerate. This is very hard to model with a tracking filter. A particle filter can be used, but this is computationally expensive. In [6], the authors are describing a method to classify trucks and cars from GPS measurements. The classifier consists of a support vector machine (SVM), and the features are primarily acceleration and deceleration. The classifier is non-recursive, which means that the complete length of the tracks is required. The measurements from a GPS device is generally more accurate than the position measurements from a radar. In [7], a decision tree is used for a recursive classification of four different target classes. The data are from a radar with height information. The decision tree has the advantage that it in some way implicitly handles the uncertainty, that is, features that do not separate the classes will not be used as much as features separating the classes. The disadvantage is that the classifier has a high variance of the classification results. In [8], the random forest classifier is introduced. The random forest is a bagging classifier [9] where multiple decision trees are used to reduce the variance of the classification results. For this reason, random forest is selected in this work.

In this work, we introduce a classifier which uses position measurements to classify radar tracks from a 2D scanning radar. The classifier consists of an alpha beta filter [10] and a random forest classifier. The alpha beta filter is classifying stationary or moving targets, and the random forest classifies moving targets. The classifier is recursive such that the classification results are being updated for each scan of the radar. The classifier performance is shown by using simulated track data and real-world radar data.

In Section 2.1, we will introduce the random forest classifier by describing the training of a decision tree and then explain how this tree is used in the random forest. In Section 2.2, we will explain how we utilize the probability estimates from the random forest in a recursive framework. In Section 2.3, we introduce an alpha beta filter classifier, which classifies targets as either stationary or moving. This is introduced because stationary targets can have high speeds because they fluctuate in the position because of measurement uncertainty or the main scatter points are moving, i.e., wind turbine. In Section 2.4, we combine the random forest and the alpha beta filter as our proposed classifier. In Section 2.5, we describe which features we use in the random forest. The simulation study is shown in Section 3, and in Section 4, the real-world results are shown. We discuss the results in Section 5 and conclude the work in Section 6.

## 2 Method

where *Z* _{ n }= [*x* _{ n },*y* _{ n }] ^{ T }, *x* and *y* is the position in a Cartesian coordinate system with the origin at the location of the radar, *n* is the measurement number index, and *k* is the set size.

### 2.1 Random forest

In this section, we introduce the random forest classifier [8, 11]. The random forest is a bagging algorithm, which means that the random forest consists of a number of weak classifiers [12], which has zero bias but high variance of the true value. The weak classifiers are decision trees [9]. We start this section by describing how to grow a decision tree and then move on to the random forest.

*N*

_{1}⋯

*N*

_{3}) and a number of leafs, e.g., (

*N*

_{4}⋯

*N*

_{7}). This is shown in Fig. 1. A node is defined by more than one class existing in the node data, whereas a leaf has only one class. In every node, a decision must be made such that we go either left or right in the tree. The decision must always be true or false. A leaf is defined as a node where all of the data in the node consists of only one class; therefore, no more splits are required.

To train the tree, we start with a feature vector *F* of size *N* _{ s }×*D* where *N* _{ s } is the number of samples and *D* is the number of features, i.e., dimensions in the feature vector. We now want to split the data such that we make the best separation of the classes by choosing the best feature and feature value. To do this, we need to find the best feature to split on and the best value to split at. To explain the algorithm, we assume that there are only two classes so it forms a binary classification problem and that the values of the feature belong to a finite sample space. This is done to make the explanation easier.

*s*

_{1}and the number of samples in the set as |

*s*

_{1}|. Similarly, we define the set of samples in the children as

*s*

_{2}and

*s*

_{3}and the number of samples as |

*s*

_{2}| and |

*s*

_{3}|. Further, we index the samples belonging to class

*ℓ*by the superscript

*ℓ*such as \(s_{1}^{\ell }\), where

*ℓ*∈{1,2}. We can calculate the empirical entropy for the children as

From (4), we now have a measure for how good a split is and now able to optimize each split of the data such that we choose the best feature to split on and the best value of the feature. We split the data and continue to split the data until all data in a node is of the same class, i.e., the node becomes a leaf. To prevent overfitting, a decision tree must be prone. However, an advantage of using random forest is that it is not necessary to prune the decision trees. The random forest is a bagging classifier [9]. This means that the random forest consists of a number of trees *N* _{ t } where each tree is trained with a random part of the samples and a random part of the features, that is, we draw a random subset of the training data and select a random subset of the features. We then train each tree with these random subsets, and we assume that the trees are statistically independent of each other. A decision tree classifies the data by following a path through each node. The path is decided by the feature and feature value that made the best split in the training. The data which must be classified follow the path until a leaf is met. The leaf has a unique class, and the data is classified as this class. The classification of the data is a majority vote of the result from each of the individual decision trees, that is, each tree is a unique classifier which classifies the individual data.

where *ψ* _{ i } is the the number of votes for class *i*.

In the next section, we explain how we obtained (5) from the random forest to achieve a recursive update of the probability for the class given all the measurements.

### 2.2 Recursive update of the random forest probability

*c*

_{ i }divided by the total number of trees. By this definition, the resolution of the probability estimates is given by the number of trees in the random forest. To prevent that a class is assigned a zero probability, we modify it in the following way:

where *γ* is a normalization constant such that \(\sum _{i} P(c_{i}|\{Z_{n}\}_{k})=1\). By this formula, the probability never reaches zero for any of the classes.

*P*(

*c*

_{ i }|{

*Z*

_{ n }}

_{ k }). However, we want the probability given all measurements, that is

*P*(

*c*

_{ i }|{

*Z*

_{ n }}), where {

*Z*

_{ n }}={

*Z*

_{ n }}

_{ n }. We have, however, not been able to find a simple way to recursively update

*P*(

*c*

_{ i }|{

*Z*

_{ n }}) based on the previous

*P*(

*c*

_{ i }|{

*Z*

_{ n−1}}) and which works for all

*n*. Instead, we propose the following recursive function

*f*(

*c*

_{ i }|{

*Z*

_{ n }}), which is everywhere non-negative and sums to one. Thus,

*f*(

*c*

_{ i }|{

*Z*

_{ n }}) can be considered to be a probability mass function (PMF), which we will use as an approximation for the true

*P*(

*c*

_{ i }|{

*Z*

_{ n }}). In particular, we define

where *w* is a weighting factor, *P*(*c* _{ i }|{*Z* _{ n }}_{ k }) is given by (6), and where *ϕ* _{ n } is the normalization constant such that \(\sum _{c_{i}} f(\{Z_{n}\}_{k} = 1\). The introduction of the weighting by *w* is inspired by the weighted Bayesian classifier used in [14]. In particular, we choose *w*=1/*k* since the features of the random forest are given by a set of measurements where only one out of *k* measurements is substituted at each update.

In the next section, we describe our alpha beta tracking filter. This filter is used to classify if a target is non-moving or moving. The reason for applying such a filter is to classify stationary targets, which have a high apparent speed due to measurement uncertainties.

### 2.3 Alpha beta filter

*Z*

_{ n }. The alpha beta filter is trying to predict

*Z*

_{ n }given the speed

*V*

_{ n−1}at time

*n*−1 and the state

*X*

_{ n−1}as

*τ*is the time between

*Z*

_{ n−1}and

*Z*

_{ n }and the superscript − is the prediction before the measurement is used and the superscript + is after the measurement is used. The filter assumes the speed is constant between

*n*and

*n*−1, that is

*V*

_{ n }=

*V*

_{ n−1}. The error can be calculated as

*α*and

*β*are the constants in the alpha beta filter. To calculate the probability for

*Z*

_{ n }given \(X_{n}^{-}\),

*α*, and

*β*, we use a multivariate normal distribution

*Σ*

_{ n }is the covariance of the position and the subscript

*α*

*β*is to emphasize that this is the probability for the alpha beta filter. The purpose of the alpha beta filter is to separate non-moving targets, i.e., stationary targets from moving targets. We therefore define two filters: a stationary filter with the parameters

*α*=0.1 and

*β*=0.0, which allows the position part of the state to move slightly but forces the speed to be constant at zero. The possibility for a slight movement of the state is because of the possibility for false-starting measurements. As the parameters

*α*and

*β*are given of the class

*c*

_{ s }, we use the notation \(P_{\alpha \beta }(Z_{n}|X_{n}^{-}c_{s})\). Likewise, we define the moving alpha beta filter as \(P_{\alpha \beta }(Z_{n}|c_{m}, X_{n}^{-})\) with the parameters

*α*=1.0 and

*β*=1.0, i.e., we hold the speed constant from update to update but allow both the movement and the speed to change with the measured change. If we know {

*Z*

_{ n−1}} which is the set measurement up to

*n*−1 and

*α*and

*β*, we can calculate \(X_{n}^{-}\) and we can therefore write

*P*

_{ α β }(

*Z*

_{ n }|

*c*

_{ i },{

*Z*

_{ n−1}}) instead of \(P_{\alpha \beta }(Z_{n}|c_{i}, X_{n}^{-})\). For this work, we want the alpha beta filter to classify if the target is stationary or non-stationary, and we therefore recursively update the probability of the alpha beta filter.

*Z*

_{ n }⇔

*Z*

_{ n−1}⇔{

*Z*

_{ n−2}},∀

*n*.

^{1}

In the next section, we describe how we combine the random forest classifier and the alpha beta filter classifier such that a classifier, which is a combination of the two classifiers, is created.

### 2.4 Combining the alpha beta filter with random forest

*c*

_{0}to be the stationary class and

*c*

_{1⋯}

*n*

_{ C }to be the moving classes, where

*n*

_{ C }is the total number of classes. For the alpha beta filter, we have the two classes as

*c*

_{ s }and

*c*

_{ m }for stationary and non-stationary classes, respectively. We want the alpha beta filter classifier to have a larger weight on the classification result of stationary vs. moving than the random forest. We therefore use the recursive updated probability from (13). We do this as described in (14) and (15).

where \({\hat {\omega }}\) is a constant such that \(\sum _{i} P_{c}(c_{i}|\{Z_{n}\}) = 1\). By including the alpha beta filter in this manner, we ensure that the alpha beta filter classifies if a target is stationary while the alpha beta filter classifier does not have influence on the different moving classes.

In the next section, we will describe the features we use for the random forest feature vector, and we will also describe how these are derived from the position. We only utilize position-dependent features such as speed and acceleration.

### 2.5 Features

*k*in (1) to 10. The number of measurements used in the feature vector is a compromise between the time it takes to get the number of measurements required for a full feature vector and the amount of information contained in the feature vector. Larger

*k*requires more measurements, i.e., more time before a classification result is made, whereas for smaller

*k*, the first classification result comes earlier albeit with a greater uncertainty due to the smaller amount of available information. The features and their descriptions can be seen in Table 1. Remember that we defined {

*Z*

_{ n }}

_{ k }to be {

*Z*

_{ n }⋯

*Z*

_{ n−k }}. To make the notation easier, we index each measurement in {

*Z*

_{ n }}

_{ k }by

*i*such that

*i*represent the

*i*th element in the set of measurements {

*Z*

_{ n }}

_{ k }, that is 0≤

*i*<

*k*. Likewise, we define the set of time stamps of the measurements as {

*t*

_{ n }}

_{ k }with the individual measurement being observed at time

*t*

_{ i }. We start by calculating the vectorial distance between the measurements as

The feature vector used. The number of measurement has been chosen to be *k*=10

Feature | Feature description |
---|---|

std ( | Empirical standard deviation of sample-to-sample distances |

| |

⋮ | 2-point speed estimate |

| |

mean ( | Empirical mean of the speed |

std ( | Empirical standard deviation of the speed |

| |

⋮ | 2-point acceleration estimate |

| |

mean ( | Empirical mean of the acceleration |

std ( | Empirical standard deviation of the acceleration |

mean\((a_{i}^{\perp }) \) | Empirical mean of the normal acceleration |

std\((a_{i}^{\perp }) \) | Empirical standard deviation of the normal acceleration |

| | Total distance moved |

| |

⋮ | Distance to coastline |

| |

mean ( | Empirical mean of the distance to coast line |

*i*<

*k*, and the 3-point acceleration estimate is

*i*<(

*k*−1). The normal acceleration \(a_{i}^{\perp }\) is given by the product of the speed and angular velocity

We also use land/sea as information These can be extracted from the SWBD (SRTM Water Body Data) database from [17]. The database is a set of polygons describing the coastline. Because of errors in the database, a hard threshold cannot be used for land and sea. We therefore proposed to use the distance to the coastline *d* _{ i } for each measurement as a feature. By using these polygons, it is possible to calculate the distance from a measurement to the coastline. However, it is getting more and more computationally expensive to calculate the distance as the distance to the nearest coastline increases. We therefore assign a maximum distance *ξ* to the coastline from the target. If the target is farther away than *ξ*, we assign *ξ* to the distance. The sign of the distance decides if it is over land or sea. We set *ξ*=700 m to accommodate for errors in the SWBD database.

In the next section, we will show some simulation results of the classifier. We will also show some real-world results of the classifier.

## 3 Simulation study

*k*where the extracted features are from. The size of the feature vector changes by

*k*and the table shown in Table 1 for

*k*=10. The data we use are simulated data from a controlled random walk. The controlled random walk consists of a three-state transition matrix which has a deceleration state, steady state, and acceleration state. The parameters for maximum and minimum speed are incorporated which changes the probability in the transition matrix if the speed is not within the boundary of the permitted speed range. The data for different targets are generated such that they have nearly the same support in speed and the main difference is the acceleration support. The random walk creates positions \({p_{m}^{x}}\) and \({p_{m}^{y}}\) which are extrapolated from some smooth speeds \(\hat {v}_{m}^{x}\) and \(\hat {v}_{m}^{y}\) described later.

*Δ*

*t*is the time between the updates for

*m*and

*m*−1 and \({\Sigma _{m}^{x}}\) and \({\Sigma _{m}^{y}}\) are position uncertainties drawn from a distribution.

*Σ*

_{ e }is the position covariance and \(\mathcal {N}\) denotes the normal distribution. The smooth speeds are speeds \({v_{m}^{x}}\) and \({v_{m}^{y}}\) which are convolved with a 25-tap moving average filter

*h*. This is done to avoid quick changes in the speed.

*j*denotes depending upon the state

*j*described in (32). The speeds are given as

*μ*

_{ x,j },

*μ*

_{ y,j }, \(\sigma _{x,j}^{2}\), and \(\sigma _{y,j}^{2}\) are given from the function

*ϕ*

_{ j }(

*v*(

*m*−1),

*Γ*). This is done because we want to control the maximum and minimum allowed speed. We define this function as

*ψ*

_{ j }(1),

*ψ*

_{ j }(2), and

*ψ*

_{ j }(3) are the set of parameters \(\{\mu _{y,j}, \mu _{x,j}, \sigma _{y,j}^{2}, \sigma _{x,j}^{2}\}\) used in (30) and

*Γ*={

*ζ*

^{min},

*ζ*

^{max}}. The state machine consists of three states: deceleration (

*d*), constant (

*c*), and acceleration (

*a*) states, see Fig. 2. Further, the state machine is also controlled by the speed. We define the state transition probabilities as

*k*can be seen in Fig. 6. Further, we show the performance of the classifier vs. the number of trees

*N*

_{ t }used in the random forest, see Fig. 7. The confusion matrix of the classification results for the four classes can be seen in Table 2, where we have used

*k*=10 and

*N*

_{ t }=100.

The confusion matrix of the simulated data

Predicted | ||||
---|---|---|---|---|

Actual | Type 1 | Type 2 | Type 3 | Type 4 |

Type 1 | 95.2 | 4.8 | 0.0 | 0.0 |

Type 2 | 16.7 | 72.1 | 11.2 | 0.0 |

Type 3 | 1.0 | 35.6 | 63.3 | 0.0 |

Type 4 | 0.0 | 0.0 | 0.0 | 99.9 |

Overall performance | 82.6 |

## 4 Real-world results

The data used for this work consist of Automatic Identification System (AIS), which is a broadcast system used for large ships; Automatic Dependent Surveillance-Broadcast (ADS-B), which is a broadcast system used for commercial aircrafts; GPS logs; and real-world radar data. The classes for this work are typical classes for coastal surveillance, e.g., large ships, birds, and small boats.

The confusion matrix for real-world data

Predicted | ||||||
---|---|---|---|---|---|---|

Actual | Birds | RIBs | Stationary sea | Large ships | Helicopters | Commercial aircrafts |

targets | ||||||

Birds | 67.9 | 9.2 | 0.0 | 21.0 | 1.9 | 0.0 |

RIBs | 6.4 | 62.4 | 0.0 | 31.2 | 0.0 | 0.0 |

Stationary sea targets | 0.5 | 0.0 | 99.5 | 0.0 | 0.0 | 0.0 |

Large ships | 21.4 | 5.1 | 0.3 | 61.5 | 11.6 | 0.0 |

Helicopters | 12.2 | 0.0 | 0.0 | 0.0 | 87.8 | 0.0 |

Commercial aircrafts | 0.8 | 0.0 | 0.0 | 0.0 | 0.0 | 99.2 |

Overall performance | 79.7 |

In the next section, we will discuss the results of the classifier.

## 5 Discussion

In Fig. 6, the performance of the classification results for the simulated data set is shown, where we vary the number of measurements *k*, in (1), used to extract the features. The performance is calculated as the mean of the diagonal in the confusion matrix. It is clear that the more measurement (longer feature vector) used, the better the classification results. This is clear as more information to the classifier gives better estimation of the class, and therefore, it is more likely to classify correct. The downside of increasing the number of measurements is that it takes longer time from a track is seen until the first probability of the target is shown. For our results, the sampling rate varies between 0.333 and 1 Hz. For 10 measurements, this gives a maximum waiting time of 30 s, which we believe for the application in hand is acceptable. In Fig. 7, the performance can be seen when varying the number of trees used in the random forest. The plot is made with *k*=10. It can be seen that the performance does not get better after around 170 trees. The increase in the number of trees takes longer time to train the random forest and is more computationally expansive and memory requiring when using the classifier for testing, i.e., the purpose of the classifier is to run in real time. The performance of *k*=10 and *n* _{ t }=100 can be seen in Table 2. It is clear that type 2 and type 3 have the most confusion between them. This is also natural if we look at the speed PDFs and the acceleration PDFs in Figs. 4 and 5, respectively, as these are very similar. In general, the diagonal numbers in the confusion matrix are at the left side. This is due to the fact the large allowed acceleration still contains smaller acceleration which therefore will be classified as a lower class type.

For the real-world scenarios, we use *k*=10 and *n* _{ t }=170. As it can be seen, the confusion matrix in Table 3 shows relative good performance. Nearly all of the stationary sea targets and commercial aircrafts are classified correctly. The helicopters are confused with birds. This can be because the helicopters can move as slow as birds. There is some confusion between large ships, birds, and RIBs. All of these classes have kinematics which are close to each other.

In Fig. 8, one of the real coastal surveillance scenarios is shown. The scenario shows a RIB sailing out from a marina and zigzagging back again. The RIB is classified as a small fast boat. The reason that it is not classified as a Jet Ski/RIB is that it sails more like a fast boat whereas a Jet Ski/RIB often makes turns, accelerates, and decelerates. The two slow-moving vessels to the north of the RIB are classified correctly. Some of the sea buoys are classified correctly as stationary targets. Only a few birds are classified correctly. In Fig. 9, two wind farms can be seen and nearly all of the wind turbines are classified as stationary, while a few are misclassified as small slow-moving boats. The commercial aircraft is between a commercial aircraft and small aircraft; however, the target is primarily classified as a commercial aircraft. The small aircraft circling the two wind farms is classified correctly even though the aircraft is flying below stall speed. This can be due to the strong winds, and therefore, the real airspeed is much larger. The one sea vessel that is sailing between the wind turbines is misclassified as a bird, while the other sea vessels are classified as small slow boats, small fast boats, and helicopters. Unfortunately, nearly all the birds are misclassified either as unknown or as a helicopter. We believe this is because the training data do not contain any birds at that distance and speed (because of the wind). Further, the radar used to record this scenario is different from the radars used for the training data.

## 6 Conclusions

We have shown that it is possible to use a recursive approach to classify radar tracks from kinematic data. We have also shown that it is possible to use an alpha beta filter together with the random forest such that stationary targets are classified as stationary. The study used both simulated data, which is simulated to behave as real targets, and real-world data. We have shown both scenario and confusion matrix to get an overview of the performance.

## Notes

### Acknowledgements

This work was partially supported by the Engineering and Physical Sciences Research Council (EPSRC) Grant number EP/K014307/1 and the MOD University Defence Research Collaboration in Signal Processing.

### Competing interests

The authors declare that they have no competing interests.

### References

- 1.LW Jochumsen, M Pedersen, K Hansen, SH Jensen, J Østergaard, in
*Radar Conference (Radar), 2014 International*. Recursive Bayesian classification of surveillance radar tracks based on kinematic with temporal dynamics and static features (IEEELille, France, 2014), pp. 1–6.CrossRefGoogle Scholar - 2.LW Jochumsen, E Nielsen, J Østergaard, SH Jensen, M Pedersen, in
*Radar Conference (RadarCon), 2015 IEEE*. Using position uncertainty in recursive automatic target classification of radar tracks (IEEEWashington DC, USA, 2015), pp. 0168–0173.CrossRefGoogle Scholar - 3.DH Nguyen, JH Kay, BJ Orchard, RH Whiting, Classification and tracking of moving ground vehicles. Lincoln Lab. J.
**13**(2), 275–308 (2002).Google Scholar - 4.S Challa, GW Pulford, Joint target tracking and classification using radar and ESM sensors. IEEE Trans. Aerosp. Electron. Syst.
**37**(3), 1039–1055 (2001).CrossRefGoogle Scholar - 5.D Angelova, L Mihaylova, Joint target tracking and classification with particle filtering and mixture Kalman filtering using kinematic radar information. Digital Signal Process.
**16**(2), 180–204 (2006).CrossRefGoogle Scholar - 6.Z Sun, XJ Ban, Vehicle classification using GPS data. Transp. Res. Part C Emerg. Technol.
**37:**, 102–117 (2013).CrossRefGoogle Scholar - 7.M Garg, U Singh, in
*National Conference on Research Trends in Computer Science and Technology (NCRTCST), IJCCT_Vol3Iss1/IJCCT_Paper_3*. C & R tree based air target classification using kinematics (IJECCEHyderabad, India, 2012).Google Scholar - 8.L Breiman, Random forests. Mach. Learn.
**45**(1), 5–32 (2001).MathSciNetCrossRefMATHGoogle Scholar - 9.S Theodoridis, K Koutroumbas,
*Pattern Recognition, Fourth Edition*, 4th edn. (Academic Press, 2008).Google Scholar - 10.E Brookner,
*Tracking and Kalman Filtering Made Easy*(Wiley-Interscience, 1998).Google Scholar - 11.J Friedman, T Hastie, R Tibshirani,
*The Elements of Statistical Learning*, vol. 1 (Springer, 2001).Google Scholar - 12.CM Bishop,
*Pattern Recognition and Machine Learning*, vol. 1 (Springer, New York, 2006).MATHGoogle Scholar - 13.G James, D Witten, T Hastie,
*An Introduction to Statistical Learning: with Applications in R*(Taylor & Francis, 2014).Google Scholar - 14.L Chen, S Wang, in
*Proceedings of the 21st ACM International Conference on Information and Knowledge Management*. Automated feature weighting in naive bayes for high-dimensional data classification (ACM, 2012), pp. 1243–1252.Google Scholar - 15.JA Lawton, RJ Jesionowski, P Zarchan, Comparison of four filtering options for a radar tracking problem. J. Guid. Control. Dyn.
**21**(4), 618–623 (1998).CrossRefGoogle Scholar - 16.N Mohajerin, J Histon, R Dizaji, SL Waslander, in
*Radar Conference, 2014 IEEE*. Feature extraction and radar track classification for detecting UAVs in civillian airspace, (2014), pp. 0674–9.Google Scholar - 17.SWBD. http://dds.cr.usgs.gov/srtm/version2_1/SWBD/. Accessed 4 Dec 2014.

## Copyright information

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.