Keywords

1 Introduction

Moving point target detection is the most difficult and significant task in infrared target surveillance. Although a lot of methods have been proposed to detect point target during the past few years, their performance is still unsatisfactory due to many reasons. The main reasons resulting in the problems lie in two aspects: infrared point targets always have less texture, color and shape information, and the detection environment is commonly very harsh.

Recently, low-rank and sparse matrices recovery (LSMR) theory was proposed and proved to be much more effective than many traditional methods in a lot of fields [1,2,3,4,5,6,7,8,9,10,11]. And it is also applied in target detection and tracking to further improve the performance. They also can be classified into two kinds. The first kind prefers to detect target in a single image [5,6,7,8]. It can solve the problems when imaging backgrounds change quickly due to rapid relative motion between the imaging sensor and the target. And the other kind of methods makes use of information of multi-frames to predict the target position more quickly and accurate [9,10,11]. However, both of their performance could degrade rapidly when the signal-to-clutter ratio (SCR) is low, or the computation is too time-consuming to be realized in real-time.

In this paper, we fully exploited the advantage of LSMR algorithm, and proposed a patch-image based LSMR method to detect moving point targets in infrared marine surveillance. Different from previously proposed LSMR based algorithms, our method takes full advantage of the information from intra and inter frames, and greatly reduces computation time while obtaining a good performance on signal-to-clutter gain (SCRG) and background suppression factor (BSF).

This paper is organized as follows. In Sect. 2, preliminary work including the LSMR algorithm and its application in small target detection will be introduced briefly. In Sect. 3, the relationship between the computation amount and image size is analyzed, thus a patch image based LSMR method is proposed and discussed in detail. In Sect. 4, experiments are executed to validate the performance of the proposed approach. And in final section, a conclusion and our future work is presented for further research.

2 LRMR Theory and Small Target Detection

2.1 Low-Rank and Sparse Matrices Recovery (LRMR) Algorithm

As a theory stretching from compressive sensing (CS) and sparse represent algorithm, the LRMR supposes an ideal image or video as a low-rank matrix. Take image signal as an instance, its rank is usually much smaller than its size.

$$ \gamma \prec \prec \hbox{min} (m,n) $$
(1)

In (1), r is the rank of an ideal image, m, n are image’s width and height respectively. In fact, the entries of the matrix are often corrupted by errors or noises, making the image or video not a low-rank matrix any more. Consequently, every image/video can then be regarded as the combination of a low-rank matrix and a noise matrix in LRMR theory, which could be represented as

$$ \text{I = L + S} $$
(2)

I is an usual image/video, L is an ideal image/video which is low-rank, S represents random noise and errors, or other unregular signal caused by various of factors, which can be regarded as sparse matrix. It has been proved that, under surprisingly broad conditions, L and S can be exactly recovered from I via Robust Principal Component Analysis (RPCA) by solving the following convex optimization problem.

$$ \mathop {\hbox{min} }\limits_{{{\text{L}},\,{\text{S}}}} = \left\| {\text{L}} \right\|_{ * } + \lambda \left\| {\text{S}} \right\|_{1} ,\quad subject\;to\quad {\text{I}} = {\text{L}} + {\text{S}} $$
(3)

Here, \( \left\| \bullet \right\|_{*} \) represents the nuclear norm of a matrix, \( \left\| \bullet \right\|_{1} \) denotes the norm-1, and λ is a positive weighting parameter, which could be utilized to fine tune L and S for optimum results.

2.2 LSMR in Point Target Detection

Generally speaking, there are three components in an infrared image (II): the background (B), the point targets (T), and various kinds of noises (N). And the infrared image can be expressed as follow:

$$ {\text{II}} = {\text{B}} + {\text{T}} + {\text{N}} $$
(4)

When detecting point target in a single image, we found out that small targets have no concrete shape and texture, and are commonly not larger than 10 × 10 pixels because of the long imaging distance. They can be considered as “sparse” with respect to the extensive background. However, background patches are always “low-rank” and approximately context correlated with each other, even though the pixel distance between two patches may be large. As for detecting targets in multi-frames, background in each frame is also ‘low-rank’ due to the consistency between adjacent frames. And target is “sparse” because of its moving characteristics.

Thus the point target detection could be regarded as a typical problem of recovering a sparse component from the infrared image in LSMR theory. This assumption has been proved valid in [8], under the assumption that the random noise is i.i.d. and its Frobenius norm is smaller than some σ (σ > 0). Positive weighting parameter λ plays a promoting role in enhancing the detection stability and suppressing random noise.

$$ \left\| {{\text{II}} - {\text{B}} - {\text{T}}} \right\|_{\text{F}} \le \sigma $$
(5)

Nevertheless, the recovery of low-rank and sparse matrix usually leads to high computational cost due to demanding convex optimization of RPCA especially with multi-frames. And the method of detecting in a single frame always cannot ensure the detection accuracy.

3 Patch Based LSMR Algorithm

How to improve the detection speed while keeping a high accuracy? A figure was drawn up to show the factor that may increase the detection time. In Fig. 1, we can see the relationship between the processing time of five randomly selected infrared marine images and their downscaled versions. Accelerated proximal gradient (APG) algorithm is used to solve the convex optimization problem in detection. From Fig. 1, we can find out that it is hard to meet the real-time requirements of most applications for detecting targets even in a single frame. However, at the same time, we also find out that the processing time is closely related to the image size. As the image shrinks, the processing time falls sharply. In that case, the processing time could be decreased when the image size is downscaled.

Fig. 1.
figure 1

Processing time of different sized images

In view of the hugeness of the computation amount in detection with all frames in video, we use image difference to extract information between frames and divide difference image into patches to further smaller the candidate region to be processed by LSMR. These methods restrict the processing time to a reasonable range. Figure 2 is the schematic diagram of the proposed method, which can be described in detail as the following steps.

Fig. 2.
figure 2

Schematic diagram of the proposed method framework

  1. Step 1:

    Frame difference

The frame difference is used to narrow the candidate region with little computation cost by utilizing information between adjacent frames. It could not only extract target position information from adjacent frames but also exclude the stationary disturbances for obtaining the candidate moving target regions which we are most interested in. Suppose C as the candidate area including moving point target, it can be obtained by (6).

$$ {\text{C}} = \left| {{\text{II}}({\text{t}}) - {\text{II}}({\text{t}} - 1)} \right| $$
(6)

Here, II(t) is the t th frame, II(t − 1) is the frame before the I(t). | ∗ | is the absolute operation.

  1. Step 2:

    Divided into patches and processed by a local threshold

After obtaining the frame difference image C, we divide the image into partially overlapped patches, which further decreases the data to be processed at a time.

Obviously, the size of the patch and the overlapping area will influence the detection capability. Since point target is usually smaller than 10 × 10 pixels, the vertical and horizontal steps are set as P-10 to ensure no loss of targets with minimum redundant calculation. P is the width of patches, which will be discussed in the following section.

The patch is then processed by a local self-adaptive threshold to further narrow the detection region. The local self-adaptive threshold is described in (7).

$$ C(x,y) = \left\{ {\begin{array}{*{20}c} 1 & {if({\text{C}}(x,y) \ge \alpha M_{\hbox{max} } )\;\& \;(M_{\hbox{max} } - M_{\hbox{min} } > \beta )} \\ 0 & {others} \\ \end{array} } \right. $$
(7)

where \( {\text{M}}_{ \hbox{max} } \) and \( {\text{M}}_{ \hbox{min} } \) are respectively the maximum and minimum pixel values of the local region. \( \beta \) is the threshold coefficient and set as 0.6, \( \alpha \) is the threshold for judging the existing of the target. For the targets brightness is usually relative high in a local region, no target is expected to exist in the patches whose difference result of \( {\text{M}}_{ \hbox{max} } \) and \( {\text{M}}_{ \hbox{min} } \) is smaller than \( \beta \). In the paper, we choose \( \beta \) = 40.

  1. Step 3:

    LSMR and segment

Last but not least, we recover the target region T by LSMR in the left candidate region C and then use a simple segment method to extract targets. APG algorithm is used to solve the convex optimization problem of (3). It can achieve dramatically better performance among all the RPCA algorithms [6]. The algorithm is explained as follows:

figure a

4 Experimental Results and Analysis

4.1 The Performance Comparison with Different Weight of Sparse Error Term (λ) and Patch Size

To verify the capability of the proposed method, a real and representative cluttered marine surveillance video sequence with 48 frames is used to execute experiments. The resolution of the frames is 498 × 696, and the moving point targets in the frames occupy about 5 × 5 pixels. All experiments were implemented by Matlab software on a PC with 2-GB RAM and 2.60-GHz Intel-i5 processor. Average detection time (ADT), SCRG, BSF are employed for objective evaluation. SCRG, BSF are same to the metrics presented in [12]. They are defined by (8) and (9). Where, S is the target amplitude and C is the clutter standard deviation within the original frame or the processed frame.

$$ {\text{SCRG}} = \frac{{\left( {\text{S/C}} \right)_{out} }}{{(S/{\text{C}})_{in} }} $$
(8)
$$ {\text{BSF}} = \frac{{{\text{C}}_{\text{in}} }}{{{\text{C}}_{out} }} $$
(9)

The patch sizes range from 20 × 20 pixels to 490 × 490 pixels with an increasing of 10 pixels. And the average results of all frames in the above mentioned video sequence are used for comparison.

Figure 3 shows the performance comparison with different weight of sparse error term (λ), which is an important parameter in APG algorithm. The ADT has been enlarged by 5, 10, 20 times in Fig. 3(b), (c) and (d) respectively for convenient observation. And it means the loss of targets when the SCRG and BSF are zero. In addition, it can be seen from Fig. 3 that, optimum detection time always exists with each determined λ, which is directly related to the patch size. Patches with too small or too large sizes always consume more time to be processed. The least detection time is obtained with patch size of 280 × 280 pixels in our experiment. The SCRG and BSF roughly increase with the patch size under determined λ, and increase with λ under determined patch size. Taking every aspect into consideration, we finally choose the optimum patch size as 280 × 280 pixels with λ = 0.3 in the proposed method. Its ADT is 3.53 s. And SCRG, BSF reach 107.0 and 641.1 respectively.

Fig. 3.
figure 3

Proposed performance comparison with different weight of sparse error term (λ). (a) Performance with different patch sizes (λ = 0.1); (b) Performance with different patch sizes (λ = 0.2); (c) Performance with different patch sizes (λ = 0.3); (d) Performance with different patch sizes (λ = 0.4).

4.2 Evaluation Comparison

The detection capability of the proposed method is also compared with the result of the same algorithm with non-partitioned image and IPI model [6].

From Table 1, we can see that, the proposed method has better performance than other two methods under the same circumstance. Compared to IPI, our method reduces the detection time by 97.4% while improving SCRG, BSF by 180.1% and 389.4% respectively. What’s more, the detection speed could be further accelerated by parallelism due to independence of patches in our method.

Table 1. Performance comparison of point target detection methods

By analyzing the results in above figure and table, we can see that: the proposed method could really improve the detection capability. And the reasons are as follows. First, the difference between adjacent frames and local threshold excluded a large amount of non-target patches before LSMR algorithm, and greatly decreased the computation load of LSMR. Since the frame difference and threshold process need much less computation than LSMR, the total computation time could be reduced tremendously. Second, the LSMR is still effective in detection as long as the point targets are “sparse” in each patch. Third, the proposed method integrates methods of frame difference, local threshold, image segment and LSMR, which takes full advantage of multi-methods to exactly distinguish targets from complex background.

5 Conclusion and Future Work

In this paper, we proposed a patch-image based LSMR method for moving point target detection in infrared coastal surveillance. It is the first time the patches of frames are independently processed by LSMR algorithm, and the optimum patch size is also analyzed with metrics of ADT, SCRG and BSF. Experiment results show that the propose method prompts the detection speed while keeping a high detection capability.

However, detection results of the proposed method would be affected by dynamic noises which are also irregular. Thus, more research work should continually focus on suppressing noise and improving detection speed and accuracy in future. A comprehensive method which can both handle the detection velocity and accuracy parameters is essential for actual industrial applications.