Moving Object Extraction in Infrared Video Sequences

Zhang, Jinli; Li, Min; He, Yujie

doi:10.1007/978-3-319-21963-9_45

Jinli Zhang^14,15,
Min Li¹⁴ &
Yujie He¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9218))

Included in the following conference series:

International Conference on Image and Graphics

1596 Accesses

Abstract

In order to extract the moving object in infrared video sequence, this paper presents a scheme based on sparse and low-rank decomposition. By transforming each frame of the infrared video sequence to a column and combine all columns into a new matrix, the problem of extracting moving objects in infrared video sequences is converted to a sparse and low-rank matrix decomposition problem. The resulted nuclear norm and L ₁ norm related minimization problem can also be efficiently solved by some recently developed numerical methods. The effectiveness of our proposed scheme is illustrated on different infrared video sequences. The experiments show that, compared to ALM algorithm, our algorithm has distinct advantages in extracting moving object from infrared videos.

Foundation Item: The National Natural Science Foundation of China (61102170).

You have full access to this open access chapter, Download conference paper PDF

Foreground Detection via Robust Low Rank Matrix Decomposition Including Spatio-Temporal Constraint

A Novel Low-Rank and Sparse Decomposition Model and Its Application in Moving Objects Detection

Article 01 July 2021

Moving object detection via RPCA framework using non-convex low-rank approximation and total variational regularization

Article 18 April 2022

Keywords

1 Introduction

With the prevalence of infrared cameras and infrared sensors, the infrared video plays an increasing important role on human life and production. For example, the infrared videos about the wild animal activity which were obtained from infrared video surveillance equipments have brought great convenience for the wild animal researchers. When the target and background brightness have not a distinct difference in the infrared video sequences or people’s some need, it is very necessary to separate the moving object from the backgrounds. How to effectively extract the moving target in an infrared video is a problem worthy of studying.

This paper aims at developing an effective scheme to extract the moving objects in infrared video sequences. Motivated by the regularization models proposed in [1, 2] for other applications, we take a similar regularization approach for moving objects extraction from background.

2 Detailed Algorithm

In this section, we present the algorithm in details. First, each frame of the infrared video sequences is reformed to a column, then combining all columns to a new matrix $ D \in {\mathbb{R}}^{m \times n} $. Because the video content is composed of the background and moving objects, the matrix D can be represented as the sum of background and moving objects. If we represent the background component as $ A \in {\mathbb{R}}^{m \times n} $ and represent moving object component as $ E \in {\mathbb{R}}^{m \times n} $, then the matrix D can be expressed as

$$ D = A + E. $$

(1)

In video sequences, the adjacent frames have most of the same background information, especially in the video with high frame rate. Thus, matrix A has many same columns and it is a low-rank matrix. Because the size of the moving objects in each frame is far less than the size of frame, the number of nonzero is far less than the element number in matrix E. Thus, matrix E is a sparse matrix. Based on these observations, if we can accurately decompose the matrix D into the sum of a low-rank matrix and a sparse matrix, the moving objects can be extracted from background.

2.1 Notation

Before presenting the details of decomposing D into a low-rank matrix A and a sparse matrix E, we first define some notations for the simplicity of discussions. The L₁ norm and the Frobenius norm of a matrix $ X \in {\mathbb{R}}^{{n_{1} \times n_{2} }} $ are defined by:

$$ \left\| X \right\|_{1} = \sum\limits_{i = 1}^{{n_{1} }} {\sum\limits_{j = 1}^{{n_{2} }} {\left| {x_{i,j} } \right|} } \;{\text{and}}\;\left\| X \right\|_{F} = (\sum\limits_{i = 1}^{{n_{1} }} {\sum\limits_{j = 1}^{{n_{2} }} {\left| {x_{i,j} } \right|^{2} } } )^{1/2} , $$

(2)

respectively. Where $ x_{i,j} $ is the $ (i,j) $-th element of X. Assuming that r is the rank of X, the singular value decomposition of X is then defined by

$$ X = U\sum V^{T} ,\quad \sum = diag(\{ \sigma_{i} \}_{1 \le i \le r} ). $$

(3)

Where U and V are $ n_{1} \times r $ and $ n_{2} \times r $ matrices with orthonormal columns respectively. The nuclear norm of X is defined as the sum of singular values, i.e.

$$ \left| X \right|_{ * } = \sum\limits_{i = 1}^{r} {\left| {\sigma_{i} } \right|} . $$

(4)

The shrinkage operator $ S_{\tau } :\,{\mathbb{R}} \to {\mathbb{R}} $ is defined by

$$ S_{\tau } (x) = \text{sgn} (x)\hbox{max} (\left| x \right| - \tau ,0). $$

(5)

Where $ \tau \ge 0 $. When $ S_{\tau } $ is extended to matrices by applying it element-wise.

The singular shrinkage operator $ D_{\tau } (x) $ is defined [3] by

$$ D_{\tau } = US_{\tau } (\Sigma )V^{T} . $$

(6)

It is noted that $ S_{\tau } (X) $ and $ D_{\tau } (x) $ are the solutions of the following two minimization problems respectively

$$ \mathop {\hbox{min} }\limits_{Y} \tau \left\| Y \right\|_{1} + \frac{1}{2}\left\| {Y - X} \right\|_{F}^{2} ,\;\mathop {\hbox{min} }\limits_{Y} \tau \left\| Y \right\|_{ * } + \frac{1}{2}\left\| {Y - X} \right\|_{F}^{2} . $$

(7)

2.2 Sparse and Low-Rank Decomposing

In order to exactly extract the sparse matrix E and low-rank matrix A, we can solve the following minimization problem to estimate A and E:

$$ \mathop {\hbox{min} }\limits_{{A,E \in {\mathbb{R}}^{{n_{1} \times n_{2} }} }} rank(A) + \lambda \left\| E \right\|_{{_{0} }} \;{\text{s}} . {\text{t}} .\;D = A + E. $$

(8)

Where $ \lambda $ is a suitable regularization parameter. $ rank( \cdot ) $ denotes the rank for a matrix. $ \left\| \cdot \right\|_{{_{0} }} $ denotes the pseudo-norm that counts the number of non-zeros.

The minimization problem (8) is a non-convex problem. In general, it is very hard to solve. Referring to the approaches in [4, 5], we try to solve the follow minimization to estimate A and E.

$$ \mathop {\hbox{min} }\limits_{{A,E \in {\mathbb{R}}^{m \times n} }} \left\| A \right\|_{ * } + \lambda \left\| E \right\|_{1} \;{\text{s}} . {\text{t}} .\;D = A + E. $$

(9)

Where $ \left\| \cdot \right\|_{1} $ is the element-wise sum of absolute values for a matrix.

The minimization model (9) above has been proposed in [1, 2] to extract low-dimensional structure from a data matrix. It could be viewed as a replacement of the Principal Component Analysis (PCA) method. The minimization approaches is termed as Principal Component Pursuit (PCP) for solving the problem of background subtraction in video surveillance. In their approach, the observed video matrix (array of image frames) is decomposed into the low-rank matrix structure (static background) and the sparse matrix structure (moving objects).

In our approach, we convert the minimization question (9) to an augmented Lagrange multiplier form:

$$ \mathop {\hbox{min} }\limits_{{A,E \in {\mathbb{R}}^{m \times n} }} \left\| A \right\|_{ * } + \lambda \left\| E \right\|_{1} + \frac{1}{2\mu }\left\| {D - A - E} \right\|_{F}^{2} . $$

(10)

Here, the value of $ \lambda $ is set the same as [1] suggested:

$$ \lambda { = }{1 \mathord{\left/ {\vphantom {1 {\sqrt {\hbox{max} (m,n)} }}} \right. \kern-0pt} {\sqrt {\hbox{max} (m,n)} }}. $$

(11)

Where m, n are the number of rows and columns of the matrix D.

In recent years, there are some good methods on how to efficiently solve L₁ norm related minimization problem. One of them is the accelerated proximal gradient (APG) method, which shows a very good performance on solving L₁ norm and nuclear norm related minimization problems (e.g. [6–9]). Another promising approach is the ADMM (alternating directions method of multipliers) which also can efficiently solve such problems (e.g. [10–12]). In our approach, we used the APG method to solve the minimization problem (10).

The general APG method aims at solving the following minimization problem:

$$ \mathop {\hbox{min} }\limits_{X} \quad g(X) + f(X) $$

(12)

Where g is a non-smooth function, f is a smooth function. Algorithm 1 describes the specific scheme of APG.

Based on the APG method, the minimization problem (10) can be converted to (12) by setting

$$ \left\{ {\begin{array}{*{20}c} {X = (A,E)} \\ {g(X) = \mu \left\| A \right\|_{ * } + \lambda \mu \left\| E \right\|_{1} } \\ {f(X) = \frac{1}{2}\left\| {D - A - E} \right\|_{F}^{2} } \\ \end{array} } \right.. $$

(13)

When applying Algorithm 1 to solve the (10), the minimization problem in Step 4 of Algorithm 1 becomes (noticing L _f = 2 in our case)

$$ \mathop {\hbox{min} }\limits_{A,E} \begin{array}{*{20}c} {} \\ \end{array} \mu \left\| A \right\|_{ * } + \lambda \mu \left\| E \right\|_{1} + \left\| {A - G_{k}^{A} } \right\|_{F}^{2} + \left\| {E - G_{k}^{E} } \right\|_{F}^{2} . $$

(14)

Since A and E are separable in the above minimization, their solutions can be obtained separately by applying singular value shrinkage operator on $ G_{k}^{A} $ and soft shrinkage operator on $ G_{k}^{E} $, i.e. $ A_{k + 1} = D_{{{\mu \mathord{\left/ {\vphantom {\mu 2}} \right. \kern-0pt} 2}}} (G_{k}^{A} ) $, $ E_{k + 1} = S_{{{{\lambda \mu } \mathord{\left/ {\vphantom {{\lambda \mu } 2}} \right. \kern-0pt} 2}}} (G_{k}^{E} ) $.

The detailed algorithm for solving the minimization problem (10) is described in Algorithm 2.

After the low-rank matrix A and the sparse matrix E are obtained by Algorithm 2, the low-rank matrix A and the sparse matrix E will be reformed to the format of the original infrared video sequences.

3 Experimental Results and Analysis

In this section, we evaluate the performance of the proposed method on three infrared video sequences “irw1”, “irw2” and “plane”. In order to facilitate the evaluation, our algorithm is compared with the inexact augmented Lagrange multipliers (ALM) algorithm [11] for its high efficiency in solving minimization problems. For a fair comparison, in each algorithm, the error tolerance $ \varepsilon $ is set to $ 1.0 \times 10^{ - 7} $ and the maximal iterations number $ K $ is set to 1000. 30 frames of each infrared video sequence were input to two algorithms in experiments. The sizes of each frame of the infrared video “irw1”, “irw2” and “plane” are 240 × 320, 240 × 320 and 200 × 256 respectively. All the experiments are performed on a desktop computer (CPU 2.30 GHz, RAM 3.25 GB) with the MATLAB R2012b software. Figures 1, 3 and 5 show the results of extracted objects and background in three infrared videos by the Algorithm 2. Figures 2, 4 and 6 show the results of extracted objects and background by the ALM algorithm. The performance of two algorithms in terms of the runtime, iteration number and the rank of the extracted low-rank matrix A are listed in Table 1.

Table 1. Comparison of the results of extracting objects in different infrared video sequence by two algorithms.

Full size table

From above figures, it can be seen that, no matter big or small, quick or slow, the moving objects can be completely extracted by the Algorithm 2. In Fig. 2, one foot of the man had not been extracted to the moving object opponent by ALM algorithm. From Figs. 3 and 4, we can find that, for the small object plane, the extracted plane has clear edge by Algorithm 2 than that by ALM algorithm. From Figs. 5 and 6, it can be seen that, for the slow moving man, partial contour of the man was not extracted to the object opponent by ALM algorithm. As can be seen from Table 1, compared to ALM algorithm, Algorithm 2 has the following distinct advantages: the rank of the recovered background more lower, running time more less and fewer iteration number to reach convergence. These advantages for rapid analysis and process large amounts of infrared video data is important.

In order to verify the validity of the proposed algorithm for optical videos, Figs. 7 and 8 show the results of extracted object and background in an optical video “highway” by Algorithm 2 and ALM algorithm respectively.

From Figs. 7 and 8, it can be seen that, the two algorithms are still able to extract the moving objects in an optical video. The extracted backgrounds by two algorithms have no obvious difference from the visual point of view, but there are more car tracks which belong to the background in Fig. 8(b) than that in Fig. 7(c).

4 Conclusions

In this paper, we presented a scheme to extract the moving objects from infrared video sequence. We convert the problem of extracting the moving object from videos to a sparse and low-rank matrix decomposition problem. The resulting L₁ norm related minimization problem can also be efficiently solved by many recently developed numerical methods. The effectiveness of our proposed algorithm is also validated to other types of video (e.g., optical videos). The experiments show that, compared to ALM algorithm, our algorithm has distinct advantages in extracting moving object from infrared videos and optical videos.

References

Candes, E.J., Li, X.D., Yi, M., Wright, J.: Robust principal component analysis? J. ACM 58(3), 1–37 (2011)
Article Google Scholar
Zhou, Z., Li, X., Wright, J., Candes, E.J., Ma, Y.: Stable principal component pursuit. In: IEEE International Symposium on Information Technology (ISIT) (2010)
Google Scholar
Cai, J.F., Candes, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)
Article MathSciNet MATH Google Scholar
Chandrasekaran, V., Sangavi, S., Parrilo, P.A., Willsky, A.S.: Sparse and low-rank matrix decompositions. In: IEEE 47th Annual Allerton Conference on Communication, Control, and Computing, pp. 962–967 (2009)
Google Scholar
Wright, J., Ganesh, A., Rao, S., Peng, Y., Ma, Y.: Robust PCA: exact recovery of corrupted low-rank matrices via convex optimization. In: NIPS 2009, Whistler, BC, Canada (2009)
Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Shen, Z.W., Toh, K.C., Yun, S.: An accelerated proximal gradient algorithm for frame based image restoration via the balanced approach. SIAM J. Imag. Sci. 4(2), 573–596 (2011). Technical report
Article MathSciNet MATH Google Scholar
Toh, K.C., Yun, S.: An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. Pac. J. Optim. 6, 615–640 (2010)
MathSciNet MATH Google Scholar
Lin, Z., Ganesh, A., Wright, J., Wu, L., Chen, M., Ma, Y.: Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. UIUC Technical report UILU-ENG-09-2214 (2009)
Google Scholar
Yuan, X.M., Yang, J.F.: Sparse and low-rank matrix decomposition via alternating direction methods. Pac. J. Optim. 9(1), 167–180 (2013)
MathSciNet MATH Google Scholar
Lin, Z.C., Chen, M.M., Wu, L.Q., Ma, Y., The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. UTUC Technical report UILU-ENG-09-2215 (2010)
Google Scholar
Tao, M., Yuan, X.M.: Recovering low-rank and sparse components of matrices from incomplete and noisy observations. SIAM J. Optim. 21(1), 57–81 (2011)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Xi’an Research Institute of Hi-Tech, Hongqin Town, Xi’an, Shaanxi Province, China
Jinli Zhang, Min Li & Yujie He
Department of Information Engineering, Engineering University of CAPF, Sanqiao Town, Xi’an, Shaanxi Province, China
Jinli Zhang

Authors

Jinli Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Min Li
View author publications
You can also search for this author in PubMed Google Scholar
Yujie He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinli Zhang .

Editor information

Editors and Affiliations

Department of Electronic Engineering, Tsinghua University, Beijing, China
Yu-Jin Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Li, M., He, Y. (2015). Moving Object Extraction in Infrared Video Sequences. In: Zhang, YJ. (eds) Image and Graphics. ICIG 2015. Lecture Notes in Computer Science(), vol 9218. Springer, Cham. https://doi.org/10.1007/978-3-319-21963-9_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-21963-9_45
Published: 04 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21962-2
Online ISBN: 978-3-319-21963-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)