Keywords

1 Introduction

Face recognition (FR) has received extensive research during last thirty years and numerous FR methods have been developed [7, 8, 13, 15, 17, 24]. Classical FR algorithms including principal component analysis (PCA) [19], linear discriminant analysis (LDA) [3] and laplacianface [10] try to employ subspace learning method to represent the intrinsic characteristics of faces. At the same time, many types of image features like scale-invariant feature transform (SIFT) [16], local binary pattern (LBP) [1], speeded-up robust features (SURF) [2] and histogram of oriented gradient (HOG) [21] have been introduced into FR algorithms, while the final recognition result can be easily obtained based on these feature representations. However, these feature descriptors are hand-crafted and always require many prior knowledge, which limits the improvement of recognition performance.

Regression analysis based methods have also aroused broad interests in face recognition community. For example, Naseem et al. proposed a linear regression classification (LRC) [15] by reconstructing a query image as the linear combination of dictionary faces. Wright et al. proposed a sparse representation based classification algorithm (SRC) [22] for robust face recognition using a sparse constraint. By representing a face image with a sparse linear combination of the dictionary faces, SRC believed that the query image will be reconstructed by the training samples in the same class. However, when the number of training samples is limited, sparsity between classes may lead to misleading solutions. Zhang et al. [25] analyzed the principle of SRC and believed that collaborative representation is more effective than sparsity constraint. Based on ridge regression, they introduced a collaboration representation classifier (CRC) which lead to better FR accuracy and lower complexity than SRC. After that, many improved versions of CRC algorithm have been proposed to further improve the performance of FR. For example, Wang et al. [20] used a relaxed collaborative representation (RCR) by considering locality constraints. Huang et al. [11] introduced group sparse classifier (GSC) which tries to incorporate the class labels to boost FR performance. IRGSC [26] further introduced group sparse classifier with adaptive weights learning, and had achieved good performance in robust face recognition.

Recently, Yang et al. [23] proposed nuclear norm based matrix regression (NMR) classification framework for occlusion face recognition and had achieved good recognition performance. However, NMR relies heavily on the completeness of database. When the number of training samples is limited, NMR suffers from misleading coding coefficients of incorrect classes. More recently, superposed linear representation based classification (SLRC) [9] model was proposed to further improve the robustness of CRC. SLRC decomposed the training sample of CRC into prototype and variation parts, and proposed a superposed linear representation that encodes the test sample as a superposition of the prototype and variation dictionaries. In SLRC, the author simply assumed that the test image can be reconstructed by class-central of corresponding class and the shared intra-class differences. However, when there are unknow illumination variations or occlusion in the test image, the SLRC model will not work effectively since it cannot reconstruct the image properly.

In order to address the limitations of NMR and SLRC, we propose a novel model called nuclear norm based superposed collaborative representation classifier (NNSCRC). In our model, a query image can be decomposed as a class centroid, a shared sample-to-centroid difference and a low rank error image. The main contributions of this paper are outlined as follows:

  • We propose a new framework named nuclear norm based superposed collaborative representation classifier for robust face recognition where a test face image can be reconstructed as a superposed of class centroid, intra-class difference and low rank error. The new model can address the misleading coding coefficients of incorrect classes when the dataset is undersampled, since it has decomposed the image as a class centroid and sample-to-centroid difference. Alternating direction method of multipliers (ADMM) algorithm has been used to obtain the optimal solution of proposed model.

  • By introducing a nuclear norm constraint, the low-rank part, generally the occlusion or illumination variations in the image, will be separated out from the dictionary reconstruction. Thus, the NNSCRC model is robust to occlusion or illumination variations.

  • NNSCRC model is robust to single sample per person (SSPP) face recognition problem. Specifically, when there is only one train image available in each class, we can borrow the intra-class variations from the subjects outside the gallery since these variations are usually similar across different subjects. The variations between query image and gallery images can be represented by these intra-class variations properly, which will improve the performance of SSPP face recognition.

  • Experimental results on Extended Yale-B and AR databases show the proposed NNSCRC model achieves better performance than state-of-the-art regression based methods for illumination variations, occlusion and undersampled face recognition.

The remainder of this paper is organized as follows: Sect. 2 reviews the related works. Section 3 introduces the proposed nuclear norm based superposed collaborative representation classifier (NNSCRC). In Sect. 4, we conduct experiments on two popular face databases and compare our model with the state-of-the-art regression based methods. Finally, Sect. 5 concludes this paper.

2 Related Works

In this section, we briefly review the regression based methods and introduce SLRC method in detail, which is related to our model.

Regression based methods have long been a research hotspot in face recognition community. Started by SRC, which represents a query image as a sparse reconstruction of dictionary images, many regression based approaches like CRC have been proposed in succession and have achieved good performance in face recognition task. Collaborative representation based methods believe that \(l_2\)-norm constraint is more important than \(l_1\)-norm constraint in classifier. They use training samples to reconstruct the test sample and believe the training samples in the same class will become the major components in the reconstruction process. Although these regression based methods have achieved good performance on general face recognition, their generalization ability to illumination variations, occlusion and undersampled face recognition problems is still weak.

Recently, superposed linear representation based classification (SLRC) [9] is proposed to decompose the collaborative dictionary in a manner similar to the decomposed representation in LDA. Specifically, given a sample x from one of the classes in the training set, SLRC assume it can be naturally reconstructed by two parts:

$$\begin{aligned} x = c_{(x)} + (x - c_{(x)}) \end{aligned}$$
(1)

where \(c_{(x)}\) is the centroid of corresponding class, and \(x - c_{(x)}\) is the intra-class difference from the sample to its class centroid. SLRC has achieved promise performance when the test images have similar attributes to the training images. However, when there are unknow variations in the test image such as illumination changes or occlusion, the SLRC model will not work properly since it cannot reconstruct these variations in test image.

Considering these limitations, we propose a novel framework to incorporate the nuclear norm constraint into superposed linear representation based classification, which not only makes use of the general variation information of training samples, but also improves the robustness to unknow illumination changes and occlusions. The proposed model will be introduced in detail in the next section.

3 Nuclear Norm Based Superposed Collaborative Representation Classifier (NNSCRC)

Although CRC methods have received great success in face recognition, it still suffers from undersampled and occlusion problems. Firstly, when the training images are insufficient or unrepresentative, the test sample has to be reconstructed by the samples of other classes, which usually generates misleading coding coefficients. Secondly, when there are illumination changes or occlusion in the test images, the reconstructed error will be dominated by these noise, which will also lead to erroneous results. In order to overcome these difficulties, we propose a novel robust face recognition framework called nuclear norm based superposed collaborative representation classifier (NNSCRC). We will introduce our NNSCRC model in detail and provide the optimization algorithm of NNSCRC in this section.

Fig. 1.
figure 1

In the proposed NNSCRC model, we try to reconstruct a test image as a linear superposition of the class centroid, the shared intra-class differences, and the low-rank error. (a) the original test image (b) the class centroid image (c) the shared intra-class differences image (shown in absolute value) (d) the low-rank error image (shown in absolute value)

3.1 NNSCRC Model

Inspired by NMR [23] and SLRC [9], we represent a test image as a superposition of three parts, i.e., the class centres, the shared intra-class differences, and the low-rank error, as shown in Fig. 1. Specifically, given a test image Y, we assume it can be reconstructed by the mentioned three parts, which can be formulated as:

$$\begin{aligned} \varvec{Y} = \mathcal {P}(\varvec{\alpha }) + \mathcal {V}(\varvec{\beta }) + \varvec{B}. \end{aligned}$$
(2)

where \( \mathcal {P}(\varvec{\alpha }) = \alpha _1\varvec{P}_1 + \alpha _2\varvec{P}_2 + ...+ \alpha _n\varvec{P}_n,\) \( \mathcal {V}(\varvec{\beta }) = \beta _1\varvec{V}_1 + \beta _2\varvec{V}_2 + ... + \beta _n\varvec{V}_n,\) and \(\varvec{P}_i\) is the central of class i, \(\varvec{V}_i\) is the variation dictionary of class i. \(\alpha _i, \beta _i\) are the corresponding reconstruction coefficients of class i. \(\varvec{B}\) is the low rank error image. To obtain the optimal reconstruction coefficients \(\hat{\varvec{\alpha }}\) and \(\hat{\varvec{\beta }}\), we can naturally construct the objective function as:

$$\begin{aligned} \begin{bmatrix} \hat{\varvec{\alpha }} \\ \hat{\varvec{\beta }} \end{bmatrix} = \arg \min \Vert \varvec{y} - [\varvec{P}, \varvec{V}] \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix} - \varvec{b}\Vert _2^2 + \lambda _1\Vert \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix}\Vert _2^2 + \lambda _2 \Vert \varvec{B}\Vert _* , \end{aligned}$$
(3)

where \(\varvec{P} \in \mathbb {R}^{d \times k}\) is the prototype dictionary and \(\varvec{V} \in \mathbb {R}^{d \times n}\) is the variation dictionary, d is the dimension of face image, k represents the class number and n is the number of training images. \(\Vert \varvec{B}\Vert _*\) represents the nuclear norm of low rank error \(\varvec{B}\), and \(\varvec{b}\) is the vectorization of matrix \(\varvec{B}\). \(\varvec{\alpha }, \varvec{\beta }\) are the coefficient vectors to be determined. \(\lambda _1, \lambda _2\) are the penalty parameters. The prototype dictionary \(\varvec{P}\) consists of centroid from all classes, and the variation dictionary \(\varvec{V}\) consists of intra-class difference from the sample to its class centroid. The construction of dictionaries \(\varvec{P}\) and \(\varvec{V}\) is similar to [9]. For most collaborative representation based methods, undersampled training images usually lead to misleading coding coefficients. The main reason is that when the training images is insufficient, the difference between test image and corresponding prototype class need to be make up by images from other class, which make the major components of reconstruction might be found in the error class. By integrating superposed linear representation classifier with nuclear norm, our model can address the problem of misleading coefficients and enhance the robustness to illumination changes and occlusion. The reasons are listed as follows:

Firstly, we introduce a superposed linear representation into our model, which constructs a prototype dictionary \(\varvec{P}\) and a variation dictionary \(\varvec{V}\). When the dataset is undersampled, the shared variation dictionary \(\varvec{V}\) will make up the difference between the test image and the corresponding prototype class. The major components of reconstructed test image will be the class centroid of corresponding class, the intra-class variations from all classes, and the low rank error, which makes our model can handle the misleading coefficients problem.

Secondly, since occlusion and illumination changes generally lead to a low-rank error image, we apply a nuclear norm constrained matrix to characterize this structured noise (see Fig. 1(d)). When there are unknow occlusion or illumination changes in the test image, the nuclear norm constrained error term will represents this kind of noise properly, which makes the NNSCRC model can work effectively.

3.2 Algorithm of NNSCRC

We provide the theoretical solution of NNSCRC in this section. Since Eq. (3) is not always a convex function, we cannot solve it with traditional methods like augmented Lagrange Multipliers (ALM). Notice that it satisfies the condition of Alternating Direction Method of Multipliers (ADMM) [4], which will been proved in Sect. 3.3, we use ADMM algorithm to solve the optimization problem. Specifically, we first introduce a matrix variable \(\varvec{C}\) and rewrite Eq. (3), which form the object function as:

$$\begin{aligned} \begin{aligned} J(\varvec{\alpha }, \varvec{\beta }, \varvec{B}, \varvec{C})&= \min \limits _{\varvec{\alpha }, \varvec{\beta }, \varvec{B}, \varvec{C}} \Vert \varvec{y} - [\varvec{P}, \varvec{V}] \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix} - \varvec{b}\Vert _2^2 + \lambda _1\Vert \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix}\Vert _2^2 \\&\ \ \ \ + \lambda _2 \Vert \varvec{C}\Vert _* ,\ \ \ \ \ \ \ \ \ s.t.\ \varvec{C} - \varvec{B} = \varvec{0}. \end{aligned} \end{aligned}$$
(4)

Denote

$$\begin{aligned} f(\varvec{\alpha }, \varvec{\beta }, \varvec{B}) = \Vert \varvec{y} - [\varvec{P}, \varvec{V}] \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix} - \varvec{b}\Vert _2^2 + \lambda _1\Vert \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix}\Vert _2^2. \end{aligned}$$
(5)

Then the Lagrange form of \(J(\varvec{\alpha }, \varvec{\beta }, \varvec{B}, \varvec{C})\) is

$$\begin{aligned} \begin{aligned} L_\rho (\varvec{\alpha }, \varvec{\beta }, \varvec{B}, \varvec{C})&= f(\varvec{\alpha }, \varvec{\beta }, \varvec{B}) + \lambda _2 \Vert \varvec{C}\Vert _* + tr(\varvec{Z}^T (\varvec{C}-\varvec{B})) + \frac{\rho }{2} \Vert \varvec{C}-\varvec{B}\Vert _F^2\\&=f(\varvec{\alpha }, \varvec{\beta }, \varvec{B}) + \lambda _2 \Vert \varvec{C}\Vert _* + \frac{\rho }{2} \Vert \varvec{C}-\varvec{B} + \frac{1}{\rho }\varvec{Z}\Vert _F^2 - \frac{1}{2\rho } \Vert \varvec{Z}\Vert _F^2. \end{aligned} \end{aligned}$$
(6)

where \(\rho > 0\) is the Lagrangian multiplier, and \(\varvec{Z}\) is the dual variable. The obtain of the optimal solution contains the following three iterative processes.

Fix \(\varvec{Z}, \varvec{\alpha }, \varvec{\beta }, \varvec{B}\) to Solve \(\varvec{C}\) . At k-th iterative, when \(\varvec{Z}, \varvec{\alpha }, \varvec{\beta }, \varvec{B}\) is fixed, Eq. (6) can be rewritten as

$$\begin{aligned} J_1(\varvec{C}) = \arg \min \limits _{\varvec{C}} \lambda _2\Vert \varvec{C}\Vert _* + \frac{\rho }{2}\Vert \varvec{C}-\varvec{B}_k + \frac{1}{\rho }\varvec{Z}_k\Vert _F^2. \end{aligned}$$
(7)

Let \(\varvec{Q} = \varvec{B}_k - \frac{1}{\rho }\varvec{Z}_k \ \in \mathbb {R}^{m_1 \times m_2},\) where \(rank(\varvec{Q}) = r\). We apply singular value decomposition to \(\varvec{Q}\) as:

$$\begin{aligned} \varvec{Q} = \varvec{U}_{m_1 \times r} \varvec{\varSigma } \varvec{V}_{m_2 \times r}^T, \end{aligned}$$
(8)

where \(\varvec{\varSigma } = diag(\sigma _1, \sigma _2, ..., \sigma _r)\) and \(\sigma _1,\ \sigma _2,\ ...,\ \sigma _r\) are positive singular values. \(U_{m_1 \times r}\) and \(V_{m_2 \times r}\) are corresponding matrices with orthogonal columns. According to [5], the iterative solution of \(\varvec{C_{k+1}}\) can be expressed as

$$\begin{aligned} \varvec{C}_{k+1} = \varvec{U}_{m_1 \times r} (\{max(0,\sigma _j-\frac{\lambda _2}{\rho })\}_{1 \le j \le r}) \varvec{V}_{m_2 \times r}^T. \end{aligned}$$
(9)

Fix \(\varvec{Z}, \varvec{C}\) to Solve \(\varvec{\alpha }, \varvec{\beta }\) and \(\varvec{B}\). At k-th iterative, when \(\varvec{Z}, \varvec{C}\) is fixed, Eq. (6) can be rewritten as

$$\begin{aligned} \begin{aligned} J_2(\varvec{\alpha }, \varvec{\beta }, \varvec{B})&= \min \limits _{\varvec{\alpha }, \varvec{\beta }, \varvec{B}}f(\varvec{\alpha }, \varvec{\beta }, \varvec{B}) + \frac{\rho }{2} \Vert \varvec{C}_{k+1}-\varvec{B} + \frac{1}{\rho }\varvec{Z}_k\Vert _F^2\\&=\min \limits _{\varvec{\alpha }, \varvec{\beta }, \varvec{B}} \Vert \varvec{y} - [\varvec{P}, \varvec{V}] \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix} - \varvec{b}\Vert _2^2 + \lambda _1\Vert \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix}\Vert _2^2\\&\ \ \ \ + \frac{\rho }{2} \Vert \varvec{C}_{k+1}-\varvec{B} + \frac{1}{\rho }\varvec{Z}_k\Vert _F^2. \end{aligned} \end{aligned}$$
(10)

Define \(\varvec{H}_k = \varvec{C}_{k+1} + \frac{1}{\rho }\varvec{Z}_k \in \mathbb {R}^{m_1 \times m_2}, \ \varvec{h}_k = Vec\{\varvec{H}\} \in \mathbb {R}^{m_1 m_2\times 1},\) the optimal solution can be obtained by setting the derivative of \(J_2(\varvec{\alpha }, \varvec{\beta }, \varvec{b})\) with respect to \(\varvec{\alpha },\ \varvec{\beta }\), and \(\varvec{b}\) to zero respectively. Therefore, we have the optimal solution of \(\varvec{\alpha },\ \varvec{\beta }\) and \(\varvec{B}\) at k-th iterative as

$$\begin{aligned} \varvec{\alpha }_{k+1} = (\varvec{P}^T\varvec{P} + 2\lambda _1 \varvec{I})^{-1}\varvec{P}^T(\varvec{y} - \varvec{b}_{k+1} - \varvec{V\beta }_k), \end{aligned}$$
(11)
$$\begin{aligned} \varvec{\beta }_{k+1} = (\varvec{V}^T\varvec{V} + 2\lambda _1 \varvec{I})^{-1}\varvec{V}^T(\varvec{y} - \varvec{b}_{k+1} - \varvec{P\alpha }_{k+1}), \end{aligned}$$
(12)
$$\begin{aligned} \varvec{b}_{k+1} = \frac{1}{2 + \rho }(2\varvec{y} - 2\varvec{P\alpha } - 2\varvec{V\beta } + \rho \varvec{h}_k ). \end{aligned}$$
(13)

Fix \(\varvec{\alpha }, \varvec{\beta }, \varvec{C}\) and \(\varvec{B}\) to Solve \(\varvec{Z}\). According to [4], the optimal solution of \(\varvec{Z}\) at iteration k can be directly obtained by

$$\begin{aligned} \varvec{Z}_{k+1} = \varvec{Z}_k + \rho (\varvec{C}_{k+1} - \varvec{B}_{k+1}). \end{aligned}$$
(14)

With the iteration optimal solution in Sect. 3.2, we can finally obtain the optimal solution of \(J(\varvec{\alpha }, \varvec{\beta }, \varvec{B}, \varvec{C})\) by alternate iteration. Finally, the optimal reconstruction coefficients are:

$$\begin{aligned} \varvec{\hat{\alpha }} = \varvec{\alpha }_{k+1},\ \ \hat{\varvec{\beta }} = \varvec{\beta }_{k+1}. \end{aligned}$$
(15)

3.3 Classification Strategy of NNSCRC

Given test image Y, we need to decide which class it belongs to for face recognition task. By using NNSCRC algorithm, we can obtain the reconstruction coefficients \(\varvec{\hat{\alpha }}\) and \(\varvec{\hat{\beta }}\). We use the reconstruction residual in each class as the criterion for classification. Specifically, the residual of test image Y is

$$\begin{aligned} r_i(\varvec{Y}) = \Vert \varvec{Y} - [\varvec{P}, \varvec{V}]\begin{bmatrix} \delta _i(\varvec{\hat{\alpha }}) \\ \varvec{\hat{\beta }} \end{bmatrix} - \varvec{B}\Vert _2,\ i = 1, ..., k. \end{aligned}$$
(16)

Where \(\delta _i(\varvec{\hat{\alpha }}) \in \mathbb {R}^n\) is a new vector whose only nonzero entries are the entries in \(\varvec{\hat{\alpha }}\) that are associated with class i. Note that when we calculate the residual, we use intra-class variation matrix of all classes to reconstruct the test image Y, because these intra-class variation are often shareable across different subjects. This is also one of the reason that our model is suitable for SSPP task. From Eq. (16), we can find that the normal variations and error image are separated out from the original query image, which can remove the influence of illumination changes and occlusions. Based on the reconstruction residual, we can decide the class label by

$$\begin{aligned} class(\varvec{Y}) = \arg \min _i r_i(\varvec{Y}). \end{aligned}$$
(17)

4 Experiments

In this section, we perform extensive experiments on two publicly available face datasets to demonstrate the effectiveness of NNSCRC. Section 4.1 first gives the experimental settings of our experiments. In Sect. 4.2, we evaluate NNSCRC for FR with different training sizes under controlled conditions. Section 4.3 verifies the robustness of NNSCRC to illumination changes and occlusion face recognition. Section 4.4 compares our method with existing methods for face recognition task under real face disguise. Finally, in Sect. 4.5, face recognition experiment with single sample per person has been performed.

4.1 Experimental Settings

We apply Aleix Martinez and Robert Benavente (AR) dataset [14] and the Extended Yale B (ExYaleB) dataset [12] to test the effectiveness and robustness of proposed model. The AR dataset contains over 4000 images of 126 individuals (70 men and 56 women). The faces in AR dataset contain variations such as lighting conditions, expressions and occlusions. Some examples of face images in AR database are shown in Fig. 2. For this dataset, we randomly seclect 100 subjects (50 men and 50 women) for our experiments. The Extended Yale B face dataset contains 38 human subjects under 9 poses and 64 illumination conditions. The 64 samples of each subject are acquired in a particular pose, which are all frontal view facial images. Figure 3 shows some facial images in ExYaleB database. All face images marked with P00 are used in our experiments.

Fig. 2.
figure 2

Facial image samples in AR database

Fig. 3.
figure 3

Facial image samples in the Extended Yale B face database

The proposed model is compared to state-of-the-art regression based representation methods including NMR [23], WGSC [18], RCRC [6], RSRC [22], and IRGSC [26]. For NNSCRC, the Lagrangian multiplier \(\rho \) is set to 1, and the parameter \(\lambda _1,\ \lambda _2\) are both traversed in \(\{0.01,\ 0.05,\ 0.1,\ 0.5, 1,\ 5,\ 10\}\) to obtain best result. For all the comparative methods, the related parameters are set to the values suggested by the authors.

4.2 Face Recognition with Different Sample Sizes

We first validate the performance of NNSCRC without occlusion on ExYaleB database. In order to explore the effect of sample size on experimental results, we randomly split the dataset into two parts. One part is used as the dictionary, which contains n(=10, 20, 30, 40, 50) images for each person, and the other part is used for testing. The results are shown in Fig. 4, which compares our method with the state-of-the-art method, IRGSC. Two most classical regression based face recognition methods including CRC and SRC have also been used for comparison.

Fig. 4.
figure 4

Face recognition with different sample sizes on ExYaleB database

From Fig. 4, we can find that the performances of all methods improved when the sample size increases. Though the test faces suffers from illumination problems, for all groups of sample size, our NNSCRC model outperforms SRC and CRC for over five percentage, which shows our model is more robust to illumination variations compared to original collaborative representation methods. IRGSC achieves higher accuracy than SRC and CRC because it use the reconstruction residuals to obtain the feature weights, which can reduce the influence of the pixel errors. However, there are still some variations between train images and test images which will influence the reconstruction and classification, and these variations cannot easily removed by the adaptive weights in IRGSC. In comparison, our model still achieves higher accuracy than IRGSC for all groups of sample sizes. The main reason is that our model can reconstruct the variations by using the variation dictionary which is constructed by all classes. The nuclear norm constraint can also handle the illumination variations problem, which make NNSCRC achieve better performance compared to IRGSC.

4.3 Face Recognition with Occlusion

To validate the robustness of proposed NNSCRC model to occlusion, we conduct two types of experiments on ExYaleB dataset, including random block occlusion experiment and random face occlusion experiment.

Random Block Occlusion. We select 20 samples per subject in ExYaleB dataset for training, and 20 for testing. Similar to the work in IRGSC, for each test image, we randomly select a location in the image and replace 10–60% pixels using a black block. Figure 5 shows the examples of different percentage of occlusions. The recognition rates of different methods are shown in Table 1. From Table 1, we can see that for all group of block occlusion, our method achieve the best performance compared with state-of-the-art regression based methods. Note that for \(60\%\) occlusion, our method still achieves \(80.3\%\) recognition rate, which is \(7.7\%\) higher than IRGSC. NMR has worse performance compared to IRGSC because it simply ignores the general variations, which will also influence the reconstruction error. By considering the general variations and the low-lank error, the proposed model can achieve better performance than other methods.

Fig. 5.
figure 5

Samples with different percentage of pixel corruption (0%–60%)

Table 1. Recognition accuracy of different methods versus different percentage of block occlusion

Random Face Occlusion. In this experiment, we replace 10–50% pixels of each test images with other face images. As shown in Fig. 6, both the location of occlusion position and the occlusion face images are randomly selected. Table 2 lists the recognition accuracy of different methods. As can be seen, our method still achieve better performance compared to others methods. The recognition rate of our model is a little lower than that of random block occlusion, which is due to the reason that face occlusion is not strictly low rank. Still, our model outperforms about \(2\%\) than IRGSC under large percentage face occlusion, which indicates the effectiveness of NNSCRC to address occlusions.

Fig. 6.
figure 6

Samples with different percentage of face occlusion (0%–50%)

Table 2. Recognition accuracy of different methods versus different percentage of face occlusion

4.4 Face Recognition with Real Disguise

To evaluate the robustness of our model to real possible disguise, we further conduct experiments on AR dataset. As shown in Fig. 2, there are some samples with sunglasses or scarves in AR database, which reflects the real FR conditions in practical application. This kind of occlusion is irregular, thus brings a large challenge for FR tasks. In our experiment, the face images of these 100 persons were separated into 2 sessions according to the shooting time of photos. For each person, we select 3 images in session 1 which has no illumination changes or occlusion problem as training samples. 1200 face images are used for test, which are divided into 4 groups as: 300 face images with illumination changes and sunglasses in session 1, and 300 face images with illumination changes and scarves in session 1, and the same divided in session 2.

The experiment results of competing methods are listed in Table 3. Clearly, the NNSCRC method achieves better result in all 4 groups of experiments compared with WGSC, RCRC, RSRC, and NMR. WGSC has the worst performance, while WGSC tried to regress the query images only with the training samples, and failed to consider the influence caused by occlusion. RCRC tries to solve the problem of occlusion, and in fact achieves better performance than WGSC. Note that our model outperform NMR by around 14%, which indicates that by introducing a superposed linear collaborative representation to NMR model, our model can enhance the robustness of face recognition effectively.

Table 3. Recognition rates (%) of different methods on AR database

4.5 Face Recognition with Single Sample per Person

We further conduct experiments on ExYaleB dataset to evaluate the robustness of our model to single sample per person (SSPP) face recognition. 20 persons in ExYaleB are used for SSPP test and the other persons are used to construct intra-class variations. We use the first image of these 20 persons in ExYaleB dataset as gallery, and select 30 images each person as probe set. The results are shown in Table 4. As can be seen, the recognition rate of NNSCRC is 9.9% and 3.9% higher than that of NMR and IRGSC respectively. Though NMR and IRGSC can handle the problem of differences between query and gallery images in some kind, both of them suffers from the misleading coding coefficients of incorrect classes when there is only one sample per subject. Different from these methods, our model can borrow the intra-class variations from other subjects which are not in the gallery set because these variations are usually similar across different subjects. Clearly, the NNSCRC method achieves much better result than NMR and IRGSC since NNSCRC can borrow the intra-class variations from other subjects, which demonstrate our model is capable for SSPP face recognition task.

Table 4. SSPP FR accuracy of different methods on ExYaleB database

5 Conclusion

In this paper, we present a NNSCRC model for robust face recognition task. In the proposed framework, a superposed collaborative representation is adopted to obtain robust representation of reconstruct face images. By representing a face image as a superposed of a class centroid, a shared sample-to-centroid difference and a low rank error, our method can address the misleading coding coefficients of incorrect classes when the dataset is undersampled. Specially, when there is only a single sample per class available, the proposed model can still have promised performance by acquiring the intra-class variation base from the generic subjects outside the gallery. Furthermore, our model is robust to occlusion and illumination changes by introducing nuclear norm constrained. Experiments on the famous Extended Yale-B and AR databases show the superiority of our model compared with the state-of-the-art regression based face recognition methods.