Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The ability to promptly recognise faces is something people automatically do in their everyday lives [12]. Despite the apparent lack of effort in doing so, the amount of processing people actually do make such a process a hard task for a computer to perform. Part of this difficulty is due to the fact that this is a multifaceted problem, where one has not only to deal with pose variations, but also image resolution and light variation.

With current computers’ computational power and low cost, the interest in the automatic processing of images and graphical videos has increased, with practical applications in the most varied areas of human knowledge. Existing systems for facial recognition usually operate in either of two ways [12]:

  1. 1.

    Face verification or authentication, which consists in a one-to-one comparison where some face in a database is compared to the face to which authentication is required; or

  2. 2.

    Face identification or recognition, which consists in a one-to-many comparison where some image is matched against all images in a data base, so as to determine the identity of the consulted face.

In uncontrolled environments, where there mainly is a significant amount of pose and light variation, along with constant movement, automatic facial recognition systems still finds a whole lot of challenges that can significantly affect their performance. Such challenges may be found in some literature reviews already made in the field (e.g. [7, 19, 21, 22]). These, however, usually focus on aspects such as pose variation and image resolution, amongst others, either in video or in static pictures, with very little reference, if any, to light variation in video recordings.

As a feature, light variation plays a paramount role in the task of identifying the person or object under examination and, more specifically, the task of face recognition, for changes in the light level may affect other features, such as perceived colour and shape (by adding shadows to the image, for instance). As a result, it comes as no surprise that light variation may drastically affect the performance of facial recognition systems [1].

Hence, and to help fill in this gap, in this article we present a systematic literature review we carried out on this subject. Our main goal with this review was to build a broad picture of the current state of the art in techniques used to approach the problem of light variation in video, when applied to automatic face recognition, by seeking and comparing work that deal either directly or indirectly with this problem. Our aim, with this review, is to contribute to future work in this area, by saving research time which can be directed to the search for solutions for other more specific issues.

To do so, we start out by describing the procedure we followed in our review, from the initial exploratory searches to the analysis of the final results. We also describe the steps in defining the review protocol and the decisions we had to make along the way. Executed during the month of May 2015, the review was conducted over the IEEE and ACM databases, resulting in a total of 104 articles, of which only 24 remained after the selection/rejection criteria were applied.

As it turned out, the problem of light variation is reported as one of the greatest challenges in the area of automatic facial recognition. Interestingly, although over 60 % of the articles do recognise light variation to be an important feature, not all of them approach the problem in a direct way. Some, in fact, limit themselves to acknowledging its importance, but with no practical proposal for its solution.

Still, many of them try to directly approach the problem. From these, we have set up a list of explored techniques, along with statistics about their popularity amongst the reported researches. As it turned out, Viola Jones and Principal Component Analysis are the most popular methods, from a total of seven different approaches we found.

Through this systematic review, we could determine that the problem of light variation is one of great concern for researchers in the area, and which is still an open question. With this effort, we hope to help other researchers in the field to approach this problem not only by summing up the major techniques their peers are currently using, but also by presenting it in the form of a reproducible systematic review, thereby allowing for comparisons to be significantly made with other existing or forthcoming related work.

The rest of this article is organised as follows. Section 2 describes the steps followed in this systematic review, from the starting exploratory search to the conduction of the review itself. Results are then presented in Sect. 3, whereas in Sect. 4 we make a discussion of the main findings, pointing out some challenges in the field. Finally, Sect. 5 concludes this work, also presenting some limitations of the current research.

2 Materials and Methods

According to Mian et al. [15], a systematic review is a scientific methodology that goes beyond a simple overview of the state of the art about some subject. It is an actual research capable of identifying, selecting and producing data related to some specific topic, which can also be used to identify gaps in the current state of knowledge. Differently from a simple literature review, then, a systematic review consists in a logical sequence of processes that can be reproduced by other researchers. Figure 1 illustrates the steps taken in our review. In what follows, we will describe these processes, as we move along them in our research.

Fig. 1.
figure 1

Processes’ flow in the systematic review.

2.1 Exploratory Search and Review Protocol

The first step we took in this research was to identify which data bases would be sought for research articles and which terms should build the search string for these bases. The chosen bases were IEEE XploreFootnote 1 e ACM PortalFootnote 2, given their size and focus on the area of Computer Science.

The bases were then searched using some terms related to the field. From this initial search, we could characterise the review protocol, whose main goal is to put in details how the systematic review will be conducted, serving as a guide throughout the review process. As such, the protocol covers, amongst other things, research goals, keywords used in the search, along with criteria for adding and excluding articles in the research corpus. In our research, the research protocol comprises:

  • Research goal: To determine the state of the art in the area of automatic image processing, more specifically the problem of facial recognition, focusing in uncontrolled environments with light variation.

  • Research question: “What are the main techniques used for facial recognition in uncontrolled environments with light variation?”.

  • Sources: ACM Portal and IEEE Xplore.

  • Keywords: Image processing, face recognition, face detection, face tracking, multiple object recognition, low resolution, surveillance camera, illumination invariance, pose invariant, person recognition.

  • Criteria for inclusion: Should be considered relevant articles which (a) describe techniques for facial recognition; (b) define image processing and facial recognition concepts; (c) are related to facial recognition in real-life uncontrolled environments with light variation; and (d) deal with facial recognition using low-resolution cameras.

  • Criteria for exclusion: Should be excluded from the set of retrieved articles those (a) not related to facial recognition; (b) using facial recognition techniques, but in controlled environments where, for example, high resolution cameras are used or there is good light; (c) presenting shallow evaluations, without giving details about the methods and techniques investigated; (d) using 3D models for facial recognition; (e) which did not undergo a peer-reviewed evaluation process; and (f) not dealing with the problem of light variation.

  • Criteria for primary study quality: As a quality criteria for the investigated primary studies we defined the assessment of the techniques reported in the article and their relevance to this review’s goal.

  • Selection of primary studies: Primary studies were queried in the data bases using search strings build from the keywords defined in this protocol. These strings should be used in the search engines of the data bases defined in this protocol, during the execution of the review.

  • Assessment of the primary studies quality: We selected only articles that satisfied one or more criteria for inclusion. If an article satisfies criteria both for inclusion and exclusion, it must be read entirely to decide for its inclusion or exclusion.

  • Information extraction strategy: Articles should be first analysed by reading their titles, abstracts and, whenever necessary, conclusions.

  • Results summarisation: From the selected articles we extracted the reported techniques, variables and final results. With this information it would be possible to determine the main advantages of the applied techniques, as well as to identify existing gaps in the current state of the art.

2.2 Review Conduction and Data Extraction

The review itself consisted in searching the data bases using the search strings built from the keywords defined in the review protocol. Searches were made in May 2015 and, for each database, we had a different search string, better suited to the base’s search engine. The strings used in each data base were:

  • ACM Portal: ((face recognition OR face detection OR face tracking OR person recognition) AND (low resolution OR surveillance camera) AND (illumination invariance OR pose invariant))

  • IEEE Xplore: (“face recognition” OR “face detection”) AND (“low resolution” OR “surveillance camera” OR “illumination invariance” OR “pose invariant”)

The search in ACM returned 77 articles, of which only 10 satisfied the criteria for inclusion. Over IEEE the search retrieved a total of 27 articles, with one duplicated (in comparison to ACM) and six satisfying the criteria for inclusion. When reading the articles, another one was excluded for reporting a literature review (i.e. a secondary study). As a result, we ended up with 15 articles considered relevant to our research. These 15 articles were then read in their entirety, in the search for information such as the research contributions, applied techniques, approached variables and results found.

3 Results

From the extracted data, we could notice that the problem of light variation is real and a relevant one for the process of facial recognition, for it is present in all of the articles found in our research. Figure 2 illustrates this result, along with the main variables found in the 15 included articles.

Fig. 2.
figure 2

Main variables found in the research.

Articles that passed the criteria for inclusion may be divided in two sets: those reporting research that directly deals with the problem of light variation and those that deal with it indirectly. In the first group, we find the research by Arandjelovic and Cipolla [2], who deal with the role colours play in automatic facial recognition. In their work, the authors claim that light variation is the most challenging aspect of this task, for different light may drastically change the performance of automatic systems. The way they found to overcome this problem was to exclude non-informative regions of the image, that is, pixels whose luminosity was lower than 3 % or higher than 97 % of the maximum luminosity in the image, thereby removing those pixels with greater light variation. For the face detection, the authors used Principal Component Analysis and the Viola-Jones algorithm.

Another approach tried by the same authors was to propose an algorithm for “reillumination” [4] that takes two image sequences as input and outputs a synthetic sequence, with the same poses of the first sequence, combined with the illumination of the second. Using Viola-Jones to detect faces and the proposed “reillumination” algorithm, the authors have improved facial recognition accuracy. This algorithm was also tested along with Probabilistic Principal Component Analysis (PPCA), in another article by the same authors [6]. In yet another work [3], they present two approaches based on generative models (illumination cones and 3D morphable model) for light variation in images, also proposing the use of Gamma Intensity Correction (GIC) to compensate for bright variation, along with a normalization of the light subspace to mitigate the problem of light variation.

Alternatively, Arandjelovic, Hammoud and Cipolla [5] approach the difficulty facial recognition algorithms have to deal with pose contrast and light variation by relying on thermal images. As a resource, these images have the desired characteristic of being almost insensitive to light variation. On the other hand, there is a loss of facial information that could be relevant to the recognition process. The authors then propose the fusion of both visual and thermal images which, according to their tests, raised the facial recognition accuracy up to 97 % in a test set with great light variation.

Following a similar path, Heo, Savvides and Vijayakumar [11] also propose the use of thermal images and correlation filters, especially in images with light variation and low resolution. Results showed that correlation filters improved the face recognition performance in low resolution images, both when dealing with visual and thermal ones. When comparing them, face recognition with thermal images outperformed the test on the visual ones under light variation and facial expression variation conditions, despite the loss of relevant facial information that thermal images present when compared to visual ones.

A different approach to mitigate the light variation problem is proposed by Wang and Li [17]. In their work, they focus on eliminating the effect that uneven light has on faces. To do so, they propose a new way to classify illumination orientation, and then compensate or eliminate its variation. Since this procedure reduces the quality of the image, they also propose an image fusion rule. Throughout their work, different techniques were tested, such as Lambert Illumination Model and PCA, amongst others.

Also relying on methods to compensate for light variation, Bicego et al. [9] propose to use not only nose, eyes and mouth as characteristic areas for facial recognition, but also the space around them. This approach conceptually differs from others, which rely on specific parts of the face only. With a mean error rate around 10.93 %, the authors claim this to be a feasible alternative, compared to other state of the art methods. Finally, to mitigate the problem of light and pose variation, occultation and low resolution, Mishra and Subban [16] examine a fusion strategy based in skin shade segmentation and highlight for face detection. The authors work in a YCbCr orthogonal colour space, since the superposition between skin and non-skin regions is small. Results were significantly good, with the additional advantage of this being a simple and low complexity approach.

Amongst the articles that deal with the problem of light variation indirectly, that is those that usually limit themselves to presenting the problem as a challenge or to showing that their methods are robust in such conditions, we find the work by Bedagkar-Gala and Shah [8], which deal with the problem of reidentification based on facial characteristics and colours, in a multi-camera environment. In this case, the authors point out that light variation is an important variable that should not be set aside, for even regions with greater variation contribute to improve their model. Still in the realm of reidentification, Bak et al. [10] approach it in a camera network environment which, by itself, implies a good deal of light variation, along with pose variation and occultation. The authors show that by normalising the covariance matrix, they managed to absorb rotation and light variations.

Louis, Plataniotis and Ro [14], in turn, propose a face detector that combines results from two classifiers. The authors try out different algorithms such as Viola-Jones, AdaBoost and GentleBoost, amongst others. The system was tested by artificially adding noise, with one of them being light variation (from -100 % to +100 %). Even in the presence of such noise, their detector proved capable of dealing with great light variation, outperforming Lienhart’s detector. Louis and Plataniotis [13] also integrate two types of Local Binary Pattern (LBP) characteristics for the same task. Here, they used a Circular LBP, which targets the pixels in the image, and a Histogram LBP, in which whole regions are the target. According to their results, LBP has proved capable of dealing with the problem of light variation.

Alternatively, Wang et al. [18] introduce an incremental learning approach for video-based facial recognition. Through their “visual words” algorithm, sequences of images of faces are clustered according to their descriptors. Representative images are then extracted from each cluster. With these images’ descriptors, the algorithm builds a “code book”. A voting algorithm is then used to determine the face identity in the video. The authors claim that their approach is robust even under pose, light and facial expression change conditions. Used algorithms were Adaboost, Camshift and Viola-Jones, amongst others.

Finally, Wang, Miao and Zhang [20] propose a framework for the extraction of low resolution facial images in video sequences under light variation, which uses a skin detection algorithm in each frame of the sequence to help on the face detection task. Although citing the problem of light variation, the authors did not try to solve or extenuate it.

4 Discussion

As the results illustrate, the problem of light variation is still an open one, severely affecting the automatic recognition of faces. Given its complexity, however, not all of the retrieved articles deal with this question directly, despite the fact that all of them take this problem to be relevant. As a workaround, some researchers constrain their work to thermal images only, which are not sensitive to light variation. The drawback to this approach, however, is the loss of relevant facial information, such as colour for example, which might be relevant to the performance of the recogniser.

Fig. 3.
figure 3

Strategies adopted in the retrieved articles (and number of articles applying them).

Another approach taken by researchers is to try to mitigate the problem, by running a “reillumination” algorithm, in order to normalise the light in the image. In our review, the main strategies we found to tackle, mitigate, or even move around this problem are:

  • Use of thermal images;

  • Exclusion of “non-informative” pixels, that is those with greater light variation;

  • Light compensation and/or generalisation;

  • Use of “reillumination” algorithms;

  • Use of colours to improve the performance of grey-scale based algorithms;

  • Use of methods for skin detection in images.

Figure 3 shows the number of articles adopting each of the strategies, while Fig. 4 presents the main algorithms used for facial recognition. As can be seen, the preferred technique for facial recognition is Principal Component Analysis (PCA) and its variations, followed by Viola-Jones algorithm. Besides being very popular in the area, both seem to outperform other state-of-the-art techniques. Along with these, other algorithms are also found, and new ones are proposed.

Fig. 4.
figure 4

Main algorithms applied to automatic facial recognition (and number of articles using them).

5 Conclusion

In this article we presented the results of a systematic review we carried out in the field of automatic face recognition, in an attempt to identify how the sub-problem of light variation is being approached by current research. Besides presenting a snapshot of the current state of the art in the field, results showed the main techniques and algorithms used both to tackle this problem, or to avoid it (or even to diminish its influence).

We understand this work to be useful for researchers who are willing to enter the field, or to students who want to learn more about it. As a limitation of our work, we cite the focus on two data bases only (IEEE and ACM). However useful and trustworthy, there can be found other data bases, which might be searched for a broader view on the subject.