Keywords

1 Introduction

Skin detection is an important issue in color image processing, which has been extensively studied over the years. It is a useful technique for the detection, segmentation and tracking of human skin in images or video streams. The interest in skin detection algorithms derives from their applicability to a wide range of applications such as gesture recognition, video surveillance, human computer interaction, ego-vision systems, human activity recognition [4, 6, 21], hand gestures detection and tracking [7, 8, 23, 31], nude images and video blocking [5, 26], feature extraction for content-based image retrieval [20], and age estimation [22].

Skin detection is a process that allows the extraction of candidate skin pixels in an image. In most cases, skin detection is performed by using pixel based techniques: a pixel is classified as a skin or non-skin pixel, independently from its neighbors, and only by using pixel color information. In addition, region based skin segmentation methods make use of extra information, for example spatial arrangement or texture information on the pixels detected in the skin detection process, to determine the boundaries of human skin regions [16, 17]. Therefore, a good pixel based method for skin detection can narrow the computational cost of the next process of segmentation, and, moreover, can improve the results of the segmentation. The main issue is to achieve a satisfactory skin detection under uncontrolled lighting conditions, since many applications, for example egovision systems, require the detection of skin human regions, both indoors and outdoors, with high or low illumination conditions. Most approaches use specific color spaces to de-correlate chromatic components from luminance, since they are less sensitive to lighting conditions [9, 12]. However, some studies [10, 14] have shown that the luminance component plays an important role in skin detection and so it should not be rejected.

In the present work, an explicit skin cluster method, in the YCbCr colour space, is proposed. The method takes into account the illumination changes of the examined image, and tries to minimize both false positives and false negatives. It results to be computationally efficient for real-time applications.

The rest of the paper is organized as follows: in Sect. 2, a description of the related work for skin detection is presented; Sect. 3 describes the proposed approach; in Sect. 4, some results and comparative evaluations performed on two publicly available databases are reported; finally, in Sect. 5 conclusions are drawn.

2 Related Work

Many recent surveys describe the various skin detection approaches [13, 18, 30]. Many approaches have been proposed for the skin colour detection; they include linear classifiers [9, 11, 12, 19, 25, 27], Bayesian [10, 15] or Gaussian classifiers [11, 29], and artificial neural network [1, 2, 28].

It has been demonstrated that the human skin colour can be modelled in many colour spaces [3, 24].

In [19], heuristic rules in the RGB color space are used to detect skin pixels; these rules depending on the image illumination; other methods adopt linear or non-linear transformations of the RGB colour space, such that other colour spaces can be generated. In particular, those colour spaces that separate the luminance and the chrominance components, are the most commonly used in skin colour detection approaches. This is the case of HSV and YCbCr, which are a non- linear and a linear transformation of the RGB colour space, respectively. Concerning to the HSV colour space, Hue (H) and Saturation (S) are the chrominance components, while Value (V) is the luminance component. Most of the methods that work in this colour space ignore the luminance component, since it does not result to be discriminant [12, 25]; however, also some methods that include the luminance in the process of skin detection have been presented [27]. Concerning to the YCbCr colour space the chrominance components Cb and Cr are obtained by subtracting the luminance component Y from blue and from red, respectively. Also in this case, some approaches ignore the luminance component [9], while others take it into account [14]. However, it has been demonstrated that skin colour is non-linearly dependent on the luminance component in different colour spaces, thus the luminance component should be included in the skin detection process [11, 14].

Explicit cluster methods are based on the definition of colour rules, in particular on the definition of a colour range for skin pixels [9, 25], or on the definition of a shape for skin pixel distribution (e.g., rectangle and ellipse). In [14], two different skin cluster models, that take into account the luminance component, have been proposed. In the first model, the skin clusters are determined by two central curves, one for the YCr and one for the YCb subspace, and by their spreads in the respective subspaces. In the second model, a single skin cluster is represented by an ellipse in a transformed CbCr subspace.

In this paper, a new approach, that works in the YCbCr colour space is proposed. In particular, taking into account the illumination conditions of the examined image, a dynamic cluster for the YCb and YCr subspaces is computed.

3 The Proposed Approach

As already shown in [14], the distribution of skin pixels in the YCb and YCr subspaces presents a trapezoidal shape (see Fig. 1a), differently from distribution of skin and non-skin pixels (see Fig. 1b).

Fig. 1.
figure 1

Cr and Cb components as function of Y component for a specific image: (a) distribution of skin pixels; (b) distribution of skin and non-skin pixel.

Moreover, we experimentally observed that the size and shape of these trapezia change depending on the lighting conditions. In particular, we have observed that:

  • for images in high illumination conditions, the bases of the two trapezia in the YCb and YCr subspaces representing the skin colour clusters are larger than those associated with the skin colour clusters in low illumination conditions;

  • the positions of the vertices of the trapezia change according to the illumination conditions of the examined image;

  • for skin pixels, the minimum value of Cr (in the following Cr min ) and the maximum value of Cb (in the following Cb max ) are practically fixed at the values 133 and 128, respectively, as reported in [9], while the maximum value of Cr and the minimum value of Cb strongly change with the illumination conditions;

  • for a skin pixel, the values of the Cr and Cb generally satisfy the following conditions:

$$ \begin{aligned} & 1 3 3\le Cr \le 1 8 3\\ & 7 7\le Cb \le 1 2 8\\ \end{aligned} $$

With reference to the Fig. 2, the vertices A and D of the larger basis of the trapezium related to the YCr skin subspace are given by (Y min , Cr min ) and (Y max , Cr min ), where Y min  = 0, Y max  = 255 and Cr min  = 133. The same applies to the vertices E and H of the larger basis of the trapezium related to the YCb skin subspace that are given by (Y min , Cb max ) and (Y max , Cb max ), with Cb max  = 128. Concerning to the vertices B and C of the shorter basis of the trapezium associated with the YCr skin subspace, they are set to (Y 0, Cr max ) and (Y 1, Cr max ). Taking into account the histogram of the pixels with values of Cr in the range [133,183], Cr max is set to the maximum of Cr, associated with at least a 10% of image pixels. So, Y 0 and Y 1 values are set as the 5th percentile and the 95th percentile of the Y component, respectively, considering all the pixels of the image with Cr = Cr max . The same process is applied to find the vertices F and G with coordinates (Y 2, Cb min ) and (Y 3, Cb min ) respectively, of the shorter basis of the trapezium associated with the YCb skin subspace.

Fig. 2.
figure 2

Graphical representation of Y min , Y max , Y 0, Y 1, Y 2, Y 3, Cr max , Cr min , Cbmax, Cbmin.

Set a Y value, a point on the upper border of the trapezium in the YCr subspace will have coordinates (Y, T Cr (Y)), while a point on the lower bound of the trapezium in the YCb subspace, will have coordinates (Y, T Cb (Y)). T Cr (Y) and T Cb (Y) are given by:

$$ T_{Cr} \left( Y \right) = \left\{ {\begin{array}{*{20}l} {Cr_{min} + d_{Cr} \frac{{Y - Y_{min} }}{{Y_{o} - Y_{min} }}} \hfill & {Y \in [Y_{min} ,Y_{0} ]} \hfill \\ {Cr_{max} } \hfill & {Y \in [Y_{0} ,Y_{1} ]} \hfill \\ {Cr_{max} - d_{Cr} \frac{{Y - Y_{1} }}{{Y_{max} - Y_{1} }}} \hfill & {Y \in [Y_{1} ,Y_{max} ]} \hfill \\ \end{array} } \right. $$

where \( d_{Cr} = Cr_{max} - Cr_{min} \)

$$ T_{Cb} (Y) = \left\{ {\begin{array}{*{20}l} {Cb_{max} - d_{Cb} \frac{{Y - Y_{min} }}{{Y_{2} - Y_{min} }}} \hfill & {Y \in [Y_{min} ,Y_{2} ]} \hfill \\ {Cb_{min} } \hfill & {Y \in [Y_{2} ,Y_{3} ]} \hfill \\ {Cb_{min} + d_{Cb} \frac{{Y - Y_{3} }}{{Y_{max} - Y_{3} }}} \hfill & {Y \in [Y_{3} ,Y_{max} ]} \hfill \\ \end{array} } \right. $$

where \( d_{Cb} = Cb_{max} - Cb_{min} \)

Finally, we classify a pixel as skin pixel, if it satisfies the following two conditions:

$$ \begin{array}{*{20}c} {Cr\left( Y \right) \in \left[ {Cr_{min} ,T_{Cr} \left( Y \right)} \right]} \\ {AND} \\ {Cb(Y) \in \left[ {T_{Cb} (Y),Cb_{max} } \right]} \\ \end{array} $$

4 Results and Comparison

The proposed approach has been compared with the method described in [9], which also works in the YCbCr colour space but with a fixed colour range, and with the method presented in [14], considering the both formulations of the skin cluster models. The approach has been tested on the Hand Gesture Recognition (HGR) database [17], containing 1,558 skin images of human hand and arm postures taken with different lighting conditions and on the Compaq database [15], a large database that consists of 4,675 colour images, containing skin images in unconstrained illumination and background conditions. Some qualitative results, for our approach and for the methods with which we compared, are shown in Figs. 3 and 4, for some selected images of the HGR and Compaq databases. Starting from a qualitative analysis of the results, it is clear that the method proposed in [9] generally obtains good results, but in some cases, many false positives are found (see row 3 in Fig. 3 and rows 1 and 3 in Fig. 4); in fact, the pixels belonging to regions of eyes and mouth or of background are generally wrongly detected as skin pixels. Concerning to the method in [14], in the YCbCr formulation, also in this case, many false positives are found, particularly in presence of high or low illumination conditions (see all the results of the Fig. 3 and rows 2 and 5 in Fig. 4); the skin detection performance improves on the case of its formulation in the transformed CbCr subspace, but in some cases many false negative are detected (see row 1 in Fig. 3 and rows 1 and 4 in Fig. 4).

Fig. 3.
figure 3

Qualitative analysis of skin detection results on the HGR database: (a) the input image; (b) the ground truth; (c) Chai, Ngan [9]; (d) Hsu et al. in the YCbCr space [14]; (e) Hsu et al. in the CbCr subspace [14]; and (f) the proposed method.

Fig. 4.
figure 4

Qualitative analysis of skin detection results on the Compaq database: (a) the input image; (b) the ground truth; (c) Chai, Ngan [9]; (d) Hsu et al. in the YCbCr space [14]; (e) Hsu et al. in the CbCr subspace [14]; and (f) the proposed method.

Moreover, quantitative results in terms of F-measure are reported in Table 1, for all the analysed approaches. The proposed approach outperforms the other methods in terms of F-measure.

Table 1. F-measure for the skin detection approaches on the Compaq and HGR databases.

Finally, the computational cost of the proposed approach has been estimated. The performance of the algorithm has been estimated on a PC equipped with an Intel Xeon E5-2623 at 3 GHz, and with 16 GB RAM. For an image with a size of 320 × 480, the execution time is, on average, 8 ms.

5 Conclusion

We have presented a new approach for skin detection in the YCbCr color space. The method shows some robustness to variations in illumination conditions, because the skin cluster range in the YCbCr color space is defined dynamically, taking into account the luminance component. In particular, two clusters are found, one in the YCb subspace and one in the YCr subspace.

The performance of the method has been tested on two publicly available databases, producing satisfactory results both qualitatively and in terms of quantitative performance evaluation parameters such as F-measure. The results of a comparative analysis are promising. With respect to methods based on fixed cluster ranges, the proposed one provide adequate results also on images acquired in low or high illumination conditions.