1 Introduction

The automatic 3D shape recognition has known a growing interest during the last years in the pattern recognition field. Recently, the 3D data become active especially with the 3D acquisition materials improvement and the big computer capacity in the term of calculations. Therefore, the quality and the resolution of 3D meshes become better. In addition, 3D data permit to overcome the problems often encountered in 2D data. In fact, 2D data need an invariance under the perspective transformations while the 3D data surfaces need only the invariance under the Euclidian transformations. But one of the major problems of 3D surfaces is the lack of a canonical parameterizations. This fact makes hard the matching procedure between 3D objects. In order to overcome as much as possible this limits many works propose to extract an invariant description from 3D surfaces under the initial parametrization. In the literature, the 3D shape description can be classified into two main categories: The global methods and the local ones.

Several 3D global surface descriptions were proposed in the literature. In this category, we can mention the cords histogram methods proposed by Paquet et al. [1]. Its consists on the extraction of the statistical characteristics from the cords of the 3D object. Osada et al. [2] proposed as a global description for the 3D surfaces, the 3D distribution forms method. This last one is obtained by a probability distribution of a 3D shape function.

For the second category of methods, a 3D local representation is extracted from a 3D objects. In this context, there are many local descriptors based on the curvature such as the Gaussian curvature proposed by Shw-wei et al. [3] which is used to describe the 3D faces. Also, Ganguly et al. [4] proposed to use a two pairwise of curvature analysis. The first pair is composed by the mean, and the maximum curvature and the second one corresponds to the minimum and the gaussian curvature. We can mention here, Bannour et al. [19] who presented a 3D surface description by a set of invariant points obtained from a set of uniform levels of the curvature values. Another kind of the local methods which based on the construction of the geodesic level curves around a feature point are used to represent the 3D surfaces. [5,6,7] proposed to describe the 3D surface by a set of geodesic level curves generated from a one reference point qualified by the unipolar representation. Other works proposed to use the representation based on many reference points in order to overcome the problem of the instability in the case of error of the reference point extraction. Ghorbel et al. [8] proposed to use the bipolar representation. It is obtained from two reference points. It consists on the levels of the superposition of the two geodesic potentials generated from two reference points. In this context, Jribi et al. [9] proposed to extend this representation to the three polar one based on the superposition of three geodesic potentials from three reference points instead of two.

The majority of these description methods require a registration step in order to estimate the variation between two shapes and to align them. In the literature, the registration methods between 3D shapes can be classified into two major categories. The first type is based on the local geometry to construct a valid hypotheses of mappings. In this category, we can classify the registration methods based on Hough transform and Hashage tables [15,16,17,18]. The second type performs the mapping by iterative algorithms. We can mention here the works of Bes et al. [12] who used an iterative techniques to extract the matched points. In this paper, we intend to propose a 3D face recognition technique based on two stages: The first one consists on the proposition of an invariant 3D face description. The second stage is a step of alignment of the 3D surface by a novel robust version of ICP [12].

Thus, this paper will structured as follows: we present in the second section a brief recall of the proposed representation. The implementation steps of the proposed representation on 3D faces are described in section three. The used similarity metric to compare between two shapes and the novel robust version of ICP are detailed in the fourth section and finally, we test the accuracy of our representation for the identification scenario on a part of the BU-3DFE database of 3D faces in the last section.

2 Brief Recall of the Proposed Representation

In this paper, we propose to describe the 3D surfaces by an accurate, finite, and invariant set of points under the geometrical transformations of the M(3) group of translations and rotations. This description is proposed by Rihani et al. [14]. It is obtained by two steps: (i) The first step consist on the construction of the three polar representation proposed by Jribi et al. [9]. (ii) in the second step a geometric arc-length reparametrization of each level of the three polar representation should be performed. We describe in the rest of the section the two steps cited above.

In the rest of the section, we consider that a 3D object as a 2D-differential manifold denoted by S.

2.1 The Construction of the Three Polar Representation

Let denote by \(U_{r}\) the function that computes for each point p of S the length of the geodesic curve joining it to the point r. The three polar representation consists on the superposition of three geodesic potential generated from three reference points. Therefore let denote by \(p_{1}\), \(p_{2}\), \(p_{3}\) three reference points of S \(U_{p_{1}}\), \(U_{p_{2}}\), \(U_{p_{3}} \) their corresponding geodesic potentials and \(U_{s}\) the sum of these three geodesic potentials. Thus, the three polar representation that we denote by \(M^{k}(S)\) corresponds to the set of k level curves where each level curve \(C^{\lambda _{i}}\) is composed by a set of points having the sum of the three geodesic potential \(U_{s}\) equal to \(\lambda \). It can be formulated as follows:

$$\begin{aligned} M^{k}(S)=\{ C^{\lambda _{i}}\}_{i=1..k} \end{aligned}$$
(1)

where

$$\begin{aligned} C^{\lambda _{i}}=\{ p \in S, U_{s}(p)=\lambda _{i}\} \end{aligned}$$
(2)

2.2 Geometric Arc-Length Reparametrization

After the construction of the three polar representation, the 3D surface S is presented by a collection of level curves \(\{C^{\lambda _{i}}\}\). A curve parametrization \(\{C^{\lambda _{i}}(t)\}\) is an 1-periodic function of a continuous parameter t defined by:

$$\begin{aligned} C^{\lambda _{i}}: [0,1]&\rightarrow \mathbb {R}^{3} \\&t\mapsto [x(t),y(t),z(t)]^{t} \nonumber \end{aligned}$$
(3)

It’s well known that the same parametric curve \(C^{\lambda _{i}}\) can have many parameterizations. This due to parametrization dependance on the position, the orientation of the used curve and the speed we go over. In order to overcome this problem, we propose to use a \(\mathbb {G}\) invariant reparametrization of each curve of the three polar representation where \(\mathbb {G}\) is a group of geometrical transformations applied to a curve.

In our context, \(\mathbb {G}\) corresponds to the M(3) group formed by the \(\mathbb {R}^{3}\) rotations and translations. This group of transformations preserves the length of the curve however the speed we go over the curve affects its parametrization. Therefore, we carry out an arc-length reparametrization of a 3D curve \(C^{\lambda _{i}}\) in order to cover it with the same speed. The arc-length reparametrization is defined as follows:

$$\begin{aligned} S(t)=1/L\int ^{t}_{0}\sqrt{x(t)'^{2}+y(t)'^{2}+z(t)'^{2}} dt,t\in [0,T] \end{aligned}$$
(4)

Here, L denotes the length of the level curve \(C^{\lambda _{i}}\).

3 The Application of the Proposed Representation on 3D Faces Meshes

Since the 3D faces known a growing interest for the identities determination especially after the many terrorist acts occurred around the world, we implement this novel representation on this type of data. In practice, the 3D surface corresponds to a discrete mesh. We will start by the construction of the three polar representation on the 3D faces. As mentioned before the three polar representation is based on the three reference points. In our case, the out corner of the eyes and the noise tip are used as reference points. For the automatic extraction of the reference points, we use an approach based on a curvature analysis of 3D faces proposed by Szeptycki et al. [21]. Then, for each reference point we compute its geodesic potential. In the discrete case, the computation of a geodesic potential generated from a reference point corresponds to the computation of the geodesic curves between the reference point and the other points of the 3D face. Here, we use the fast marching algorithm [13] for the computation of the geodesic path between each pairs of points. The three polar representation is composed by a set of discrete level curves. Each level curve of value \(\lambda \) can be represented by a set of vertices. The sum of three geodesic potentials of each vertex should belongs to \([\lambda -\epsilon , \lambda +\epsilon ]\) it can formulated as follows:

$$\begin{aligned} C^{\lambda }=\{P \in S, \lambda -\epsilon \le U_{3}(P)\le \lambda +\epsilon \} \end{aligned}$$
(5)

where \(\epsilon \) is a real positive value chosen according to the resolution of mesh to avoid the intersections between successive level curves.

After the construction of the geodesic level curves of the three polar representation, we perform the approximation of these curves by the B-spline function. Finally, we realize the arc-length reparametrization procedure for each level curve of the tree polar representation. The obtained points are equidistant and invariant under the M(3) group of translations and rotations. Each point is defined by its level number value and its position in that level. In fact, the 3D face can be defined for N levels of the three polar level curve by:

$$\begin{aligned} {\hat{M}}^{N}(S)=\{p_{ij}^{S} \in \mathbb {R}^{3}\}, i\in [1..N], j\in [1..L] \end{aligned}$$
(6)

where N is number of the geodesic level curves of the three polar representation and L is number the points by level. In Fig. 1, we summarize all the steps of the proposed representation applied on the 3D faces.

Fig. 1.
figure 1

The steps of the proposed approach applied to a 3D face. (a, b): The extraction of the three polar level curve. (c): Approximation of this level curve with the B-spline function. (d, e): The arc-length reparametrization of this level curve.

4 3D Faces Comparison

4.1 Haussdorff Shape Distance

In this work, we use the well known Haussdorff shape distance introduced by Ghorbel et al. [10, 11] for the recognition task between 3D shapes. All the possible parameterizations of surface are grouped on G. G can be \(\mathbb {R}^{2}\) plane if the surface is open or \(\mathbb {S}^{2}\) if it is closed. Let \(S_{1}\) and \(S_{2}\) be two 3D surface pieces diffeomorphic to G on which act the M(3) group of geometrical transformations. The Hausdorff shape distance between \(S_{1}\) and \(S_{2}\) can be defined by:

$$\begin{aligned} \triangle (S_{1},S_{2})=max(\rho (S_{1},S_{2}),\rho (S_{2},S_{1})) \end{aligned}$$
(7)

where :

$$\begin{aligned} \rho (S_{1},S_{2})=\sup _{g_{1}\in M(3)} \inf _{g_{2}\in M(3)} \Vert g_{1}S_{1}-g_{2}S_{2}\Vert _{L^{2}}^{2} \end{aligned}$$
(8)

Since the M(3) displacement group preserves this norm, the Hausdorff shape distance can be written as the following quantity:

$$\begin{aligned} \triangle (S_{1},S_{2})=\inf _{h\in M(3)} \Vert S_{1}-hS_{2}\Vert _{L^{2}}^{2} \end{aligned}$$
(9)

The transformation between two shapes should be estimated in order to compute the correct value of the Haussdorff shape distance. We use in our context, a novel robust version of the Iterative Closest Point algorithm to estimate the optimal transformation between faces. In this work, each 3D face is characterized by its 3D descriptor. Therefore a face is described by a set of infinite points obtained after the reparametrization of the tree-polar geodesic level curves.

4.2 Proposed Robust Version of ICP

In this work, we are interested on the problem of the 3D faces recognition. In this context, we generally need an elementary process of fine alignment which consists on the minimization of the global deviation between surfaces to compute the right distance value. But the major problem of such type of surfaces consists on the uncontrolled effects of the facial expressions. Therefore, we propose here, a robust version of the iterative closest point algorithm (ICP) adopted to this context. The ICP algorithm takes as input two 3D surfaces characterized by their points cloud. ICP is based on three main steps: (i) The first one consists on the matching procedure between the two sets of points. (ii) In the second step, the optimal rigid transformation is estimated. (iii) We apply finally the estimated transformation to one of the sets of points. The main contributions of the proposed version of ICP are essentially in its two first steps.

Here, a 3D face is represented by a set of discrete points corresponding to the proposed descriptor. The descriptor of a 3D face \(S_{1}\) is formulated as mentioned above by:

(10)

where N is number of the three polar level curve of the three polar representation and L is number the points by level.

Let consider two surfaces \(S_{1}\) and \(S_{2}\) and their respectively corresponding descriptors and are defined by:

(11)

First Step: Pairwise Points Matching. Bes et al. [12] determined that the matching step assumed 95% of the ICP’s time. This fact shows that the efficiency of the ICP depends on the corresponding step. In our approach, the 3D surface is presented by a set of discrete points. These points are indexed by their level number value and their position in this level. The first contribution of the proposed robust version of the ICP derive directly from the three polar representation. In fact, the matching procedure is automatically obtained since each point \(p_{ij}^{S_{1}}\) is matched to the point \(p_{ij}^{S_{2}}\) of the second face. One the other hand, a correct correspondence is conditioned by having a unique way to obtain the starting point on each level curve. We use, therefore, the plane passing through the noise tip and the first level of the three polar representation (which correspondence to a invariant point) to detect the starting point in each level curve. The intersection between this plane and the 3D surface in each level curve of the three polar representation corresponds to the starting points of each three polar level curve.

Second Step: Transformation Estimation. The second step of ICP consists on the estimation of the rigid transformation between and that we denote by \(\hat{T}\). ICP algorithm is an iterative procedure minimizing the Mean Square Error (MSE). In practice, the rigid transformation should find a solution to the least squares defined by:

$$\begin{aligned} \hat{T}=\underset{T}{{\text {argmin}}}\sum _{i}\sum _{j}e_{ij}^{2} \end{aligned}$$
(12)

where \(e_{ij}\) is the distance between the point \(p_{ij}^{S_{1}}\) of \(S_{1}\) and its corresponding point \(p_{ij}^{S_{2}}\) of \(S_{2}\).

$$\begin{aligned} e_{ij}^{2}=\Vert p_{ij}^{S_{2}}-T(p_{ij}^{S_{1}})\Vert ^{2} \end{aligned}$$
(13)

Our approach is implemented on the 3D faces with different facial expressions. Since the rigid matching process is sensitive to the 3D shape deformations, we should consider this variation shape. In the present work, we propose to automatically associate different weights to the different points representing the 3D surface. In fact, only the points that are less influenced by the facial expressions will participate in this estimation step. To distinguish these points, we suggest to study the variation \(V_{ij}^{k}\)of each point \(p_{ij}^{S_{k}}\) of the surface \(S_{k}\) from its centroid noted by \(C_{S_{k}}\) in all the surfaces. This variation corresponds to the distances between \(p_{ij}^{S_{k}}\) and \(C_{S_{k}}\). It can defined by:

$$\begin{aligned} V_{ij}^{k}=(d(P_{ij}^{k}, C_{S_{k}} )), \end{aligned}$$
(14)

The weight value \(W_{ij}^{S_{k}}\) given for each point \(p_{ij}^{S_{k}}\) should qualify the quality of matching. Indeed, The more static the point is, the greater its weight should. Therefore, the weight \(W_{ij}\) for two corresponding points \(p_{ij}^{S_{1}}\) and \(p_{ij}^{S_{2}}\) for the two surfaces \(S_{1}\) and \(S_{2}\) can be formulated by:

$$\begin{aligned} W_{ij}=\frac{V_{max}-(V_{ij}^{S_{2}}-V_{ij}^{S_{1}})}{V_{max}} \end{aligned}$$
(15)

where \(V_{max}\) is presented by:

$$\begin{aligned} V_{max}= \max _{k}(\max _{ij}(V_{ij}^{k})), k\in [1..H] i\in [1..N], j\in [1..L] \end{aligned}$$
(16)

where H is the number of the used 3D surfaces.

This equation shows that when the variation between two correspondent points tends to reach \(V_{max}\) the weight \(W_{ij}\) of \(p_{ij}\) tends to zero.

Thus, the novel transformation estimation should find a solution to the least squares defined by:

$$\begin{aligned} \hat{T}=\underset{T}{{\text {argmin}}}\sum _{i}\sum _{j}W_{ij}^{2}e_{ij}^{2} \end{aligned}$$
(17)

Seen that T is a rigid transformation, it can be decomposed on rotation and translation. Therefore, it can be defined as follows

$$\begin{aligned} \hat{T}=\underset{T}{{\text {argmin}}}\sum _{i}\sum _{j}W_{ij}^{2}(\Vert p_{ij}^{S_{2}}-T(p_{ij}^{S_{1}})\Vert ^{2}) \end{aligned}$$
(18)

The translation between the two sets of points is defined by:

$$\begin{aligned} \hat{t}=C_{S_{2}}-RC_{S_{1}} \end{aligned}$$
(19)

where \(C_{S_{2}}\) and \(C_{S_{1}}\) are respectively the centroid of and .

Once the rotation R is determined the translation can be derived. Therefore, we need firstly to estimate the rotation R. We place each set of points on its centroid landmark: \(p_{C_{ij}}^{S_{1}}=p_{ij}^{S_{1}}-C_{S_{1}}\) and \(p_{C_{ij}}^{S_{2}}=p_{ij}^{S_{2}}-C_{S_{1}}\). The optimal rotation is rewritten as follows:

$$\begin{aligned} \hat{R}=\underset{R}{{\text {argmin}}}\sum _{i} \sum _{j} W_{ij}^{2}\Vert p_{C_{ij}}^{S_{2}}-R(p_{C_{ij}}^{S_{1}})\Vert ^{2} \end{aligned}$$
(20)
Fig. 2.
figure 2

The CMC curve of the proposed approach for the scenarios: All vs All, Expression vs Expression and Neuter vs Expression

5 Experiments and Discussion

Here, we perform experiments based on the novel version of ICP applied to the reparametrized level curves for the identification scenario. For the experimentation, we used a part of the BU-3DFE database [20]. This portion is composed by 700 faces corresponds to the first magnitude level of the six facial expressions and the neutral face of all the subjects of the database(100 subjects). We run the experiments with three protocols: (i) The first one is All vs All. It consists on the comparison of each face of the database to all the others. (ii) Expression vs Expression is the second protocol. This one corresponds to the comparison between each expression of the database and all the other expressions. (iii) Neuter vs Expression protocol is used to compare each 3D neutral face with the 3D faces with expression. Figure 2 shows the Cumulative Matching Curves of the proposed 3D representation under the three protocols cited above. The obtained rank-one recognition rates are about 96.48% for All vs All protocol, 88.53% for Expression vs Expression and 98.65% for Neuter vs Expression.

6 Conclusion

We introduced in this work a new approach for the recognition of the 3D faces. This approach consists on a novel robust version of the ICP algorithm. This proposed ICP is based on the three polar representation proposed in [9] and it is adopted to the variation of the facial expression on the 3D faces. The obtained rates for the three protocols of the identification scenario show the performance of the proposed framework.

We propose in the future work to experiment the proposed approach on the standard database of 3D faces FRGCV2. We intend also to compare the proposed ICP with ICP’s variants.