1 Introduction

Retrieving 3D shape features from polarisation cues acquired from a single point of view is a concept introduced by Koshikawa (1979) to constrain the surface normals of objects made out of dielectric materials. Together with the increasing impact of the Computer Vision for 3D scanning techniques, the so-called Shape from Polarisation problem (SfP) became one of the most physically based approaches in the Shape from X family. It is based on the physical property that unpolarised light becomes partially polarized once it reflects off an object. The acquisition process consists of taking images of a static object under the same illumination conditions, using a linear polariser in front of a camera. Each image differs from any other by the fact that the polariser has been rotated by a known angle from its initial position. What makes the SfP approach very interesting, at least in theory, is that polarised image formation is both albedo and lighting independent. However, despite the theoretical and practical progress achieved over almost 40 years of research, the SfP still provides limited constraints of the surface preventing the shape recovery (due to the periodicity of the polarisation information). Indeed, most of the existing SfP approaches usually combine other cues coming from multi-view (Atkinson and Hancock 2006a, 2007a; Rahmann and Canterakis 2001), shading (Ngo et al. 2015; Drbohlav and Sara 2001; Morel et al. 2006; Huynh et al. 2013; Atkinson and Hancock 2007b; Smith et al. 2016) and recently RGBD data (Kadambi et al. 2015) in order to increase the information of the 3D shape and fully reconstruct the surface under observation.

Furthermore, another important limitation of the SfP is its mathematical formulation. Instead of being expressed within a single framework, it is fragmented into several steps derived from physical models that describe the behaviour of light when it propagates between media of differing refractive indices (Martinez-Herrero et al. 2009). This makes the SfP less straightforward to understand and less practical to solve than other Shape from X approaches.

Contribution In this work, we extend the differential formulation presented in Mecca et al. (2017) where a linear PDE was derived for retrieving isocontour from the object under observation for specific polarisation angles. Specifically, our contribution consists of:

  • deriving a more general formulation of the PDE presented in Mecca et al. (2017) for general polarisation angles

  • characterizing the novel PDEs for diffuse and specular reflection

  • computing the level-set of an object having mixture of diffuse and specular reflections

  • merging the new formulation with two-light Photometric Stereo data for 3D shape reconstruction.

2 Previous Works

Polarimetric cues have been used in Computer Vision for a number of different tasks mostly related to the difficulty of dealing with specular highlights. In addition to using polarised image formation to separate diffuse from specular reflection (Nayar et al. 1993; Umeyama and Godin 2004; Wang et al. 2016; Wolff and Boult 1991), the potential of polarimetric approaches has been demonstrated by its use in determining the normal orientation of glossy surfaces (Koshikawa 1979; Wolff and Boult 1991; Wolff 1994a, b; Taamazyan et al. 2016). For the same reason, particular attention has been given to the SfP problem attempting to recover the shape of transparent objects (Miyazaki and Ikeuchi 2005; Saito et al. 1999; Miyazaki et al. 2004, 2003; Miyazaki and Ikeuchi 2007; Chen et al. 2007). However, polarimetric analysis provides geometrical information for dielectric surfaces, which has allowed the SfP to be adopted to reconstruct surfaces reflecting light diffusively too (Atkinson and Hancock 2006).

In any case, the features of the surface provided by the polarization reflectance model are limited and this results to ambiguous shape recovery. To disambiguate polarisation normals, most of the approaches enhance the SfP with supplementary cues to make the overall methodology well-posed. Several works use multiple views of the object together with polarisation imaging (Atkinson and Hancock 2005; Miyazaki et al. 2012). Other approaches are more related to what we propose here, where shading cues are merged with polarisation imaging. For example, Drbohlav and Sara (2001) employed polarisation imaging to recover the zenith angle of the surface normal adding integrability constraints to reduce the ambiguity of the uncalibrated photometric stereo to concave and convex case. Morel et al. (2006) used a lighting system in a diffuse dome composed of a ring with numerous LEDs. This provides a uniform and unpolarized light onto the object to be digitalized. The object is placed inside the dome and the light reflected by its surface is analysed by the camera and the liquid crystal polariser. The ring of LEDs is split into four parts that can be independently electrically controlled. Huynh et al. (2013) proposed an iterative method where diffuse polarisation modelling is considered together with two additional constraints of the problem, including the surface integrability and the material dispersion equation using a hyperspectral imaging system. They adopted a preliminary disambiguation proposed by Zhu and Shi (2006) extended by the use of fast marching and patch stitching. Atkinson and Hancock (2007b) disambiguated the polarisation normal using the shading information from three distant light sources, which are placed in a strategic position, such that the angles subtended by the camera and the light sources from the object are equal and also that the distances between the object and the light sources are equal too. Ngo et al. (2015) proposed a very interesting approach based on the ratio of both lambertian irradiance equations (with uniform light direction) and polarisation image formation equations to compute the surface orientation and the refractive index using at least 3 light sources. By increasing the number of input images (i.e. light sources), they eventually extended the approach to the uncalibrated case, also computing the light directions.

Lastly, Smith et al. (2016) proposed a SfP approach aided by shading information which is provided by a distant light source with orthographic viewing geometry. Besides estimating the shape up to a global concave/convex ambiguity, an important limitation for real applications is the assumption of uniform albedo.

The differential formulation for the Shape from Polarisation problem presented in Mecca et al. (2017) has been the first to exploit the monocular feature of this technique where polarimetric images of a static object are taken from a stationary point of view. The non-linear redundant spatial information has been simplified by considering image ratio leading to homogeneous linear PDEs that describe the geometry of the object via its level-set. After that, other approaches based on image ratio have been presented (Tozza et al. 2017; Yu et al. 2017) assuming diffuse reflection only and considering two-light uncalibrated photometric stereo data and smoothness/convexity priors respectively, for computing the 3D shape.

In this work, we extend (Mecca et al. 2017) by describing the SfP through a differential formulation, where a simple mathematical formulation is capable of describing the geometry of the object through its level-set in the same fashion for both diffuse and specular reflections. This brings a twofold advantage: to conceive the expected ambiguity of the SfP problem by extracting the level-set of the surface, and to embed such PDE into a differential formulation, modelling the Photometric Stereo problem using image ratios (Mecca et al. 2016; Mecca and Falcone 2013; Smith and Fang 2016; Chandraker et al. 2013).

Indeed, our approach considers shading information coming from a minimum of two point light sources of known position. The ratio of the respective irradiance equations leads to an albedo invariant PDE that ambiguously describes the surface (Mecca et al. 2014). The combination of those two into a differential system makes the problem solvable.

Experimental validation is provided for synthetic and real-world tests. In particular, real tests have been performed using very general lighting setup (e.g. light desk, light bulb) and, most importantly, taking into account challenging materials (such as glass) to prove the robustness of our differential approach.

3 Shape from Polarization: A Differential Approach

In this section, we describe the mathematical derivation of a novel framework for the SfP problem resulting to a homogeneous linear partial differential equation. To do so, before recalling the theoretical principles for polarised imaging, we firstly consider an important aspect of the surface normal parameterisation as a function of the depth.

3.1 Camera Parameterisation

The pinhole camera modeling is an important aspect to consider while carefully taking into account image based shape reconstruction algorithms. When perspective viewing geometry comes into play, there have been several works proposing different parameterisations (Bruckstein 1988; Tankus et al. 2003; Prados and Faugeras 2005; Papadhimitri and Favaro 2013). However, to mathematically describe the SfP, we do not use any specific parameterisation of the camera since we only consider a general aspect occurring when surface normal deforms due to perspective viewing geometry. For this purpose, let us call as \({{\varvec{\chi }}}(\mathbf {x})\in \varSigma \), \(z(\mathbf {x})\) and \(\nabla z(\mathbf {x})=(z_x(\mathbf {x}),z_y(\mathbf {x}))\) the point belonging to the surface, the depth and the gradient of the surface at pixel \(\mathbf {x}=(x,y)\). Then, the first two components of the non-unit normal vector to the surface \({\overline{\mathbf{n}}}(\mathbf {x})=(\overline{n}^1(\mathbf {x}),\overline{n}^2(\mathbf {x}), \overline{n}^3(\mathbf {x}))\) are proportional to \(\nabla z(\mathbf {x})\) up to a factor depending on the focal length f.

This means that, if we take the unit surface normal \(\mathbf {n}=\frac{\overline{\mathbf {n}}}{\Vert \overline{\mathbf {n}}\Vert }\) into account, we have

$$\begin{aligned} n^1(\mathbf {x}) = g(f)\frac{z_x(\mathbf {x})}{\Vert \overline{\mathbf {n}}\Vert } \quad \text {and} \quad n^2(\mathbf {x}) = g(f)\frac{z_y(\mathbf {x})}{\Vert \overline{\mathbf {n}}\Vert }. \end{aligned}$$
(1)

For completeness, let us mention that for the orthographic viewing geometry, (1) is still preserved as \(g(f)=1\). In the following part, we use this fact to derive our new model independently from the camera viewing geometry.

3.2 Polarization Imaging

When a linear polariser filter is imposed in front of the camera, the intensity of the light acquired by the sensor depends on the rotation angle of the polariser \(\theta _{pol}\) and on the material of the object. With the aim of considering mixed polarisation from diffuse and specular material, we take into account a linear combination of both type of polarisation. In particular, since the specular polarisation differs of a \(\frac{\pi }{2}-\)phase shift from the diffuse one, we consider the following image formation

$$\begin{aligned} \begin{aligned} I(\theta _{pol})&= \alpha _{diff}\bigg (\frac{I_{max}+I_{min}}{2}+\frac{I_{max}-I_{min}}{2}\cos (2\theta _{pol}-2\overline{\theta })\bigg )\\&\quad +\alpha _{spec}\bigg (\frac{I_{max}+I_{min}}{2}\\&\quad +\frac{I_{max}-I_{min}}{2}\cos \Big (2\theta _{pol}-2\Big (\overline{\theta }+\frac{\pi }{2}\Big )\Big )\bigg )\\ \end{aligned} \end{aligned}$$
(2)

that after some algebra simplifies as

$$\begin{aligned} \begin{aligned} I(\theta _{pol})&= (\alpha _{diff}+\alpha _{spec})\frac{I_{max}+I_{min}}{2}\\&+ (\alpha _{diff}-\alpha _{spec})\frac{I_{max}-I_{min}}{2}\cos (2\theta _{pol}-2\overline{\theta }) \end{aligned} \end{aligned}$$
(3)

where \(\alpha _{diff}\) and \(\alpha _{spec}\) are the percentage of diffuse and specular polarisation respectively and \(\overline{\theta }\) is the phase angle. It is the angle that the linear polariser has to have in order to obtain the highest intensity \(I_{max}=I(\theta _{pol}=\overline{\theta })\). Instead, \(I_{min}\) is the minimum intensity value obtainable while rotating the polariser. To simplify the notation, in the following we refer as \(I_+=\frac{I_{max}+I_{min}}{2}\) and \(I_-=\frac{I_{max}-I_{min}}{2}\).

To give an overview of our method, it is important to mention that the idea of considering mixed polarisation (i.e both diffuse and specular polarisation) comes from the fact that our formulation based on level-set is particularly suitable for identifying regions on the objects having different type of polarisation. In fact, the \(\frac{\pi }{2}\)-phase shift between the diffuse and specular polarisation transfers to the level-sets characterisation. We will show how the level-sets of diffuse and specular polarised pixels are orthogonal allowing to visual recognition of the different polarisation effects (see Fig.  7).

In order to derive the mathematical model based on level-set, without loss of generality we firstly consider the purely diffuse polarisation image formation model (i.e \(\alpha _{spec}=0\)). After that, the specular case will be easily extended and experimental tests will be showed to prove the concept.

Now, to introduce the depth parameter \(z(\mathbf {x})\) in the polarisation image formation (3), we consider the parameterisation with spherical coordinates for the normalised version of the surface normal as

$$\begin{aligned} \mathbf {n}(\mathbf {x}) = \frac{{\overline{\mathbf{n}}}(\mathbf {x})}{\Vert {\overline{\mathbf{n}}}(\mathbf {x})\Vert }=(\cos (\theta )\sin (\phi ), \sin (\theta )\sin (\phi ),\cos (\phi )) \end{aligned}$$
(4)

where \(\theta \in [0,2\pi ]\) is the azimuth angle and \(\phi \in \big [0,\frac{\pi }{2}\big ]\) is the zenith angle. Let us recall that the phase angle \(\overline{\theta }\) in (3) contains geometrical information regarding the shape since \(\theta =\overline{\theta }\) or \(\theta =\overline{\theta }+\pi \) that resumes the ambiguity of the SfP problem.

With the aim of deriving a differential formulation of the SfP, we introduce the depth parameters \(z_x(\mathbf {x})\) and \(z_y(\mathbf {x})\) in the image formation (3) by substituting (1) in the first two coordinates of (4), so we can get respectively the following equalities

$$\begin{aligned} \cos (\theta ) = g(f)\frac{z_x}{\Vert {\overline{\mathbf{n}}}(\mathbf {x})\Vert \sin (\phi )} \end{aligned}$$
(5)

and

$$\begin{aligned} \sin (\theta ) = g(f)\frac{z_y}{\Vert {\overline{\mathbf{n}}}(\mathbf {x})\Vert \sin (\phi )}. \end{aligned}$$
(6)

3.3 The Differential Model for the Shape from Polarisation

Let us rearrange the image formation model (3) for diffuse polarisation (i.e \(\alpha _{spec}=0\)) using the following trigonometric formula

$$\begin{aligned} \cos (2\theta _{pol}-2\overline{\theta }) = \cos (2\theta _{pol})\cos (2\overline{\theta })+\sin (2\theta _{pol})\sin (2\overline{\theta }) \end{aligned}$$
(7)

and the duplication formulas which lead to the following equality

$$\begin{aligned} \begin{aligned} \cos (2\theta _{pol}-2\overline{\theta })&= \cos (2 \theta _{pol})\big (2\cos ^2(\overline{\theta })-1\big )\\&\quad + 2\sin (2\theta _{pol})\sin (\overline{\theta })\cos (\overline{\theta }). \end{aligned} \end{aligned}$$
(8)

By substituting (8) into the image formation (3) we get

$$\begin{aligned} \begin{aligned} I(\theta _{pol})&= I_++I_- \Big (\cos (2 \theta _{pol})\big (2\cos ^2(\overline{\theta })-1\big )\\&\quad + 2\sin (2\theta _{pol})\sin (\overline{\theta })\cos (\overline{\theta })\Big ). \end{aligned} \end{aligned}$$
(9)

The key step to describe the previous polarisation image formation as a partial differential equation is to consider the consistency of shape information from the angle \(\overline{\theta }\) with respect to the surface normal. In other terms, after substituting (5) and (6) in (9), we obtain the following partial differential equation

$$\begin{aligned} \begin{aligned} I(\theta _{pol})&= I_++I_- \bigg (\cos (2\theta _{pol})\bigg (2g^2(f)\frac{z^2_x}{\Vert {\overline{\mathbf{n}}}(\mathbf {x})\Vert ^2\sin ^2(\phi )} -1\bigg ) \\&\quad +2\sin (2\theta _{pol})g^2(f)\frac{z_xz_y}{\Vert {\overline{\mathbf{n}}}(\mathbf {x})\Vert ^2\sin ^2(\phi )} \bigg ) \end{aligned} \end{aligned}$$
(10)

that after some algebra becomes

$$\begin{aligned} \begin{aligned}&I(\theta _{pol}) - I_++I_-\cos (2\theta _{pol}) \\&\quad =I_-\Big (\cos (2\theta _{pol})z_x+\sin (2\theta _{pol})z_y\Big )\frac{2g^2(f)z_x}{\Vert {\overline{\mathbf{n}}}(\mathbf {x})\Vert ^2\sin ^2(\phi )}. \end{aligned} \end{aligned}$$
(11)

With the aim of simplifying the non-linear part depending on the focal length f, the zenith angle \(\phi \) and the normalized normal vector \(\Vert {\overline{\mathbf{n}}}(\mathbf {x})\Vert \), we consider the ratio of the previous equations obtained with two polariser angles \(\theta _{pol_1}\) and \(\theta _{pol_2}\), we get

$$\begin{aligned}&\frac{I(\theta _{pol_1}) - I_++I_-\cos (2\theta _{pol_1})}{I(\theta _{pol_2}) - I_++I_-\cos (2\theta _{pol_2})} \nonumber \\&\quad =\frac{\cos (2\theta _{pol_1})z_x+\sin (2\theta _{pol_1})z_y}{\cos (2\theta _{pol_2})z_x+\sin (2\theta _{pol_2})z_y} \end{aligned}$$
(12)

that leads to the following homogeneous linear PDE

$$\begin{aligned}&\Big (\big (I(\theta _{pol_1}) - I_++I_-\cos (2\theta _{pol_1})\big )\cos (2\theta _{pol_2})\nonumber \\&\qquad -\big (I(\theta _{pol_2})- I_++I_-\cos (2\theta _{pol_2})\big )\cos (2\theta _{pol_1})\Big ) z_x \nonumber \\&\qquad +\Big (\big (I(\theta _{pol_1}) - I_++I_-\cos (2\theta _{pol_1})\big )\sin (2\theta _{pol_2}) \nonumber \\&\qquad -\big (I(\theta _{pol_2})- I_++I_-\cos (2\theta _{pol_2})\big )\sin (2\theta _{pol_1})\Big ) z_y=0 \nonumber \\ \end{aligned}$$
(13)

that we refer in the following as

$$\begin{aligned} \mathbf {b}_{pol}(\mathbf {x})\cdot \nabla z(\mathbf {x}) = 0, \end{aligned}$$
(14)

where the components of the bi-dimensional vector field \(\mathbf {b}_{pol}(\mathbf {x})=(b_{pol}^1(\mathbf {x}),b_{pol}^2(\mathbf {x}))\) are

$$\begin{aligned} b_{pol}^1= & {} \big (I(\theta _{pol_1}) - I_++I_-\cos (2\theta _{pol_1})\big )\cos (2\theta _{pol_2}) \nonumber \\&-\big (I(\theta _{pol_2}) - I_++I_-\cos (2\theta _{pol_2})\big )\cos (2\theta _{pol_1}) \end{aligned}$$
(15)

and

$$\begin{aligned} b_{pol}^2= & {} \big (I(\theta _{pol_1}) - I_++I_-\cos (2\theta _{pol_1})\big )\sin (2\theta _{pol_2}) \nonumber \\&-\big (I(\theta _{pol_2}) - I_++I_-\cos (2\theta _{pol_2})\big )\sin (2\theta _{pol_1}). \end{aligned}$$
(16)

Let us point out that the previous equation can be substantially simplified by taking \(\theta _{pol_1}=0\) and \(\theta _{pol_2}=\frac{\pi }{4}\) yielding to the following equation

$$\begin{aligned} \big (-I_{\frac{\pi }{4}}+I_+\big )z_x+\big (I_0-I_++I_-\big )z_y=0. \end{aligned}$$
(17)

As a first remark, we notice that (14) is invariant with respect to lighting and albedo. Most importantly, it describes the geometry of the surface through its isocontours circumventing the ambiguity of the SfP problem.

In the next section we show how the new differential approach (13) extends also to specular reflection with minimal changes due to shift of the phase angle.

3.4 Specular polarisation

As mentioned before, for specular polarisation a shift of the phase angle of \(\frac{\pi }{2}\) has to be taken into account. This is another reason why the level-set approach we present here is very suitable for parameterising the shape from polarisation problem. Indeed, since the phase angle \(\overline{\theta }\) represents the azimuth angle under a certain \(\pi \)-periodic ambiguity, the bi-dimensional vector field describing the level-set at a specular pixel \(\mathbf {x}^{spec}\) has orthogonal direction to those in (14), that is

$$\begin{aligned} \mathbf {b}^{spec}_{pol}(\mathbf {x}^{spec})=\mathbf {b}_{pol}^\bot (\mathbf {x}^{spec}) \propto (-b^2_{pol}(\mathbf {x}^{spec}),b^1_{pol}(\mathbf {x}^{spec})). \end{aligned}$$
(18)

This means that the linear PDE describing the level-set in the specular regions is as follows

$$\begin{aligned}&-\Big (\big (I(\theta _{pol_1}) - I_++I_-\cos (2\theta _{pol_1})\big )\sin (2\theta _{pol_2})\nonumber \\&\qquad +\big (I(\theta _{pol_2})- I_++I_-\cos (2\theta _{pol_2})\big )\sin (2\theta _{pol_1})\Big )z_x\nonumber \\&\qquad +\Big (\big (I(\theta _{pol_1}) - I_++I_-\cos (2\theta _{pol_1})\big )\cos (2\theta _{pol_2})\nonumber \\&\qquad -\big (I(\theta _{pol_2})\!-\! I_+\!+\!I_-\cos (2\theta _{pol_2})\big )\cos (2\theta _{pol_1})\Big ) z_y=0.\nonumber \\ \end{aligned}$$
(19)

From the experimental perspective, the proposed level-set approach can be even more appreciated since it can be easily used to distinguish diffuse regions from specular ones, see Fig. 6.

Note that the \(\frac{\pi }{2}\) phase difference between diffuse and specular polarisation translates to the two sinusoids being exactly out of phase (the sinusoid has a factor of 2). Hence, even if a part of the surface is simultaneously partially diffuse and specular, the overall reflected light will have a phase that is exactly equal to the diffuse or the specular (expect of the corner case of the two components exactly canceling out at some point and so this point exhibits zero polarisation). Thus, it is valid to partition objects in diffuse and specular zones and this is verified experimentally, see Fig. 6. Note however that the degree of polarisation \(\rho \) will be smaller than what predicted by assuming purely diffuse or specular due to the partial canceling out of the two sines. Our approach directly circumvents this issue in contrast methods that aim to use \(\rho \) directly (e.g Taamazyan et al. 2016). Finally, note that even objects that are traditionally thought as highly specular (e.g. glass) can have “diffuse polarisation regions”. These regions correspond to viewing vectors much different than the half vector, and thus the specularly reflected light simply does not reach the camera (Blinn 1977) (Fig.1).

Fig. 1
figure 1

Real data photometric stereo pairs

In the next section we describe how (17) elegantly fits into a well-posed differential system of hyperbolic PDEs when additional shading information is provided to reconstruct the 3D shape.

4 Enhancing SfP with Two-Lights Photometric Stereo

In this section we describe how to exploit the new differential formulation for the SfP (17) into a single framework where shading information is provided.

In order to take advantage of the fact that SfP is albedo independent, we consider a fully calibrated Photometric Stereo (SfPS) approach having the same specific feature by basing the derivation on the irradiance equations ratio (Davis and Soderblom 1984). Unlike polarisation theory, in order to parametrize the SfPS approach, we require some additional information to be known: the camera parameters, the lighting and the type of reflection (i.e. diffuse or specular).

Let us consider the approach proposed by Mecca et al. (2014) which uses the camera modeling introduced by Papadhimitri and Favaro (2013) where the outgoing normal to the surface \(\overline{\mathbf {n}}\) is parametrised as follows

$$\begin{aligned} \overline{\mathbf {n}}(\mathbf {x})=\frac{1}{f}\!\big (f\nabla z(\mathbf {x}),\!-f\!-\!z(\mathbf {x})-\mathbf {x} \!\cdot \! \nabla z(\mathbf {x})\big ) \end{aligned}$$
(20)

where z is the surface depth, f is the focal length. The irradiance equation takes in to account the normalised normal vector \(\mathbf {n}\) as follows

$$\begin{aligned} \small I_i(\mathbf {x})=\rho (\mathbf {x})a_i(\mathbf {x},z)\left( \mathbf {n}(\mathbf {x})\cdot \mathbf {h}_i(\mathbf {l}_i(\mathbf {x},z), \mathbf {v}(\mathbf {x},z))\right) ^{\frac{1}{c(\mathbf {x})}} \end{aligned}$$
(21)

where \(i=1,2\).

Although (21) is not physically based, it describes the mixture of reflections in a single equation instead of considering the linear combination of diffuse and specular reflection (Torrance and Sparrow 1967). Indeed, the single reflection lobe changes size dynamically, depending on the shininess parameter \(c(\mathbf {x})\) and the half vector between the light \(\mathbf {l}_i(\mathbf {x},z)\) and the viewer direction \(\mathbf {v}(\mathbf {x},z)\) defined as follows

$$\begin{aligned} \begin{aligned} \mathbf {h}_i(\mathbf {x},z)&=(h_i^1(\mathbf {x},z),h_i^2(\mathbf {x},z),h_i^3(\mathbf {x},z))\\&=\overline{\mathbf {l}}_i(\mathbf {x},z)+\min \bigg \{1,\frac{|1-c(\mathbf {x})|}{\varepsilon }\bigg \}\overline{\mathbf {v}}(\mathbf {x},z) \end{aligned} \end{aligned}$$
(22)

where \(\varepsilon \) (assumed equal to 0.01 for the experiments) defines a transition phase that averages diffuse and specular reflection.

We assume that the light spreads according to the point light source parameterisation at point \(\mathbf {P}_i(\mathbf {x})\) and attenuation \(a_i(\mathbf {x},z)\) as follows

$$\begin{aligned} \mathbf {l}_i(\mathbf {x},z)={{\varvec{\chi }}}(\mathbf {x})-\mathbf {P}_i(\mathbf {x}) \end{aligned}$$
(23)

and

$$\begin{aligned} a_i(x,y,z)=\frac{\phi _i (\mathbf {l}_i(\mathbf {x},z)\cdot \mathbf {p}_i )^\nu }{|\overline{\mathbf {l}}_i(x,y,z)|^{2}} \end{aligned}$$
(24)

where \(\mathbf {p}_i\) is the main direction of illumination that we assume equal to (0, 0, 1), \(\nu \) is the coefficient of radial attenuation and \(\phi _i\) is the intensity of the \(i{\mathrm{th}}\) point light source. For our experiments, we consider a fully calibrated setup, so all these quantities as known.

Finally, the unknown albedo \(\rho (\mathbf {x})\) cancels out by considering the ratio \(\frac{I_1(\mathbf {x})}{I_2(\mathbf {x})}\) that yields the following quasi-linear PDE

$$\begin{aligned} \mathbf {b}_{ps}(\mathbf {x},z) \cdot \nabla z(\mathbf {x}) = s_{ps}(\mathbf {x},z) \end{aligned}$$
(25)

where, by dropping the dependency on \(\mathbf {x}\) and z, we have

$$\begin{aligned} \mathbf {b}_{ps}= & {} \Big ( \left( \phi _1 a_2 I_1\right) ^{c}\Big (fh_2^1-xh_2^3\Big )\!-\!\left( \phi _1 a_1 I_2\right) ^{c}\Big (fh_1^1-xh_1^3\Big ),\nonumber \\&\left( \phi _2 a_2 I_1\right) ^{c}\Big (fh_2^2\!-\!yh_2^3\Big )\!\!-\!\! \left( \phi _1 a_1 I_2\right) ^{c} \Big (fh_1^2\!\!-\!\!yh_1^3\Big )\Big )\nonumber \\ \end{aligned}$$
(26)
$$\begin{aligned} s_{ps}\!\!=\!\! & {} (f\!+\!z)\Big ( \left( \phi _2 a_2 I_1 \right) ^{c}h_2^3\!\!-\!\! \left( \phi _1 a_1 I_2\right) ^{c}h_1^3\Big ). \end{aligned}$$
(27)

In the next part we show that the system of PDEs consisting of (13) and (25) elegantly written as follows

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {b}_{p}(\mathbf {x}) \cdot \nabla z(\mathbf {x}) = 0 \\ \mathbf {b}_{ps}(\mathbf {x},z) \cdot \nabla z(\mathbf {x}) = s_{ps}(\mathbf {x},z) \end{array}\right. } \end{aligned}$$
(28)

counts the minimum amount of equations to have the SfP and SfPS problem unified under a single differential framework.

4.1 Numerical Approach to the Shape from Polarisation and Photometric Stereo

For computing the sinusoid given by the parameters (\(I_+\), \(I_-\), \(\overline{\theta }\)), we avoid the standard procedure taking only the images \(I_0\), \(I_{\frac{\pi }{4}}\) and \(I_{\frac{\pi }{2}}\) (Wolff 1994a) since it performs poorly in practice, as differences between different polarisation images are very small and hence sensitive to noise.

Fig. 2
figure 2

Photometric stereo pair images. Up Lambertian, down Cook and Torrance Specular

Fig. 3
figure 3

Top row: Bimba level sets for diffuse (left) and specular (right). The ground truth is shown in green, the calculated in red. Bottom row: respective errormaps and reconstructions. The mean errors are \(8.7^\circ \) and \(2.4^\circ \) for the diffuse and specular cases respectively (Color figure online)

With the aim to maximise robustness to noise, we capture several images \(I_1 \ldots I_n\) at polarisation angles \(\theta _1 \ldots \theta _n\). By re-arranging (3) as

$$\begin{aligned} I(\theta _{pol})=I_+ + I_- \cos (2\theta _{pol}) \cos (2\overline{\theta }) \!+\! I_- \sin (2\theta _{pol}) \sin (2\overline{\theta }), \end{aligned}$$
(29)

we obtain the following (over-constrained) linear system

$$\begin{aligned} \begin{bmatrix} 1&\quad \cos (2\theta _1)&\quad \sin (2\theta _1) \\ \vdots \\ 1&\quad \cos (2\theta _n)&\quad \sin (2\theta _n) \end{bmatrix} \mathbf {X} = \begin{bmatrix} I(\theta _1) \\ \vdots \\ I(\theta _n) \end{bmatrix}. \end{aligned}$$
(30)

With \(\mathbf {X}={[X^1,X^2,X^3 ]}^{t}\!={[I_+, I_- \cos (2\overline{\theta }), I_- \sin (2\overline{\theta }) ]}^{t}\). We solve (30) using \(L_1\) relaxation (Candès et al. 2008) and calculate \( I_{-} = \Vert (X^2+X^3)\Vert _2\) and \(\overline{\theta }=\text {atan2}(X^2,X^3)\). Finally, \(I_0\), and \(I_{\frac{\pi }{4}}\) are recalculated using (3) with the robustly estimated sinusoidal parameters and used to find the level-set with (17).

Fig. 4
figure 4

Upper line: real data degree of polarisation \(\rho \). The scale is black is \(\rho =0\), white is \(\rho >0.25\) for all datasets expect the elephant, which is more specular hence exhibits higher polarisation, thus has \(\rho =1\) for white. Note that the owl statue and face statues gets a very minimal polarisation compared to the specular cup, the ball and the elephant. In addition, note that \(\rho \) is higher at object boundaries where the zenith angle \(\phi \) is higher. On the bottom line: real data level-sets overlaid on \(I_0\). It is assumed that the ball, cup and elephant are purely specular and the owl and head are purely diffuse

The polarisation equation (17) is stacked along Eq. (25) giving the following variational problem

$$\begin{aligned} \underset{z}{\min } \left\| \begin{bmatrix} \mathbf {b}_{ps} \\ \mathbf {b}_{pol} \end{bmatrix} \cdot \nabla z - \begin{bmatrix} s_{ps} \\ 0 \end{bmatrix}\right\| _{L^2} + \lambda \left\| z - z_0\right\| _{L^2} \end{aligned}$$
(31)

where \(\lambda =10^{-5}\) in the experiments.

Note that the term \( \lambda \left\| z - z_0\right\| _{L^2}\) is a zero-order Tikhonov regularizer that constraints the mean depth and ensures that the differential problem has a unique solution.

Equation (31) is discretized with finite differences and solved with simple least squares. Furthermore, since \(\mathbf {b}_{ps}\) and \(s_{ps}\) (but not \(\mathbf {b}_{pol}\)) implicitly depend on z, the solution of Eq. (31) is embedded in an iterative process that calculates all relevant quantities (a, \(\mathbf {h}\), \(\mathbf {b}_{ps}\) and \(s_{ps}\)) by using the current estimates of the depth values in a similar manner with Mecca et al. (2016). The optimisation is initialised with a flat plane at the mean distance.

5 Experimental Results

The proposed approach was evaluated with a range of synthetic and real data sets. In addition, we calculated the level set of the head dataset from Kadambi et al. (2015). The algorithm was implemented in matlab with a total running time of about two minutes (for the 7x3MPixel images) on a laptop with a quad, core-i7, 2.6GHz CPU.

It is worth mentioning that this experimental section is conceived to validate the theoretical concepts of Sect.  3, where isocontours of the surface have been described through the proposed differential model (17). Furthermore, 3D reconstruction using SfP and SfPS as described in Sect. 4 are provided as proof of concept using the minimal amount of data, i.e. without aiming at reconstructing highly accurate shapes. In fact, any outliers to the photometric stereo assumption (mostly shadows and saturated pixels) are expected to cause artifacts to propagate in a region around them.

Fig. 5
figure 5

Several views of the reconstructions obtained by fusing the photometric stereo pairs of Fig. 1 and the isocontours of Fig. 4

Fig. 6
figure 6

Soup plate (up) and billiard ball (down). These two objects have known geometry and thus serve as quantitative benchmarks for estimating the level-set accuracy. The central part of the plate is relatively flat and hence it is (manually) segmented out in order to just keep the comparison with perfectly circular contours

Fig. 7
figure 7

Level-sets for the two objects of Fig. 6 assuming diffuse polarisation. Note how the contours follow concentric circles expect from the regions where specular reflection dominates: those include the specular highlights of the plate and the regions of the reflection of the light sources at left top and bottom and right top of the ball. This transition from diffuse to specular can be easily understood at a phase plot (Fig. 8)

Fig. 8
figure 8

Normalised orientation plot for the plate (top) and ball (bottom) experiments of Fig.  6. We plot \(\frac{\cos (\theta )+1}{2}\) so as to get an 1 to 1 mapping between \(\theta \in [-\pi ,\pi )\) (the orientation of the level-set which encapsulates the two way ambiguity of the azimuth of the normal) and (0, 1] that is also continues with respect of the circular nature of \(\theta \). The discontinues phase changes corresponding to transitions between diffuse and specular regions are clearly visible and the automatic segmentation using Dollár and Zitnick (2013), Arbelaez et al. (2009) is indicated in green. Note that the center of the ball is badly lit and “nearly flat” hence experiencing minimal polarisation with a not so meaningful associated phase. Finally, note that the number of regions that need to be manually labeled as diffuse or specular is quite low (5 for the ball,10 for the plate) compared to doing the classification on a pixel basis, which would be completely impractical

Fig. 9
figure 9

Corrected contours with MAE for the plate \(5.2^\circ \) and \(2.8^\circ \) for the ball,(by comparing to perfect circles aligned with the ball)

Fig. 10
figure 10

Inaccurate polarisation angles experiment(1): a systematic \(10^\circ \) shift is introduced to the polarisation angles and this causes a respective rotation the contours by a similar amount. As expected, the MAE also exhibits a similar increase reaching \(14.1^\circ \) and \(14.4^\circ \) respectively (from \(5.2^\circ \) and \(2.8^\circ \), see Fig. 9)

Fig. 11
figure 11

Inaccurate polarisation angles experiment(2) : random noise is introduced to the polarisation angles to model uncertainty of the rotation of the polariser. We consider 6 datasets namely the billiard ball and the plate with 3 (minimum possible), 5 and 7 polarisation angles. We also consider 4 Gaussian noise levels i.e. \(0^\circ \), \(5^\circ \), \(10^\circ \) and \(20^\circ \) of standard deviation for polarisation angle error. Not surprisingly, the higher the uncertainty of the angles, the higher the error expect for the 3 angles plate experiment which essentially fails with any noise level. An interesting observation is the fact that for the ball experiment, the 5 angles case performs better than the 7 angles one under heavy noise levels. This signifies that the additional redundancy of the additional images is only useful if the polariser is calibrated properly, otherwise it acts as an outlier and reduces performance. Of course, in reality the polarisation angles will have uncertainties closer to around \(5^\circ \), and this does not seem to cause significant performance drops

Fig. 12
figure 12

Detailed examination of the polarisation sequence sine-waves for 2 points on the transition between diffuse and specular regions diffuse (middle) versus specular (bottom). Notice that the \(180^\circ \) sine phase change corresponds to \(90^\circ \) rotation of the contours. In addition, \(\rho \) is significantly higher in the specular region

Fig. 13
figure 13

On the top for each pair one of the input polarised images. On the bottom the level-sets of the objects

5.1 Synthetic Data

First of all, we generated \(800 \times 600\) pixel synthetic data using the “bimba” from the AIM@Shape Repository. The data were rendered with realistic effects including non-uniform albedo, perspective viewing geometry, and near point light sources. We generated a Lambertian and a specular dataset, the latter rendered with the Cook & Torrance BRDF (see Fig. 2).

We rendered the minimum required polarisation images \(I_0\), \(I_{\frac{\pi }{4}}\) and \(I_{\frac{\pi }{2}}\) generated from (3) and assuming index of refraction \(\mu =1.6\). Finally, to make the experiments realistic, we limited the precision of the data to 3 decimal digitsFootnote 1 and added 0.5% Gaussian noise.

The level sets obtained are evaluated quantitatively by calculating the angle at each pixel with the level set computed from the ground truth. The mean angular error (MAE) is used as an overall metric for comparing datasets. For reconstructing the specular dataset, we used \(c=0.25\) (see (21)). The results are shown in Fig. 3. We note that the specular dataset outperforms the diffuse one, due to the much higher polarisation effects for specular materials.

It has to be noted that these kind of synthetic data do not model a lot of interesting effects arising in real data such as self-reflections and semi-transparency (e.g. Fig. 13) and so their usefulness lies on understanding robustness to noise for diffuse versus specular reflection as well as overall geometry. Thus, as diffuse regions with low elevation angle (i.e. almost flat) are quite inaccurate even on synthetic data, these kind of regions are not expected to perform well in real data either (see Fig. 12 where the diffuse, almost flat region has a very minimal polarisation sine-wave).

5.2 Real Data

The setup we used for acquiring data suitable for SfP and SfPS consists of a FLIR camera FL3-U3-32S2C-CS having maximum resolution 2080 \(\times \) 1552 mounting a TECHSPEC 8mm UC series fixed focal length lens, OSRAM Platinum Dragon high power LEDs white and a linear polariser mounted on a rotary mount with post. We captured images at polariser angles \([-90\), \(-60\), \(-30\), 0, 30,  60, \(90^\circ ]^\circ \) which is well above the minimum requirement of three images. To test the robustness of our approach, experiments with 3 and 5 images are discussed in Sect. 5.2.1. The obtained isocontorus and reconstructions are shown in Figs. 4 and 5 respectively. We note that real data experiments confirm the fact that the level sets are more accurately calculated on specular than diffuse materials which is a clear advantage of the SfP approach compared to SfPS.

In addition, we present quantitative evaluation of the level-set accuracy by using two objects of known geometry; namely a soup plate and a billiard ball (Fig. 6).

We segment diffuse from specular regions using the readily available structured Forests edge detector (Dollár and Zitnick 2013) and hierarchical segmentation of Arbelaez et al. (2009) with default parametersFootnote 2 on the normalised polarisation orientation image (see. Fig. 8). The \(180^\circ \) phase difference between diffuse and specular ensures that the regions are easily automatically segmented as even for nearby points on the 2 sides of the boundary of the two regions, the polarisation sine waves are significantly different (see Fig. 12). Finally, one single region manual labelling is required and the rest are propagated using the segmentation boundaries.

The complete pipeline is then presented from “raw” contours assuming uniform diffuse reflection (Eq. (17)) in Fig. 7, to phase map segmentation in Fig. 8 to final level-set correction for specular regions and quantitative evaluation using the mean angular error (MAE) in Fig. 9.

Finally, another two datasets containing a marble statue, a glass bottle and a glass bowl are presented in Fig.  13. Most of the inaccuracies are at the badly illuminated regions like the top of the bottle. Notice that the strong reflection in the middle of the bowl does not reduce the quality of the level-sets. That is because the level-set equation is invariant to type of illumination; any kind of specular reflection exhibits the same polarisation phase.

5.2.1 Polarization Angle Uncertainty

In this section we consider the effect of error of the polarisation angles to the orientation of the level sets. In addition, we also examine the effect of using different number of polarisation images. The experiments are performed using the real datasets of the plate and billiard ball (see Fig. 6) so as to be able to get a quantitative evaluation. We note two distinct cases namely systematic shift of the angles (Fig. 10) and random “noise” (Fig. 11). The first case corresponds to uncertainty of the zero point of the polariser (something that has to be calibrated) and leads to a systematic shift of the phases and thus rotations of the contours. The second case models inaccurate rotation of the polariser by simply adding random Gaussian noise to the polarisation angles. The level-set orientation is pretty robust to low levels of angular uncertainty and performs poorly only under unreasonable levels of noise (as it is unreasonable to expect \(20^\circ \) error on the polarisation angles). Another more surprising conclusion is the fact that more polarisation images can lead to reduced quality under high angular uncertainty; this means that the additional images act as outliers and cause a systematic error to the level set calculation (Figs.12, 13).

6 Conclusion and Perspective

In this work we present a new differential approach to Shape from Polarisation leading to a linear PDE that describes the level-set of the object under observation. By combining surface depth related parameters into the polarisation image formation, we derived an homogeneous linear PDE that describes the geometry of the surface through its isocontours.

This approach allows to have equivalent formulations for diffuse and specular light reflection since the shift of the phase corresponds to the same shift of the vector field describing the level-set. From the experimental point of view, this new model permits to distinguish diffuse regions from specular region very easily and then characterise the shape through its level-set entirely. The fully automatic procedure for separating diffuse from the specular reflection is an interesting future work. This could be accomplished with some sort of global consistency optimisation (such as graph-cut) which can also take into account the fact that pixels with high degree of polarisation are necessarily specular; for diffuse polarisation, \(\rho \) peaks around 0.4 (Atkinson and Hancock 2005).

With the aim of providing full 3D shape recovery, we added shading cues using accurate Photometric Stereo model describing sharing information from a point light source with perspective deformation and diffuse/ specular reflection. We showed that the new Shape from Polarisation differential formulation merges very elegantly into a system of hyperbolic PDEs which is albedo independent and well-posed by considering two lights sources at least.

By introducing this level-set characterisation for the Shape from Polarisation problem, there could be several extension for merging this problem with others like multi-view or Shape from Defocusing.