Abstract
Plenoptic cameras have a complex optical geometry combining a main lens, a microlens array and an image sensor to capture the radiance of the light rays in the scene in its spatial and directional dimensions. As conventional cameras, changing the zoom and focus settings originate different parameters to describe the cameras, and consequently a new calibration is needed. Current calibration procedures for these cameras require the acquisition of a dataset with a calibration pattern for the specific zoom and focus settings. Complementarily, standard plenoptic cameras (SPCs) provide metadata parameters with the acquired images that are not considered on the calibration procedures. In this work, we establish the relationships between the camera model parameters of a SPC obtained by calibration and the metadata parameters provided by the camera manufacturer. These relationships are used to obtain an estimate of the camera model parameters for a given zoom and focus setting without having to acquire a calibration dataset. Experiments show that the parameters estimated by acquiring a calibration dataset and applying a calibration procedure are similar to the parameters obtained based on the metadata.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Plenoptic cameras are able to discriminate the contribution of each of the light rays that emanate from a given point in the scene. In a conventional camera, the contribution of the several rays is not distinguishable since they are collected on the same pixel. This discrimination on plenoptic cameras is possible due to the positioning of a microlens array between the main lens and the image sensor.
Plenoptic cameras sample the lightfield [8, 9] which is a 4D slice of the plenoptic function [1]. There are several optical setups that are able to acquire the lightfield as camera arrays [15]. Here, we focus on compact and portable setups like the lenticular array based plenoptic cameras. More specifically, on the SPC [13] which has a higher directional resolution and produces images with lower spatial resolution [7] when compared to the focused plenoptic camera (FPC) introduced by Lumsdaine and Georgiev [10, 14].
The camera models proposed for SPC [3, 4] are approximations of the real setup by considering the main lens as a thin lens and the microlenses as pinholes. There can be more complex models to describe the real setup. The SPC manufacturer provides metadata regarding the camera optical settings that help describing the camera. Namely, the metadata provided include the main lens focal length which is considered in [3, 4] to model the refraction of the rays by the main lens. On the other hand, the metadata also includes the distance at which a point is always in focus by the microlenses. Nonetheless, the assumption of pinhole like microlenses do not allow to incorporate directly this additional information on the camera models [3, 4].
The calibration procedures for SPCs [3, 4] do not consider the information provided by the camera manufacturer as metadata and therefore rely completely on the acquisition of a dataset with a calibration pattern for the specific zoom and focus settings to estimate the camera model parameters. Thus, in this work, we identify the relationships among the optical parameters provided as metadata as well as the relationships between these optical parameters and the entries of the camera model [4] for different zoom and focus settings of the camera. The relationships obtained are used to represent the camera model parameters based on the metadata parameters for a specific zoom and focus setting without having to acquire a novel calibration dataset.
In terms of structure, one presents in Sect. 2 a brief review of the camera models proposed for the SPC. In Sect. 3, the camera model [4] that describes the SPC by a \(5 \times 5\) matrix that maps the rays in the image space to rays in the object space is summarized. In Sect. 4, one identifies the relationships among the parameters provided as metadata, and the relationships between the camera model entries and the metadata provided on the raw images. The results of estimating the camera model based on the metadata for a given zoom and focus setting are presented in Sect. 5. The major conclusions are presented in Sect. 6.
Notation: The notation followed throughout this work is the following: non-italic letters correspond to functions, italic letters correspond to scalars, lower case bold letters correspond to vectors, and upper case bold letters correspond to matrices.
2 Related Work
SPC allow to define several types of images by reorganizing the pixels captured by the camera on the 2D raw image (Fig. 1a) [13]. The raw image displays the images obtained by each microlens in the microlens array (Fig. 1b). There is another arrangement of pixels that is commonly used in SPC, the viewpoint or sub-aperture images. These images are obtained by selecting the same pixel position relatively to the microlens center for each microlens [13]. The microlens and viewpoint images exhibit different features due to the position of the microlens array on the focal plane of the main lens (Fig. 2). Thus, for these cameras, there are mainly two calibration procedures, one based on viewpoint images [4] and other based on microlens images [3]. These consider camera models in which the main lens is modeled as a thin lens and the microlenses as pinholes.
The calibration based on viewpoint images [4] considers corner points as features and assumes a decoding process that transform the hexagonal tiling of the microlenses to a rectangular tiling (Fig. 1). This is done by interpolating the pixels of adjacent microlenses to get the missing ray information [5]. So in fact, this calibration procedure considers the calibration of a virtual SPC. There is an evolution of this work [16] that considers a better initialization for the camera model parameters. One of the disadvantages pointed out to this procedure is the fact of creating viewpoint images before a camera model is estimated.
On the other hand, the work of Bok et al. [3] allows to calibrate SPC directly from raw images using lines features. This procedure requires that line features appear on the microlens images which cannot be ensured when the calibration pattern is near the world focal plane of the main lens [3]. In this region, the microlens images consist of an image with very small deviations on the intensity values since these projections correspond to the same point in the scene [12] (Fig. 2).
The calibration procedures [3, 4] assume that no information is known and therefore each of the parameters must be estimated by acquiring a dataset with a calibration pattern for a specific zoom and focus settings. A SPC provides metadata with information of the optical settings with the images acquired. Monteiro et al. [12] identified a relationship between the zoom and focus step provided in the metadata with the world focal plane of the main lens, but did not pursue this line of research. Here, we go a step further and identify the relationships of the metadata parameters among them and with the camera model parameters [4]. These relationships allow to obtain a representation of the camera model for an arbitrary zoom and focus settings based on the parameters provided by the manufacturer as metadata of the images acquired and without acquiring a calibration dataset.
3 Standard Plenoptic Camera Model
Let us consider a plenoptic camera that acquires a lightfield in the image space \(L\left( \mathbf {\Phi }\right) \) with the plane \(\varOmega \) in focus, i.e. with the world focal plane of the main lens corresponding to the plane \(\varOmega \) (Fig. 2). The rays of the lightfield in the image space \(\mathbf {\Phi } = \left[ i,j,k,l\right] ^T\) are mapped to the rays of the lightfield in the object space \(\mathbf {\Psi } = \left[ s,t,u,v\right] ^T\) by a \(5 \times 5\) matrix proposed by Dansereau et al. [4], the lightfield intrinsics matrix (LFIM) \(\mathbf {H}\):
where \(\tilde{\left( \cdot \right) }\) denotes the vector \(\left( \cdot \right) \) in homogeneous coordinates. The rays in the image space are parameterized by pixels \(\left( i,j\right) \) and microlenses \(\left( k,l\right) \) indices while the rays in the object space are parameterized on a plane \(\varPi \) by a position \(\left( s,t\right) \) and a direction \(\left( u,v\right) \) in metric units [12]. Removing the redundancies of the LFIM with the translational components of the extrinsic parameters [2, 4], one defines a LFIM with 8 free intrinsic parameters
This matrix does not provide a direct connection with the common intrinsic parameters defined within a pinhole projection matrix. The closer connection to the pinhole projection matrix is the one provided by Marto et al. [11] regarding the representation of a camera array composed of identical co-planar cameras. In this setup, the LFIM can be represented as
where \(\mathbf {0}_{n \times m}\) is the \(n \times m\) null matrix, \(\left[ h_{si},h_{tj}\right] ^T\) corresponds to the baseline between consecutive cameras, and \(\mathbf {K}\) corresponds to the intrinsics matrix that represents the cameras in the camera array defined using the LFIM (2) entries.
The LFIM introduced by Dansereau et al. [4] describes a virtual plenoptic camera whose microlenses define a rectangular tiling (Fig. 1c) instead of the actual hexagonal tiling of a plenoptic camera (Fig. 1b). The rectangular tiling is a result of a decoding process [4] that corrects the misalignment between the image sensor and the microlens array, and removes the hexagonal sampling by interpolating the missing microlenses information from the pixels of the neighbouring microlenses [5].
4 Calibration on a Range of Zoom and Focus Levels
The metadata parameters (meta-parameters), provided by the camera manufacturer with the images acquired, are retrieved from the camera hardware. Here, we focus on the information that refers to the image sensor, main lens and microlens array. More specifically, meta-parameters that change with the zoom and focus settings of the camera, i.e. the main lens world focal plane [12].
4.1 Camera Metadata Parameters
In [12], the influence of two meta-parameters in the definition of the main lens world focal plane was analyzed. Monteiro et al. [12] identified that the world focal plane is mainly determined by a combination of the zoom and focus steps (Fig. 4b). Nonetheless, there are more parameters on the metadata of the images acquired that can determine the main lens world focal plane and that were not analyzed in [12]. For example, the main lens focal length that can be associated with changes on the zoom level or the infinity lambda that can be associated with the focus settings of the microlenses. Namely, the infinity lambda corresponds to the distance in front of the microlens array that is in focus at infinity. However, the microlenses optical settings are fixed. The optical settings are changed by modifying the main lens or the complex of lenses that compose the main lens. Thus, the infinity lambda describes the combined optical setup of the microlenses and main lens. In fact, representing the focal length, infinity lambda and target object depth (Fig. 4c), one finds a similar behavior to the one depicted in Fig. 4b. This shows that the world focal plane can also be defined by a combination of the focal length and the infinity lambda parameters.
In order to identify and analyze the camera parameters depending on zoom and focus settings, we follow the same experimental approach defined in [12] and computed the Pearson correlation coefficient among the different meta-parameters [6]. In this experimental analysis, one identifies five parameters that vary with the main lens world focal plane: zoom step (zoom-stepper motor position), focus step (focus-stepper motor position), focal length, infinity lambda, and f-number. The first two parameters represent, up-to an affine transformation, optical parameters information. Namely, the zoom step is related with the focal length of the main lens (Fig. 5a) (correlation of \(93.16\%\)), and the focus step for a fixed zoom is related with the infinity lambda parameter (Fig. 5c) (correlation of \(99.54\%\)). On the other hand, the f-number is not used in the definition of the intrinsic parameters of a camera and it is normally described as the ratio f/D where f is the focal length and D is the diameter of the entrance pupil. This reduces the relevant metadata parameters to two, the focal length and the infinity lambda.
4.2 Metadata Parameters Vs. LFIM
The LFIM depends on the optical settings of the camera. Let us now evaluate how the focal length and infinity lambda are related with the parameters of the LFIM described in Sect. 3. The derivation of Dansereau et al. [4] indicates how the LFIM parameters change with the focal length included in the images metadata. However, the assumption of microlenses as pinholes do not allow to introduce the concept of focus at infinity as a parameter of the LFIM. Thus, one wants to provide a relationship between the LFIM parameters and the camera parameters provided on the images metadata.
In order to evaluate these relationships, one needs multiple calibration datasets acquired under different zoom and focus settings. The datasets [12] were collected using a \(1^\mathrm{st}\) generation Lytro camera and are summarized on Table 1. For establishing the relationships, we use 10 poses randomly selected from the acquired calibration pattern poses to estimate the LFIM [4] and repeated this procedure 15 times to get the mean and standard deviation values. Representing the entries of the LFIM and computing their Pearson correlation coefficients [6] against the focal length and infinity lambda, we found that the entries \(h_{si}\) and \(h_{tj}\), which are related to the baseline, exhibit an affine relationship with the focal length (Fig. 6a–b) with a correlation coefficient of \(99.97\%\) and \(99.98\%\), respectively. The entries \(h_{uk}\) and \(h_{vl}\), which are related with the scale factors, exhibit a nonlinear relationship with the focal length (Fig. 6c–d) with a correlation coefficient of \(84.94\%\) and \(84.75\%\), respectively. Furthermore, the remaining entries do not exhibit a correlation with any of the metadata parameters provided.
If we consider the entries on the intrinsics matrix \(\mathbf {K}\) (3), \(1/h_{uk}\) and \(1/h_{vl}\) exhibit an affine relationship with the focal length (Fig. 7a–b) with a correlation coefficient of \(99.82\%\) and \(99.81\%\), respectively. On the other hand, the ratios \(h_{ui}/h_{uk}\) and \(h_{vj}/h_{vl}\) have an affine relationship with the infinity lambda (Fig. 7c–d) with a correlation coefficient of \(99.55\%\) and \(99.83\%\), respectively. The principal point \(\left[ h_{u}/h_{uk},h_{v}/h_{vl}\right] ^T\) continue not having any relationship with the metadata parameters. The transformation to a pinhole like representation allows to simplify the relationships with the parameters provided by the manufacturer on the metadata of the images acquired.
In summary, denoting f as the focal length (see sample values in Table 1 column 4), \(\lambda _\infty \) as the infinity lambda (sample values shown in Table 1 column 5), and \(\left[ c_u,c_v\right] ^T\) as the principal point, one has
where \(\mathrm {a}\left( f\right) \), \(\mathrm {b}\left( f\right) \), \(\mathrm {c}\left( f\right) \), \(\mathrm {d}\left( f\right) \), \(\mathrm {e}\left( \lambda _\infty \right) \), and \(\mathrm {g}\left( \lambda _\infty \right) \) are the affine mappings identified earlier. In the next section, we detail the procedure followed to estimate the affine mappings and show numerical results for the datasets [12].
5 Experimental Results
In this section, we use the relationships established between the LFIM entries and the metadata parameters (Sect. 4) to obtain a representation for the parameters used to describe the camera for a specific zoom and focus settings.
The relationships \(\mathrm {a}\left( f\right) \), \(\mathrm {b}\left( f\right) \), \(\mathrm {c}\left( f\right) \), \(\mathrm {d}\left( f\right) \), \(\mathrm {e}\left( \lambda _\infty \right) \), and \(\mathrm {g}\left( \lambda _\infty \right) \), in Eq. (4), are estimated using the datasets in Table 1 except Dataset B. As in Sect. 4, one considered for each dataset 10 poses randomly selected from the acquired calibration pattern poses to estimate the camera model parameters [4] and repeated this procedure 15 times to get the mean values. The parameters of the affine mappings obtained using the mean values of the LFIM are summarized on Table 2.
The Dataset B is not included in the previous analysis in order to be used to evaluate the accuracy of the camera representation (4) using the focal length and the infinity lambda meta-parameters. The LFIM entries are obtained by applying the affine mappings identified in Table 2. These entries are compared with the mean values obtained by repeating 15 times the calibration procedure [4] using 10 randomly selected poses of Dataset B and are summarized in Table 3. The principal point \(\left[ c_u,c_v\right] ^T\) is assumed to be the center of the viewpoint image since no relationship was found with the metadata parameters. Table 3 shows that the entries obtained from the calibration are similar to the ones obtained from the metadata. Namely, the maximum deviation is \(7.8 \%\) and occurs for the ratio \(h_{ui}/h_{uk}\).
Additionally, one considered a set of 10 randomly selected images to evaluate the re-projection, ray re-projection [4], and reconstruction errors using the LFIM obtained from applying the calibration procedure [4] and from the metadata provided on the images acquired using the representation (4). The errors are summarized in Table 4. This table allows to have a more practical view of the difference between the two approaches considered. The errors presented are significant but is important to note that the extrinsic parameters are not tuned for the LFIM. The re-projection and ray re-projection errors are similar, being greater for the LFIM obtained from the metadata by 0.34 pixels and 0.14 mm, respectively. On the other hand, the reconstruction error for the metadata based estimation is significantly greater than the one obtained from calibration [4] but still lower than 65 mm. However, note that the LFIM representation using the focal length and the infinity lambda is based on a statistical analysis between the metadata parameters provided by the camera manufacturer and the parameters estimated from a calibration procedure that are affected by noise.
6 Conclusions
The different zoom and focus settings of the camera change the LFIM \(\mathbf {H}\) used to describe the camera, so we proposed a representation based on the metadata parameters provided on the images acquired. We found that the main lens world focal plane can be determined by the focal length and the infinity lambda parameters. This allows to estimate the LFIM entries without requiring the acquisition of a calibration dataset for a specific zoom and focus settings.
References
Adelson, E.H., Bergen, J.R.: The plenoptic function and the elements of early vision. In: Computational Models of Visual Processing, pp. 3–20. Vision and Modeling Group, Media Laboratory, Massachusetts Institute of Technology (1991)
Birklbauer, C., Bimber, O.: Panorama light-field imaging. Comput. Graph. Forum 33(2), 43–52 (2014)
Bok, Y., Jeon, H.G., Kweon, I.S.: Geometric calibration of micro-lens-based light field cameras using line features. IEEE Transact. Pattern Anal. Mach. Intell. 39(2), 287–300 (2017)
Dansereau, D.G., Pizarro, O., Williams, S.B.: Decoding, calibration and rectification for Lenselet-based plenoptic cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1027–1034 (2013)
David, P., Le Pendu, M., Guillemot, C.: White Lenslet image guided demosaicing for plenoptic cameras. In: 19th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6. IEEE (2017)
Fisher, R.A.: Statistical methods for research workers. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics. SSS, pp. 66–70. Springer, Heidelberg (1992). https://doi.org/10.1007/978-1-4612-4380-9_6
Georgiev, T., Zheng, K.C., Curless, B., Salesin, D., Nayar, S., Intwala, C.: Spatio-angular resolution tradeoffs in integral photography. In: Rendering Techniques pp. 263–272 (2006)
Gortler, S.J., Grzeszczuk, R., Szeliski, R., Cohen, M.F.: The Lumigraph. In: Proceedings of the International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), vol. 96, pp. 43–54. ACM (1996)
Levoy, M., Hanrahan, P.: Light field rendering. In: Proceedings of the International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), vol. 96, pp. 31–42. ACM (1996)
Lumsdaine, A., Georgiev, T.: The focused plenoptic camera. In: Proceedings of the International Conference on Computational Photography (ICCP), pp. 1–8. IEEE (2009)
Marto, S.G., Monteiro, N.B., Barreto, J.P., Gaspar, J.A.: Structure from plenoptic imaging. In: Proceedings of the Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp. 338–343. IEEE (2017)
Monteiro, N.B., Marto, S., Barreto, J.P., Gaspar, J.: Depth range accuracy for plenoptic cameras. Comput. Vis. Image Underst. 168, 104–117 (2018)
Ng, R.: Digital light field photography. Ph.D. thesis, Stanford University (2006)
Perwass, C., Wietzke, L.: Single lens 3D-camera with extended depth-of-field. In: Proceedings of SPIE, Human Vision and Electronic Imaging XVII, vol. 8291, p. 829108. International Society for Optics and Photonics (2012)
Wilburn, B., et al.: High performance imaging using large camera arrays. In: Transactions on Graphics (TOG), vol. 24, pp. 765–776. ACM (2005)
Zhang, Q., Zhang, C., Ling, J., Wang, Q., Yu, J.: A generic multi-projection-center model and calibration method for light field cameras. IEEE Trans. pattern Anal. Mach. Intell. (2018). https://doi.org/10.1109/TPAMI.2018.2864617
Funding
This work was supported by the Portuguese Foundation for Science and Technology (FCT) projects [UID/EEA/50009/2019] and [PD/BD/105778/ 2014], the RBCog-Lab [PINFRA/22084/2016], and the E.U. Portugal 2020 / Project ELEVAR / I&D co-promotion 17924.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Monteiro, N.B., Gaspar, J.A. (2019). Standard Plenoptic Camera Calibration for a Range of Zoom and Focus Levels. In: Morales, A., Fierrez, J., Sánchez, J., Ribeiro, B. (eds) Pattern Recognition and Image Analysis. IbPRIA 2019. Lecture Notes in Computer Science(), vol 11868. Springer, Cham. https://doi.org/10.1007/978-3-030-31321-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-31321-0_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31320-3
Online ISBN: 978-3-030-31321-0
eBook Packages: Computer ScienceComputer Science (R0)