1 Introduction and Related Work

Video-based eye-tracking devices estimate the gaze direction from the users pupil center and corneal reflection. Gaze estimation uncertainty, often referred to as accuracy of an eye-tracker, is determined by the offset between the measured gaze direction and the actual gaze direction. Today, many eye-tracker manufacturers claim an accuracy of \(0.5^{\circ }\) of visual angle. Sitting in front of a desktop monitor at a typical distance of about \(60\,\text {cm}\), \(0.5^{\circ }\) of visual angle correspond to about \(0.5\,\text {cm}\) on screen. Accordingly, eye-trackers propose to provide measured gaze positions off the actual gaze positions by \(0.5\,\text {cm}\). To achieve this accuracy for an individual user, the user has to perform a calibration procedure before using the eye-tracker. This is necessary as the parameters influencing the geometrical components of the gaze direction calculation like eyeball radius, eyeball shape or glasses differ between users; besides, iris color and texture, head pose, viewing angle and lighting conditions influence the gaze direction calculation [HJ10]. Typically, the calibration procedure requires the user to look at a number of predefined calibration points covering the screen (or the area on the screen where gaze data will be of interest). Common numbers are 5 or 9 [BHND14]. Various contributions proposed several methods to achieve a good calibration; they propose repeating calibration until the offset between calibration point positions and corresponding estimated gaze positions is under, e.g., \(0.5^{\circ }\) [Tat07], local recalibration for selected calibration areas [Jac90], participant-controlled calibration [HMAvdW11], or using an extra step taking 80–120 s to complete which is used for post-calibration regression [BHND14]. However, the cited contributions address the calibration process as a separate part, required initially but also during operation as the user behavior, particularly, the change of the users head pose, influences calibration. Hence, gaze estimation might deteriorate if head movements are allowed. To keep gaze data quality high, recalibration in suitable intervals is required; however, conducted like the initial calibration it would be annoying for the user [HH02]. [LKRK16] show that cute spreading of the recalibration points can reduce the number to only two. However, it seems more compelling to utilize targets provided on the screen during the working task for unobtrusive recalibration. If the system knows about the position of such targets, those might be utilized to relate gaze position and target position, and calibration would be improved at runtime without disturbing the user. Monitoring interactions such as clicks on a target or salient objects [LYS+11] in the user’s field of view could provide this kind of information but without any guarantees in terms of accuracy. Using a target the user did not focus at all could not only fail to improve the current calibration, but even deteriorate it. For further research on methods of obtaining such recalibration points automatically, it is of importance to know how accurately those have to be determined to avoid a degradation of the initial calibration.

Fig. 1.
figure 1

Corner points of the 16-point grid (green) and the four central points (orange) utilized for recalibration. (Color figure online)

2 User Study and Results

We conducted a user study with 16 participants (aged 21–32, no glasses, four with contact lenses, one female) that had to go through a 9-point calibration procedure using an infrared-based eye-tracker [BMvdC+16]. After this initial calibration, the participants performed a calibration validation step by looking sequentially at the 16 points of an evenly spaced grid on the screen.

Fig. 2.
figure 2

Shifted recalibration points (orange) generated by randomly applying an offset to their original locations. (Color figure online)

Fig. 3.
figure 3

Points used for evaluation of the accuracy degradation caused by the simulated recalibration point relocation. The five points were chosen as evaluation points as the participants fixated them after they fixated the recalibration points in the 16-point calibration validation step.

Fig. 4.
figure 4

Correlation between the offset (in pixels) and the resulting error induced to the estimated gaze position on screen (in pixels).

Table 1. Induced localization errors for the recalibration points (in pixels (first column) and degrees of visual angle (second column)) and their impact on the gaze estimation uncertainty (accuracy) as mean, standard deviation and median (in degrees of visual angle).

Figure 1 shows the corner points of the 16-point grid in green and the four center points of the grid used for recalibration in orange. For recalibration, we used the corner points of the original 9-point calibration in combination with the four recalibration points. The eye-tracker we used allows to record and replay full video data of experiments. Using this feature, we reran the calibration on the recorded video data while systematically introducing offsets to the four recalibration points (Fig. 2), mimicking their erroneous localization. We tested the result of this new calibration on the remaining five points of the 16-point grid (Fig. 3). The results are presented in Table 1 showing the gaze estimation uncertainty, or accuracy, as mean, standard deviation, and median for the different induced offsets. Due to the location of the evaluation points at the bottom of the screen, the mean gaze estimation uncertainty is high compared to typical results for the whole screen. However, of major importance are the relative differences between the result with no offset (baseline) and the results with an offset \( > 0\). Figure 4 shows a linear correlation between the localization error for the recalibration points (offset) and the resulting error of gaze point estimation. Overall, the correlation between the error for localizing the recalibration point (x) and the resulting error for gaze estimation is promising as only large offsets significantly deteriorate the gaze position estimation. Even localizing with an erroneous offset of as much as 50 pixels (\(1.21^{\circ }\)) deteriorates gaze estimation causes than \(0.5^{\circ }\). The fact, that the recalibration is error tolerant to a certain degree shows that techniques such as saliency detection might be suitable for the localization of recalibration points.

3 Conclusion and Future Work

The results of our study indicate which requirements any method automatically providing recalibration points has to fulfill. The next step will be to gather potential points for recalibration automatically. As mentioned in the introduction, monitoring interactions like clicks on a target or salient objects in the user’s field of view could provide this kind of information but without any guarantees in terms of accuracy. For example, using Microsoft Windows 8.0, stationary targets like desktop icons cover \(1.5\,\text {cm} \times 3\,\text {cm}\) (“small size”), the CLOSE-button of a window covers \(0.5\,\text {cm} \times 1.0\,\text {cm}\), the MINIMIZE-button covers \(0.5\,\text {cm} \times 0.6\,\text {cm}\). Assuming that the user focuses the center of a button or icon in order to make sure he will hit it with the pointer, it is reasonable to use the target center as recalibration point. Hence, detecting that the user looked at the MINIMIZE-button (as this item was clicked) introduces a maximum offset between estimated and actual gaze position of \(0.5 \sqrt{0.5^2 + 0.6^2} = 0.4^{\circ }\), if the user actually was focussing one of the corner points. Considering the desktop icon, the maximum offset is \(0.5\sqrt{1.5^2+3^2} = 1.7^{\circ }\). Overall, it seems likely that these methods might be used to obtain points for recalibration and will need to be put to a test to examine the real-world implications of recalibration at runtime.