Keywords

1 Introduction

The explosive growth of total information makes human-computer interaction more frequent and complex. Eye movement-based interaction has attracted more and more attention due to its high bandwidth, continuous input and natural interaction. As early as the early 1990s, the development of eye position input system based on eye tracking system has attracted much attention [1]. The improvement of accuracy and resolution of eye tracking technology makes eye movement-based input possible. Especially with the development of real-time eye tracking measurement technology, eye movement-based interaction has become a useful human-computer interaction mode [2, 3]. Many studies have confirmed that eye movement-based interaction has a faster targeting speed [2,3,4] than traditional interaction methods (such as mouse). Moreover, because it can only use eye movement to interact with machines, even people with limb defects can easily use it, which also makes the research of eye movement-based interaction have high social value.

The essence of eye movement-based interaction is to record and recognize the movement mode of human eyes through devices, and take specific movement mode as input signal to control specific tasks. In human-computer interaction, blinking, gazing and saccade scanning are usually used as input signals.

At present, the application and research of eye movement-based interaction mainly focus on blink input [5, 6] and gaze input [7,8,9]. These two input modes require high spatiotemporal characteristics of user interface, and often accompanied by low efficiency, narrow bandwidth, easy misoperation and other usability problems. This also makes the related research of eye movement interaction stagnate, and fails to get real application and promotion from the lab to the market. The definition of gaze gesture comes from the direction or amplitude of the saccade, so the requirement for the spatiotemporal characteristics of the user interface is very low, which is not easy to cause Midas contact problem, and has the advantages of high bandwidth, high efficiency and high customization. However, gaze gesture input also has the remarkable characteristics of high cognitive load. If we can overcome these shortcomings, give full play to the advantages of gaze gesture input, and apply it reasonably to design, it will bring high application value and release the potential of eye movement-based interaction. Table 1 shows the difference between blink input, gaze input and gaze gesture input.

Table 1. Comparison of three input mechanisms for eye movement-based interaction

However, the research on gaze gesture input is relatively less, and mainly focuses on the study of simple gaze gesture input symbols. The performance of experimental task completion is almost the only research indicator. No one has ever studied the appropriate input area size for gaze gesture, and no one has ever set the experimental conditions for gaze gesture input in the eye-friendly zone (ignoring the ergonomics to develop input performance). Moreover, few people have studied the effect of feedforward types on gaze gesture input. To sum up, there are many factors that affect the performance of gaze gesture input, but many previous studies are too narrow to have broad practical significance.

In this paper, the spatiotemporal characteristics of gaze gesture input were focused. The effects of input area size, feedforward type and input gesture shape on the performance of gaze gesture input and the subjective satisfaction of users were synthetically explored. Then, the performance of gaze gesture input in the input continuity, the implicit interaction and the real-time feedback were summarized.

2 Related Work

Eye movement-based interaction takes vision as input channel and has the potential to replace traditional mouse-like pointing devices (such as mouse, stylus, finger touch screen). The difference in application between eye movement and mouse-like pointing devices makes the spatiotemporal characteristics of eye movement-based interaction significantly different from traditional ones. The time characteristic in eye movement-based interaction refers to the time threshold (residence time, blink time, etc.) required to trigger an interface instruction using eye movement data (including eye gaze, blink, eye gesture, etc.). The spatial characteristics in eye movement-based interaction refer to the spatial accuracy needed to select the interactive objects on the interface based on eye-movement interaction (including the size of the interactive objects, the distance between objects, the arrangement of objects, etc.).

In the study of spatiotemporal characteristics of eye movement-based interaction, Feng (2006) et al. found that for human-computer interface based on eye-movement gaze input technology, the horizontal arrangement of objects was significantly better than the vertical arrangement [10]. Zhu (2014) et al. found that in the touch screen interaction system, by observing the eye movement data, the user’s visual focus in the operation process is mainly concentrated on the position of the object and the target, while only a small amount of attention is paid to the process, so there is enough data and time for eye movement behavior to be an input mechanism [11].

In addition to spatiotemporal characteristics, the study of feedback in eye movement-based interaction can also help to improve the availability of eye movement-based interaction. Feng (2004) et al. found that introducing visual feedback in eye movement-based interaction can improve the efficiency of searching, locating and activating the target objects on the interface, while introducing visual display of the border of the interactive objects in eye movement-based interaction interface has no significant impact on user’s performance [12]. Zhu (2014) found that the availability of the touch screen interactive system has been significantly improved when the eye movement-based assisted operation was introduced. Moreover, if appropriate interference indication is introduced into the system, it will help users to establish target objects faster [11].

The above studies mainly focus on gaze gesture input. Gaze gesture input, also known as saccade input, is of great research value because of its advantages of fast speed, low requirement for spatiotemporal characteristics of the interface and difficulty in misoperation. Firstly, the fastest speed of saccade up to 400°–600° per second which means that gaze gesture input can reach 1° to 40° viewing angle within 30–120 ms which is much faster than a standard gaze input unit time 300–500 ms. Secondly, as gaze gesture input does not require specific interactive controls and elements on the interface, the interactive time is also relatively high robustness and it does not necessarily require an accurate response time, interface design will be easier and faster because of the low requirements of the spatiotemporal characteristics of the interface. Thirdly, since gaze gesture input is sequence-based and does not require a precise starting point and ending point, it is naturally more advantageous for Midas contact than blinking and gazing [12].

There are relatively few studies on spatiotemporal characteristics of gaze gesture input, and there is no decisive conclusion. Xin (2015) et al. studied the efficiency of gaze gesture and only analyzed the input performance of subjective defined short-range and long-range gaze gesture [13]. Møllenbach et al. found that short-range gaze gesture trajectories triggered faster than long-range gaze gesture trajectories, and the speed of horizontal trajectories performed better than vertical trajectories’ [14]. However, there are some inaccuracies in this experiment. The experiment was carried out on a rectangular screen. In order to control the distance of gaze gesture in horizontal and vertical directions equally, the size of the trigger area of gaze gesture in these two directions is different, which may affect the validity of the experimental results. The flaws in this experimental design can be supported to a certain extent in the study by Heikkilä (2012) et al. In their study, similar experimental content has reached the opposite conclusion that gaze gesture moves faster in the vertical direction than in the horizontal direction [15]. Heikkilä et al. also found that there was no significant difference in time between short-range and long-range gaze gesture when eye movements were performed using closed eyes to end eye movement-based control. They think it is possible that in their experimental design, users only need to move their eyes in the right direction without precise eye movement control, which makes the conclusion different from that of Møllenbach [16].

In fact, the gaze gesture input mechanism relies on saccade, and the speed of saccade is extremely fast, the fastest speed can reach 400°/s–600° /s. Therefore, if there is no significant difference in length between the same type of gaze gesture, performance should not vary too much. In ergonomics, there is a comfort zone when the human eye rotates. When rotating in the comfort zone, the muscle burden is small, it is not easy to fatigue, and the speed of eye movement is fast, and vice versa. However, previous researchers often neglected this point when they studied the performance of gaze gesture. We can assume that the reason for the significant difference in the performance of long-range and short-range gaze gesture in some experiments may not be simply because of the difference in the length of gaze gesture, but because long-range gaze gesture may have exceeded the comfort zone of gaze while short-range gaze gesture not. As shown in Fig. 1, there is a rotational comfort zone in eye rotation. The optimum upper and lower regions: upper 25° + lower 30° = 55°. The best left and right regions : left 15° + right 15° = 30°.

Fig. 1.
figure 1

Rotational comfort zones in eye rotation. a. horizontal view b. vertical view

Gaze gesture input has not been fully applied yet because of its obvious disadvantages. How to design gaze gestures as input symbols is a very difficult research topic. If the gaze gesture is too simple, it is easy to overlap with the unconscious eye movement, leading to misoperation; but if it is too complex, it will increase the user’s learning cost, memory burden and cognitive load, which is contrary to the original intention of natural interaction. Previous studies have mostly focused on simple gaze gestures (single-step long gaze gestures), which are suitable for completing simple human-computer interaction tasks. Although multi-step gaze gesture (multi-step or radian gaze gesture) is difficult for users to learn and has a high degree of input fatigue, it is more suitable for completing complex human-computer interaction tasks because of its higher bandwidth. Istance et al. designed a series of composite gaze gestures for World of Warcraft games. Experiments have shown that the use of composite gaze gestures can interact more accurately than a single gaze gesture, but it will occupy a large number of cognitive channels [17]. Feedforward (giving guidance before the user performs the operation) as a special form of feedback, can reduce the user’s learning burden and operating burden, so it is worth considering.

In summary, this article conducted research on the following four aspects:

  1. (1)

    Gaze gesture input area size was taken as the control variable to study its influence on input performance.

  2. (2)

    Input gesture type was taken as the control variable to study its influence on input performance.

  3. (3)

    Feedforward type was taken as the control variable to study its influence on input performance.

  4. (4)

    A multi-factor analysis of the above three factors and input performance was conducted to investigate whether there is a cross-effect effect.

3 Experiment

3.1 Participants

20 participants (10 males and 10 females) were voluntarily recruited, ranging in age from 20 to 25 years old (mean = 22.2 years old and SD = 1.67 years). All of the participants had visual acuity or corrected visual acuity of 5.0 or above and the corrected visual acuity of the participants was all within 200°, which ensured that the lenses they wore would not be too thick to affect the detection of eye tracker. All participants successfully passed the eye tracker calibration test, and the calibration accuracy in the X and Y directions was less than 0.5°.

3.2 Device

The eye tracker used in the experiment was the German SMI iView RED non-contact eye tracker with a sampling frequency of up to 500 Hz, a tracking resolution of 0.1 deg, and a gaze positioning accuracy of 0.5°–1°. The dedicated display for the eye tracker was a DELL 22-inch display. The physical size of the electronic screen was 475 mm long, 298 mm wide, and the resolution was 1680pi * 1050pi, with a ratio of 16:10. In terms of software, the software used to control the eye tracker device was iView X version and the analysis software was BeGaze. Experimental staff for the experimental included an operator, a recorded and a host.

3.3 Selection of Tasks

Firstly, the experimental level of input region size variables was determined. The relationship of human visual angle range is shown in Fig. 2 where α is the horizontal view size, β is the vertical view size, L is the horizontal view range length, W is the vertical view range length, and o is the center point of the screen. When the human eye’s line of sight is perpendicular to o point in the center of the screen, H is called the line of sight distance from the human eye to the screen.

Fig. 2.
figure 2

View angle size and target size diagram

The relationship between the size of the perspective, the range of the perspective, and the line-of-sight distance is shown in formula 1 and formula 2.

$$ \alpha = 2{\text{arctg}}\frac{\text{L}}{{2{\text{H}}}} $$
(1)
$$ {\text{L}} = 2{\text{Harctg}}\frac{\alpha }{2} $$
(2)

According to the physical size and pixel size of the screen used for the observed materials in the experimental equipment, 10 levels of experiments as shown in Table 2 have been defined (the data have been rounded). The determination of the maximum viewing angle depends on the maximum height of the experimental screen, that is 298 mm. On the basis of 25°, the level of other variables is determined by decreasing the angle of view by 5°. When the angle is less than 5°, the decrease is changed to 1°. All horizontal viewing angles, both left and right rotation and up and down rotation, are in the comfortable region of rotation, as detailed in Table 2.

Table 2. Ten levels of the input area

Secondly, the experimental level of input gesture type variables was determined. Previous studies have been divorced from reality, and most of them were researches of single-step gaze gesture, and did not depend on specific application scenarios. The experimental level of the input symbol type variable in this experiment was derived from the action in the consensus set obtained in the previous experiment, from which the two representative inputs of square and circle were selected.

Finally, the experimental level of input feedforward type variables was determined. Considering the attention allocation mechanism of human eye movement, the experimental level was designed as follows: line-like feedforward, no feedforward and point-like feedforward.

In summary, there were three control variables in the experiment, including 10 levels, 2 levels and 3 levels, totaling 60 input tasks. In order to facilitate the experimental operation, during the actual experiment, the tasks were divided into 6 groups, as shown in Table 3.

Table 3. Details of 60 tasks in 6 groups

According to the experiment task, 60 stimulating materials were developed. The stimulating materials were on the grey background, the input area was white background, the feed-forward reminder was located in the white background, and the inner margin was controlled between 0.5°–1°.

3.4 Procedure

Firstly, the participants adjusted their sitting posture to maintain a relatively comfortable sitting posture, with their eyes just perpendicular to the center of the display screen, and their heads aligned and fixed. The distance between their eyes and the screen was measured and determined to be about 600 mm. Specific experimental scenario is shown in Fig. 3.

Fig. 3.
figure 3

Experimental scene

The experiment was introduced to the participants before the experiment was officially started. A training opportunity was provided before each group of experiments begins. Between groups of experiments, three minutes’ rest time was provided, and within each group, 20 s’ rest time was provided.

After all the experiments were completed, participants were asked to fill in questionnaires to evaluate the overall satisfaction of each task, that was, “I feel satisfied with the overall performance of this gaze gesture input”. Satisfaction refers to whether users feel it is easy, accurate and fast to complete the task with gaze gesture input. The design of the questionnaire was based on the Likert 10-point scale, with 10 representing “very satisfied” and 1 representing “very dissatisfied”. The Likert 10-point scale can increase the rating, and the participants are more likely to give a score and improve the discrimination of the results [18], which is suitable for experiments with up to 60 samples in this evaluation. The whole experiment lasted about 60 min.

4 Results

The results of the experiment totaled 1200 input trajectories, of which 3 participants made obvious errors in the trajectory results of their individual tasks. There were two types of errors: one was systematic errors, that was, the device did not accurately record the input trajectory; the other was that users were distracted for a long time, and the characteristics of the input trajectory were obviously deviated from the whole. The number of valid input trajectories collected was 1020.

4.1 Qualitative Analysis of Input Trajectory

Rule1: It is most accurate to draw the square with gaze gesture input, when the feedforward form is point-like feedforward.

As shown in Fig. 4, the different feedforward types had very significant differences for the square trajectories ultimately drawn by the participants. The comprehensive performance of point-like feedforward was the best, while that no feedforward was the worst, mainly reflected in two points: first, the positioning of the four key points was not accurate; second, during the scanning process, there were a lot of additional fixation behaviors.

Fig. 4.
figure 4

All trajectories of the square

Rule2: The fourth step of drawing a square shows an offset.

When drawing a square, the fourth step of the trajectory generally showed an inclination of about 8° instead of vertical. When drawing a square, the trajectory of the fourth step was inclined about 8°, not vertical. The trajectory of the second step also showed a certain slope, but it was not obvious as the fourth step. The same situation did not occur in the first and third steps.

By analyzing the set of trajectory diagrams, we can find that there were two cases of b and c in Fig. 5b showed that when the third step was drawn, the position of the scanning end point exceeded the target position, and the participant added a fixation point to modify the figure in the fourth step. In the case of c, when the third step was drawn, the scanning end point positioning failed to reach the target position, and the participant added a fixation point for correction, but often the corrected point exceeded the target position. As a result, most of the graphs obtained in the final drawing present the situation of the fourth inclined step.

Fig. 5.
figure 5

An illustration of the fourth step of drawing a square to show the offset

In conclusion, it can be speculated that the positioning accuracy of horizontal saccade is less than that of vertical saccade, while the positioning accuracy of left-to-right saccade is better than that of right-to-left saccade.

Rule3: Point-like feedforward has a negative effect on the accuracy of drawing a circle.

Different feedforward types had a significant impact on the final circular trajectory. The performance of line-like feedforward was the best, while that of no feedforward was the worst. The fixation points of the circular trajectory drawn with line-like feedforward were evenly distributed on the circular border, compared with the figure drawn without feedforward, which deviates a lot from the circle. The result of point-like feedforward was like a diamond. After the experiment was completed, many of the participants bluntly stated that they needed to allocate extra attention to prevent them from simply connecting the points with straight lines, as shown in Fig. 6.

Fig. 6.
figure 6

All trajectories of the circle

Rule4: The participants automatically add key points to correct the drawing of the graph.

In Rule 3, point-like feedforward was counterproductive for guiding the user to draw an accurate circle. If we look at each participant’s drawing process, as shown in Fig. 7, we can find that most of the participants unconsciously completed the presentation with one saccade, but also subconsciously added a fixation point between the two key points to correct their drawing.

Fig. 7.
figure 7

An illustration that participants automatically add fixation points to correct their drawing

Rule5: The accuracy of symbol rendering decreases significantly when the input area is less than 5°.

The experimental results showed that when the input area decreased to less than 5°, the accuracy of drawing symbols decreased significantly, and the accuracy of drawing square was better than that of drawing circle. Usually when the input area droped to 3°, it was very difficult to distinguish the approximate figure from the trajectory (as shown in Fig. 8c and d). Of course, a small number of excellent participants also showed good accuracy when using gaze gestures to draw in small areas (as shown in Fig. 8c and d). Combined with the results of the loud thinking of the participants, when the input area was less than 3°, the participants felt that the input experience was extremely poor. First, it was necessary to concentrate a large amount of attention to control the unconscious shaking of the eyeball. Second, without feedback, participants felt frustrated if the drawing is inaccurate.

Fig. 8.
figure 8

Trajectory with too small input area

4.2 Quantitative Analysis of Input Trajectory

1020 valid sample data collected by eye movement equipment were quantitatively analyzed for input performance. Table 4 shows the average time-consuming statistics of 60 input tasks (accurate to milliseconds).

Table 4. Average time-consuming of 60 gaze gesture input tasks

Figure 9 was obtained by visualizing the above table data. From the results, as the input area became smaller, the overall input performance increases; the overall performance of the square input was better than the circular input; the performance of the line-like feedforward was the best. Simply analyzing the impact of each factor, we found that the three factors had significant impact on the input time (the P values of the three factors are less than 0.001).

Fig. 9.
figure 9

Mean input time of six groups of tasks with different input area sizes

However, the eye movement-based speed (average eye movement angle per second) continued to decline as shown in Fig. 10. Eye movement-based speed reflects the degree of fatigue to a certain extent. The faster the eye movement-based speed is, the less effort the user pays to concentrate on the task, so user’s fatigue is lower. When input area size was larger than 15°, the eye movement speed of the user was faster and would not cause obvious fatigue.

Fig. 10.
figure 10

Eye movement-based speed

In addition, it can be seen from the figure that each group of data had obvious variation rules and cross effects. In order to verify the above judgment, the data results were analyzed by univariate multivariate analysis of variance. The results are shown in Fig. 11.

Fig. 11.
figure 11

Multivariate cross-effects of input time (a.size & feedforward b.size & shape c.feedforward & shape)

4.3 Change Law of the Number of Fixation Points

The variation of the number of fixation points was studied. Table 5 shows the statistics of the average number of fixation points for 60 input tasks.

Table 5. The average number of fixation points for 60 input tasks

The above table data was visualized and shown in Fig. 12. From the results, as the input area became smaller, the total number of fixation points decreased; the number of fixation points used for square input was generally less than the circular input, but this situation became inconspicuous when the input area began to be less than 5°. Simply analyzing the impact of each factor, we found that all three factors had a significant impact on the number of input fixation points (the P values of all three are less than 0.05).

Fig. 12.
figure 12

The mean number of fixation points for six tasks with different input sizes

In addition, it can be found from the figure that each group of data had obvious variation rules and cross effects. In order to verify the above judgments, the data results were analyzed by univariate multivariate analysis of variance. The results are shown in Fig. 13.

Fig. 13.
figure 13

Multivariate cross-effects of fixation points (a.size &feedforward b.size &shape c.feedforward &shape)

The results of multivariate cross-analysis are as follows:

Firstly, there was no significant cross-effect between feedforward and input area size (p = 0.974 > 0.05). In general, no matter what kind of feedforward, their trend of the number of fixation points changed consistently, which decreased with the decrease of the size of the input area.

Secondly, there was a significant cross-effect between shape and size of input area (p = 0.000 < 0.001). When the input area size was greater than 5°, the number of square input fixation points was less than that of the circular input; when the input area startedto be less than 5°, the difference of the input symbols had little effect on the number of fixation points.

Thirdly, there was no significant cross-effect between feedforward and shape (p = 0.405 > 0.05). Generally speaking, both line-like feedforward and point-like feedforward could reduce the number of fixation points used in input, which was not affected by the type of input shape.

Finally, from the results of descriptive statistics and cross-analysis, the effects of the three factors on the input duration and the number of input fixation points were very close, so Pearson correlation analysis of the two factors showed that they were positively correlated, and the correlation coefficient was 0.0651 (p < 0.001), that was, input performance had a strong positive correlation with the number of fixation points, and the more input fixation points were, the longer it took.

4.4 Subjective Satisfaction Evaluation

The data of 17 users collected were summarized and counted to obtain the comparison chart of the average satisfaction in Fig. 14. It was found that the overall satisfaction was higher when the input area was between 10°–20°. When the input area began to be less than 5°, the satisfaction decreased rapidly. The overall satisfaction of square input was better than that of circular input, but this phenomenon was not obvious when the input area was less than 5°. Point-like feedforward improved the satisfaction of square input but reduced the satisfaction of circular input. Simply analyzing the impact of each factor, we found that all three factors had a significant impact on satisfaction (the P values of all three are less than 0.001).

Fig. 14.
figure 14

Mean satisfaction of six tasks with different input sizes

In addition, it can be found from the figure that each group of data had obvious variation rules and cross effects. In order to verify the above judgments, the data results were analyzed by univariate multivariate analysis of variance. The results are shown in Fig. 15.

Fig. 15.
figure 15

Multivariate cross-effects of satisfaction (a.size & feedforward b.size & shape c.feedforward & shape)

Firstly, feedforward and input area size had significant cross-effects (p = 0.009 < 0.001). When the size of the input area was larger than 10°, the satisfaction when there was feedforward was much higher than that without feedforward, and the satisfaction of point-like feedforward was higher than that of line-like feedforward. When the input area began to be less than 10°, the existence of feedforward had little effect on the input satisfaction.

Secondly, there was no significant cross-impact effect between shape and input area size (p = 0.15 > 0.05). In general, whether the input is square or circular, their satisfaction trends are consistent, and they all change with the size of the input area.

Thirdly, the feedforward and shape factors had significant cross-effects (P = 0.000 < 0.001). When there was line-like feedforward, the satisfaction of circle and square had little difference; when there was no feedforward, the satisfaction of square was obviously higher than that of circle; when there was point-like feedforward, the satisfaction of square was much higher than that of circle. In general, feedforward can improve the user’s satisfaction at gaze gesture input, but different feedforward types have different help for different graphics. For example, point-like feedforward plays a great role in promoting input square, but it has very limited help for input circle.

5 Conclusion

In this paper, the spatiotemporal characteristics of gaze gesture input were focused. The conclusions of this study include the following aspects:

In terms of real-time feedback:

  1. (1)

    Feedforward is very important, which can reduce the number of fixation points and improve performance.

  2. (2)

    Drawing straight lines with point-like feedforward is more accurate, while drawing curves with line-like feedback has better accuracy.

  3. (3)

    When the user’s eye movements have sight deviates or the number of points in the point-like feedforward is insufficient to describe feature of the gaze gesture, the user will automatically correct the data by adding fixation points.

In terms of implicit interaction:

  1. (1)

    When user performs gaze gesture, it is easier to lead to visual fatigue which tend to cause fuzzy input which has low accuracy.


  2. (2)

    When the input area is smaller than 5°, the drawing accuracy of gaze gesture is degraded, and the satisfaction is very low.


  3. (3)

    When the feedforward information is insufficient, the user’s input will be ambiguous.

In terms of input continuity:

  1. (1)

    In the gaze gesture interaction, the positioning accuracy of horizontal saccade is less than vertical saccade, and the positioning accuracy of left-to-right saccade is better than right-to-left saccade.

  2. (2)

    When the input area is larger than 5°, the performance and satisfaction of square are generally better than circle.

  3. (3)

    With the decrease of input area size, the number of fixation points decreased significantly, and the performance of gaze gesture interaction also increased significantly. However, the eye movement-based speed (average eye movement angle per second) continued to decline. Eye movement-based speed reflects the degree of fatigue to a certain extent. The faster the eye movement-based speed is, the less effort the user pays to concentrate on the task, so user’s fatigue is lower. When input area size is larger than 15°, the eye movement speed of the user is faster and will not cause obvious fatigue. Satisfaction is higher when the gaze gesture interaction area is 10°–28°.

According to the research on the spatiotemporal characteristics of gaze gesture interaction, the following reference points in the gaze gesture interaction design are obtained:

  1. (1)

    Considering the accuracy of gaze gesture, the vertical saccade accuracy is better than the horizontal saccade, and the positioning accuracy of left-to-right saccade is better than right-to-left saccade.

  2. (2)

    Gestures consisting of only straight lines should be adopted as far as possible in the design of gaze gesture, and feedforward with sufficient information should be used to guide users.

  3. (3)

    When the gestures consisting of curves have to be drawn, line-like feedforward or point-like feedforward with sufficient information should be used to reduce the cognitive burden of user.

  4. (4)

    It is not advisable to set the eye gesture interaction area smaller than 5°, and the gaze gesture interaction task should be designed as far as possible between 10° and 28°.

Through interviews with the participants, one of the reasons for the influence of the gaze gesture input performance was that once the user is psychologically aware that he was using the gaze gesture to input, the spirit would be tightened. In order to pursue the accuracy of the input, the user’s eye fatigue would increase dramatically, and the psychological pressure would double, which in turn would affect the performance of continuous gaze gesture input.

According to the experimental results, the relationship between the degree of eye load and the degree of attention distribution is roughly described as shown in Fig. 16. The Y-axis represents load, including physiological load and psychological load. When the user is in unconscious eye movement, the load is very low, and is not affected by the complexity of eye movement. When the user is in the subconscious eye movement area, the load is affected by the complexity of the eye movement. The X-axis represents attention resources. When users realize that they are using gaze gesture to input, they begin to allocate attention resources, and the allocation is affected by the complexity of the eye movement. The unconscious eye movement area, which is the user’s natural eye movement area, is usually used as the data source for the implicit input of eye movement in human-computer interaction. Subconscious eye movement area is usually used as a data source for explicit input in human-computer interaction.

Fig. 16.
figure 16

Relationship between attention distribution of gaze gesture input and load

Not all eye movements are suitable as input instructions. The blue line in Fig. 16 is used to represent the appropriate input state. In the specific application process, we should follow the principle of attention distribution and reasonably consider the complexity of input symbols.

There is a long way to go in gaze gesture input and eye movement interaction interface research. The work done in this paper is only a little bit. Specifically, inspired by the research results of this paper, there are already several clear directions, which are worthy of further research:

  1. (1)

    The input performance can be studied when the input area is larger than 30° (outside the comfort zone of the eyeball rotation).

  2. (2)

    It is possible to explore whether there is a better performance feedforward type for input symbols such as circles.

  3. (3)

    It is possible to define the “complexity” metric for eye movements and the range of complexity metrics that are appropriate for eye movement explicit input.