1 Introduction

Many systems have been studied for enhancing a viewer’s experience of and interest in digital content. Dive into the Movie [1] is a system that embeds viewers’ faces into characters in a movie, so they can feel as if they were a character in the movie. This system succeeds in amplifying the viewer’s interest and viewing experience. On the other hand, 3DTV and head mounted displays (HMDs) have the potential to enhance the degree of immersion offered by digital content. However, these systems are difficult to generate content for and have demanding hardware requirements. We thus aim to enhance the viewer’s experience of and interest in existing digital content without him or her having to expend time and effort.

Here, a gaze point is regarded as accurately representing the viewer’s interest. In addition, a person’s visual field has two parts: central and peripheral vision. Central vision clearly perceives the gaze object, while peripheral vision perceives the periphery of the field of view vaguely [2]. Accordingly, we can expect that the viewing experience can be enhanced by presenting digital content in a way that takes into account the properties of central and peripheral vision. Okatani et al. [3] developed a gaze-reactive display that changes a viewer’s impression of a photograph by presenting a blur effect on it depending on the gaze point detected by an eye tracking device. However, their method requires preparing blurred images for each part of the target photograph in advance, which also requires time and effort. In addition, this method is difficult to apply to video clips because it requires a blurred image to be generated for each position in each frame. Moreover, the resolution of the gaze point is limited.

In this study, we focused on the characteristics of central vision, peripheral vision, and gaze and devised a method to enhance viewers’ experience of digital content by dynamically superimposing a simple blur effect on it depending on the gaze point.

To realize this system, we expanded the versatility of the gaze-reactive display [3] by monitoring a viewer’s gaze point and by superimposing blur effects surrounding the gaze point on the digital content in real time by using OpenGL Shading Language (GLSL). Here, the parts on which the user’s eyes are focused remain clear, while the parts not focused on become more blurred the farther away they are from the gaze point (Fig. 1). In our method, users do not need to prepare blurred images in advance and can enjoy not only still images but also videos and video games with blurred effects. By applying blur effects, we aim to expand and amplify impressions such as presence or immersion.

Fig. 1.
figure 1

Examples of adding blur effect by our method.

We implemented a prototype system that superimposes blurring using a Gaussian filter on arbitrary moving images and experimentally evaluated its usefulness. We also experimentally clarified for which types of video the proposed method is effective.

2 Related Work

Many researchers have attempted to amplify interest in content by superimposing effects on it. As mentioned above, the gaze-reactive display of Okatani et al. [3] dynamically changed the depth of the focus point of a photograph depending on the viewer’s gaze point detected by an eye tracking system and its depth information. Changing the depth of focus enhances the user’s experience. However, up to 256 blurred images had to be generated in advance for the user to experience content on their gaze-reactive display.

Other systems amplify interest, stereoscopic effect, and sense of reality by changing the blurring degree (depth of field) of the virtual world depending on the gaze point [4, 5]. However, these systems only focus on the virtual world and lack versatility because they need to use the data of the cameras used for rendering in the contents. In contrast, our method expands the viewing experience by superimposing simple blur effects on arbitrary video content. Furthermore, Okatani et al. [3] reported that viewers notice the artificiality when there is a slight delay between the movement of the gaze point and the movement of the blurred point of the content. To solve these problems, we use GLSL, which can process images with as short as possible delays.

Hirai et al. proposed the VRMixer system [6], which increases the fun provided by video contents by projecting real user images into them by using a depth camera. Kagawa et al. [7] developed a method that enables users to add emotion illustrations like hearts, called “intuitive emoji comment,” to video content and showed its usefulness in an evaluation. However, these methods superimpose the effects directly on the video content, which impairs its visibility. Our method keeps visibility as high as possible by reproducing the blurring condition in the viewer’s peripheral visual field on the display.

Hata et al. [8] revealed that the gaze point unconsciously moves from low-resolution areas to high-resolution areas in an image. They developed a gaze control system that changes the resolution of each part of content in an image. They also found that the time required for visual guidance depends on the intensity of blurring. Their method effectively uses the features of central and peripheral vision. This knowledge can also be used for our method, and blurring is thought to be able to enhance the viewers’ ability to concentrate.

3 Prototype System

Our work is intended to easily enhance the experience of digital content such as the feeling of realism and immersion by adding a blur effect to the content surrounding the central visual field.

Because peripheral vision only perceives things faintly and processes visual information unconsciously, we accordingly focus on the difference between central vision and peripheral vision and develop a new method for viewing digital content. When a user views digital content by using our method, the central visual field, which is detected by the eye-tracking system, is clear and the peripheral visual field becomes more blurred the farther out from the center it goes. This effect emphasizes the centrally viewed part of the content. Figure 2 shows example images of the system in action. In the figure, the blur effect is superimposed on the light blue part and not on the clear elliptical part.

Fig. 2.
figure 2

Images of our method that blurs the peripherally viewed part of the content.

We use a Gaussian filter for this process. An image is processed by increasing the weight of the filter as it moves away from the central vision of the gaze point acquired by the eye-tracking device (see Fig. 3). Figure 3 illustrates the filter structure: the blurring is stronger farther from the gaze point. In particular, the weight is increased in four stages to change the blur level depending on the distance from the gaze point.

Fig. 3.
figure 3

Blur level is changed depending on the distance from the gaze point.

The shapes of the regions with different blur levels must be elliptical to match the human visual field, so the prototype system has to give a weight to each point of the image by using a filter [3]. Here, we let \( \upsigma \) be the weight of the filter, \( \left( {{\text{x}},{\text{y}}} \right) \) be arbitrary coordinates of the video contents, and \( {\text{c}} \) be the RGB value before the Gaussian filter is superimposed on the target pixel.

When the length of one side is \( {\text{k}} \) [pixels], the range to which the filter is applied is represented by a square of \( {\text{k}} \times {\text{k }} \) pixels centered on the target pixels, and the RGB values C after superimposing the Gaussian filter on the target pixels are derived from (1) and (2), respectively.

$$ f(x,y) = \frac{1}{{2\pi \sigma^{2} }}\exp ( - \frac{{x^{2} + y^{2} }}{{2\sigma^{2} }}) $$
(1)
$$ C = \sum\limits_{y = 1}^{k} {\{ \sum\limits_{x = 1}^{k} {c \bullet f(x,y)} } \} $$
(2)

The blur can be increased by increasing the value of \( {\varvec{\upsigma}} \) in Eq. (1). Here, our method manages the color information by using the RGB value. If the value exceeds a certain threshold, the brightness of the pixel is decreased and the image turns black. Therefore, we set \( {\varvec{\upsigma}} \) such that it does not exceed a predefined upper limit. In addition, k [pixels], which is the length of one side of the filter, increases the farther it is from the gaze point.

By superimposing such effects, we reproduce a “human” field of view that is close to reality on the display. The user is expected to be able to obtain a stereoscopic effect and feeling of immersion.

3.1 Implementation Method

We implemented our system by using Processing and GLSL. Here, our system consists of an eye-tracking module and content presentation module. The eye-tracking module monitors the gaze point of a user by using Tobii EyeX and transmits the detected gaze point data to the content presentation module. The content presentation module superimposes the blur effect on the currently presented content using GLSL on the basis of the gaze point data. GLSL is a programming language specialized for image processing and operates at high speed because it processes images using a graphics processing unit (GPU). Furthermore, our method succeeds in minimizing the delay in updating the video information and reproducing the human visual field with high accuracy.

Incidentally, there is a phenomenon called saccade, in which the gaze point moves frequently due to minute eye movements that the eyeball always perform involuntarily. If the system reflects the saccade when drawing effects, the effects will become an obstacle to viewing the content. Therefore, when the variation of the gaze position is less than a certain threshold, the gaze information is not updated. Since the gaze point of humans is elliptical [3], the range of effect is changed in an elliptical shape in accordance with the distance from the gaze point (Fig. 4).

Fig. 4.
figure 4

Examples of effect superimposition.

Figure 4 shows (a) original images, (b) original images with blur effects, and (c) original images with blur effects and boundary information where the weight changes to clarify the boundary of the blur. In (c), the central ellipse is not superimposed with the effect, but the farther out the ellipse is, the stronger the blurring becomes.

4 Preliminary Experiments

To clarify how the viewing experience of still images and videos is enhanced by superimposing blurring, we investigated the difference in subjective impressions between viewing digital content with and without our method. Here, we also investigated physiological impressions of effects such as visibility and discomfort.

4.1 Preliminary Experiment on Image Content

First, we clarified whether the viewing experience of contents is enhanced by superimposing blur using a Gaussian filter on the image content in the peripheral visual field. Images for the experiment were selected subjectively by us. The images were classified into video-game content, landscapes, paintings, illustrations, and geometric patterns. We selected 30 images for evaluation and asked ten university students aged 18–21 years old to participate in the experiment.

The experiment procedure was as follows. First, we asked each participant to sit in front of the display and adjust the Tobii EyeX. The distance at which Tobii EyeX succeeded in adjusting the gaze point was set as the distance between the display and the user. The position of the display was adjusted so that the viewing angle of the center field of view was 10° above and below and 15° to the left and right.

Next, we informed the participants that they were going to view image contents from now on and told them to look at the whole image. Here, we taught them how to turn the blur on and off. They viewed the 30 original images on a 27-inch display (Fig. 5) and turned the blur on and off arbitrarily while viewing images. After the viewing, we verbally asked them which images gave them different impressions when the blur was used.

Fig. 5.
figure 5

Scene of experiment.

As a result of this experiment, we found that our method was effective on the video-game content and landscape images but not on the paintings or geometric patterns. The reason for this is that the Gaussian filter does not work well in a single-color area.

4.2 Preliminary Experiment on Video Content

Since the prototype system worked effectively on some types of image content, we thought that the impressions of the user might change even for video content. Therefore, we prepared 12 videos and conducted an experiment by preparing an impression evaluation questionnaire to find out whether the impression is enhanced when the blur is superimposed on the peripheral visual field.

In this experiment, we prepared two types of video clip: video games and actual landscape videos. The video content lasted from 30 s to 2 min 30 s. As an experimental procedure, Tobii EyeX was adjusted as in Subsect. 4.1.

Next, we asked four participants (male and female university students aged 18 to 21 and divided into two groups of two) to view 12 videos on a 27-inch display. Then we asked them to answer the questionnaire after viewing each video. After they answered the questionnaire, we asked them to press the enter key to go to the next video. Here, one group viewed the videos with the blur, and the other group viewed videos without it. In addition, both experimental groups viewed the same videos in the same order.

The questionnaire consisted of eight items about emotions (pleasure, excitement, relief, pleasure, disgust, excitement, surprise, frustration, fear, and interest) selected from Plutchik’s Wheel of Emotions, eight items about psychological impressions that we expected the blurring to give (stereoscopic effect, immersion, presence, tension, exhilarating feeling, feeling of freedom, sense of stagnation, and feeling of stir) and eight items about physiological impressions (discomfort, visibility, concentration, flickering, botheration, blur, motion sickness, and feeling strange). We asked the participants to score all items on a 7-level Likert scale.

The results showed that the score for each type of item increased for both types of video content. However, our system did not work well in videos where the similarity of each frame is very high, which coincides with the results of the preliminary experiment performed on image content.

Also, the emotions and impressions of participants changed depending on the length of the video contents, which suggests that videos of the same length should be used to test the system.

5 Experiment

We redesigned the experimental test on the basis of the results described in Section 4 in order to clarify the effectiveness and usefulness of our method for videos and clarify how it enhances the viewing experience.

5.1 Experimental Content

We prepared videos of retro video games (Retro-Game), new video games (New-Game), and actual landscapes (Landscape). We prepared four videos for each category. We omitted videos with few image changes on the basis of the results in Subsect. 4.2. Each video lasted 1 min 30 s. The viewing environment of the video content was the same as that described in Subsect. 4.2.

The participants were eight male and female university students aged 18–22 years old and divided into two groups of four.

The questionnaire included seven items about emotions (relief, pleasure, disgust, excitement, surprise, frustration, and interest), four items about psychological impressions (stereoscopic effect, immersion, presence, and tension), and three items about physiological impressions (comfort, visibility, and concentration).

The items were scored on a five-level Likert scale: maximum value of 2 and minimum value of –2. Taking comfort as an example, students answered whether they felt [2] comfortable, [1] somewhat comfortable, [0] neither comfortable not uncomfortable, [–1] somewhat uncomfortable, or [–2] uncomfortable. Emotion items were selected from Plutchik’s Wheel of Emotions, and psychological and physiological impression items were selected on the basis of the experimental results in Subsect. 4.2. The experiment procedure was the same as that of the experiment in Subsect. 4.2.

5.2 Experimental Results

Figures 6, 7 and 8 show the average values of emotions, psychological impressions, and psychological impressions evaluated by participants for each of the three video categories.

Fig. 6.
figure 6

Results for emotion items.

Fig. 7.
figure 7

Results for psychological impression items.

Fig. 8.
figure 8

Results for physiological impression items.

Figure 6 shows disgust, surprise, and frustration were scored negatively in all categories. Also, disgust and frustration were more positive for videos with the blur than without it. Evaluation values of pleasure were positive in all categories and increased with the blur in Retro-Game and New-Game.

Figure 7 shows videos without blur were scored negatively in each category for every item. On the other hand, videos with blur had more positive values than videos without it for almost all items. Immersion was scored positively in all categories.

Figure 8 shows that videos without blur were scored as having higher comfort and visibility than videos without it. These results indicate that our current method sometimes gives viewers negative physiological impressions.

5.3 Considerations

The experimental results showed that the participants’ impressions of the video content were changed by superimposing the blurring effect on the peripherally viewed area. In particular, almost all psychological impression items were scored higher when blur was superimposed than when it was not. These results clarify that our system is useful in all three categories.

In Fig. 8, landscape videos with blur scored high for concentration. This result coincides with those of Hata et al. [8]. In other words, our method achieves visual guidance because the centrally viewed area remains high resolution and the peripherally viewed area has low resolution. As a result, the gaze point is focused in this central visual field, which may increase the degree of concentration. Therefore, our method can be said to increase the degree of concentration for viewers of landscape videos.

On the other hand, since videos with blur scored lower for comfort and visibility in all the categories, our method seems to decrease the visibility of content. In this experiment, we set the range of the weight of the Gaussian filter to a range where the brightness of the image does not decrease. However, in the future, it will be necessary to verify the range of weights that do not impair visibility while enhancing the viewing experience. Also, none of the participants said that they noticed a delay, so our method seems to meet the requirement for no delays.

As well as videos of video games and landscapes, we aim to further narrow down the videos for which our method is useful by conducting experiments on animations, special effects videos, etc.

From the questionnaire on the physiological impression, it turned out that when participants view the video by superimposing blur on the landscape, the degree of concentration gets higher. Therefore, we will investigate the relationship between the degree of concentration and the gaze point by measuring the gaze point log when videos are viewed using this prototype system as an additional experiment.

5.4 Additional Experiments on Concentration and Gaze Point

To clarify whether or not the gaze point moves differently depending on the presence or absence of the blur, we prepared three new videos, one for each category (Retro-Game, New-Game, and Landscape). In addition, we implemented a logging system of viewer’s the gaze point for analyzing the viewer’s behaviors.

Participants were eight university students aged 18–21 and divided into two groups of four. Although the viewing environment and procedure are almost the same as those in Subsects. 4.2 and 5.1, there was no need to answer a questionnaire, so participants took a break for 10 s after viewing a video and then moved on to the next video. The measured gaze point logs are shown as heat maps for each video with and without the blur (Fig. 9).

Fig. 9.
figure 9

Heat maps showing gaze point log.

The results show no large differences between Retro-Game and New-Game videos with and without blur. However, viewers tended to concentrate more on the center of the Landscape video without the blur (2) than that with it (1). In some cases, the gaze points tended to be slightly dispersed.

According to the visualized gaze point log data obtained in the additional experiments and the questionnaire results, the degree of concentration is increased by using this prototype system when viewing actual landscape videos, and the gaze points tend to be somewhat dispersed. In other words, the increased degree of concentration led to viewers paying attention to a wider range of the video. However, this result may not be reliable since the videos used in the additional experiment consisted of only one video per category and differed from the videos used in the main experiment. Therefore, we aim to improve the reliability by repeating the experiment of measuring the gaze point log and increasing the number of videos for each category used. Furthermore, we aim to measure the gaze point log when other types of videos are viewed and clarify the degree of concentration and its relationship with other psychological and physiological impressions.

6 Conclusion

We investigated whether a viewer’s experience of video content is enhanced by superimposing blurring using a Gaussian filter on the peripherally viewed area. In addition, we evaluated the method’s effectiveness in a user-based evaluation experiment and clarified its usefulness and problems.

In this work, blurring effects were superimposed on video content. In the near future, we will clarify whether the viewing experience is enhanced when blur is superimposed on the screen while the viewer is actually playing a video game. Moreover, we will investigate the threshold of the Gaussian filter at which the blur is not noticed and attempt to develop a new system that does not impair visibility while enhancing the viewing experience. Furthermore, since the questionnaire results in Subsect. 4.1 showed that our method is useful for still images, we expect that it can be usefully applied to e-books.

This system adopts a Gaussian filter as a blurring effect and superimposes it on the video content, but its usefulness is low for video content that does not change much. Therefore, we are planning to implement a new effect for decreasing saturation and brightness in areas away from the center of the acquired gaze point data. Sensitive perception of light flicker is a characteristic of peripheral vision [3]. Michael et al. [9] investigated the effect of color perception of the user in an environment where the background and the object color change as a result of movement of the gaze point and in an immutable environment. By changing the color of the background and the object, they showed that the method extended the region of a color perceivable by a person and improved viewers’ ability to discriminate objects. Therefore, decreasing saturation and brightness is expected to enhance the viewing experience. Also, their system seemed to work well for videos with few changes. Finally, we plan to analyze our system in more detail by increasing the number of research participants and acquiring more data.