1 Introduction

Information becomes an important factor in making our everyday lives efficient and also comfortable by augmenting human behavior with access to this information. Accessing digital information has become easier because of the spread of mobile devices. In the latter half of the 1990s, mobile phones in Japan became more widespread, and in the 2000s, mobile phone penetration exceeded 50% [12]. In addition, in 1999, NTT Docomo started a service called “i-mode” that could connect to the Internet from a mobile phone. At that point, the mobile phone began to play an essential role in accessing digital information in the real world. After Apple iPhone appeared in 2007, mobile phones became more like generic digital devices than just voice phones. As a result, people now have access to information through a mobile phone anytime and anywhere.

People usually need to activate the device and to open a target application in order to access the information provided by mobile devices. Currently, this process is standard procedure; however, that process is changing through the progress of augmented reality (AR) technologies. In particular, the concept of Mixed Reality (MR), which Microsoft HoloLensFootnote 1 uses, seamlessly incorporates digital objects into people’s real lives. If a wearable smart glass that Google GlassFootnote 2 tried to produce is widely available, people will be able to utilize digital content more easily by wearing the glasses. In addition, contact lens type AR devices are currently being developed [7]. With the progress of such devices, people will be able to interact with digital content in their everyday lives more seamlessly. Current mobile phones may change to other forms and play different roles. Moreover, the mobile phone may disappear from our casual daily activities in the near future, as e-mails are replaced with casual social communication tools like SlackFootnote 3.

In this paper, we investigate an interaction method for accessing digital information more ambiently in an age when smart glass is becoming widely used. Now, people can interact with digital information without holding a device, so they can acquire information more naturally and with less effort. We discuss accessing information by implementing and evaluating interaction methods in Ambient Bot. In past information systems before the AR era, information was typically shown on a display, and artificial interactions were made using devices such as a keyboard and mouse not typically used in daily reality space. However, Ambient Bot is designed to access information more naturally by using eye contact to replicate human nonverbal communication. The approach that utilizes non-verbal communication may fill a gap between the artificial digital world and the real physical world. If the digital world feels more realistic, information technology can spread more widely. As a result, an information system will help people navigate easily and acquire more useful knowledge with natural interaction.

The structure of this paper is as follows: In Sect. 2, we show an overview of Ambient Bot. Section 3 introduces the basic design of two kinds of interaction methods (the pull-based and push-based interaction methods) for accessing information. In Sect. 4, we will present the design of the push-based interaction method and its evaluation in Ambient Bot. In Sect. 5, we investigate the feasibility of using Ambient Bot to provide serendipitous information. In Sect. 6, we present other work related to this study, and finally, we summarize this paper in Sect. 7.

2 An Overview of Ambient Bot

Ambient Bot offers an agent that ambiently exists around a user to provide information that he/she wants to know by using AR technologies without interfering with their current activities in his/her everyday life. The aim of Ambient Bot is to offer implicit and low-cost interaction that makes it easy to access necessary information even when users are going about their everyday lives, for example, when people are in public places such as stations or walking down the streets.

Ambient Bot allows users to access information when needed by simply making eye contact with virtual creatures that are always floating around them. Eye contact does not enable complicated interaction with accessing information but can initiate implicit and natural interaction without explicit interaction, such as using a controller device. A creature shows and speaks a content article in a translucent window in the real world to provide information to the user, as shown in Fig. 1. Ambient Bot requires people to wear a lightweight head-mounted display (HMD) that shows the real world through a camera attached to the HMD, where they can see a virtual creature in the real world with augmented reality technologies. While there is no eye contact, the creature automatically moves to a position that does not interfere with the user’s view, so he or she is not strongly conscious of the existence of the creature. In the current version of Ambient Bot, we chose a floating creature similar to a jellyfish, as shown in Fig. 1. Since animated characters are popular in animations and games, especially for young people, this approach is not unnatural for them as a means to access information.

Fig. 1.
figure 1

A screenshot of Ambient Bot presenting news

The original Ambient Bot described in [2, 5] supports only the pull-based interaction method that will be defined in the next section.

3 Designing the Interaction Modality

In this section, we describe the interaction modalities that are used in recent digital services. Then, we define the push-based and pull-based interaction method and show how the methods are used in Ambient Bot by presenting a scenario.

3.1 Interaction Modality

Digital services need to offer interaction methods to access those services and need to design appropriate input and output modalities for those interaction methods. For example, in a modern GUI environment, a user manually uses a mouse and a keyboard as inputs for a service, and the service presents information as the output through a display device. Due to the progress of information technologies, their modalities are diversifying and we can now consider other options. In particular, new input modalities like tangible devices, sounds, gestures, and eye gaze have also appeared. In contrast, the output modalities have not significantly changed, and mainly visual and audio methods are used for presenting information. Although tactile and olfaction senses have appeared to enhance people’s user experiences, we do not consider those approaches in this paper because those technologies are not mature enough to be used as general information access.

Table 1 shows the categorizations of input and output modalities. The current digital services use any modalities belonging to these categories. For example, the standard interaction of a smartphone mainly uses the touch - vision modality. In recent years, voice input has become popular because speech recognition technologies have progressed rapidly. As a result, smart speakers such as Google HomeFootnote 4 and Amazon EchoFootnote 5 have appeared and began to be used widely. The smart speakers typically use the voice - audio modality. This discussion is also necessary for AR technologies, which have become popular recently. In this paper, we discuss how Ambient Bot can support the push-based interaction method. Ambient Bot adopts AR technologies to access information so that investigating the design space of input and output modalities in Ambient Bot offers useful insights to choose appropriate modalities in the AR environment in the future.

Table 1. Input and output modalities

3.2 Pull-Based and Push-Based Interaction Methods

We consider two methods for people to access information; the pull-based interaction method and the push-based interaction method. In the pull-based interaction method, a user actively accesses information that he/she wants to know. In the push-based interaction method, information is actively provided to a user and the user can passively access the information. The push-based interaction method includes not only notifications but also accidentally receiving information. For example, people may acquire knowledge about new furniture via advertisements when they walk down the street. Such interaction is also classified as push-based interaction because people acquire the information passively from the outside world, and they may not expect to receive the information.

These two methods may be used separately or combined. For example, watching a weather forecast application is close to a pure pull-based interaction because a user usually has a clear intent to access the information. On the other hand, an SNS (Social Networking Service) has features of both pull-based and push-based interactions. When watching the timeline of an SNS, the purpose of accessing the SNS information is ambiguous; however, a user is willing to acquire various pieces of information published by followers of the SNS. From the above discussions, information access cannot simply be divided into pull-based and push-based interaction methods, but it becomes a spectrum corresponding to what information a user needs to receive concretely.

Recently, push-based notifications on smartphones were studied [3] and an ambient notification method using an eyeglass device was reported [4]. In this paper, our focus is to investigate the design space to use interaction modalities in the push-based and pull-based interaction methods in future AR environments.

3.3 Scenario Demonstration

The section introduces a scenario for explaining the pull-based and push-based interaction methods using Ambient Bot. Figure 2 shows how the two interaction methods are used in Ambient Bot.

Fig. 2.
figure 2

The pull-based and push-based interaction method in Ambient Bot

“Satoshi is 27 years old. He works at an IT company in Tokyo. His parents’ house is in Kyoto, and now he lives by himself.

One day in October, he woke up at 6 am as usual. He went to the washroom to wash his face and put on a pair of glasses. His glasses use smart glass, a tool that provides various pieces of information, and it made an explosive hit several years ago. Smartphones are still significantly popular as most people use smart glasses and smartphones in combination. Although the smart glass is excellent as a display, input methods are poor. For this reason, a smartphone is used for work requiring detailed input. When he wore the smart glasses, multiple virtual creatures floated in the real world. This application is called Ambient Bot, and he can receive several notifications, such as news and weather forecasts, through these creatures. He made eye contact with the creature that conveys the weather forecast. The creature said, “Today’s temperature is 16°. It will be a bit cooler than yesterday.” He thought, “Then, I’m going to wear my sweater.”

He went to his work place by a train. He had decided to watch the news on the train every day. On this day, there was a feature about venture companies, and he thought that working at a venture company would be a tough job. However, he thought that a venture company may not be a bad place to work if there is time to go fishing on the weekends.

Satoshi arrived at work and started working. While he is concentrating on his work, Ambient Bot is not used. During his work that morning, he discovered a bug in the system he had developed last week and he worked on fixing it. He could not finish the bug fix, and it was time for his lunch break. When he finished his lunch and drank a coffee, Ambient Bot suddenly came into his sight. It was a creature who provided information that someone recommended. That creature gave a summary of a company manager’s interview. The manager said “When you are confronted with a problem, you don’t notice your surroundings”. Satoshi realized that he was in that situation right now. After he returned to work, he discovered that the bug was not a problem in his development but that the problem was caused by another department.

The problem was solved, and he left the office. When he was going home as usual, the Ambient Bot creature fell into his sight. The creature tells him, “You are going to the gym at 20 o’clock today.” He was busy with work, so he forgot that he had decided to go to the gym that day. Because the next station was the transfer station to the line that goes toward the gym, he changed trains at that station and went to the gym.”

4 Design and Implementation of the Push-Based Interaction Method in Ambient Bot

In this section, we show the design and implementation of the push-based interaction method in Ambient Bot. We also investigate the feasibility through user study. As described in Sect. 2, the pull-based interaction method in Ambient Bot is explained and evaluated in detail in [2].

4.1 Design for the Push-Based Interaction Method in Ambient Bot

This section investigates how Ambient Bot provides information in the axis of visual and auditory cues. Since Ambient Bot aims to provide information naturally without increasing a user’s cognitive overload, we need to find proper ambient information delivery methods for the AR environment. Therefore, the visual cue designed to notify the user of ambient information is delivered by deliberately locating a virtual creature in the user’s view instead of explicitly displaying information. On the other hand, auditory cues need to be considered to ensure ambientness when a user hears the sound.

When Ambient Bot wants to provide information, a virtual creature appears in the real space, but the position is almost the edge of a user’s view; thus, the creature’s appearance does not consume his/her cognition too muchFootnote 6. When he/she makes eye contact with the creature, provides information. Then, the creature moves out of his/her sight and disappears after providing the information. Based on this basic interaction design, we would like to investigate the design space for presenting visual and auditory cues.

4.2 A Prototype System

The prototype system, as shown in Fig. 3, was implemented on Microsoft HoloLens (see footnote 1), which is a platform that offers MR user experiences by enhancing the previous version of Ambient Bot [2, 5]. Microsoft HoloLens offers an HMD to superimpose virtual information onto the real world. For discussing visual cues, we compare two types of cues to indicate that Ambient Bot has information to share: the first one is whether or not to show a notification icon above the creature, and the second one is to indicate what types of content the creature provides to a user.

Fig. 3.
figure 3

Overview of the push-based interaction

Also, we prepared six sounds for the auditory cues. Three sounds are traditional, inorganic, and electric sounds, and the other three are sounds the creature seems to naturally generate, where Sound 1 is a comical sound that becomes treble gradually; Sound 2 is a short electronic metallophone sound whose musical scale is C; Sound 3 is also a short electronic metallophone sound, but it is a chord composed of C, E and G; Sound 4 consists of three short marimba sounds; Sound 5 is like the friction sound of paper; and Sound 6 consists of a double-short electronic sound that becomes treble drastically. A user hears these sounds when the creature appears in the real space. The user can also configure the device to receive information without the auditory cue. The content is read via the creature’s voice, and the user can choose whether to display the content on the message window or not.

In addition, we added a gesture recognition function to this version of Ambient Bot. When accessing information through Ambient Bot, making eye contact with a creature is a good approach since a user intends to access the information by his/her own intention. However, when the information is actively presented to a user, the user may be currently doing another task. Therefore, it might be uncomfortable for the user to be interrupted to watch the creature and determine whether the information is currently relevant to him/her. Once the user makes eye contact with, the creature continues to read the content article by voice until the end. Thus, we added a function to cancel speaking in the middle of reading. We decided to use MyoFootnote 7, which is a wearable gesture recognition device. This device makes it possible to use the shake of a user’s hand as an input to Ambient Bot, for example, wave in, wave out, double tap, fist and so on are used as current gestures. An important issue in the basic design philosophy of Ambient Bot is that the interaction should be ambient and natural, and a user can use Ambient Bot in various public spaces without annoyance. A user can stop the creature’s reading with a double tap gesture with his/her fingers. In addition, the user can cancel receiving the information by fist gesture if he/she feels that the information is not of interest to him/her.

4.3 User Study

A user study was conducted using nine participants (8 males and 1 female, average age: 22.7). Figure 4 shows screenshots of how Ambient Bot works in each step. In this user study, participants chose their preferred display modes for three typical content types: newsflashes, e-mails and the user’s schedule. For each type of content, the participants selected the configuration of sound cues, whether a notification icon is presented or not, and whether a message window is presented or not. The message window presents content, and the content may be read by the creature’s voice.

Fig. 4.
figure 4

Screenshots showing how the push-based Ambient Bot works

In this user study, we interviewed the participants and asked them why they chose the configurations. Table 2 shows the configurations selected by the participants in the user study. For the types of sound cues, eight participants chose to use a sound cue before information is presented. Only participant D selected no sound cue when a newsflash was presented. All participants claimed that selecting sound cues depends on the taste of each individual, and a treble sound was not used by most participants for indicating there is interesting content. Since all participants chose to use auditory cues, presenting a creature in their sights is not enough of a cue to indicate that there is interesting content for them. However, participant D answered “I do not need to configure a sound for newsflash because the content may not be necessary for me.” In addition, participant D also said, “In the AR environment, using electronic sound cues is unnatural. I felt the sound from my mobile phone is natural because I implicitly understood that the sound is in cyber space, but the AR environment is closer to the real world, so I’d like to be notified with a more natural sound that exists seamlessly in reality.” In this user study, the participants felt inconvenienced because the auditory cues that we prepared were mostly electronic sounds.

Table 2. The results of the push-based interaction method

The message window selected depended on the type of content. Of course, some participants always turned the window on, but other participants changed the configuration to display the message window according to the content. The reason provided for turning the message window on was mostly because they may miss listening to content via voice only, and the reason for turning it off was because the content in the message window does not matter if participants missed listening or not.

For e-mail content, all participants turned on the message window. From the interviews, they did not want to miss the email content. In addition, several participants said they do not need a function to read e-mails via a voice because e-mails are usually read by a user’s eye, and not listened to by a voice. The result of the interviews shows that the participants felt the modality used for existing services should be used even in novel services when traditional content is shown to a user.

The icon was turned on by most participants, except participant C. Several participants said, “The icon was turned on in order that Ambient Bot informed what information would be presented.” Participant C selected to display the icon only when the message window was not displayed. He commented in the interview “When the message window comes out, I can read the message quickly; thus, I can fully understand what content it is. In the case of only voice, the icon is displayed soon to see what content it is.” Hence, we understood through the interviews that all participants wanted to know at the beginning what type of content Ambient Bot wanted to present.

4.4 Analyzing the Results of the User Study

In order to investigate the methods for providing a user with interesting content, we conducted a user study using visual and auditory cues. As a result of the user study, both the sound and the icon were used in most cases for existing content. Participants preferred to use icons because they could identify what type of content Ambient Bot has before obtaining the actual content. The icon was also adopted because it does not disturb the participants’ views. The results indicate that combining the sound and the icon is effective at making users aware that there is content available. A user can predict what content is available by listening to an auditory cue before receiving the actual content. Similarly, a user can determine what types of content are available by looking at the icon.

The results of the interviews suggested that the modality of information cues should be designed according to the existing services’ modalities. For example, participants B, C, and E said, “Reading e-mails via voice is unnatural.” Because e-mails are usually read by a user’s eye, it is rare to listen to them by voice, so they felt that was strange. Therefore, the modality to offer information cues should keep the traditional style as much as possible. In the case of newsflashes, participants A and C wanted the content to be presented only via a voice. In particular, participant C commented “Hearing newsflash is like listening to a radio.” Because newsflashes are originally delivered by voice, we consider listening to that content by sound only as natural for most participants.

Although the message window offers an advantage to reliably grasp the content quickly, the message window in the AR environment may disturb a user’s view, which gives the user an uncomfortable feeling. For example, participants A and B pointed out “The message window may interfere with my daily activities.” So, we need to carefully design how to present a message window. On the other hand, participants A, F, G and I commented “The auditory cue does not interfere with everyday life, although there is a risk of missing informed content”; thus, the designer needs to pay attention to the tradeoff.

Finally, we asked participants about interaction using gestures. All participants said “The interaction method was good.” In particular, participant B reported “When using the pull-based interaction method in Ambient Bot, I felt that the system controls my eye sight because I need to keep my eye sight on virtual creatures, but in this case, I could autonomously control the system by myself, so I like the push-based interaction method better.” In addition, participants E and G answered, “Using gestures is better than using gazes when being notified of content.” It was also suggested that “The interaction via gestures using Myo was easier for them than the eye contact-based interaction.” In this case, Ambient Bot may not require using creatures in the push-based interaction method. On the other hand, in the interviews, participants B and E commented “I preferred that a creature bows after the creature finishes speaking.” We guess that these participants felt that the creature behaved like a human. The human-like behavior motivates a user to use Ambient Bot, so in the future, we need to compare a notification with only a gesture without creatures to one with creatures without gestures.

5 Evoking Serendipity Through the Push-Based Interaction Method in Ambient Bot

In addition to providing interesting information, as described in the previous sections, we believe that Ambient Bot can trigger people’s serendipity through the push-based interaction method. People can come up with new ideas and learn lessons by receiving various stimuli from the outside world. This is usually called serendipitous information [10].

We conducted another user study to investigate whether the push-based interaction method in Ambient Bot can be used to deliver serendipitous information to users. Therefore, another type of content was chosen for this user study. As an example of promoting serendipity, we chose Nietzsche’s words as the serendipitous information, and the participants configured the sound type, the message window, and the notification icon just like in the previous user study.

In the interviews, in addition to the reasons why participants chose their configurations, we asked three questions using a seven-step Likert scale, “Have you experienced serendipity?”, “Can Ambient Bot randomly offer information to increase serendipity?” and “Can an Ambient Bot that takes into account personal preferences raise serendipity?” In this Ambient Bot, a user can handle several types of inputs using Myo. Therefore, the gestures can tell Ambient Bot whether the content provided by Ambient Bot satisfied a user’s preferences or not.

Table 3 shows the results of the user study. Compared with the previous user study, three participants chose the configuration not to display the icon. There is no tendency toward the preferences on the sound and message window. The serendipity experienced by individuals is diversified; however, most participants answered positively that there is a possibility to raise their serendipity. For example, participant D said “I think that the score is 7, but it depends on the personality of a user. People who have curiosity on a variety of issues will come across the serendipity whether there is this system or not. People who are hard to come across the serendipity think that the serendipity will not happen much even with this system.

Table 3. The results of the push-based interaction method for delivering serendipity

In this case, a number of participants who did not use the icon and the sound to be notified about new content increased in comparison with the previous test. This result means that the participants may intentionally reduce notification factors to receive serendipitous content compared with the approach shown in the previous section. Participant H said “Since serendipitous information is not a notification, I need neither icons nor sounds. It is appropriate for me to have the information by chance when I have a break.” Participant I commented “I felt that I should not be strongly aware that the information is presented because it is not necessary to strongly pay attention.” Likewise, participant G said, “I did not like to attach the icon to the content because I do not mind missing the content.” Therefore, we recognized that there are some cases in which it is inappropriate to use sounds and icons for content that a user does not mind missing.

The serendipity offered by Ambient Bot was favorably accepted by most participants. They thought serendipity would increase more for the push-based notifications that are based on a user’s personal preferences. However, participant I commented “I do not need virtual creatures that deliver information that I’m not interested in.” On the other hand, participants B and D said “I think that it is not good for my serendipity that the scope of the information is narrowed down from the personalization to a user’s preferences. Incorporating surprises through the unexpected information leads to more serendipity.” The information source for increasing serendipity strongly depends on a user’s personality. For example, Facebook allows us to choose various information sources according to our preferences, and the feeds that appear in our timelines become serendipitous information for us. Thus, it may be a better approach that a user chooses the categories of information and then information in those categories is randomly shown.

6 Related Work

HoloLens, Microsoft’s HMD, has recently attracted people to develop new services in a variety of fields. Microsoft HoloLens is able to deliver a MR user experience [1], allowing people to interact with virtual objects and entities within real-world settings. MR enables designers to develop new types of advanced services that incorporate virtuality into the real world. The software platform that accompanies the Microsoft HoloLens hardware makes it easy to develop MR applications without requiring advanced skills. Various visions of possible novel services have already been presented. Incorporating virtuality in the real space offers opportunities for developing new types of advanced services [8, 9]. Practical examples have been shown in various fields such as educationFootnote 8, gamesFootnote 9, medical careFootnote 10, the space industryFootnote 11 and the manufacturing industryFootnote 12.

In past research, a concept named Pervasive Ambient Mirrors reflects people’s current situations to influence their behavior [11]. In [6], slow technologies enable daily objects to ambiently represent some currently useful information for a user. These past approaches allow a user to receive information ambiently, but the interaction does not make our life richer beyond merely functionalism. For example, strengthening sociality in technologies is desirable especially for Japanese young adults who are living in a collectivism society [14]. Many of them want to attach intimate accessories on their personal technological products, such as mobile phones. In addition, a user sometimes does not notice information because of the abstract representation [6]. We need to investigate an alternative approach to ambiently offer information, but the information should be delivered to a user in a more social and intimate manner [9]. The approach will become an important aspect for making our society truly mindful through the rapid progress of future information technologies like the Internet of Things (IoT) and advanced artificial intelligence technologies because the technologies will make our daily life more and more efficient, but a sense of fulfillment in our life may be lost.

Recently, in particular, for Japanese young adults, the boundary between fictionality and their usual daily real life becomes more and more ambiguous [9, 14]. In their pop-culture-based lifestyles, fictionality is already becoming an alternative reality, and they like to enjoy the social relationship with virtual creatures in the hybrid world [13]. For example, a jellyfish is natural and intimate for them because it is a popular agent in one of the most popular Japanese animations that represents our near-future high-tech society. Thus, technologies used in animations and video games are plausible for them if they appear in our present daily life.

7 Conclusion

In this paper, we discussed the push-based interaction method in Ambient Bot, which interacts with a virtual creature via eye contact. By using eye contact, we determined that the preferred content delivery methods are comparable to existing methods of accessing applications. We focused on the fact that information access can be divided into the pull-based and the push-based interaction method and discussed the design space of each method in Ambient Bot.

The results of using the push-based interaction method in Ambient Bot indicated that the notification sound or icon should be first presented for notifying a user that there is useful information before providing the actual content. However, the approach has a limitation because listening for a sound cue or watching a notification icon can lead to a user’s cognitive overload even if the load is trivial. In our approach, we also tried to use Myo to notify a user that there is useful content, and the approach was highly appreciated by participants in the user study. The results of the interviews in the user study also indicated that frequently making eye contact has annoyed users. In addition, we showed the feasibility of using a push-based interaction method to offer serendipitous information to a user.