Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

We react to the environments we inhabit, and these environments can have an immense impact on us. The physical environments are not the only environments we interact with. The progress in VR technology since 2010 demonstrated the potency of virtual reality (VR) to make us feel present in virtual environments (VEs). The sense of presence that we feel in VEs is sometimes so strong that we react to the events in VEs as we would react to them in physical reality [30]. Many authors argue that one of the crucial determinants to make us feel present in a certain environment is the ability of that environment, physical or virtual, to support our actions [15, 31, 38]. We use our whole bodies and engage them in more or less subtle movements to perform many of our actions, and interact with the environments or engage in talking, walking, or even breathing. The importance of understanding interaction as a whole body activity was recognized by the third wave in human-computer interaction (HCI) research that emphasized an embodied approach to the design of interactive systems. Dourish [9] defines embodied interaction as an approach to interaction design that emphasizes the integrity of our minds and bodies engaged in actions with environments, in a process through which meaning and understanding are generated. Along with this idea, Slater and Sanchez-Vives [31] argued that even perception is “a whole body action”. A “whole body interaction” is defined as “The integrated capture and processing of human signals from physical, physiological, cognitive and emotional sources to generate feedback to those sources for interaction in a digital environment” [12].

We focus here on the idea that our thinking and learning processes originate in our bodies as much as in our brains (the concept from embodied cognition). In particular, we are focusing on subtle body movements engaged in the process of breathing and mapping the abdominal movement to the changes in our virtual environment (VE). Given the connection between cognition and the body, we ask: if VEs feel “real” and are perceived as real, physical environments, how can these environments change us through embodied interaction design? And, how can an embodied interaction design that employs the user’s subtle breathing movements facilitate these changes?. These overarching questions motivated the research presented here. We speculate that the ways in which our environments can shape us depends on the environment’s properties. In this paper, we investigate how the type of interaction mapping supported by an affect estimation of previously recorded sounds enhances the affective properties of the VE. For our test-bed we employed Pulse Breath Water Footnote 1 (PBW), an immersive virtual environment presented on head-mounted display-HMD (Oculus Rift DK2). The interaction between a user and the environment is enabled through the user’s breathing patterns that generate changes in the virtual environment.

PBW has been publicly shown as an art installation; we gained insights from the audience that motivated the research presented here. By undertaking an embodied interaction design research approach and mixed methods, we investigated how different mappings (metaphoric, and “reverse”) between the user’s breathing patterns and the system response (reflected by the changes in audio and visual components) can be used to influence the user’s affective state, user’s engagement and overall user experience.

2 Background

2.1 Embodied Interaction and Embodied Cognition in Interactive Systems

The shift in HCI paradigms that emphasized embodiment followed the shift in cognitive research. For a long time it was understood that “thinking” happens in our heads only; however, the emerging body of research on embodied cognition argues that body interactions with the environment is the basis for cognitive processes [37]. In other words, to understand the world around us, we use not only our brains, but our bodies too.

In particular, research on embodied interaction draws an understanding of interaction processes from situated bodies, minds, and the environment. Dourish [9] used the term “embodied interaction” to explain the embodied nature of an interaction in physical and social contexts. We interact with others and physical objects in the environment, and through these “embodied actions” we make meanings [9]. Grounded in phenomenology, Robertson [26] focuses on Merleau-Ponty’s teachings, and speculates that in the centre of embodied interaction is a living body, our tool to experience the surrounding world. Similarly, the design of interactive systems should follow the interaction principles we establish with the world around us [9].

Similarly to Dourish, Kirsh [18] focuses on the lens of embodied cognition to emphasize the potential for designing interactive systems. Their premise is that when we manipulate objects we shape our concepts and beliefs through the actions that employ those objects in the environment. Kirsh sees objects as extensions of our bodies, that we use to “think” with. Following this idea, we aim to understand how to design interaction with objects in VE in such way that this interaction changes user’s feelings, states, concepts, and beliefs. We look at the complex VE as a tool, the complexity of which enables for more interaction design opportunities. The potential of such tools is immense, and would allow for new research directions in embodied cognition and VR.

2.2 Unconventional Interfaces

Conventional vs Unconventional Interfaces. The definition of unconventional interfaces depends on the user’s familiarity with interaction interfaces. According to Kruijiff [19], some of the characteristics of unconventional interfaces are: alternative input/output compared to hand-held control and audio-visual feedback, using either new technology or existing technology in a new way compared to conventional interfaces, use of interfaces in artistic works compared to common, everyday usages, and using “magical”/unnatural metaphors compared to well-known metaphors.

As different tasks require fundamentally different interfaces (interfaces for a FitBit vs a flying interface in VR simulator), Beckhaus and Kruijiff [4, p. 72] distinguish between interfaces used for “experimental application” and interfaces built with the goal of successful task-accomplishment (productive application). While in productive applications the main concern is usability, in experimental applications fun factor and aesthetics are some of the preferred values [4, p. 183]. Here, we focus on the experimental application of a breath-controlled interface in our work. Given that there is no particular goal set for a user to achieve, but rather to explore the interaction, we argue that breathing as an interaction modality will contribute to a higher engagement of a user in VE, which will, we predict, enhance the user’s affective reactions to VE.

Breath-Controlled Interfaces. Respiration computer interfaces (RCI) are easy to use, accommodating to different body shapes, and preferred over button interfaces [2]. Two main categories of application for breath-controlled devices are: assistive technologies for impaired individuals, and interfaces used for creative expression. Assistive technologies often take a form of breath controlled joysticks and mouses [13, 20, 29]. Breath-controlled interfaces in creative applications can be found in a number of video games [32, 34] and in artworks that employ an audience’s breath in various ranges of creative outputs [23, 28].

The number of VR projects that have employed breathing input is limited, to our knowledge. Waterworth et al. [36] built a VE for exploring the relationship between emotion and presence, through a multimodal interaction paradigm. Employed input modalities were the user’s balance for movement (leaning forward/backwards, right/left), and breath for vertical navigation. As the authors reported, initial trials confirmed the ease of using a breath interface for natural interaction on a vertical axis in VE. This project was inspired by the pioneering work of Char Daves’ Osmose, an immersive VE presented on HMD, in which the user’s breathing and balance were assessed via a vest [8]. This artwork, highly influential, refers to the phenomenological teachings of Heidegger and Merleau-Ponty in Daves’ attempts to bring together divorced minds and bodies. In this work we can recognize traces of ideas of embodied interaction and cognition that will be publicly presented years after the completion of Daves’ work.

2.3 Affect and Sound Estimation

Dimensional approach to affect estimation in sound and music: Affect estimation in sound and music is still in discussion in Music Information Retrieval (MIR) research. Eerola and Vuoskoski [10] argue for four affect models as discussed in their state of the art paper on affect estimation in sound and music [10]. The four affect models are: discrete, dimensional, miscellaneous, and music-specific. We undertake a dimensional approach to emotion that is focused on defining continuous dimensions that can represent and differentiate affective states. Ekkekakis [11] presented different dimensional models of affect in human subjects, however in our research we are focusing on the 2-dimensional circumplex model by Posner et al. [25]. The circumplex model has 2 axes, horizontal representing the valence dimension (unpleasant-pleasant) and vertical axis that represents the arousal or activation dimension (activation-deactivation), assessed here using the Affect Grid by Russell et al. [27]. We built PBW upon our previous research on affective estimation of soundscape recordings [5, 14, 35]. Specifically, we use the affect estimation model of Fan et al. [14] in our system design. In this 2-dimensional model of affect in audio, two axes are: pleasantness (equivalent to valence in human subject research) and eventfulness (equivalent to arousal).

Audio stimuli and Affective states: Previous research in the domain of audio stimuli and affective states showed that by varying pleasantness of the sound you can affect the user’s ratings of arousal. Bradley and Lang showed that highly pleasant and highly unpleasant sounds had higher arousal ratings, while the most memorable sounds are those with high arousal [6]. Asutay and Västfjäll [3] researched the relationship between emotional reactions as described through activation (arousal)-valence scales and the characteristics of sounds that are tones and noise complexes. Activation was related to the tonal content and sharpness, whereas valence was associated with perceived loudness, roughness, and naturalness. Similarly, Tajadura-Jiménez et al. [33] researched the determinants of sound properties (physical, physiological, and spatial) in regard to evoked affective responses and revealed the effect that the intensity of sound, amplitude and frequency modulation, and the type of sound (natural, and artificial) have on reported arousal. One of the relevant findings is that fast heartbeat sounds lead to increased reported arousal, and as well, can enhance affective response of the presented visual stimulus, if visuals and sounds are presented at the same time to a user. We are tackling this idea by matching changes in affective audio to the changes in visuals. While affect estimation in audio is a well researched area, there is no research done, to our knowledge, on the application of affect estimation of audio systems in VEs. Moreover, we research how changes in the audio and visual components of the VE triggered through embodied interaction influence the user’s affect.

3 Pulse Breath Water, an Immersive Virtual Environment with Affect Estimation in Sound

Pulse Breath Water (PBW) is an immersive virtual environment (VE) presented to a user through a HMD and is manipulated by the pulse of a participant’s breath, provoking and challenging the interaction between a user and the substantial element of the VE: water. The user rises in the VE when breathing in and slowly sinks (underwater) when breathing out. The interaction design follows the idea of “metaphoric” mapping (discussed in the Sect. 3.2 in more details). The audio is generated in real-time by mapping the eventfulness of the chosen audio samples to the frequency of the user’s breathing.

Our design approach relied on an autobiographical design [24] and iterative research through design process [39]. The collaboration between the authors coming from HCI and generative audio field required iterative design sessions during which a variety of mappings between a user’s breathing frequency, visuals, and audio were discussed and implemented.

3.1 Interaction Scenario

In the design process, we reduced visual impact following the concepts of ambiguity and abstraction. We decided to employ the user’s breathing in an interaction with the VE, in a simple manner that empowered users to react to the system’s decisions in VE through their breathing patterns. Users were comfortably seated, and we gave no instructions to the users prior to immersion; rather we left it up to them to explore the VE. The system recognizes subtle differences in breathing patterns, and reacts to changes in breathing patterns by changing the audio quality and visual characteristics (the waves become more calm as the breathing slows down. e.g.). The system has its own behaviour that changes in regard to the incoming breathing patterns of a user. This process could be understood as negotiation between a user and the system, in a play that prioritizes the user’s decisions over the system’s.

3.2 System Description

The overall system outline is represented in Fig. 1. Two breathing sensors (Thought Technology [1]) attached to the user’s abdominal and chest area stream breathing waveform data to M+M middleware. M+M sends this data to a MAX msp patch. The reactive agent generates the audio output using an audio corpus (a set of pre-recorded audio samples). The reactive agent selects samples from its corpus using the mapping of the frequency of the user’s abdominal breathing to the eventfulness of the audio samples. All audio samples were previously labelled with a two-dimensional vector: average eventfulness and average pleasantness using an affect estimation model proposed by Fan et al. [14]. The reactive agent sends online affect estimation of the audio output to Unity 3D along with breathing data via OSC messages. This data generates visual changes in the VE presented to the user via HMD. The user listens to the audio environment with circumaural noise-cancelling headphones.

Fig. 1.
figure 1

The system architecture

Audio: Figure 2 shows the average affect values of each audio sample in PBW’s audio corpus. Each dot represents one audio sample. We created the audio corpus by recording two, three, four, and five voice chords with quartile harmony on the piano. Then, we used pitch shift and time stretch to generate more sounds. In particular, we used these methods to generate an audio corpus that locates around neutral valence and neutral to low arousal in the affective space. Following, we calculate the user’s abdominal breathing patterns using the wavelet transform of the breathing data. In our implementation, the wavelet transform has 24 bands. We map these bands to the highest and the lowest arousal (eventfulness) values in our audio corpus. The reactive agent uses the band with the highest power to choose an audio sample. Hence, we map the frequency of the user’s breathing to the eventfulness of the audio. At any point, four audio samples are played together to ensure that the affective state of the overall audio centres around the neutral arousal. The design decision to position the audio corpus in this area of affective grid arose from authors’ aesthetic tendencies. Our goal was to lead our users towards relaxing states, by introducing audio low in arousal (in audio vocabulary of affect: eventfulness), and staying in neutral to positive end of valence axis (pleasantness).

Fig. 2.
figure 2

The audio corpus of PBW mapped to 2 dimensional space defined by arousal and valence axes

Visual: Virtual environment built in Unity 3D comprises of a scene that combines interactive audio (generated independently via MAX msp patch) and the 3D element of a body of water - an ocean (see Fig. 3). The aesthetics of the scene is intentionally left minimal, displaying the ocean and the sky in a range of gray-scale shades over time (see Fig. 4). Below the main level of the ocean, we positioned an additional ocean surface in blue colour, to emphasize surreality of the scene. A fog that encompasses the ocean in the distance adds to the ambiguity of the scene. We decided to implement these elements in order to maintain a neutral atmosphere dictated by neutral valence and arousal levels of the accompanying audio environment. This was based on the authors’ judgment and several design iterations with informal user testing. The main design principle in designing this environment was ambiguity to evoke engagement and thought-provoking. As Gaver et al. [16] argue, ambiguity in HCI and design of interactive artifacts is desirable for the thought-provoking and engaging characteristics that it adds to the design. In PBW, we aimed at employing an “ambiguity of relationship” [16] that engages users to project their own values and experiences in the process of meaning-making. While meaning-making is not in the focus of the presented research, we find it to be a crucial component in creating the experience of the whole scene, adding to the affective potency of the environment.

Fig. 3.
figure 3

Screen shots of the environment: left: calm ocean; center: aroused ocean; right: under the water

Mappings: Breathing frequency as well as eventfulness (arousal) and pleasantness (valence) levels of the audio environment are sent from Max msp patch to the game engine Unity 3D. In Unity, the value of eventfulness is mapped to the waves of the ocean. Higher aroused states result in a more disturbed ocean surface and waves. The colour of the sky progresses from grey (at the beginning of the experience) to pitch black (at the end of a session) over the span of eighth minutes. A participant’s breathing data controls the elevation of the user in the VE in that, when the user breathes in, their position in the environment is elevated so they can rise above the ocean surface. Similarly, when the participant exhales, they sink.

Fig. 4.
figure 4

Screen shots of the sky’s colour progression: left: sky colour at minute 1; middle: sky colour at minute 5; right: sky colour at minute 8 (Color figure online)

3.3 PBW as an Art Installation

PBW was premiered as an art installation in two collective exhibitions: Scores + Traces at One Art Space in NYC, USA (March 10–12th, 2016), and at MUTEK-VR Salon in Montreal, Canada (November 9–13th, 2016). During these two exhibitions we gathered qualitative feedback from the audience, which we summarize here along with our observations of the audience’s behaviour.

Easing-into the Environment. PBW was designed as a generative piece without a clear beginning or end. The time that the users spent in the PBW varied from 5 to 20 min. The users usually spend the first few minutes exploring the extremes of their breathing, to familiarize themselves with the system’s capabilities through exaggerated belly movements while inhaling and exhaling. Interestingly, after a few minutes of vigorous exploration, users would slowly ease–into the environment, and their breathing would become slowly paced. This type of breathing would typically remain stable until the end of each session (until the user decided they are done).

Meaning-Making and Re-evoked Memories. Even though we did not design PBW with any particular narrative in mind, the majority of users we spoke to had their own interpretation of what the narrative was. Some of them constructed the narrative, others re-lived some of their past experiences. We believe that a major role in the meaning-making process is held by users themselves, who invest their “beholder’s share” [17]. In other words, users respond to the ambiguity and lack of details by projecting their own experiences and imagination relying on top-down processing [17, p. 58].

4 Methodology

4.1 Study Design

In order to investigate the potential benefits of predictable, embodied interaction through breathing on a user’s affect, enjoyment, engagement, and presence we designed an experimental comparison of interactions using two versions of PBW to support two different experimental conditions: (a) metaphoric mapping; and (b) reversed mapping.

4.2 Conditions and Mappings

The original piece, PBW, was modified to support two experimental conditions that differed in interaction mapping between breathing frequencies and the changes in the environment.

Condition 1: Metaphoric mappings: In this condition, metaphoric mappings of audio and changes in VE are based on cognitive schema developed from everyday actions and interactions such as “more is up, less is down” [21]. Metaphoric mappings are widely exploited in the design of everyday objects (sliders moved up to “crank the volume up”) because the underlying concepts are understood beyond conscious awareness. For this reason metaphoric mappings of interactions are considered to be “intuitive” and require unconscious effort [21].

PBW was originally designed following the logic of metaphoric mappings. The vertical movement of the participant in the environment follows the logic of “more is up”: the more air you inhale the higher you move. When participants inhale they rise in the environments, and when they exhale they sink, similar to what happens when exhaling when swimming. The exact position is depending on the amount of air inhaled/exhaled, therefore the participant can be above the water (a big breath in), under the water (deep exhale), or any place in between if they maintain shallow breathing. In this metaphoric condition, we did not change the mapping between the respiratory interaction and the generative audio. We use the same mapping that we explain in Sect. 3.2 to generate the audio output.

Condition 2: Reverse mappings: In this condition, we reversed the metaphoric mapping in order to investigate how this might affect participants’ experiences in regard to their affect, engagement, immersion and overall satisfaction. In this condition, when a participant breathes in they sink, and rise when they exhale. This is a simple intervention yet clearly observable by the participants. The waves of the ocean were still mapped to arousal level. Moreover, we reversed the mapping between the respiratory sensor data and audio sample selection. As the user breathes more frequently, the reactive agent chooses samples with lower eventfulness; and vice versa.

Based on the above-mentioned cognitive schema and metaphor theory [21] we hypothesized that interaction based on metaphoric mappings will be more engaging and will enhance the affective properties of the audio more than the reverse mapping condition.

4.3 Participants

Twenty-four participants (16 female) were recruited using on-line participant recruitment system, and randomly assigned to one of the two experimental conditions to start with. Participants’ ages ranged from 19 to 58 (mean: 22.3, SD: 8.03). Majority of participants have never tried VR before (14/24). All participants reported the good health condition and normal vision.

4.4 Experimental Setup

The experiment was performed in iSpace lab, SIAT, SFU. The participants were seated, one at the time, in a dark room, at the computer station. Depending on theirs assigned experimental condition, one of the two VE experimental conditions were presented on an Oculus DK2 HMD (resolution \(1080 \times 960\) per eye) and refresh rate of 75 FPS. The audio component of VE was played on noise cancelling headphones. Participants wore two breathing sensors (Thought technology) positioned on the abdomen and chest.

4.5 Procedure

Upon arrival in the lab, we informed participants that they are participating in an exploratory study in which we are interested in their engagement with the VE measured through assessed affect before and after, and additional questionnaires. Following, participants read written description of the study, and signed informed consent. The participants were informed about their rights to withdraw at any point and instructed to report to the experimenter any feelings of vertigo, nausea, or headache as they arise, upon which the experiment would be terminated. Each participant completed two eight minute long session (for example, condition 1: metaphoric mapping, and condition 2: reverse mapping). The order of conditions was counter-balanced across participants. After each session, the participants were interviewed.

4.6 Data Collection

Before the experiment, the participants were asked to fill in the affect grid and state- trait anxiety inventory -STAI-6 [22]. After each exposure, the participants filled in the affect grid and STAI-6 again, without seeing their previous responses. In addition, they were asked to answer a questionnaire containing twenty-one questions. Our questionnaire is a modified version of the Game Engagement Questionnaire [7] used for assessing levels of engagement through the lenses of four categories: flow, immersion, engagement, and presence. Following, the participants were interviewed and the interviews were audio recorded.

4.7 Data Analysis

Data analysis was performed on the data from twenty-two participants. Data from two participants had to be discarded: One participant experienced anxious feeling in the middle of the first exposure and the experiment was stopped at that point. The other participant did not report motion sickness as it occurred and rather continued, but was unable to complete all of the questionnaires. Quantitative data was analyzed through inferential statistics, as explained below in the Findings section. Interviews were transcribed and analyzed using a grounded theory approach. The deductive approach to coding originated from the semi-structured interview questions that focused on the experience: feelings, thoughts, actions performed, attention, intentions, narrative, evoked memories, and difficulties of using the system.

5 Quantitative Findings

5.1 Questionnaire Findings

A two-way within-subject ANOVA was run on a sample of 22 participants to examine the effect of order and mapping on the different questionnaire items. Below we only report significant main effects and interactions.

Perceived reactivity of the environment to the user: There was a significant main effect of order, in that participants perceived the environment as more reactive in their second exposure, \(F(1,40) = 2.95, p =.013\), as illustrated in Fig. 5 right.

Fig. 5.
figure 5

Main effect of order on the questionnaire dependent variables “I purposefully used my breath to change the sounds and visuals” (left) and “The environment reacted to me” (right). Error bars depict one standard error of the mean. Grey dots depict individual participants’ mean values.

Users engagement to change the sounds and visuals: Participants purposefully used their breath to manipulate the environment in their second exposure more than in their first exposure \(F(1,40) = 2.20, p = .016\) (see Fig. 5 left).

Payed attention to the audio: Participants payed more attention to the audio in metaphoric as compared to the reverse mapping condition F(1,40) = 1.76, p = 0.039 (see Fig. 6 left).

Desire for experience to last longer: There was a significant interaction between order and mapping for the questionnaire item “I wish it lasted longer”, \(F(1, 39) = 6.14, p = .0177\) (see Fig. 6 right). Planned contrasts showed that after the second session participants were more inclined to wish for a longer experience if this second experience was the metaphorically mapped condition versus the reverse mapping condition, \(F(1, 39) = 5.56, p = .0233\). If the metaphoric condition was experienced as the second session, participants were also more inclined to wish for a longer experience than if the metaphoric condition was experienced first, \(F(1, 39) = 5.22, p = .0278\).

Fig. 6.
figure 6

Main effect of mapping on the questionnaire dependent variable “I payed my attention on audio” (left) and interaction between order and mapping on “I wish it lasted longer” (right). Error bars depict one standard error of the mean. Individual dots depict individual participants’ mean values.

5.2 Affect Grid and STAI-6

A 2-way ANOVA for the factors order {baseline before the first session; after session 1; after session 2} and mapping {metaphorical; reverse} and the dependent variables arousal and pleasantness scores from the affect grid did not show any significant main effects or interactions. In regard to the six questions included in STAI-6, we found a significant difference in baseline (pre-exposure) scores between the two groups (one group that was assigned to metaphoric mapping condition first, and the other one that started with the reverse condition), even though participants were randomly assigned to the two groups. Due to these group differences we did not further analyze the STAI-6 results.

6 Interviews

The majority of the participants in our study were undergrad students with no prior exposure to virtual reality. Through semi-structured interviews after each condition we hoped to gather insights that will help us build a better understanding of how different interaction mapping contributes to the affective properties of the environment, and overall experience regardless of the previous experiences with the technology. The themes from semi-structured interviews served as a basis for the non-linear accounts of various experiences as presented here.

6.1 Exploring the Unknown: Phases

The majority of the participants verbally shared their excitement to try VR for the first time. As we noticed, the first phase of interaction is exploration of their agency by breathing in and out, testing the limits of the system (how high or low they can get), and familiarizing themselves with the elements in the environment. After this exploration phase, they eased into the environment.

“In the beginning I breathe in different levels so I can see how the image will move. After I realized how it works, I tried different kinds of breathing. I tried even to go forward but I couldn’t. Oh, at the beginning I was a little bit worried that image will be intimidating, but after I realized it was ocean it was more relaxed. Then I tried different kind of stuff that didn’t work so I kept breathing” [P9].

6.2 Regardless of Mapping, Second Trial Is Enjoyed More

The participants reported second trials as more enjoyable very consistently, regardless of the mapping. Even though participants were informed how the system works, they used their first trial to familiarize themselves with the environment to allow for more profound interaction in the second trial.

“This is my first time to try VR, in the first environment I felt... don’t want to say stressed... maybe anxious a little bit, a little bit excited, the second time my perception was: ok, this is stuff I already know, it’s like an old friend, I know what to expect, I know what should I do, observe... I enjoyed it more the second time” [P1].

The lack of anticipation of a new environment one is immersed in resulted in increased relaxation in the second exposure.

“The first time was like giving a toy to a child... this time I was enjoying the feeling of calm... I wanted to take good relaxation time now” [P2].

“The first time I was not sure what you would ask me, or what to expect... this time I knew what was coming... There were some parts that were intense... but I was immersed in the simulation... I knew nothing crazy is going to happen, I was more calm” [P14].

6.3 Metaphoric Mapping Feels Intuitive, but Reverse Is More Playful

Those participants who were aware of the differences in interaction mapping between two conditions articulated their preferences for metaphoric mapping. The descriptions of the metaphoric mapping conditions such as: intuitive, natural, and counter-intuitive emerged from participants’ comments, and were usually linked to the themes such as relaxation, and being calm.

“I practised being at the water level, tried going down below blue waves, but you need a big breath for that. The other one [metaphoric] felt more natural, I guess, because you breathe in and go up, breathe out and go down, and this one felt weird because it is opposite” [P5].

“I feel much better than in the first one (referring to reverse mapping)... it felt so correct, when I am in the water you exhale and go down... I felt more calmer than when I came in” [P2].

On the contrary, reverse mapping was perceived as more stimulating, engaging, and interesting.

“They are both good for different (reasons): reflected more in the first one (metaphoric), in this one I was more playful (reverse)” [P5].

“Not much different but felt more interesting, it was counter-intuitive” [P7].

6.4 Somatic Experiencing: Awareness of Breath and Emerging Past Experiences Through the Changes in the Environment

Visual representation of the ocean coupled with the movement often triggered memories in participants, followed by strong bodily sensations such as: floating, dropping, or even sensations that

“...I could almost feel, like being submerged under the water and then being brought back up... like if there was an invisible wrapper, kind of like a real water but not as... kind of real water but at slower pace” [P14].

“It was exactly like when you go in the water, it had the feeling of being calm... it felt like floating chamber... the idea is the same, idea is that water makes you feel calm, floating in the water” [P2].

“When I was learning to swim... they tell you to focus on your movement, and breathing... I related that to this. What made me feel different was motion... I liked the first one (reverse), it made me feel better... the second one (metaphoric) felt more like I was floating” [P4].

Few participants reported a heightened impact that the music had on their awareness of breath.

“I think it was music. Suddenly I felt like I can feel my heartbeat. The rhythm of the music... it was in the parallel with my breathing, and that’s when I noticed (her breath)” [P17].

“Oh, when the music gets excited I know something might happen, and then I wait for it happen. Oh, I also breathe to let it happen more faster” [P9].

However, some participants experienced tension in regard to audio they were listening to:

“I felt a heartbeat and then when it was music every time I went down... I felt dramatic effect to it... so I was like: something is wrong, I should do something about it... it was triggering fear even though I knew it’s VR... when the sky turned black, sounds triggered fear” [P3].

6.5 Loss of Control Triggers Fear?

From the conversation with the participants we realized that those participants who did not make a connection between their breath and the changes in the environment were more likely to get distressed by audio or visuals.

“I tried to control my breathing... and then once the loud music started I kind of... can’t control it any more, and I felt tense... I lost control of how I wanted to go down to the blue, and that’s while loud music started to fade and I regained control... then I started to take deep breaths and started to calm myself...” [P21].

Once the participants regained control over the system, their tension lessens.

“Fear doesn’t come to me this time, I felt I have a control over my body... to stay in one state, tried to be at one situation... either above or down” (the second exposure, metaphoric mapping) [P2].

6.6 Imagine This Was a Tool...

During the interview, the participants were asked “If I tell you that what you just experienced is a tool, what would you use that tool for?”. Two themes emerged as the most dominant: a relaxation tool, and a tool for overcoming anxieties triggered by water.

“I would use it for, like, calming people down... cause this probably can calm many people down who are stressing about stuff... it gives you something to focus on, you are focusing on your breathing and something in front of you, so it kind of distracts you from everything else... cause you think of, if you start panicking, I guess, and you breathe really fast and go up and down like crazy, and then you would be like: what’s happening in the world, and then your focus immediately is on the image in front of you instead anything outside.” [P6].

“I don’t know... something to do with calming people down, when their heads are somewhere else” [P5].

“Zoning out, not thinking about whatever is going in your head... If things were steady up the water... I would feel more relaxed” [P2].

7 Discussion

In this section we discuss the main findings from the presented study and our understanding of the lived experiences of the participants. We asked the question: how can these environments change us through embodied interaction design? And, how can an embodied interaction design that employs the user’s subtle breathing movements facilitate these changes? and the answer lies between multiple accounts gathered here. The richness of gained insights helped us to see a wide range of factors that can affect the experience, and that we did not take into the account during the study planning process. Finally, we discuss the insights as we formalize them into a set of design considerations for embodied interaction design in VR.

7.1 Familiarity First, Engagement After

Analysis of questionnaire responses highlighted that the participants perceived the environment as more reactive to their input in their second trials, regardless of the mapping. We believe that this can be explained by the novelty effect or lack of understanding of the system’s nuances. Even though prior to each trial we explained how the system works, many of the participants did not make any connection between their breathing and the changes in the environment in their first trial. This would explain our second finding that the engagement of the user to manipulate the environment through their breathing was higher in the second trial as well. Once they knew how the system worked they engaged with it more. The dynamic of their familiarization was revealed in the interviews in which the majority of the participants revealed that in the beginning they were exploring the environment and testing the interaction limits, followed by easing into it and pacing their breathing in less forceful and more pleasurable ways. Two participants were not aware of their agency at all. One reason might be that these participants employ chest breathing more than abdominal breathing, which will be explored in future work.

7.2 Tension and Relaxation, at the Same Time

We investigated whether different mappings can lead participants’ affect toward an affect that matches the overall affect of the audio corpus. The audio corpus was intentionally centred around neutral pleasantness with a tendency towards positive pleasantness and neutral to low arousal (playfulness), positioning the corpus in the area of relaxed feelings on the affect grid. The inferential data analysis of affect grid responses did not yield any significant differences between two different mappings nor trial order and the baseline. This might be explained by overall subtle changes in the affect across the sample. However, a few participants reported feelings of tension. From the interviews we learned that many of them found the sky colour change from grey to black dreadful and this element triggered anxious-like feelings in several participants. One participant finished the session after four minutes claiming that the environment caused her distress and she was not able to continue. Other participants who did not make the connection between their agency and the changes felt tensed too. In these cases, the music was adding tension to the dark environment. Despite the reports of felt tension, when asked what they would use this tool for, the majority of the participants responded that they would use it for relaxation. Even though there were elements that were causing distress, the participants recognized calming qualities of the system. This finding is of a particular interest to our future work.

7.3 The Context Is the Key

Originally, we designed PBW as an art piece grounded in research questions we asked here. As an art piece, we exhibited it in galleries where we verbally collected the experiences of audience members who interacted with the piece. The majority of them recognized relaxing qualities and would stay immersed up to 20 min. The quality of the experience dramatically changed when we moved the setup from the gallery to the lab. The main difference was the expectation and the openness to the experience. The audience in the gallery is there because they would like to experience something new. Our participant pool consisted of fairly young undergrad students who might be very different from those who initially experienced the piece in an artist gallery context. The laboratory setting, no matter how we tried, still feels like the setting for an experiment rather than an experience. This might have affected our participants’ responses and we find it to be an important factor to be accounted in the future studies that employ art and research questions.

8 Conclusion

In this paper, we introduced the system Pulse Breath Water and we investigated the efficiency of embodied interaction design through two different mappings (metaphoric, and “reverse”) for enhancing affective properties of the system. This research encompasses two directions of affective research: in VR and in the audio, combining them into one system in an attempt to gain a better understanding of the combined effect of these two on the user’s engagement, affective states, immersion, and overall experience.

In this paper we contribute to a better articulation of affective properties of virtual environments that combine visual and audio components into one system. We presented some of the individual accounts of lived experiences and showed that the majority of the participants when asked to imagine that this was a tool replied that they would use this tool for relaxation. We built our system on the premise of neutral pleasantness and low arousal properties which can be translated to feelings of being relaxed. This gives us a direction for future work to research the potential of the system of inducing a wider range of affective states. We believe that the insights presented here will bring us closer to the final goal of creating a system that not only “reacts” to a user’s breathing but evolves into an immersive artificial intelligence system capable of taking initiative and changing a user’s affective states.