Keywords

1 Introduction

Virtual reality (VR) has become a popular research topic since the 1990s. VR has recently received a considerable attention from the mass market. VR technologies allow users to enter a virtual environment, and numerous VR applications even allow multiple users to enter a virtual world simultaneously. For example, Star Trek: Bridge Crew (2017) has a VR mode that allows four players to be on the bridge of a starship and collaborate on exploratory missions in an imaginary world. Social behaviors that are mediated by VR technologies may involve different combinations of communication formats (e.g., voice and gestures vs. voice only) and in various location settings (e.g., co-location vs. virtual teleportation). Thus, a gap in the literature remains to be filled. However, the traditional research methods used in social sciences demonstrate limitations in studying VR-mediated social behaviors. An integration of methods can aid researchers in measuring and observing social behaviors on VR. This study is aimed at exploring such integrated approach.

Various scenarios in using VR involve social behavior. Multiuser VR applications support multiple users who are physically co-located, whereas several VR applications are designed to support remotely located users. VR systems are also designed to support social interactions between VR users and outsiders [5, 8]. Stream gameplay is a popular phenomenon that includes VR gameplay. The interactions between a VR broadcaster and the audience can result in new forms of social behaviors. Different types of social behavior manifest in all these scenarios. Furthermore, the forms of communication vary from verbal (e.g., spoken words) to non-verbal communication (e.g., gestures, orientations, positions of physical bodies and avatars, and use of virtual objects). Multiuser VR applications, such as multiplayer VR games and collaborations in design, have diverse contexts [14]. Such diversity in contexts of multiuser VR applications guarantees extensive research foci, such as team dynamics, trust, and persuasion. Thus, an integrated approach that can address this new class of research questions on social behavior in VR is necessary.

This study focuses on the VR applications based on head-mounted displays (HMDs). Therefore, examples such as [15] are disregarded.

A discussion of the traditional research methods and their suitability for different steps of a typical experimental session is introduced in this paper. A classification of various scenarios of VR-mediated social behaviors is proposed. The challenges and considerations of integrating research methods into studying each type of scenarios are discussed. This paper primarily discusses the issues in the data collection stage. However, analysis methods of interactions (e.g., [2, 17]) are excluded.

2 Four Traditional Methods

This study focuses on four research methods that researchers can adopt in an experiment or user study on VR-mediated social behavior and multi-user VR applications. These methods, namely, questionnaire, interview, observation, and focus group, are well established and commonly used in studies on social science and human–computer interaction.

The focus on these methods is due to they are well established, thereby indicating that researchers can use previous works as methodological references. The findings from these methods can be compared with previous studies that applied the same methods. Furthermore, these methods do not require special tools, such as eye-tracking and electroencephalography devices. Researchers who have relatively simple settings in their laboratory can adopt these methods, which generally do not require changes in or amendments to the studied VR applications. The enhancement of a VR application if the source code is unavailable is very difficult, if not infeasible; this scenario is frequently the case for commercial VR applications.

Questionnaire.

Questionnaires [3] allow researchers to collect quantitative and qualitative feedback from participants. Established questionnaires measure different aspects of the internal states of participants, such as immersion [9]. Questionnaires can collect responses from participants in a structured manner without the assistance of a research staff. The questionnaire can be in a paper- or computer-based form. Participants answer several questions by themselves. Thus, many research staffs are not required even if several participants are involved in a session.

Questionnaires require the full attention of participants. These participants must pause from their errand and focus on responding to the questions on a questionnaire. In the context of VR applications, the time to ask participants to fill out a questionnaire should be after their experience with a VR application. The drawback is that their immediate response cannot be captured. This scenario is particularly important in social behaviors because a typical trial or experimental session would involve a series of interactions and communications among the participants. Capturing the participants’ response to one another’s behavior is important.

Simple questions to the participants are possible during exposure to a VR application. This process is not equally disrupting as asking them to fill out a questionnaire. Researchers can ask brief questions to all participants who are in a VR world during a session and ask these participants to respond via hand or body gestures (e.g., “Please indicate with your fingers how excited you are now. Five means extremely excited. One means not excited at all.”). Responses with hand or body gestures can prevent participants from knowing one another’s responses and being influenced. This brief questionnaire approach can only apply to simple questions that require simple answers. Simple answers are expressed with simple gestures, such as a thumb up or down for yes or no and showing fingers to rate on a 5-point Likert scale.

Interview.

Interviews [10] allow researchers to guide participants to express their viewpoints in a humanistic manner. Thus, researchers can carefully control the pace and even adjust content and/or order of questions in accordance with the behavior of the participants (e.g., asking the participants for the reasons for performing certain actions in a VR usage session). Participants can ask questions if the questions are unclear to them. An interview can be an option for VR-mediated social behaviors. This method is useful in collecting qualitative feedback from the participants but suffers the limitation of requiring the attention of participants and thus might interrupt the participants’ exposure to VR stimuli. Interviews can be conducted retrospectively and are typically performed after the participants have experienced a VR application. Retrospective interviews can include recordings of a session of VR usage to remind participants about their experience and serve as a reference of their answers (e.g., referring to moments when they perform certain actions). Retrospective interviews with recordings of VR usage can assist participants in revisiting their experience and expressing their emotions and cognitions in a social interaction in a VR context.

An interview requires at least one researcher for each participant. The participants in a session that involves multiple participants (which is typically the case for studying VR-mediated social behaviors) are interviewed nearly at the same time, thereby requiring numerous research staffs to support such procedure.

Observation.

Observation [7] can be a method that is used to observe social interactions, including hand gestures, body movements, and facial expressions. Researchers must observe behaviors in the real and VR worlds when applied to VR contexts during VR usage. If participants are co-located, then researchers can directly observe the behavior of multiple participants simultaneously. Behaviors in the real world denote differently in a VR world. Observations of VR-mediated social behaviors require a mapping between the two worlds. The mapping, such as direct observation or via a live video camera, can be achieved by showing a live view on a monitor that displays the VR world and a live view of the real world. A live video camera can also record a session and offer recordings to support a retrospective interview.

Focus Group.

Researchers can consider conducting focus groups [12] with a group of participants retrospectively and obtain collective feedback from multiple participants simultaneously. Similar to interviews, recordings of the participants’ VR usage can support retrospective focus groups. The drawback is that responses from individual participants can influence or be influenced by their peers. Every participant experiences different parts of a VR usage session. A retrospective focus group may require multiple recordings from different views to be displayed in a synchronized manner.

2.1 Integrating Traditional Methods

This section describes a framework for integrating the four traditional research methods in an experiment session of multi-user VR applications. A typical experimental session that explores social behavior in a multi-user VR application involves the participants’ exposure to a VR application. A session can be divided into three stages, namely, “before exposure,” “during exposure,” and “after exposure” (Fig. 1).

Fig. 1.
figure 1

Integrated framework of the traditional methods in an experiment session

Before Exposure.

The administration of a questionnaire that requires the participants’ demographics, previous experience with VR and similar applications, and reference values for measures involved in the study (e.g., pre-exposure attitude towards certain topics) must be performed before the participants are exposed to a VR application.

During Exposure.

Researchers can observe and record the behavior of the participants during an exposure to a VR application. The researchers can take video recordings and screen captures while observing to support the retrospective interview and focus group. Researchers can also ask simple questions to participants at certain points during the exposure. However, this activity should be maintained to a minimum to reduce severe intrusion to the participants’ exposure to the VR application.

After Exposure.

Researchers can measure the participants’ response after the participants’ exposure to a multi-user VR application. Researchers should focus on individual response first by using a questionnaire and retrospective interview and then proceed to the collective response by conducting a focus group. The participants can express their own views in the questionnaires or individual retrospective interviews before listening to other participants in the focus group. This activity is performed in this order to prevent participants from influencing one another while individual responses are collected.

3 Classification and Dimensions

This section presents the proposed classification scenarios of the VR-mediated social behaviors. The classification is aimed at identifying and discussing the challenges faced by researchers who attempt to investigate social behavior in each scenario (Fig. 2).

Fig. 2.
figure 2

Classification of VR-mediated social behaviors

3.1 Co-presence

The first dimension of the classification is “co-presence,” which refers to the degree that the users are all co-present in a VR world. One end of the dimension is full co-presence; in this dimension, all the users utilize the HMD VR devices and are in a VR world. Each user wears an HMD VR device to enter a VR world.

Certain situations occur when not all users are in a VR world. The HMD VR technologies are becoming popular, but not every user owns this device. There are cases where only one or some users of a VR application wear HMDs and enter the VR environment. Other users can still join the VR application via other devices, such as tablet devices [1, 8], but are not similar to the presence of a user(s) that utilize an HMD VR device. In a digital game context, this scenario is called asymmetric VR gaming [8]. In these situations, the co-presence is partial.

The co-presence dimension is based on the classification by Kraus and Kibsgaard [11]. However, their classification disregards cases in which several users are partially co-present with other users but are all physically co-located [5, 8].

3.2 Co-location

The second dimension “co-location” refers to the extent in which the users of a VR application are physically co-located. That is, one end of this dimension is physically co-located. For certain multi-user VR applications (e.g., Star Trek: Bridge Crew (2017)), if multiple sets of HMD VR devices are co-located, then multiple users can enter the VR world together. If users are physically co-located, then they can select to communicate directly in the real world (e.g., shouting at one another while playing a VR game together) or via the VR application. Researchers who study such social behavior must capture and observe the interactions in the VR and the real worlds (and potentially the transitions between the two worlds).

The other end of the dimension is connected. That is, several or all users of a multi-user VR application are connected but not physically co-located. The rationale for distinguishing the aspect of co-location is the difference in considerations when studying the social behavior involved. If all the users are not physically co-located and only connected via a network, then these users must communicate via the application. Researchers may further focus on the types of information (e.g., verbal and social cues) that can be transmitted via the studied network-capable VR applications.

4 Methodological Considerations in Each Scenario

This section discusses the considerations of using the research methods discussed in each category. The considerations before, during, and after the participants’ exposure to a multi-user VR application of the category are presented.

4.1 Physically Co-located and Full Co-presence in VR

VR applications that support multiple co-located users to enter a VR world enable many types of social interactions. This type of VR application allows users to interact via the VR (via the application) and the real world. For example, Chaqué and Charbonnier [4] developed a multi-user VR platform that combines real and virtual worlds. This platform uses tangible objects in the real world as props for the users to interact and is designed for physically co-located users to enter a VR world together. Two players in the example game demonstrated in their paper interact with a simple cardboard box in the real world, but this box appears to be an Egyptian chest in the VR world. Players in the gameplay can cooperate on moving the chest; this scenario is a VR-mediated interaction. The players can verbally communicate with one another directly in the real world.

Zaman et al. [19] developed and user-tested a collaborative VR application that allows multiple physically co-located users to collaborate on a spatial design task in a VR world. The participants performed in pairs several collaborative design tasks in a series of user-testing sessions in the system. The researchers audiotaped the sessions for content analysis of verbal communication and performed a content analysis of the transcripts of the audio tapes. After the test, each participant filled out a questionnaire with questions based on the Likert scale.

Before Exposure.

Before participants are exposed to a VR application of this category, researchers can ask each participant to fill out a questionnaire. As the participants are physically co-located, the questionnaire can be conducted by one research staff.

During Exposure.

Researchers can consider directly observing any interactions in the VR and real worlds. Studying behavior in the real world can be challenging. If a behavior is tracked by an application, such as the cardboard box in [4], then the application can record tracking information. Otherwise, researchers need other tools to observe and capture interactions in the real world. The interactions may include body movement (with or without props) and verbal communication. As participants are co-located, that the interactions occur in a 3D space presents a challenge. Setting up several video cameras may capture the interactions of two to three users. One risk is that the participants’ body may block the view of the cameras. Such a risk will increase along with the number of participants in a session. Although direct observation by researchers can help fill the gaps not captured by the cameras, the risk of occlusion remains.

The cognitive and emotional states of the participants is difficult to study, as in other immersive technologies. Asking the participants to fill out a questionnaire is possible, but this approach disrupts the experience. One alternative is pausing the exposure to VR application and ask participants simple questions. As the participants are co-located, this course of action can be completed easily. The questions should be short to minimize interruption to the participants’ experience. As discussed in Sect. 2, the questions should be straightforward so that the participants can respond with simple hand gestures to prevent the participants from influencing one another.

After Exposure.

Retrospective interview with video recordings of a session may provide researchers a window to the internal states of the participants. However, participants’ immediate response in their interactions cannot be examined. In addition, the session spreads across the VR and real worlds. Body gestures and movements observed in the real world should correspond to certain actions in the VR world. In [19], partners appear as virtual humanoid hands in the VR world. Recordings of the real and VR worlds are needed to support retrospective interviews.

In the scenarios of this category, participants are co-located. They enter a VR world together in the same physical location. A retrospective focus group can be conducted right after the collection of individual feedback (with either questionnaire or individual interview or both) to allow the participants to discuss their shared experience.

4.2 Connected and Full Co-presence in VR

VR applications that fall under this category support multiple users with HMD VR devices to remotely enter a VR world. All the interactions between the users are mediated and transmitted by the VR applications. Sra [18] put forward a mechanism to present different physical constraints faced by different users who are remotely connected to a VR world.

One challenge is that players or users only see the representation of one another. The representation can be an avatar (e.g., Facebook’s Space), an object, or a 3D scan of a player. For example, Facebook SpacesFootnote 1 allow users of HMD VR devices to remotely meet on a VR world. The users can draw 3D objects, chat, and view panoramic imagery together. Users are co-present in the VR world as avatars.

Before Exposure.

Researchers can consider a background questionnaire before an exposure to a VR application. The challenge lies in the distributed nature of the experimental setup. To simulate a remotely connected scenario, participants must be separated from one another and enter a VR world. Each participant in a session needs a separate location. Therefore, additional research staff and space are necessary. As the participants, researchers, and equipment are not co-located, the researchers need to communicate efficiently with one another so that the timing of executing the procedure can be coordinated. For instance, multiple participants need to enter the VR world at the same time to prevent individual participants from being exposed to the world alone. Such situations require the researchers to carefully manage time and communicate efficiently so that the procedure for individual participants can be coordinated.

During Exposure.

As participants are separately located, multiple researchers are required to observe their interactions. A researcher can observe a participant in one location. The notetaking of the researchers should be synchronized. The participants’ behavior (e.g., waving hands) may represent an interaction with other participants in a VR world. Therefore, mapping is necessary between the behavior in the physical and VR worlds. The researchers must observe the participants’ behavior with a reference to a view in the VR world, which can be a live view of the participants’ HMD view or a spectator view provided by a VR application.

After Exposure.

After exposure to a VR application, each researcher can administer a post-exposure questionnaire or conduct a retrospective interview with each participant. If a focus group is planned, then the researchers need to bring the participants together after collecting individual response. As the participants are not physically co-located, they may have not seen one other before the exposure. In this case, the focus group may need a warm-up period. The participants may also need an introduction to understand that the other participants are actually the other users they meet during the exposure.

4.3 Physically Co-located and Partial Co-presence in VR

In cases under this category, all users are physically co-located. Therefore, part of their interactions can happen in the real world. Recently, researchers have gained interests in VR applications that allow HMD VR users to interact with non-HMD users. One reason is that HMD VR devices are not as popular as smartphones. Not every user has a HMD VR device. A few VR games allow multiple users to engage in a VR world even if only one set of VR equipment is available. Other users can engage in the VR world via commonly available devices, such as tablets. Gugenheimer et al. [8] proposed a system called ShareVR that allows non-HMD users to interact with HMD users and be part of the VR world experience. ShareVR consists of a floor projection showing the VR scene in the real world, mobile displays with tracking capabilities for non-HMD users, and a HMD VR device. Gugenheimer et al. [8] called this type of interaction co-located asymmetric interaction. They presented use cases of digital games and drawing application. In their user study evaluating ShareVR, Gugenheimer et al. [8] asked the participants to fill out a questionnaire after each experiencing ShareVR.

Another motivation of VR applications under this category is that HMD VR devices are designed to fully immerse users. The HMD blocks the attention of users to the real world. If a user is wearing an HMD and enters a VR world, he or she is blocked from interactions with anyone in the real world. The HMD user cannot interact with the real world even if necessary. Chan and Minamizawa [5] proposed and developed a HMD called FrontFace that supports communication between co-located HMD users and outsiders. The innovative HMD consists of an external display that shows a user’s visual attention (eyes) and reference to the user’s position in the VR world. One of the proposed application is VR classroom in which instructors can observe learners wearing FrontFace to examine their learning status. Chan and Minamizawa [5] proposed methods for outsiders to interact with a FrontFace user. They conducted a trial with a small group of three participants to experience and give feedback on the innovative HMD.

Before Exposure.

Researchers can administer a background questionnaire in the beginning of an experiment session.

During Exposure.

The first challenge is that exposure to a VR system differs between HMD and non-HMD users. As all participants are co-located, researchers can directly observe the participants’ interactions. The researchers need to carefully observe how the behaviors of non-HMD users are represented in a VR world for comprehension. The researchers should directly observe users’ behavior in the real world and their view in the VR world to scrutinize their representation in the VR world. To capture all the interactions, interactions in the real and VR worlds should be included. The methods to capture the behavior of HMD and non-HMD users may vary.

After Exposure.

After exposure, researchers can collect individual response from the participants through a questionnaire and retrospective interview. HMD users and non-HMD users are exposed to different aspects of a VR application. The researchers should consider whether all the participants address the same questions. Alternatively, the questions for HMD and non-HMD users can differ and be specific to their mode of engaging in a VR world.

4.4 Connected and Partial Co-presence in VR

VR applications under this category support users with and without HMD VR to remotely enter a VR world. One use case is supporting HMD users and non-HMD users to remotely collaborate on tasks. Cergeaud et al. [6] proposed a method that allows HMD and non-HMD users to collaborate remotely in a common VR meeting. The system consists of physical objects (or props) to support object manipulation. Cergeaud et al. [6] conducted an interview after letting participants try the system.

Another scenario is online streaming of VR gameplay. Streaming platforms, such as Twitch, support VR gameplay, which allows a streamer wearing HMD to play a VR game while the audience watch the gameplay streaming with regular screens.

Before Exposure.

Before entering a multi-user VR application, researchers can ask participants to fill out a pre-exposure questionnaire. As the participants are separately located, multiple research staff are necessary.

During Exposure.

As discussed above, pausing a session and asking all participants simple questions are possible. In cases under this category, the participants are not co-located. The pause needs to be synchronized. However, the participants should answer verbally instead of non-verbally.

Direct observation of remotely located participants requires multiple researchers. The researchers should also consider how the non-HMD participants are represented in a VR world and how they can interact and communicate with the HMD participants. Doing so determines how the researchers observe the participants’ behavior. For example, if they are spectators who can comment on the HMD participants (e.g., spectators of VR gamers on Twitch), their commenting behavior needs to be observed in the application.

Recordings of the whole session should involve video recordings of the behavior of HMD and non-HMD participants in the real world and screen recordings of live views of HMDs and non-HMD devices.

After Exposure.

Researchers can conduct retrospective interview individually with each participant. Similar to the procedure before the exposure to a VR application, multiple researchers are needed. In a retrospective focus group, introduction and warm-up may be necessary because the participants have not met before the exposure. They may not realize that the peer participants in a focus group are those who interacted with them in the VR world. The recordings of HMD and non-HMD devices must be shown in the session to remind them about their experience. As such a retrospective focus group session should happen shortly after an exposure to a VR application, editing the videos is challenging. The recordings from different devices may not be nicely combined. The researchers may need to prepare multiple monitors to play the recordings simultaneously.

5 Toward an Integrated Approach

Scenarios of VR-mediated social behavior are classified. The proposed classification consists of two dimensions distinguishing the extent in which the participants of multi-user VR applications are located physically in the same place and the extent in which they are virtually present in a VR world. There are existing classifications of immersion environments [11] and shared-space technologies [1]. Kraus and Kibsgaard [11] classified the communication of multiple users in immersive environment. Benford et al. [1] classified shared-space technologies according to transportation, artificiality, and spatiality. However, these two studies are not focused on methodological considerations. The current study provides a framework to systematically consider the concerns in the major steps of experimental procedures involving multi-user VR applications in different settings.