Keywords

1 Introduction

Interaction techniques are the combination of software and hardware [1] and are used to interact with a Virtual Environment (VE). They are characterized by 3 concepts: navigation, selection and manipulation, and system control [2]. Navigation allows users to travel in the VE from place to place adjusting his/her point of view (i.e. steering) [3]. Selection is the action of picking a target in VE. System control refers to interaction between the user and software external functionalities such as interacting with a menu outside the VE.

The user performance when completing a virtual task depends on the interface. Indeed, the interaction techniques involve mental workload and modulate positively or negatively the user performance when the interaction technique is not intuitive enough [4]. Workload is the mental resources needed at oncee for a task [4]. High workload affects performance because the human cognitive resources are limited in energy [5], if too much workload is requiring for the task realization, the user could miss information or commit errors. The elevation of the workload depends on the task complexity, environmental factors and the user abilities and knowledge [6]. Using the mouse involve only low workload during task realization because most people use it in their everyday life and perfectly know how to manipulate it. Other less known interaction technique could lead to more mental workload. There is a need to understand how different interaction techniques could impact the user’s performance according his/her familiarity with it.

The use of an interaction technique is subtended by motors functions such as gesture control by hand [3] and cognitive functions [7, 8] such as spatial abilities [9]. As the user uses an interaction technique, its use becomes automatic and no longer consumes as much attention-giving resources [10] and so, few workload. We can consider that individuals who used the computer mouse frequently for several years are experts. Thus, we can assume that an expert user with the mouse, will have better results during a task in VE because it will make fewer errors related to the usability of the interaction technique than a “novice” user. So, as we just discussed, the interface (or the interaction device) use a part of the cognitive functions to allow the interaction with the virtual world. On the other hand, to develop a useful virtual tool for neuropsychological assessment capable of giving the clinician an effective measurement of cognition, it is primary to understand the user abilities with the use of an interaction technique. It is important to be able to dissociate which part of cognition is used by the HCI and which part is really allowed to the cognitive task itself. For example, the principal variable used to qualify patient performance during virtual cognitive task is often time completion. So, a longer completion time was used to discriminate patients with mild cognitive impairments (MCI)from healthy elderly who completed the virtual task in less time [11]. However, completion time could be modulated by the user’s abilities with an interaction technique. So, a user who never used a mouse before the virtual test could take more time than a familiar one even if their cognitive abilities are comparable.

The study aims to detect how the skill of a participant using an interaction technique can be qualified with behavioral and physiological data. To assess the user abilities with interaction technique, participants realized common activities in VEs (i.e., a training step and a pointing task) with several interation techniques, 2D interaction: gamepad (i.e., Xbox controller) and mouse. And 3D interaction: Razer Hydra. They also answered some questionnaires about their computer and video games usage.

2 Related Work

Navigation efficiency is mainly assessed by completion time and is linked to the user performance [12] where a long completion time is associated to a poor performance. Selection occurs in a 2D VE or in a 3D VE. During 2D selection, the user picks a target by moving the selection cursor in the x and y axes whereas during 3D selection he/she moves the cursor in x, y and z axes and must control the depth during selection. In 2D, the common selection techniques are pointing and drag-and-drop [13]. When pointing, the user put on the selection cursor on a target and then click on it whereas during drag and drop he/she selects an item and move it into the desired place before dropping it. Adults [13] and children [14] users are more efficient with pointing than with the drag and drop. Moreover, mental workload is higher during drag and drop tasks for the elderly than adults whereas there is no difference in mental workload for the 2 groups during pointing [15]. Workload can be assessed in a objective way by recording physiological data. Heart Rate (HR) and Heart Rate Variability (HRV) are sensitive to the different states of the autonomic nervous system and can be used to assess mental workload. HR is faster during complex tasks and high workload situation [16, 17] whereas HRV is lower [18].

Fig. 1.
figure 1

Illustration of the pointing task in ISO 9241-9 standard

To assess the usability of a selection technique, ISO 9241-9 [19] proposes a standard pointing task where user must select as quickly as possible several targets with different positions and sizes (Fig. 1). The results can be analyzed with Fitts’s Law [20], to predict the user performance with a selection technique according targets position and size (Eq. 1).

$$\begin{aligned} MT = a + b\cdot log_{2}\left( \frac{D}{W}+1 \right) \end{aligned}$$
(1)

Where MT is completion time, D distance to the target and W its width. Log is the index of difficulty where a and b are determined by a linear regression. Fitts’s Law was then adapted by Shannon Formulation and throughput (Eq. 2). Where De is movement amplitude in pixels and \(SD_{x}\) is the standard deviation of the distance between the selection and centre of the target in pixels. MT is the time to hit and select a target in second. Throughput (bits/second) assess the usability of a selection technique with a combination of velocity and accuracy.

$$\begin{aligned} TP = \frac{log_{2}\left( \frac{De}{4.133*SD_{x}} +1\right) }{MT} \end{aligned}$$
(2)

Few studies analyze the profile user during a virtual task and qualify them as novice or expert. Rosa et al. [21] isolated user profile associated with the most effective experience in VR using a correspondence analysis combined with cluster analysis. They showed that PC gamer profile had a better experience in VE than console gamer or non-gamer which is the profile with more cybersickness. Hourcade et al. [8] tested elderly subjects during a selection task with or without selection assistance software (i.e., PointAssist). They showed that an expert participant with the use of the mouse has better results without the PointAssist software. Assistance is not useful if users are expert with the technology. In addition, the authors showed positive correlations between computer and mouse use and target selection (click on the target). Individuals frequently using the mouse and computer are those who have done the selection task better and where assistance has been triggered least frequently.

Another study assessed participant during a daily activity of shopping in VE. They could interact in the virtual shop with gamepad. The novice subjects were isolated with t-modified test and took significantly more time to complete the task than expert participants [22].

3 Method

3.1 Participants

Thirty student volunteers were recruited from the local university. The participants were randomly divided in 3 groups. The first group, composed of 3 women and 10 men (age M: 23.7; SD: 3.5), used gamepad. The second, composed of 3 women and 6 men (age M: 23.9; SD: 5.3) used the mouse. The third, composed of 5 women and 3 men (age M: 23.6; SD: 3.9) used the Razer Hydra. No participants knew the Razer Hydra beforehand.

3.2 Tasks

Training. To understand how to use the interaction technique and be able to freely navigate and select items in VE, participants trained in a virtual apartement composed of 6 rooms. Time completion, distance travelled, number of clicks and miss clicks are recorded.

Pointing Task. We used a pointing task like the ISO 9241-9 standard. In this task 13 targets are positioned in circle and the participants clicked on each target. Targets are spheres of 16 cm width. Only the active target was displayed on the screen and participants received an audiofeedback when they missed the target (i.e., error). Time, errors and throughput are recorded. As we wanted to explore if a quick pointing task could be a useful task to discriminate participants, we only use one sequence of 13 targets during the task.

3.3 Apparatus

The experiment was conducted on a computer with Intel® Xeon® processor, a NVIDIA GeForce GTX 1080 and 32GO of RAM running Windows 10. The VE were displayed on a \(50''\) Samsung television with a \(1630 \times 768\) resolution.

3.4 Procedure

After signing the protocol agreement, participants wore 3 sensors on the chest to collect HR and HRV data with the help of the investigator. ECG data were recorded through the (R)evolution BITalino board kit and the OpenSignals software. The ECG sensors were placed: under the right clavicula (+ electrode), under the left musculus pectoralis major (− electrode) and under the left clavicula (reference electrode) [23]. Then, participants were ready to begin the training step where they visit a virtual apartment. Guided by the researcher, participants visit the VE in the same order. When the visit is done, they can see three boxes in the kitchen. They have to select the three boxes, one by one, and drop them on a closed surface. Participants could spend more time acting in the VE and end the training when they feel comfortable with the use of the interaction technique. After the training step, they realized the pointing task and completed questionnaires about their use of PC and video games on a 5 points Likert scale.

4 Results

All data were analyzed with R software [24]. First, for each interaction technique, we conduct hierarchical clustering analysis (HCA) with he Agglomerative Nesting algorithm and average method of linkage. The HCA differentiated at least two subgroups. To describe the subgroups, we compared them using t-test if the data distribution was normal and Mann–Whitney U test if not. Moreover, confidence intervals were plotted as another result interpretation less dichotomous than the p-value [25]. To compare the 3 interaction techniques we used ANOVA or Kruskal-Wallis test depending on the distribution normality.

4.1 Profile Analysis

Gamepad Users. The cluster dendrogram could easily discriminate two groups (Fig. 2) called subgroup 1 (\(n = 7\)) and subgroup 2 (\(n = 6\)). Results are presented in the Table 1. Throughput for the subgroup 1 is higher than for the subgroup 2 (\(t = 3.35; p = 0.008\)). Subroup 1 realized the pointing task in less time than the subgroup 2 (\(U = 31; p = 0.04\)). Subgroup 1 spent less time completing the training step than the subgroup 2 (\(U = 30; p = 0.04\)). Subgroup 1 clicked more time in the training VE than the subgroup 2 (\(t = 6.64; p = 0.001\)). Subgroup 1 made more miss clicks during the training step than the subgroup 2 (\(t = 3.41; p = 0.01\)). Subgroup 1 played video game for more year than the subgroup 2 (\(t = 4.25; p = 0.002\)). Subgroup 1 played more often video game than the subgroup 2 (\(U = 6; p = 0.01\)).

Fig. 2.
figure 2

Two subgroups of gamepad users can be discriminate viewing the cluster dendrogram. Participants number are displayed in the x axis.

Table 1. Mean (M), standard deviation (SD) and p-value for the gamepad users during pointing and training task.

Mouse Users. The cluster dendrogram separated two groups of observations: subgroup 1 (\(n = 6\)) and subgroup 2 (\(n = 3\)). Throughput is almost higher for subgroup 1 (\(M = 0.22\)) than subgroup 2 (\(M = 0.16\)) but the difference is not statistically significant (\(t = 2.89; p = 0.06\)). Visualizing the confidence intervals between subgroup 1 and 2 of mouse users we can interpret that throughput for the subgroup 1 is not higher than the subgroup 2 (Fig. 3). Subgroup 1 took less time to complete the pointing task than the subgroup 2 (\(U = 18; p = 0.02\)). Subgroup 1 took less time during training VE than the subgroup 2 (\(U = 0.5; p = 0.02\)). Subgroup 1 use PC for longer than the subgroup 2 (\(U = 3; p = 0.05\)) and subgroup 1 played video game for longer than the subgroup 2 (\(U = 0; p = 0.008\)).

Razer Hydra Users. The cluster dendrogram separated 2 subgroups of observations: subgroup 1 (\(n = 4\)) and subgroup 2 (\(n = 4\)). Throughput is higher for the subgroup 1 than for the subgroup 2 (\(t = 3.61; p = 0.01\)). The subgroup 1 made less errors during the pointing task than the second (\(U = 12; p = 0.05\)) and realized the task is less time than the subgroup 2 (\(U = 12; p = 0.05\)).

Fig. 3.
figure 3

Confidence intervals of throughput between group 1 and 2 with the 3 interaction techniques.

4.2 Comparison of Interaction Techniques

Training. Razer Hydra users navigated during a longer time in the training VE than mouse users (\(p = 0.02\))and gamepad users (\(p = 0.004\)). They also made more errors (i.e., miss click) than participants using the mouse (\(p = 0.04\)). Other variables were not significantly different.

Pointing Task. Throughput is higher for the mouse than the gamepad (\(p = 0.001\)) and the Razer Hydra (\( p = 0.001\)). Throughput gamepad is better than Razer Hydra (\(p = 0.001\)). Completion time of the pointing task is smaller for the mouse than the gamepad (\(p = 0.001\)) and the Razer Hydra (\(p = 0.001\)). Gamepad completion time is smaller than the Razer Hydra (\(p < 0.001\)). Mouse users made less errors (i.e., miss click) than the gamepad users (\(p = 0.01\)) and the Razer Hydra users (\(p = 0.01\)). Gamepad users made less errors than those of the Razer Hydra (\(p = 0.01\)). Other variables are not significantly different.

HR and HRV. No statistical difference was found in heart rate and HRV during the training step and the pointing task between interaction techniques.

5 Discussion

The study aims to understand how to characterize the user abilities with an interaction technique. To do it, we observed 3 groups of participants interacting with the gamepad, the mouse and the Razer Hydra. They realized a training and a pointing task in a VE.

Unsupervised clustering algorithms like HCA can organize observations at least in 2 subgroups according to several variables. Here, we took variables related to the use of the interaction (e.g., completion time, number of errors) and to workload (i.e., HR and HRV). The HCA discriminated two subgroups of users, within each group: a first subgroup with good skilled users and a second subgroup with less abilities with the use of the interaction technique.

Several variables are significantly different from subgroup 1 to subgroup 2, among these throughput is a recurrent one. Indeed, the subgroups 1 have a better throughput than the subgroups 2 of gamepad and Razer Hydra users. The subgroups 1 presented participants more familiar with video games or PC usage. For example, the subgroup 1 of gamepad users is characterized by video game players. These participants have a better throughput than subgroup 2 and explored more the VE of the training stage. Indeed, they clicked more on inactive and active items in the VE to see which reaction they could expect or not. They did it in less time than subgroup 2. In addition, the subgroup 1 of mouse users completed the pointing task in less time and they use PC for a longer time and play more video games than participants in subgroup 2. Throughput was not significant between subgroups of mouse users maybe because most of mouse users were already very familiar with the use of the mouse. Indeed, 8 of them use a computer every day and the last one use it several days in a week. The skill of a participant using an interaction technique could be calculated from several parameters like being accustomed with an interaction technique. For instance, console gamers are familiar with gamepad. Indeed, the more familiar is the user with an interaction technique, the more the control of the input device appears to be natural and easy [26]. A natural interaction technique may be not only a technology which is mapping real common gestures in the VE but is also a familiar one. Mouse and keyboard are perceived more natural than the Razer Hydra [27] whereas mouse and keyboard are desktop-based and the Razer Hydra is a semi-natural interaction technique according Nabioyni and Bowman’s taxonomy [28].

The results of the HCA show that the subgroup 1 in the several conditions is more skilled with the use of an interaction technique. Indeed, they have a better throughput or used common interaction techniques for longer than others. The calculation of throughput is a good way to discriminate skilled participants from less accustomed with the use of an interaction technique. Indeed, HCA mainly discriminates group from the throughput results and time completion results. We found no significant difference with ECG data between groups 1 and 2 or the 3 interaction techniques. The HCA didn’t separate the groups from a workload measure and the mouse, the gamepad and the Razer Hydra seem to involved the same workload across users. As previous studies, the completion time is smaller with the mouse than with the gamepad and there are less errors with the mouse than with the gamepad [26, 27]. The use of the mouse is associated with best results, in part because people use computer for several years and at least every day. The use of mouse cost few effort [28]. Razer Hydra users had lower performance at the pointing task than the mouse and the gamepad groups. These results are concordant with other studies where Razer Hydra is lower and made more errors during a pointing task than gamepad and mouse[29] or during a navigation task for elderly[30]. Razer Hydra is not a common interaction technique and users were not familiarized with it and need more time to be comfortable with it than with the gamepad or the mouse.

HCA is good statistic method to conduct profile analysis and then see by which variables there are the more characterized. To realize a short pointing task with only one sequence of trials, here 13 targets, may be an efficient way to understand the participant abilities with the use of an interaction technique. The users with higher abilities have a better throughput than the one with lower. Having a knowledge about the skill of the user with an interaction technique could help to have a better appreciation of the com- pletion time variable during cognitive test in VE. For instance, a gamepad user with good abilities (e.g., high throughput) who complete a virtual test with a long completion time may have more cognitive issues than a gamepad user with few skills with the use of this interaction technique. Indeed, the realization of complex tasks in a VE involve an amont of workload. A non-familiar user of an interaction technique would devote mental workload for both tasks realization and interaction technique use. So, his/her involved workload would be high and he/she may commit errors or take a long time to complete the complex virtual tasks even if the user has no cognitive issues.

Future studies should include more sequence trials during the pointing task to explore how many trials are necessary to distinguish the skills of mouse users because one sequence of 13 trials is not enough. There is also a need to compare the user performance of participants with and without good skills using an interaction technique during cognitive virtual tasks. Indeed, the subgroup (i.e., good or bad abilities) may have higher, lower or not signifficant difference compared to the other.