1 Introduction

Today digital artifacts come in various shapes and dimensions. With decreasing and increasing sizes of instruments (e.g. smart watches, wall-sized displays etc.), traditional ways of interacting with existing interfaces such as the pointer (WIMP) paradigm become more and more ineffective and impractical. One of the alternative methods of interaction that has a promising future is gestures. Until now, various gesture recognition devices and gestural interfaces have been presented for interaction [1,2,3,4,5,6,7]. The interaction modalities for these devices predominantly fall into three categories: handheld devices, touch gestures, and freehand gestures.

Among these gesture types, on-skin touch gestures and freehand gestures come forward as they offer an interaction model where intermediary devices, such as remote controllers, are no longer needed. Previous studies have explored these models separately with various device implementations [2, 7, 8] and user-centered studies [9,10,11,12]. However, there is still a gap in the field regarding a comparison of user experience of these two gesture types. There is not enough design knowledge, which informs designers about strengths and shortcomings of these modalities that points to appropriate application fields comparatively. To produce this design knowledge in this unexplored area, we aim to investigate users’ preferences about these gesture types and see the conditions in which one would be advantageous to the other.

In order to make a comparison of on-skin touch and freehand gestures, we procured and adapted two user-elicited gesture sets: a skin-to-skin touch gesture set obtained by our previous study [13], and a freehand gesture set, obtained by Vatavu [14]. In this work, our goal was to explore users’ intuitions and preferences regarding these gestures. Twenty participants evaluated thirteen computer tasks and their corresponding gestures, which were taken from each set, summing up to twenty-six gestures in total. We used the NASA Task Load Index (TLX) [15] to evaluate users’ subjective evaluations about the gestures. We added four 7-point Likert scale items about social acceptability, learnability, memorability, and the ‘goodness’ [16] of gestures to this Index.

Our findings reveal that on-skin touch gestures were less physically demanding and more socially acceptable compared to freehand gestures. It suggests that on-skin touch gestures are more suitable for daily use where time and space are limited resources. They are more appropriate for controlling smaller personal devices such as smartphones. In comparison, freehand gestures were more convenient for large displays. Since they were found to be more engaging, they can be more suitable for entertainment contexts such as TVs or gaming consoles. Predominantly, our results suggest that different gesture types have different advantages in different contexts. Our work contributes to HCI community in inspiring designers and developers to choose and design new gestural interfaces for various devices and their ambient displays.

2 Related Work

2.1 Gesture-Based Interfaces

With varying size of displays, the need for new interaction modalities emerged to create better-suited methods for controlling vast amount of technological devices in different sizes. Interfaces with accustomed modalities such as WIMP paradigm have shifted towards interfaces with novel modalities such as gestural interfaces to fill the gap. Studies investigated gestural interactions with various application devices such as different home appliances [17] and ambient displays [18] with the aim of evaluating types of gestures proposed. Others tried to understand and define gestures for these diverse contexts [19, 20]. However, participatory experiments regarding mainly large screen implementations [1] revealed the users’ preferences and shifted the focus to ‘intuitiveness’ of the gestures [21,22,23]. With the aim of achieving this intuition, several studies focused on designing gestures through user-elicitation methods instead of pre-defined design methods [24,25,26].

Moving on to user-centered approach of gesture design, Nacenta et al. underlines the importance of user-elicitation methods as they create more memorable results [27]. They further argue that users explicitly prefer user-elicited gesture sets over pre-defined sets as they seem more usable. On the other hand, the reason behind this preference is still ambiguous. Heydekorn et al. evaluate a user-elicited gesture set by conducting a usability test to clarify the ambiguity [28]. The participants of the study were able to use an interactive display, spontaneously through touch gestures they did not know of, which indicates the benefit of intuition for controlling ambient displays.

2.2 Gesture Types

There are many interaction modalities presented to control ambient displays; however, handheld devices, touch gestures, and freehand gestures predominantly adopted user-elicitation method in creation. Among these gesture types, on-skin touch gestures and freehand gestures stand out, because they offer an interaction model that excludes the use of intermediary devices, such as remote controllers or touch sensitive displays. In this section, we address these two gesture types that we incorporated in our study.

On-skin Touch Gestures.

In this gesture type, the input is taken with the various contact methods of two skin-related items. There are different subsets under this category with various elicitation and implementation methods. As an example, Cahn et al. has created a set called Single-hand Microgestures (SHGMs), in which the users touch different parts of their palms with different actions to carry out the referent, using only a single hand [29]. Despite the fact that SHGM clearly creates a more subtle, discrete and mobile interaction with devices, it also lacks to propose an implementation method other than external hand tracking sensors. On the other hand, several studies proposed implementations of on-skin touch gestures through using an armband [2], a wristband [3] or a smart watch [5] for partial recognition of body parts. Skinput can even detect multiple parts of the body through acoustic transmission with an implementation of an armband [7]. All of these studies propose a method to measure the input of a single user. On the contrary, Nakatsuma et al. use another armband to measure the electrical capacitance between two users by active bioacoustics measurement [4]. It creates new application fields for on-skin touch gestures by adding a second user to the equation; however, lack of user experience is still an issue regarding on-skin gestures.

Freehand Gestures.

In this gesture type, the input is taken by moving one’s hand in mid-air. Studies investigated freehand gestures by evaluating and defining the gestures [30], and by understanding users’ preference and creating a taxonomy [31]. While creating sets for freehand gestures, studies mainly focused on devices that will be controlled. As an example, Henze and Hesselmann created a user-elicited gesture set for music playback [32], where as several other studies focused on creating a user-defined gesture set for controlling televisions [33, 34]. These studies create an advantage for users to control necessary referents for specific devices; however, they also lack to evaluate the general perception of freehand gestures from users’ perspective. To enhance the solution, some studies focused on feedback of freehand gestures in which users can understand if they performed the gestures right. Hood and Karvinen proposed haptic feedback regarding the issue [35, 36]. Nonetheless, it still lacks to fulfill users’ experience over ambient devices.

2.3 Comparison of Gestures

Until now studies evaluated these gestures within the boundaries of their own sets. Both user preference and elicitation studies only concern a single type of gesture set, although there are several studies that compare a type to another. BodyScape is a device implementation that can both recognize freehand and on-skin touch gestures [6]. The study both compares and combines these two types of gestures for large displays. However, it does not compare every gesture one-by-one and it lacks to report the results of this comparison. Instead, what the study reports is a combination of freehand and freehand-on body elicitation study. Moreover, the on-skin touch gesture set they use to compare is not a user-elicited set, where some of the gestures have extreme actions like touching the feet. In another study, Jakobsen et al. compare touch and freehand gestures for large displays [37]. They reported that although touch gestures were faster to perform and easier to select small targets, when the affordance of movement was calculated freehand gestures were preferred over touch gestures. Both of these studies clearly investigate advantages of one type of gesture over another; however, they are limited to a single scenario of controlling a large display.

Adverting to the concern, Vatavu compares handheld and freehand gestures for ambient home entertainment displays [14]. He reports that users prefer handheld devices to perform gestures because they prefer buttons and familiar actions such as WIMP paradigm. The work illustrates users’ experience towards two different gestures types, yet it does not compare usage scenarios with new interaction modalities, where there is no use of accustomed intermediary devices. The results demonstrate users’ bias for already known interactions. On the other hand, what we strive for is to understand user’s preference for new interaction modalities for different contexts.

The literature review suggests that despite the shift toward users’ experience concerning different gesture types, there is still a gap in the field regarding a comparison of user experience for new modalities. There is a lack of design knowledge to inform researches about which gestures will be advantageous for varying technological devices and contexts. We aim to explore users’ preferences comparatively for these gesture types to produce design knowledge in this uncharted area. Thus, we designed a study to compare on-skin touch and freehand gestures, and observe the conditions in which one would be advantageous to the other.

3 Methodology

3.1 Participants

Twenty individuals (12 females and 8 males) participated to our study. Participants’ ages ranged from 18 to 26 (M = 21.15, SD = 2.01), and they were all university students with various level of education from undergraduate level to PhD. All participants were right-handed and regular technology users with no professional relationship to design and/or HCI. Although we have conducted a previous user-elicitation study for creating on-skin gesture set, none of the participants were engaged in creating that set and they performed the gestures for the first time in their lives.

3.2 Setting

We conducted the experiment in an audio studio located in our university to minimize the external stimuli and control for possible extraneous variables such as lighting. There were 3 computers in the room (Fig. 1), where the first one (A) recorded videos via two external cameras, one in front of the participant (A1) and one above (A2). The second computer (B) displayed the survey to the participants via an external screen (B1). The third one (C) transferred the videos and the actions of the gestures to a LCD TV (C1) that was visible to the participants. Also, one of the two experimenters (D) used this computer to perform wizard-of-oz (WoZ) actions. The interface displayed to the participants was an edited Microsoft Power Point presentation, where the actions of the tasks were controlled by a simple click of WoZ.

Fig. 1.
figure 1

The setting of the experiment: (A) Computer no. 1, (A1) Camera no. 1, (A2) Camera no. 2, (B) Computer no. 2, (B1) Survey screen, (C) Computer no. 3, (C1) LCD TV, (D) Wizard-of-oz, (F) Participant

3.3 Gesture Sets

Freehand.

We obtained the freehand gesture set from a previous work done by Vatavu [14]. In that study, he conducted a user-elicitation experiment with twenty participants (12 females and 8 males) with various technical backgrounds. The participants were all right-handed similar to our case. He collected the gestures using Xbox’s Kinect sensor. Originally in his study, he obtained 22 freehand gestures for corresponding tasks with some task having more than one referent. However, for this study we chose 13 tasks, which correlated with our previous study [13], and chose the gestures with the highest agreement scores set by Vatavu (Fig. 2).

Fig. 2.
figure 2

Freehand gesture set for 13 tasks

On-skin Touch.

We used the on-skin touch gesture set from our previous work [13]. Nineteen undergraduate students (9 females and 10 males) participated in that study creating two on-skin touch gesture sets, an intuitive and an exclusive set. These sets included 26 tasks each and again we selected 13 tasks that correlated with Vatavu’s set [14]. We mainly chose the referents from the intuitive gesture set due to higher agreement scores; however, some of the referents were very similar for different tasks because of being intuitive. When this was the case, we gave the referent with the highest agreement score to the corresponding task and replaced the others from the exclusive gesture set. As a result, we obtained an on-skin touch gesture set with 13 referents with the highest agreement scores (Fig. 3).

Fig. 3.
figure 3

On-skin touch gesture set for 13 tasks

3.4 Procedure

First, the participants were greeted to the setting and seated. Here, while the first experimenter informed the participant about the experiment and handed the informed consent forms, the second experimenter started the video recordings. Both the experimenters and the participants signed the two consent forms, one for the participant and one for the experimenters. Then, the participants were situated in front of the main screen where they were visible to the cameras. Here, participants were told that they would see two gesture sets on the screen, both containing the same 13 tasks but different 13 corresponding gestures. The order of these sets was counterbalanced for each participant such as first on-skin touch gesture set or first freehand gesture set. Also, the order of these tasks was randomized for each participant and each set.

As the process began, the participants were asked to watch the videos of the gestures with the task name on top twice and repeat the gesture when the command screen shows up. They were told if they repeat the gesture as they see, ‘the machine’ would recognize the gesture and carry the necessary action for the corresponding task. We first presented a sample gesture (e.g. open menu) for each set to show them the process. After they successfully repeated the gesture and the WoZ initiated the action, they filled our 7-point Likert scale survey consisted of NASA Task Load Index (TLX) and our additional questions of social acceptability, learnability, memorability, and the goodness (Table 1). As the participants filled the surveys, we went over the questions together to make sure they were understandable. When the participants were done with the sample survey, we filled in their demographic information and chose their groups (e.g. on-skin gesture set first).

Table 1. 7-Point Likert Scale survey questions

Next, we continued with our designated gestures. The participants again watched the videos twice, repeated until they were successful and filled the survey for each gesture. Mention that although we presented a single large display to control with gestures to shorten the process, we continuously reminded the participants to think for various and ambient devices they use. They were also encouraged to think out loud and comment on anything that comes to their mind. After they finished all 13 tasks for the first set, we again showed a sample gesture and repeated the procedure for the second set. Subsequently, we seated the participants again and had a semi structural interview about the process. Here we also informed them about the WoZ process. In total, the procedure lasted approximately 30 min.

4 Results and Discussion

4.1 Survey Results

Two of the participants were dropped from the analysis because they were outliers in multiple items, leaving 18 participants for the final analysis. Repeated measures ANOVA was conducted for the items in the 7-point Likert scale survey, controlling for order effects of seeing either gesture set first. Results showed that freehand gestures (M = 1.62, SD = 0.56) were found more physically demanding than on-skin touch gestures (M = 1.28, SD = 0.33), F(1,16) = 10.55, p < 0.01. Freehand gestures (M = 6.07, SD = 0.91) were also less socially acceptable than on-skin gestures (M = 6.62, SD = 0.42), F(1,16) = 10.77, p < 0.01. For all other items in the survey, mean differences between freehand and on-skin gestures were not significant, p > 0.05.

4.2 Mental Model Observation

In this section we will share the results of semi-structural interviews together with our insights regarding participants’ behavior during the study. Predominantly, participants preferred freehand gestures (8 participants) over on-skin touch gestures (5 participants). However, another 5 participants expressed that both sets have advantages over the other considering various end devices, thus they want to use both of these sets. They indicated that the preference could easily shift from a device to another, so there should be a personalization option for the given sets, where the user can decide which modality to choose. In this section, we will discuss pros and cons of these gesture sets over the other in the given contexts.

Physical Demand.

One of the significant items in our comparison analysis was physical demand. Four participants specified freehand gestures as ‘large.’ Five further participants described them as ‘tiring’ and ‘difficult.’ On the other hand, 3 participants found on-skin touch gestures as ‘easy.’ The significance of the result may be due to higher physical demand caused by the nature of freehand gestures. Freehand gestures are indeed take much space and effort in reality. Their use of larger space felt too much for some participants while the on-skin touch ones were easier because they require less effort.

Intuitive vs. Artificial.

We observed that most of our participants perceived the palm as the multi-touch sensor. They transferred the metaphor of accustomed devices such as the smartphone or the tablet onto their hands and perceived on-skin touch gestures as similar. Therefore, we observed a legacy bias of standard smartphone touch gestures onto the on-skin touch gesture set, with 5 participants pointing that these gestures were ‘habitual.’ One participant expressed this situation by referring to on-skin touch gestures as “transporting the touchpad to the palm.” As a result, another participant indicated it to be ‘artificial,’ pointing to its man-made qualities. They evaluated accustomed gestures (e.g., swipe left for “next”) as ‘boring.’ On the other hand, many of the freehand gestures were taken from daily life, which one naturally performs while manipulating actual objects. Two participants even reported that they are ‘suitable for daily life.’ Another 2 participants found freehand gestures as ‘intuitive.’ Additionally, the interviews revealed that the gestures which were derived from symbols (e.g., thumbs up for “accept”) were more liked because they were claimed to be more memorable and that they “made more sense.”

Social Acceptability.

The other significant item in our comparison analysis was social acceptability. Twelve participants reported that they would prefer on-skin touch gestures in public context, while freehand gestures had less social acceptability on the survey questions. We believe this relates to many factors such as the size of the gestures, their relatively covert nature and their ‘artificial’ quality. First, as many participants indicated, freehand gestures take up larger space and this constitutes a problem while performing gestures on the street or on crowded public transportation. The possibility of trespassing strangers’ personal spaces was one of the main reasons why these gestures would not be socially acceptable in public. Second, on-skin touch gestures are usually performed within the palm area and can easily be concealed from public by correctly positioning the hand. Since they take small space, they can easily go unnoticed by public, providing the user with increased privacy in his use of the sensor. Finally, on-skin touch gestures are perceived to be more man-made while freehand gestures resemble gestures used in daily life communication. Therefore, some participants thought freehand gestures could be perceived as rude in the public context if strangers confused command gestures with communicative gestures. Since on-skin touch gestures are clearly directed towards an electronic device, these have a higher social acceptability.

Areas of Use.

Participants suggested many application areas or contexts for both gesture types. A general overview reveals that on-skin touch gestures were mostly seen appropriate for controlling ‘smaller personal devices’ or those require more ‘precision.’ Two participants reported they would prefer these gestures for ‘reading’ or ‘writing’. On the other hand, freehand gestures were found more ‘fun’ (2 participants) and ‘immersive’ (1 participant), which resulted in them being suitable for ‘large displays’ (7 participants). Five participants also indicated they can be used to control ‘public displays’ such as an interface of an automat or a presentation for a meeting. Further, 2 participants indicated a use for ‘gaming’ correlating with immersion, another wanted to interact with ‘holograms’ using freehand gestures. Participants believed they could have more fun with these gestures and increase immersion in multimedia by performing such large, intuitive gestures.

5 Conclusion

In this study, we compared user-elicited freehand and on-skin touch gestures through a user participatory experiment. In this experiment, twenty participants completed 13 tasks with the correlated gestures for each set and filled our survey. Our results revealed that on-skin touch gestures were less physically demanding and more socially acceptable. On the other hand, freehand gestures were found more intuitive. Further they were expressed as more fun and immersive.

From our results, future interaction designers should take account that smaller and artificial gestures like on-skin touch gestures are more appropriate modalities for publicly used devices such as mobile phones, mp3 players, smart watches or maybe even POS machines. They are preferred by the users because these gestures are divergent from one naturally performs. They have a lower possibility to confuse public because they are clearly to perform or control some action. Also, the subtler nature of these gestures helps to conceal the action if wanted. Moreover, this nature also enables smaller movements for the gesture, which made participants think that they are more appropriate for smaller devices and the devices with precision. In a sense, most of the devices we publicly use are small devices because they need to be easily carried and mobile. Thus, there is also a link between small devices preference and public use advantage of on-skin touch gestures.

On the other hand, designers should also account that intuitive and immersive gestures like freehand gestures are more appropriate modalities for fun contexts such as gaming, watching movies, listening music, sports or maybe even cooking and using other home appliances. Users preferred these gestures because compared to on-skin touch gestures, which were found boring, freehand gestures are more engaging. They need the use of larger parts of the body with wider motions ending up immersing the user in the action they perform. That is one of the reasons why they are also preferred to be used in private actions, because the true immersion of the self can hardly be achieved with spectators. Furthermore, immersion and wide motions of these gestures are the reason why they are preferred to control large displays. Controlling televisions to large billboards or even an automat was more convenient for our participants. Thus, we can speculate, it is even more convenient to make a presentation using these gestures to be more engaging, although it is a rather public environment.

Although, we presented advantages of these two gesture sets over another in different contexts to inform designers of the modalities, note that many of the participants preferred to customize these sets. They want to use both sets according to their needs, which can change over situations. For instance, they prefer to use on-skin gesture set to control their smartphones during a crowded bus trip, but they also prefer to use freehand gesture to control the same smartphone during a house party where they choose the music. Therefore, while both of the sets have clear advantages over another, interaction designers should also take account that these advantages are mainly context related and these contexts change over time. Thus, the most user-friendly way to approach the topic is to prepare a customizable interaction modality where users can adapt according to their needs.