Keywords

1 Introduction

Advances in modern technology place humans in contexts where machines may someday be partners versus tools [24]. To achieve this vision, machines will need to engage in team-based behaviors in collaboration with humans. Some of these team-based behaviors may involve monitoring human physiological activity and task performance. A good teammate, after all, is aware of when her/his teammates are stressed, overloaded, or just not engaged. Team members often use this information to support or “back-up” another team member. This back-up behavior exemplifies being a good teammate [1]. Understanding and acting on degraded human performance (either physiological or task-based) could be a way for advanced technology to augment human performance in military environments, yet little is known about how such technologies would be accepted or rejected among military operators.

Modern society has witnessed an explosion of devices for monitoring health, activity level, and other factors. Yet, the bulk of these tools are voluntary and while useful for tracking fitness, many of these tools are entertainment-centric and are not tied to an augmentation strategy from the technology. In other words, these tools provide information only and it is up to the human to utilize their guidance in most cases. From a levels-of-automation standpoint, this would constitute information acquisition and analysis [19], which is on the lower end of the levels of automation spectrum. What happens when these systems begin to integrate information about humans’ states into their decision processes? Furthermore, what happens when these systems are granted authority to redirect their actions based on an understanding of human states and one’s performance threshold? Imagine a world where one’s watch can dictate whether or not a driver is alert enough to drive, or where one’s fitness monitor prohibits the purchase of a desired tasty treat. With the advent of novel technologies desired to sense and augment humans, researchers must consider human acceptance, or trust, of the technologies and their behavior.

Trust refers to one’s willingness to be vulnerable to another entity [14]. Recent literature has reviewed the construct of trust as it relates to trust of machines [9, 22]. Much of this literature has examined the effects of reliability, performance, and error types on trust and other outcomes [9, 18, 21, 22]. But an emerging trend within this literature focuses on concepts such as transparency, i.e., methods for establishing shared awareness and shared intent [see 10]. Transparency manipulations have been shown to influence trust of automated systems [8, 13, 15]. Most of the transparency-based designs examined in prior research use interface-based features to convey information about the real-time activities associated with an automated tool, or they might use an interface to display the rationale for an automation’s decision or recommendation. In all such cases, these are examples of Robot-to-Human (RtH) transparency as discussed by [11] wherein the robot (or automation in this case) communicates task-based and analytical awareness information to the human in an effort to foster greater shared awareness. Of the various transparency facets, Robot-of-Human (RoH) transparency refers to when a system uses information about the human’s state to guide its interaction with the human and to explain its behavior. Knowledge of human workload, stress, boredom, or degraded physiological capacity could be instrumental in determining when a system should intervene in a human operator’s task. The awareness of human states and a system’s augmentation in relation to those states has been examined in the literature on adaptive automation.

Adaptive automation is automation that can invoke a higher or lower level of automation based on an operator’s state in critical situations – such as safety critical situations [2]. It is believed that adaptive automation can reduce human-automation interaction errors [2, 6]. Research by [5] examined a form of automation that used human Electroencephalography (EEG) signals as a means to understand the human’s cognitive workload and to, in turn, determine the appropriate time to interrupt the human operator without overloading her/him. They found performance benefits for the system under high workload conditions [5]. Adaptive systems that trigger based on human performance decrements have been present in the automotive community for years. Such systems may engage in augmentation strategies such as (1) arousing a driver’s attention to encourage greater attention allocation to potential risks, (2) providing warnings that encourage the driver to make appropriate decisions and actions to avoid accidents, and (3) using fully-automated control systems to take action when no action by the human is detected and an action is needed to avoid an accident [10]. The military too, has recently fielded a fully-automated safety system that will assume control of certain aircraft (e.g., F-16 fighters) to prevent ground collision [8]. These are examples of adaptive systems that monitor human performance thresholds. Yet, little is known about how humans view systems that are capable of sensing our physiology and altering their actions based on that understanding. The current research investigates several possible sensing capabilities and gauges operator acceptance of these methods. The current study also examines the Perfect Automation Schema (PAS) as the predictor of these attitudes.

The construct of PAS has recently gained attention as a trust antecedent. PAS has been conceptualized as a two-factor construct consisting of High Expectations (HE) and All-or-None beliefs (AoN) [16]. People with higher HE believe that automated systems are highly reliable, whereas those with AoN beliefs feel that any faults on the part of the automation means that the whole system is broken. Research has shown that the HE and AoN facets of PAS do influence trust perceptions, however the studies have shown inconsistent results in terms of whether it is HE or AoN that is related to trust perceptions [16, 20]. As such, PAS was included to examine if and how it is related to acceptance of sensing technologies.

The whole purpose of developing technologies that are capable of sensing and reacting to human states is to promote more effective teaming between the humans and machines. It is clear that autonomous systems and the human ability and preference to team with these technologies is an important part of future research doctrine, notably within the department of Defense (DoD) [3, 4]. However, there are many research challenges associated with the notion of HMT. Using machines as teammates also calls into question how such systems will be evaluated. Analogies can be drawn using human-human teaming as a comparison. There is a vast literature on human-human teams which may inform the design and evaluation of human-machine teams [24]. In the military context, the Air Force is exploring the concept of an autonomous wingman. Evaluations strategies of concepts like this need to account for the natural variance that occurs through human-human interaction when using humans as a comparison group, lest the evaluation be biased. As such, the current study examined pilot trust of different types of human wingmen.

Two factors must be considered when using human teams as a benchmark for comparison of trust for an autonomous system designed to team with humans, namely familiarity and experience. It is likely that the human team members used as a benchmark will be familiar with one another. This familiarity should positively influence trust perceptions [17, 23]. Thus, trust comparisons between human teams and HMTs may be contaminated by human familiarity – albeit, if designed poorly. Experience, specifically task experience, should also influence trust perceptions. Experience should be associated with greater learned trust [as noted by 8] and should be associated with greater perceived ability – which is a known trust antecedent [14]. If trust of a human teammate with considerable task experience was compared to trust of an autonomous system, the system will likely be subjected to a trust-based biases favoring the human simply based on experience. Thus, the current study will investigate how familiarity and experience influence trust of a human wingman to demarcate the impact of these factors and to show potential bias that would inevitably proliferate poorly-designed human versus machine comparisons in the HMT domain.

Given that several sensing capabilities were examined in the present study, no explicit hypotheses are posited to suggest greater or less acceptance of one type over another. Rather this study described the acceptance levels across the different types. It was expected that both the HE [greater] and AoN [less] facets of PAS would be associated with acceptance of the sensing technologies. Finally, it was expected that trust of a human wingman would increase as familiarity and experience increase.

1.1 Participants

Seventy-four F-16 pilots served as the participants for this research. They averaged 1700 flight hours and each was an operational pilot versus a trainee.

1.2 Materials and Procedure

As part of a larger study on pilot trust of automated collision avoidance technologies, pilots were asked to respond to an online survey which gauged their acceptance of sensing technologies. The sensing technologies varied in focus and design intent as noted below. The pilots were also asked to respond to three items which gauged their trust of a human wingman and they completed items for measuring the Perfect Automation Schema.

1.2.1 Sensing Technologies

Using a 7-point Likert scale where 1 = strongly disagree and 7 = strongly agree, participants were asked to rate their agreement with the following items (each item was prefaced by “I would be comfortable with an automated system on my aircraft that…”): (1) monitored my heart rate, (2) monitored my brain activity, (3) assessed my task performance, (4) assessed my mental alertness, (5) changed its behavior based on an understanding of my brain activity, (6) changed its behavior based on an understanding of my task performance, (7) changed its behavior based on an understanding of my mental alertness.

1.2.2 Wingman Trust Items

Using a 7-point Likert scale where 1 = strongly disagree and 7 = strongly agree, participants were asked to rate their agreement with the following items: (1) I would trust a human wingman who was unfamiliar and inexperienced, (2) I would trust a human wingman that was unfamiliar but experienced, and (3) I would trust a human wingman who was familiar and experienced.

1.2.3 Perfect Automation Schema

There were two scales related to the Perfect Automation Schema: High Expectations (HE) and All-or-None (AoN) beliefs [16]. Using a 7-point Likert scale where 1 = strongly disagree and 7 = strongly agree, participants were asked to rate their agreement with the following items: [HE items] (1) Automated systems have 100% perfect performance, (2) Automated systems rarely make mistakes, (3) Automated systems can always be counted on to make accurate decisions, (4) Automated systems make more mistakes than people realize [reverse-coded]; [AoN items] (1) If an automated system makes an error, then it is broken, (2) If an automated system makes a mistake, then it is completely useless, (3) Only faulty automated systems provide imperfect results.

2 Results

As shown in Table 1, the pilots unexpectedly reported similar comfort levels for all of the sensing items. As shown in Table 2, the Perfect Automation Schema was associated with some of the sensing items. Specifically, HE was marginally associated with greater comfort of technologies that change their behavior based on one’s task performance. AoN was associated with less comfort for technologies that assess mental alertness and those that can change their behavior based on one’s physiological activity. AoN was also marginally associated with less comfort for technologies that assess task performance and technologies that change their behavior based on one’s mental alertness. As shown in Fig. 1, the wingman analyses followed the expected trend that trust increased as familiarity and experience increase. Trust varied based on the familiarity and experience of the wingman, F(1, 73) = 256.39, p < .001, and trust was lowest for unfamiliar-inexperienced wingmen and highest for familiar-experienced wingman. The differences were reliable at each increment of familiarity/experience.

Table 1. Descriptive statistics for the sensing items
Table 2. Correlations between the sensing items and High Expectations (HE) and All-on-None (AoN) beliefs.
Fig. 1.
figure 1

Wingman trust by familiarity and experience.

3 Discussion and Implications

Teaming relationships with advanced technology is complicated and success in these relationships will be predicated on the ability of the humans and machines to establish shared awareness and shared intent. Contemporary researchers have suggested that bidirectional transparency may be one method to help foster shared awareness between humans and machine partners [11]. This will require that machines ingest and integrate information about human states (physiological and psychological) into their actions. The current study examined human acceptance of general sensing technologies that varied in their input (i.e., physiological metrics - such as heart rate or neurological signals, task performance, or mental alertness) and their targeted response (e.g., merely assessment versus augmentation). The results showed that operational pilots did not favor (or disfavor) one sensing type over another. Pilots evidenced moderate comfort levels for sensing technologies that spanned the gamut of capabilities from sensing heart rate and neurological activity, assessing task performance and mental alertness, and taking action based on the sensed signals. The pilots did not seem to differentiate between technologies that simply sensed, versus those that sensed and augmented the human based on the sensed information.

It is possible that pilots are getting more accustomed to technologies that sense and augment them in operations. The Air Force recently fielded the Automatic Ground Collision Avoidance System (AGCAS) which senses the pilot’s flight performance and automatically recovers the aircraft when a collision is detected. Despite the fully-automated nature of AGCAS, pilots have grown to trust the system and have accepted it as a useful safety technology [8]. Thus, it is possible that operators such as fighter pilots, are getting more accepting of sensing and augmentation technologies in general. However, an alternative explanation is that the nature of the technologies described in the current study were too high-level to warrant resistance. In all cases, the technologies examined in this study were described absent details on how the system would sense (i.e., what sorts of sensors would be used), how the data would be used, and what implications of the augmentation would be for the pilots. Many pilots may be comfortable with commercial products that gauge physiology for instance, yet these same pilots may report resistance to wearing a cumbersome set of electrodes under their flight helmet due to the physical discomfort and the lack of familiarity with such systems. None of the technologies discussed in the current study included details about how they would be used in the cockpit, and this lack of detail may have promoted more innocuous perceptions of the technologies in general. The intended use of the data is also very relevant. If pilots were to think that the sensing technologies could be used punitively or that they may result in flight disqualification, then certainly the technologies would be faced with greater resistance. Finally, there were no details about how the augmentation would occur. Specific details about how the system would engage in augmentation may be subject to greater resistance than the general ideas of augmentation. While the current study sought to understand acceptance of sensing and augmentation technologies in general, the lack of details associated with the technologies may have masked potential resistance among pilots. Future research should examine acceptance of specific sensing and augmentation technologies. One such study examined pilot trust of an automated air collision avoidance system and it noted several trust barriers by pilots for this technology [12].

In terms of a trust antecedent, PAS appeared to be related to acceptance of the technologies. Specifically, AoN beliefs were related to less acceptance of the sensing and augmentation technologies. The tendency to have AoN beliefs are associated with individual perceptions that advanced technologies always useful or always ineffective. The dichotomous view of technology exemplified by AoN beliefs could be a useful predictor of operator acceptance and/or rejection of novel technologies. Surprisingly, HE was not associated with acceptance of the technologies, albeit with the exception of a marginal positive relationship with acceptance of technologies that augment based on one’s task performance. The present findings add to a growing literature on individual differences of the trustor that are associated with trust in automation [16, 20]. Variability in trust can also be based on features of the trustee.

Trust of a teammate can be based on a number of factors. The current study examined how familiarity and perceived task experience influence trust. As expected, trust of a wingman increased with greater familiarity and higher task experience. While this finding is not surprising, it raises an important point that relates to HMTs. Test plans that seek to compare the effectiveness of a HMT may use human-human teams as an analogous comparison, and this comparison is an understandable benchmark. HMTs should be at least as effective as their human counterparts, right? Well, maybe… However, comparisons to human teammates can artificially bias the evaluations in favor of the human teams if not properly designed. In particular, it is likely that human-human teammates will have a de facto benefit for trust by virtue of their increased familiarity in comparison to an unfamiliar machine under test. In this case, factors such as perceived task experience and familiarity must be accounted for in the comparisons. This accounting, however, is easier said than done. Specifically, to develop task experience and familiarity, there exists a need for a teaming agent that facilitates teamwork between the human and the machine by adapting to the preferences for interaction of each partner (human or machine). At the same time, it is important that this facilitated interaction is transparent and bi-directional (i.e., comprising of RtH and RoH), which can be achieved by interactive “training” in which both the human and machine learn about each other while taking each partner’s preferences and strengths and weaknesses into consideration. A seminal research effort for building this type of teaming agent has been spearheaded for the NASA Reduced Crew Operations Program [7], but further research is needed to conceptualize and build an agent that can be generalized for any application.

The current study has a number of implications. First, pilots reported moderate levels of acceptance to general technologies that seek to sense and augment in various ways. Researchers should continue to gauge pilot comfort levels with novel technologies to avoid fielding a new tool that will be rejected by the operators. According to the present data, pilots did not seem bothered by higher-level automated systems that not only sense but also augment them. Care needs to be taken that future sensing and augmentation systems are not resisted based on lack of trust. Engineers and researchers should consider pilot preferences for sensor placement and feasibility as pilots may show significant resistance to sensors that are painful, distracting, and disliked. Further, the use and implications of use for the technologies need to be considered. Technologies that carry the potential for punitive action and or those that have the potential to impact a pilot’s flight readiness may be faced with resistance. The technologies explored in the current study may have lacked the specific details to have revealed these nuisances.

Operator individual differences such as the AoN component of the PAS could be useful predictors of resistance to novel technologies. Engineers who seek to field new technologies need to be aware that individuals may naturally vary in their acceptance of new technologies. Yet, many of these individual differences can be assessed and used to identify individuals who may be more resistance to the technologies. If individuals are believed to be resistant, care must be taken to avoid overselling unreliable tools. In contrast, designers should use transparency guidelines to promote shared awareness and shared intent between humans and new technologies [2, 8, 11, 13, 15].

Finally, when evaluating trust of technologies in the context of a HMT, researchers should be careful with using human teams as a comparative benchmark. Poorly designed comparisons between an unfamiliar technology as a teammate compared teams of humans that have experience working together will introduce an unfair bias against the technologies. Comparisons might be made in two ways: (1) with human teams who have no experience working together and with humans who are not familiar with one another; or (2) with human-machine teams that have an agent whose role is to facilitate teamwork by adapting each partner’s preferences for interaction. This will create an even playing field between the HMT and the human teams.