1 Introduction

The paper is focusing on human–robot cooperation in work environments from the perspective of human–robot interaction, and the sociological insights that were gathered so far in experiments conducted in the Fabrication Laboratory “MTI-engAge” at the Technical University of Berlin [1]. Due to the typical range of application of robots, aspects of work safety were so far mainly emphasized as an issue of avoiding somatic harm. However, when it comes to social robots and to the design of settings of straight human-robot cooperation cognitive aspects should be taken into consideration. Until now, the issues discussed in this paper were not relevant since most of the robots used in work environments are industrial robots that are used in production processes. In these typical industrial settings safety is ensured by separating the robots from the workers through security spaces or even fences. A new application area for robots is becoming increasingly important as the development of service robots are gaining relevance in terms of reliability and versatility. However, to take part in collaborative actions with human workers clearly raises completely new challenges and demands new approaches for addressing safety issues.

This paper is emphasizing the fact that even if the interaction with the robot is intuitive and therefore unproblematic, it will pose a danger to safety due to an increased reduction of cognitive tasks as the robots assume the workload for the involved humans. If the robot fulfills the collaboration task with the human in a routinized way (unlike humans would), the human will adapt to the robot’s predictable behavior very quickly and in doing so reduce his or her attention to a quite dangerous level. To address these new challenges and develop strategies to avoid them, I argue that the construction of interactive working robots can benefit from a sociological insights and basic assumptions about interaction among humans. From a sociological point of view interaction among humans is always endangered of failing. The probability of failing is usually assumed higher than succeeding. Therefore, one intriguing question is to ask how humans ensure a smoothly course of interaction. Sociological insights may help through the evaluation of human–human interaction to deduce relevant issues regarding human-robot interactions how to ensure safety in upcoming human-robot collaboration settings. A main aspect lies in the use of a conceptual instrument in the form of what we coined in our research group as a “behavioral crisis pattern”. The term “crisis” is a conceptual term taken from a very circumscribed approach within sociological theories mainly known as “Ethnomethodology.” Especially the approach of Ethnomethodology is highlighting the constant state of failure within interaction and the permanent repair strategies that are adopted by the interacting humans to coop with potentially unexpected behavior respectively with the slight overall contingency of the situation.

A crucial aspect for HRI in work environments is avoiding strong routinization and a strong reduction in the human’s attention. Especially if the robot is assuming a large workload the stated problem could arise and pose danger to safety. Systematically induced crisis in terms of confronting the humans with credible contingency in the form of small amounts of unexpected behavior could be a very effective solution. The humans cooperating with robots are not involved in a behavioral crisis in an everyday sense; the term “crisis” is used in a ethnomethodological meaning to describe a robot behavior that slightly surprises the human user. To achieve this goal and the intended benefit, the robot simply must act in such a way that is somewhat different from the human’s expectations. When expectations do not meet with such surprise after a fair amount of interaction experience, humans automatically tend toward routinization.

2 The Peculiar Relationship of Failure and Safety in Human-Robot Cooperation

Designing a safe human-robot interaction for collaborative work settings is a comparatively new challenge. From a sociological point of view, it is crucial to analyze basic elements and patterns of human-human interaction (HHI). A common theory regarding HHI on the micro level is to emphasize the use of symbols and the performative use of them in a constructivist way. Especially in collaborative work environments the interaction between a human and a robot is basically carried out through actions and gestures, less through communication in a strict sense, e.g. using words, talking with each other etc. The interaction is therefore based upon the understanding of gestures and actions as social cues and the ability to read them right, but also to anticipate the expectations of the collaborating partner. The situation is becoming increasingly social when it is built upon expected expectations, or at least when the human could describe it that way. Even if the robot is following a deterministic path, we can assume that if the human is ascribing expectations towards the robot and acting towards them (either way: fulfilling or willfully disappointing them) the situation can count as social. The described “as if” situation is typical for artificial intelligent research and can be followed back to one of the most influential and paradigmatic approaches within the field, the so called “Turing Test” by Alain Turing [2, 3].

Intriguing is the notion of Turing to design a test based on genuinely sociological assumptions [34]. The main underlying assumption of the so-called Turing Test is the ascription of intentionality and contingency undertaken by the human towards the artificial intelligence (A.I.). This concept is very familiar with most of the major sociological theories concerning the micro level of society, i.e. an interaction system. Alter and ego are relating to each other assuming that they are interested in exchanging whatever could be of interest. Furthermore, both are ascribing the other capabilities without having any proof of their existence. Especially the capability of being intentional and acting on the basis of expected expectations and even more on the presupposition to know that the other is also acting on this basis and is therefore expecting that Ego is acting orientated by expected expectations. It is of course quite hard to believe that a human can really assume a robot or a A.I. is consciously, intentionally expecting expectations and furthermore also assuming that the human is expecting that. However, the ingenious concept of Turing was to change the game by pointing out that the only thing that matters is not what is really going on in the mind (respectively the operations and input to output relations of the machine) but the mere assumptions that one – the human – will have or will develop towards the machine, i.e. the robot or the A.I. and that will be used as guidelines for the own future acting as well as thinking. This way of putting the challenge was – and for many reasons still is – a completely new and different approach to the riddle how to put the interaction between a human and a machine. Most of the approaches in HRI are still operating upon the implicitly acknowledged assumption that the relation between a human and a robot has to be modeled by taking into consideration the way how the human mind is functioning and processing information. By doing so the solipsistic mind is the cornerstone of the (most probably unmanageable) solution. In opposition to this viewpoint the Turing Test is putting the emphasis on the sole ascription of qualities, even if they do not really exist, if the human is acting toward the machine assuming that it is capable of expecting expectations, intentionality, taking always contingency as the outcome of the interaction into consideration, etc. the human’s action will define the machine as assumed. According to the theorem of Thomas “If men define situations as real, they are real in their consequences” [35,36,37]: If the human is assuming that the robot is acting towards him or her in a very similar or even the precise same way how a human would do it, he or she will act towards the robot in a way that will establish the robot as human-like as possible – at least within the situational context of the interaction they are involved.

Accordingly, to an orientation towards expected expectations a basic assumption is that humans engaged in typical HHI situations also expect that the interaction could fail and they also are familiar with slightly breaches of their expected expectations respectively their assumed course of the interaction. These general theoretical propositions regarding HHI are put in the foreground by the approach developed by Harold Garfinkel, first coined as “Ethnomethodology” in the 1950’s. Ethnomethodology follows the basic assumption that social reality is not predetermined by fix structures but created in situ between the interacting actors who act on behalf of methods. It is ‘‘the investigation of the rational properties of indexical expressions and other practical actions as contingent ongoing accomplishments of organized artful practices of everyday life’’ [4]. In other words: Each action needs an interpretation to be understandable and suitable for follow-up actions between two individuals (interaction). Therefore, ethnomethodology assumes that the meaning of an action as a symbol cannot be deduced from the symbol itself, as it is always vague and preliminary and pervaded by indexical expressions (e.g. ‘this’, ‘here’), but emerges in the social context between the humans interacting with each other. To be accountable the involved humans use “methods” to understand how the action should be understood. These “ethnomethods” are reflexive by nature: “the activities, whereby members produce and manage settings of organized everyday affairs are identical with members’ procedures for making those settings ‘accountable’” [5].

Another important concept in Ethnomethodology and for the here presented approach to design HRI in collaborative work environments is sequentiality. Sequentiality is describing the fact that it is anything but random when, i.e. at what point in the interaction something happens. The sequential order is to be taken seriously and actions can be interpreted by formulating hypotheses and taking the following action as a verification or falsification (sequential interpretation).

As we already stated in recent published paper [6,7,8], a useful instrument to work out the quality of HRI is the ethnomethodological instrument of “Breaching Experiments” which was developed by Harold Garfinkel [9, 27, 28] to estimate the strategies (ethnomethods) that are adopted by humans to achieve a successful interaction between humans. We adopted this approach to work out new insights related to several studies in the field of HRI that were already leading to the assumption that the gaze of the robot is crucial for the assessment of the interaction. Especially for the successful and safe execution of cooperative tasks the gaze of the robot has a significant impact [10,11,12]. Even if a “point of interaction” – however embodied: as a face with eyes or just a light or a very simple emoticon like a smiley – is completely irrelevant for the mere functionality of a robot within work environments, it could be nonetheless vital for a healthy, cognitive exonerative HRI design. The general framework of this kind of research is referred to in typical HRI settings mostly quite vague as “point of interaction” [19, 33]. The so-called point of interaction can be very different things, the range goes from a specific interface or a specific action and therefore shows a strong affinity to HCI research in general. In the here presented research, the point of interaction is assumed to be crucial, due to the assumption that every interaction between a social robot and a human is focused and oriented towards it.

To understand the key factors, that are defining the HRI as sound and superior to developments focusing on mere functionality, it is insightful to take the HHI as a reference. Even if the implementation of HRI deviates from the standards of HHI, the orientation towards HHI is the key for the design of a proper and human-centered configuration of HRI. To achieve these goals, a conceptual framework based on some basic sociological assumptions in regard to the main factors that are characterizing interaction among humans is becoming increasingly important the more interactive the relationship between the robots and the humans will be. The framework should be able to identify the crucial features for a successful interaction and by doing so also increase the acceptance of the workers to willingly engage themselves in HRI.

From a sociological point of view, it is fruitful to consider HHI as a foil for the design of a safe HRI in work environments. Therefore, it is of paramount importance to consider the social addressability involved in the above mentioned general model of HHI. Addressability is an abstract prerequisite for interaction amongst humans in general, which has to be manifested and materialized for embodied forms of face-to-face communication. A point of interaction within a HRI setting becomes a social address mainly qua ascription of the entity that is dealing with it within the process of interaction. The reasons leading to the ascription of addressability is culturally shaped. The point of interaction serves as a mediator for the communicative act. A so-called point of interaction within HRI research should be defined as social address to be able to work with more elaborated concepts for the further understanding of gazes.

When it comes to the relevance of the point of interaction in the past decade as well as up-to-date studies in the field of HRI are focusing on the gaze and the gazing of the robot. The dominant aspect regarding social cues in HRI research was primarily and is still in most of the cases concerned with the use of the so-called social gaze and its importance in typical HRI settings. The research that was conducted so far is mostly concluding that social gaze has a favorable impact on nearly all the major aspects that are influencing the assessment of the interaction as a positive experience by the human. Fischer et al. [32] for instance summarized their research as follows:

“Our qualitative and quantitative analyses show that people in the social gaze condition are significantly more quick to engage the robot, smile significantly more often, and can better account for where the robot is looking. In addition, we find people in the social gaze condition to feel more responsible for the task performance. We conclude that social gaze in assembly scenarios fulfills floor management functions and provides an indicator for the robot’s affordance, yet that it does not influence likability, mutual interest and suspected competence of the robot.” [32, p. 204]

In a quite typical setting for HRI research design, the gaze of the robot was tested in a tutoring situation. In this case the human had to teach the robot how to perform a cooperative task together with him or her. Both Moon et al. [10] and Zheng et al. [11, 12] presented similar results using a HRI test setting that was likewise the one of Fischer, however putting the focus not on the cooperative task in general but addressing a more concrete and haptic task regarding handover situations. They were able to “provide empirical evidence that using humanlike gaze cues during human-robot handovers […] the timing and perceived quality of the handover event” [10, p. 2] can be improved. In most of the social gaze studies that were carried out in the recent past in HRI, the underlying models were orientated towards similar human-human interaction situations. The setting that were chosen to carry out the tests are a blueprint of the analogous human-human interaction situation. Not just situations that are typical for work cooperation and handover situations, but also the possibility of using the gaze as an acknowledging feedback of content-based exchange was also tested [31]. The overall results are similarly positive: “We argue that a robot – when using adequate online feedback strategies – has at its disposal an important resource with which it could pro-actively shape the tutor’s presentation and help generate the input from which it would benefit most.” [31, p. 268]

The above stated quotes and presented overall assessments related to the relevance and the function of social gaze within HRI in most of the recently conducted research dedicated to this topic, came so far to the mostly shared conclusion that the consideration and the proper implementation of social gaze is of paramount importance for a successful outcome in terms of a satisfying as well as both effective and efficient interaction between humans and robots.

Basic elements of typical HRI situation could manifest a wide range of complexity. However, the interaction between a human and a robot could be characterized as social regardless of the rate of complexity. Even human interactions involve a great range of degrees of sociality. HHI can range from routine and rule-governed settings to contingent forms and complex behaviors. The latter can best be described with a vocabulary of intentionality [13].

HHI are mainly structured by expectations. The expectations are mostly build upon the anticipation of the other’s expectations (expected-expectations). Even if in everyday contexts quite seldom expectations are disappointed, a certain amount of deviation from the expected behavior is a phenomenon humans are familiar with in social interactions. This characteristic of HHI was framed by Luhmann as a situation of double contingency [14]. This description emphasizes the fact that among humans the possibility to act differently is always possible and quite common. For this reason, uncertain expectations are more stable than certain ones [15].

Also interesting is the fact that humans have the strong tendency to behave socially towards robots – due to the representations in mass media as embodied and socially embedded [16] – but also towards all sorts of technological objects, media and nature [17, 18]. Anthropomorphizing is an evolutionary asset of humans’ behavior, since most of the competence to understand the environment is rooted in social interactions. This strategy is simply transferred to other areas and eventually enriched with specific knowledge regarding an artifact or the objects of nature [17]. Anthropomorphization is a cornerstone for intentionality, that is strictly linked to the ascription of agency towards the robot [19].

As already stated a sociologically inspired view (“breaching experiments” (Garfinkel), combined with a “frame analysis” (Goffman)) to evaluate HRI can work out the specifically social aspects [20]. The breaching experiments developed by Garfinkel to demonstrate the fragility of social order, are used to make visible the latent mechanisms humans use to coordinate themselves within their social environment [5, 29, 30]. A breach in this regard is used to reveal the strategies adopted to navigate through social contexts creating a certain state of “normality”. Conducting reaching experiments in HRI experiments can show if humans transfer these basic social-interaction strategies (ethnomethods) in the interaction setting with the robot. Especially if the robot is inducing flaws in the interaction experience and the human is adopting repair strategies to keep the interaction going. In these case the HRI setting is becoming comparable to a HHI situation and is framed by the human as social [6]. To be able to see the benefit of this approach it is helpful to underline a few basic assumptions about how in ethnomethodology interactions between humans (HHI) are understood. From the viewpoint of ethnomethodology in each HHI the humans must reestablish the meaning of the symbols (words, actions, etc.) they are using to be able to exchange information and/or cooperate with each other anew. For this reason, this approach is assuming that in most of the interactions slight adjustments to clarify the meaning of an action, a word etc. are inevitable. The meaning of a word, an object, etc., is never 100% clear and is always established ex post as an effect of the interacting partner’s reaction. Therefore, HHI demands and automatically leads to a certain degree of awareness on the part of the interacting persons. A HRI design trying to replicate this very typical behavioral patterns for HHI is in this regard a very gentle way of increasing safety, since it involves a very common feature of HHI and proposes to transfer it to the design of HRI.

These thoughts in mind, even in non-humanoid robots in work environments, the implementation of basic social features can result in a superior and safer interaction experience. This goals are becoming increasingly important as the cooperation between worker and robots is becoming interactive. For this new direction in the field of human-robot cooperation in work environments it is crucial to highlight the difference between interacting with a machine and a human: A machine becomes predictable after a certain amount of time. On the one hand have humans the capability to surprise one another, on the other hand the meaning of every interaction must be deduced from a new definition of the world which is established anew with each interaction. Therefore, robots should of course not manifest a crisis behavior, rather implementing a sort of behavior leading to a very common experience in HHI. Just small perturbations that of course should not disturb the perceptions of reliability and trustworthiness toward the robot by the humans cooperating with it.

Findings of studies conducted with feedback systems that were designed to decrease the human’s workload, concluded that the higher the degree of reduction is, the higher is also the risk of reduced attention [21]. Following the above stated assumptions related to deviation and expectation flaws that are typical in HHI, and that are leading to the breaching experiments as an instrument, a behavioral pattern could be designed [Societies]. The aim of such a behavioral pattern is to trigger a crisis in regard to the human’s expectations toward the specific HRI sequence. The goal of this procedure is to create the illusion of a interaction situation that seems to be social in nature. Presumably the anthropomorphization of the robot is an important precondition for the realization of a crisis that is ascribed by the human as social. However, in most cases robots provoke these basic intentional ascriptions per se, which easily result in the human interaction partner assuming a certain agency [17, 19].

Considering different aspects of behavioral crisis patterns in working environments characterized by a cooperation between humans and robots, it is very important to take into consideration the complexity of the setting. Following the description of Kahn et al. [29] of the term interaction pattern, the robot’s behavior should just slightly vary the action as expected by the human to achieve a breach. The robot doesn’t have to deviate from the expected behavior in total or in part at all. It is crucial to distinguish the different framings involved in working environments characterized by human–robot cooperation. Depending from the different level of complexity, the realization of the breach can differ quite a lot. The same effect depending on the different levels of complexity can be implemented following the insights of social theories focusing on interaction: ethnomethodology [5], symbolic interactionism [22], role theory [23, 24] and (in part) structural functionalism [25, 26].

3 Conclusion

The presented suggestion to prevent reduced awareness in human–robot cooperation tasks by implementing a behavioral crisis pattern with the goal to induce a crisis could be easily misunderstood. The proposed approach is aiming at creating a certain amount of awareness and not – or just in rare cases a soft, low-level – alertness. It is very important to keep in mind the way how ethnomethodology is describing everyday HHI situations by emphasizing the strategies that are used to avoid failure in interaction (which is from this point of view much more probable that the other way around). It is also very important to remember that this means also, that interactions between humans are successful not because they are following solid structures but because of the capabilities of humans to constantly rearrange the meaning of an action, a word, etc. and to coop with a constantly different definition of a symbol respectively of the course of interaction – compared to how it was beforehand planned and/or anticipated. Ethnomethodology tends to highlight the management of micro-crisis that are usual for HHI regarding the rules and solid patterns that one is attempted to see in HHI. The parallels between the description of HHI provided by ethnomethodology and the above presented approach for the design of a safe HRI lies in realizing just very small perturbations rather than actual crises (in a strict sense). By doing so the perceptions of reliability and trustworthiness of the robot(s) will not be destabilized. Just the awareness is kept up to a very familiar rate, similar to the awareness that is in place when humans are engaging in HHI.