Keywords

1 Overview

Play is an essential social, emotional, and intellectual developmental activity for children [3, 19], including those with Autism Spectrum Disorder (ASD).Footnote 1 While parents, educators, and therapists of autistic children strive to create play opportunities, play is not a simple prospect for autistic children [10]. During play, we draw upon our social, verbal, and non-verbal communication skills. These are the very abilities impacted by autism [1]. Play enables children to develop the ability to see the world through someone else’s eyes, a process described by the “Theory of Mind,” (ToM) a developmental psychology theory developed by Alan Leslie [11, 16]. Simon Baron-Cohen’s research in the autism field has shown that this awareness of another’s perspective can be a challenge for people with autism [2, 13]. Children with autism are also likely to be highly sensitive to sensory inputs [1]; therefore, they can experience discomfort with dynamic environmental changes often associated with play including loud noises and bright lights. Even simply the unstructured nature of playtime can be challenging to many autistic children who can have difficulty managing transitions and typically prefer structure. Given the juxtaposition of the importance of play with its challenges, our research investigated the current play practices of autistic children and the opportunities for technology to facilitate play.

Remote technology has become increasingly pervasive and affordable, enabling people to communicate while not physically co-located. For example, FaceTime from Apple and Skype from Microsoft are pre-installed on many mobile and desktop devices. These video-based technologies enable a user to view and hear a remote partner. The IllumiShare, a prototype system created by Microsoft Research, provides the additional ability for partners to share any arbitrary surface. The IllumiShare device is a lamp with a web camera and a projector embedded next to the light bulb [14]. In this system, each user has in front of them a desktop including physical objects, overlaid with a projection of their partner’s desktop and objects. Figure 1 illustrates paired IllumiShare desktops. The girl is placing her hand on her desktop, which is projected to her partner. He is tracing her hand with his pen. The illuminated rectangle on the table establishes a space for joint interaction [5].

Fig. 1.
figure 1

In the left-hand image, the girl places her hand on her IllumiShare desktop and sees a projection of her partner tracing her hand. She can also see his face and hear him via Skype. In the right-hand image, the boy traces the projection of his partner’s hand.

Fig. 2.
figure 2

In the left-hand image, A’ says, “Hey, let’s thumbs up!” and extends his right hand. In the right-hand image, his thumbs up is projected to D via Skype and IllumiShare.

Fig. 3.
figure 3

D returns A’s thumbs up

In exploring the use of IllumiShare and Skype during remote playdates, we drew upon the concept of embodied interaction [7], which addresses the ways in which interactions with and through technology are fundamentally social and embodied. Embodied approaches to design strive to make a system’s mediating interfaces recede in order to make user experiences directly tangible. Previous research exploring how neurotypical (NT) children play using IllumiShare and Skype [14] demonstrated that, when using IllumiShare, physical toys and their digital representations blend together into tangible objects to be used in play. By interacting with these digitally enabled tangible objects, the children engaged in technology-mediated social interactions. We build upon this research to study the remote play of neurodiverse (ND) pairs (comprised of neurotypical (NT) and neuro-atypical). From this perspective, we investigated our research questions:

  • R1: What are current playdate practices for families with children with autism?

  • R2: How can remote technologies be used to create authentic play experiences for children with autism?

  • R3: How do children with autism interact with remote technologies during play?

To address our research questions, we interviewed parents of autistic children and found that, while they value play as important to their child’s development, they struggle to implement face-to-face playdates due to the challenges of finding appropriate nearby peers and busy schedules filled with medical appointments. We then conducted an exploratory user study with pairs of children, each consisting of a child with autism and a NT playmate. The children interacted via Microsoft Skype and IllumiShare while in different lab rooms. Our interaction analysis revealed that, through embodied interaction with real and virtual objects, the children created meaning in the form of mutual and parallel play, supporting important ToM skills. The children successfully negotiated variations of remote technology features. However, at times, the constraints of the technology impeded joint attention and perspective-taking. We contribute empirical findings based on parent interviews and interaction analysis of neurodiverse playmates playing remotely with tangible objects in an shared surface environment. We recommend design considerations for remote technologies to better support mutual play and ToM skills.

2 Related Work

In this section, we establish a connection between ToM and the cognitive and social skills involved in play. We highlight research on play among neurodiverse peers and recent research on technology-mediated play of children with autism.

2.1 Theory of Mind Skills and Play for Autistic Children

While no one theory holistically describes autism, we draw upon the ToM to better understand stages of cognitive development that are related to play. ToM, coined by Premack and Woodruff [18] and extended upon by Alan Leslie [16], is “the ability to infer other people’s mental states (their thoughts, beliefs, desires, intentions, etc.), and the ability to use this information to interpret what they say, make sense of their behavior and predict what they will do next” [13]. Through conceptual perspective-taking tests designed to detect a child’s awareness that their belief about an experimental situation is different than others, Baron-Cohen et al. found that “autistic children as a group fail to employ a theory of mind” [2, p. 43]. Due to this ToM gap demonstrated by research and seen in practice, researchers, parents, educators, and therapists aim to build ToM skills of autistic children. They do so by targeting key skills of engaging in pretend play, developing emotional literacy, and understanding information states (perspective taking and joint attention) [13, p. 2]. These skills are used during play. By conducting our research within the context of play, we used the real-world scenario of play to investigate ways children demonstrate ToM skills.

When children play, they are actively engaged in experiences that are pleasurable, intrinsically motivated, and flexible [26]. A key motivator for paired play is enjoying the camaraderie of playmates. Wolfberg et al. [25] found that integrated play groups are an effective forum for both neurodiverse and neurotypical children to develop relationships. Inclusive play provides autistic children the opportunity to develop social communication and reciprocity, while neurotypical children gain knowledge and skills to be flexible and responsive. Based on the benefits of integrated play groups, our study examines the verbal and non-verbal communication and reciprocal exchanges that arise during the free play of autistic children with their neurotypical play mates. We draw from Wolfberg’s model of inclusive play to analyze our user study.

2.2 Technology-Mediated Play for Neurodiverse Children

Research on the play practices of autistic children focuses on the role of digital games in building social relationships and cognitive skills. Boyd et al. examined the usage of iPad games, finding that games support building friendships through features that enable children to fluidly join an activity and to coordinate their actions [4]. In research on how inclusive pairs used an iPad picture-taking application, Sobel et al. investigated the role of technology in enforcing cooperation and prompting for interactions [22]. On both dimensions, they had mixed results that depended on the naturally-occurring dynamics of the children. This points to the need for flexible game environments that can be adjusted based on the children’s goals and dynamic. Both Boyd et al. and Sobel et al. discussed that the children bonded over being able to comment and share their gaming experience.

Specifically exploring embodied interactions, Farr et al. researched social interactions of autistic children playing with tangible objects on an Augmented Knight’s Castle play set [9]. The researchers conducted interaction analysis and coded for play modes (e.g., disengaged; co-operative-social play) plus object-related actions (solitary versus parallel sensori-motor play). They found that the children who were allowed to extend the functionality of the objects (by activating pre-recorded audio) were motivated by the immediate feedback of the object and sought the attention of others to share the effects. Our research also employs interaction analysis and uses a similar play code scheme. However, our research is distinct in that we studied remote play in which the role of technology was to convey verbal and non-verbal interactions. In our case, the use of objects was intrinsic in the children’s embodied interactions.

A popular platform for remote play is the Minecraft online gaming environment. Some Minecraft servers, such as AutCraft, are dedicated to players with autism and related neurodiverse conditions. Research on these dedicated servers has found that the players engage in social learning, problem solving, and community building through their use of the game’s communication affordances (chat windows, avatars, and play activity) [20, 27]. Our research builds upon this examination of remote play by exploring the set of communication affordances supported in remote technology (audio, video, and a shared surface).

Overall, this body of knowledge informs our research with its emphasis on inclusive play experiences and exploring the role of technology to support play rather than constrict it. We use the methods of interaction analysis and video coding to identify evidence of ToM skills.

3 Method

We conducted semi-structured interviews with four mothers of children with Autism Spectrum Disorder (ASD) to gain an understanding of current playdate practices. They were recruited via a local autism therapy clinic and a local company’s email list of self-selected employees with an interest in autism. In each hour-long interview, we inquired about their child’s autism diagnosis and characteristics, current school and therapy strategies, current play goals, logistics, positive experiences, and challenges, and the use (if any) of technology for remote communication with friends and family. The children were all male, aged 3.5–11. Their parents identified them as having autism (with one child also having low cognition and a visual impairment). Two of the children attended mainstream class in public school. One child attended a class comprised of 50% ASD and 50% NT children. Another child attended special education class in public school and spent 1.5 h per day in a general education class. To analyze the interview results, two researchers used the affinity diagramming technique to generate themes.

Next, we conducted an exploratory user study with four pairs of children, each consisting of an autistic child and a neurotypical playmate. We recruited the children from the parents who participated in our interviews and from a local autism community group. All of the autistic children (4 male, ages 9–13) were confirmed as being on the autism spectrum by their parents and were verbal. None of the playmates (3 male, 1 female, ages 8–13) were identified by the parents as being on the autism spectrum. In this report, we changed the children’s initials and indicated which child had autism by appending their initial with an apostrophe. The University of Washington Human Subjects ethics board approved this research.

During each session, the pair of children first interacted face-to-face to help researchers establish a baseline for their interactions. The children were instructed to spend 5 min playing together with any of the toys available on a nearby table (cards, cars, mermaid doll, and a Question and Answer style book). The NT child was then escorted to another lab. Each lab had similar toys and was equipped with a Microsoft Surface tablet running Skype, plus an IllumiShare device. As shown in Table 1, the pairs cycled through three conditions in random order: (1) Skype audio and video, (2) IllumiShare plus Skype audio and video, and (3) IllumiShare plus Skype audio only. Each pair experienced the “IllumiShare plus Skype audio and video” twice since the IllumiShare was novel technology that none of the children had previously experienced. (The children had previous experience with remote video applications.) They played in each condition for no more than 10 min, with a researcher entering the room to adjust the equipment for each condition. The sessions ended by debriefing each child separately, during which they were asked their favorite play activity and if anything was bothersome about the play session.

Table 1. The pairs cycled through 3 conditions. “IllumiShare + Skype” was done twice. Audio and video were on for the “Skype” condition. Only audio was on for the “Skype-Audio only” condition.

Our analysis of video recordings followed Goodwin’s protocols of interaction analysis, a method often used in the systematic investigation of mundane activities ranging from families at the dinner table [8] to interactions of adults with communication disorders [23]. By conducting a detailed interaction analysis, Goodwin describes an “embodied participation framework” comprised of body positioning, artifacts, gestures, gazes, and linguistic markers. Order matters when it comes to interpreting gestures, language, and structure in the environment, making it important to develop a coding technique that accounts not just for isolated instances but sequences of actions. In our analysis, we closely examined specific interactions, prioritizing depth and richness of data over quantity of participants. We followed Erickson’s [8] inductive procedure in which researchers iteratively view the video corpus to identify major events, transitions, and themes. The first analytic pass occurred during the sessions as two researchers coded for the Play Categories listed in the first column of Table 2: disengagement, parallel play, mutual play, and negotiation. After all the sessions were completed, we discussed the major events and themes we had coded while observing the sessions and then wrote memos defining, clarifying, and refining observations. The refined Social Play codes and definitions are the result of these analytic activities and are based directly on Wolfberg et al.’s [25] research on integrated play groups. With both empirical and theoretical support for this inductive schema, one of the researchers coded the video corpus to these more refined Social Play Codes. For Isolate and Onlooker-orientation codes, we observed instances where both children showed this behavior. Therefore, we appended ASD and NT as appropriate. Finally, we transcribed key excerpts for salient talk, singing and paralinguistic elements.

Table 2. Codes and definitions for our interaction analysis.

4 Results

4.1 Families Face Challenging Play Logistics

During our interviews with parents, we found that families struggled to incorporate playdates into busy schedules of school, doctor’s appointments, and regular and intense therapy in the following areas: physical, occupational, speech, social skills, Applied Behavior Analysis, and Relationship Development Intervention. The parents value the benefits of playdates (i.e., learning typical social skills; negotiating), noting that mutual play with appropriate-level peers is critical for their children. The parents described difficulties finding playmates with complementary schedules, play skills, and interests. All of the families had previously used Skype or FaceTime on their desktop computers or mobile devices. Two families regularly used FaceTime with family members, for example a traveling parent or grandparents who live far away. Based on these experiences, parents expressed a primary concern that conversations using these tools can easily became stifled unless the other party was entertaining and engaging.

Additionally, parents expressed worries about the social dynamics presented by their children’s play environments. To these parents of ASD children, positive social engagement in face-to-face play means that their child is very animated, participating in the activity, and laughing at appropriate times. Importantly, parents did not consider eye contact to be a critical component of play. Instead, they valued parallel play, such as sitting side-by-side while playing video games independently. For example, P4’s mother speculated that not being forced to make eye contact with his friend allowed her son to feel more comfortable. Other parents noted similar difficulties their children experienced during face-to-face play due to tendencies of being rigid and having difficulty conforming to play activities. Some parents noted that their children sometimes have problems taking turns and sharing toys. The parents said that due to these challenges, play can quickly turn to arguments and disruptive behavior (e.g., throwing things). Difficult social interactions, changes in routines, and being in unfamiliar environments can wear out the children, increasing the likelihood that children will need to perform self-soothing behaviors such as hand flapping, putting their head down, hitting their face with a stuffed animal, and, with a parent’s guidance, taking deep breaths.

4.2 Mutual Play Through Embodied Interaction

Taken together, common focus on each other and materials, along with setting and executing common goals, result in mutual play [25]. In our study, the children’s mutual play began during the face-to-face baseline session in which they discussed what they thought they were going to do, and played with the toys or spun in the chairs. In some pairs, one child was more vocal than the other; in other pairs, both children were equally vocal. When the children were placed in separate rooms, researchers assessed typical markers of conversational alignment [12] such as tone of voice, body language, and amount of dialog. Based on these markers, the children exhibited play dynamics that were consistent with their face-to-face behaviors. Throughout the sessions, the children jointly established their common focus and goals through technology-mediated verbal exchanges and non-verbal communication using their body and objects. For example, their mutual play was based on negotiating how to co-create using physical and virtual artifacts (e.g., taking turns tracing each other’s autographs), cooperating to use material only accessible to one child (e.g., a book; digital media from tablet), and planning how to use jointly-accessible artifacts used in combination to create a shared surface game space (e.g., card game).

Common Focus.

We observed instances of all aspects of common focus, as defined in our codebook: joint action, mutual imitation, sharing emotional expression, sharing materials, taking turns, giving and receiving assistance, and directives. Upon starting their remote play, all pairs began establishing a baseline of what their partner could see and hear. The children jointly made sense of their environment through this inquisitive form of play. As illustrated in Excerpt 1, we can look at interactions between A’ and his younger sister, C, who their mom described as often taking a “nurturing, mothering” role with A’.

Excerpt 1, First Interactiona

1D:

A’, are you drawing on a separate piece of paper?

2A’:

I’m drawing autographs.” ((holding pen))

3D:

Wait, where are you…where are you drawing it on? Are you drawing on the paper that’s taped? ((feels the edges of her paper))

4A’:

(.5 s) N::no.

5D:

Okay

6A’:

(1 s) Hey, let’s thumbs up! ((reaching right hand out toward Skype, into IllumiShare desktop, with his thumb up)) (Fig. 2)

7D:

Thumbs up! ((forms thumbs up with her hands, extending toward Skype and over IllumiShare)) (Fig. 3)

  1. aWe used the Modified Jefferson transcription convention [24]. Turns at talk are numbered for identified speakers. Continuous speech at turn boundaries is shown with = equal signs, while onset of [overlapping talk is shown with left brackets. EMPHATIC talk is shown in caps, and elong:::ated enunciation is shown with repeated colons. ((Activity descriptions)) appear within double parentheses and in italics, and > comparatively quick speech < appears in angle brackets.
  2. We extended the convention in two ways: (1) The child with autism is indicated with an apostrophe, and (2) {{Technology descriptions}} appear within double curly parentheses and in italics.

Joint Attention.

Attending to a person or object is a behavior that tends to be difficult for children with autism. This was evidenced when the pair directed their actions toward a common object or activity. The children demonstrated that they were attending to the same object or activity through verbal communication, gestures, moving objects, or drawing pen marks. When a child looked toward the Skype video, we made general observations that they were looking for the presence of their peer or for visual communication. When a child was looking at another object, such as a book, instead of at the IllumiShare desktop, we observed that they were at least partially engaged by the object. In the case of Pair 3, H’ and K, their sociotechnical interactions were enriched by their personal tablets, which they brought to the study. According to H’’s mother, they primarily played video games when together. During the face-to-face portion of the session, K was preoccupied with his tablet, watching YouTube videos of Minecraft gameplay. H’ started playing with cars. When Illumishare + Skype was introduced, the two expressed excitement with comments like “wow” and “this is so cool.” H’ spent the majority of the time drawing a cat character, even using his iPad to look up the character as a reference. In one interaction, K gave explicit directions to H’ asking for visual access to his marks (Excerpt 2). This type of interaction through drawing has been established as an important form of communicative practice [21].

Excerpt 2, Placing Marks

1H’:

Draw in the middle of the paper so I can see what you’re drawing.

2K:

(1 s) I can totally see your drawing too.

In another example, F’ and G spent almost the entire session playing cards, actively engaged with playing and, as conditions changed, expressing delight and talking about what to adjust in their game. In their first condition (Skype), F’ directed play, asking if G could see his cards and telling G to hold up his cards. Sometimes, a physical action was adequate to convey meaning, with no words necessary, as shown by G holding up his card in Excerpt 3.

Excerpt 3, Verbal and Visual Checks

1F’:

((holds up cards to the camera)) “Show your cards to the other person.”

2G:

((holds up his cards))

3F’:

Currently you have 14 chips, is that correct?

4G:

Yeah.

During mutual play, they placed joint attention on the artifacts and, when they oriented themselves toward the camera, toward each other.

Taking Turns.

Turn taking is another behavior that tends to pose challenges for children with autism. People can asynchronously and explicitly take turns (such as making a move during chess) or they can take turns more informally (such as dressing a doll together). In our study, we observed both styles of turn-taking. As an example of asynchronous turn-taking, A’ and D began their session with A’ writing Disney princess autographs, while D followed along, tracing what A’ drew. The IllumiShare + Skype-Audio condition enabled their mutual play of writing and tracing. They negotiated their Disney signature tracing activity with the use of words, drawing, and observing. (Excerpt 4). When the IllumiShare was turned off and they just had Skype, they continued drawing autographs in a more parallel play fashion. They then switched to a “Would you rather?” verbal exchange for a few minutes.

Excerpt 4, Explicit Directions

1A’:

How do you like that? ((After drawing the signatures across the page.))

2D:

Keep it right there, I’m gonna trace.

3A’:

((watches, waits for her to trace))

4D:

Who’s next?

For Pair 4, N’ and P, a major play activity was choreographing the signing of a song about the U.S. states. In the IllumiShare condition, N’ spontaneously began writing down the state names, and P, recognizing the order of the states as a song they learned in school, began singing them as N’ wrote. N’ and P switched roles as writer and singer. When the condition changed to Skype, N’ did not realize that the IllumiShare was now off. He put his finger on one of the lyrics and asked P to sing, but P said he could not see where N’ was pointing. As shown in Fig. 4, N’ then decided to hold up the paper to the Skype camera so P could see where he was pointing. It took N’ a while to find a successful way to show P where to start singing. This example demonstrates a technological barrier to the children fluidly taking turns in this condition.

Fig. 4.
figure 4

N’ struggles to simultaneously point to the correct word while showing it to P.

Fig. 5.
figure 5

A’ is writing “ocelot” and talking to D, who is engrossed in her book.

Common Goal.

During mutual play, children establish common goals about their activities and roles. Mutual goal setting involves negotiation and compromise, requiring a child to express his views and consider the views of others. We observed negotiation as play began, at points within play when roles or activities were clarified or changed, and as technology conditions changed. The sociotechnical nature of the study required the children to negotiate how to share materials (e.g., playing with cards shared between their physical and virtual space) and the IllumiShare desktop. When the latter occurred, the children needed to accommodate the changed video and audio modes of communication. They re-established how the play activity would proceed and negotiated how they would share materials. The majority of children expressed delight when Skype video was back on and they could see each other. For instance, K commented “I can see your face!”, and both waved at each other. When Skype video was turned off, the children often adapted by being more vocal. P3 (H’) and his friend (K) had to verbally check in with each other to see if the other was active and paying attention during their play.

When F’ and his friend G were in the Skype-only condition, they struggled to play a card game without visual access to the desktop. F’ tried so show G his cards on the table by physically re-orienting the tablet running Skype, but the tablet fell on the table. When the IllumiShare came on, they quickly adjusted their actions to leverage visual access, which eliminated the need to hold up their cards to Skype, thus freeing up their hands (Excerpt 5). The pair used explicit statements questioning to establish their mental model of their own perspective and their peer’s visual perspective, which is a key step in building ToM.

Excerpt 5, Checking In

1F’:

I have a question, can you see these cards on the ground? o::oh::h!

2G:

I can see the cards.

3F’:

Oh, you can actually see them! That is so awesome! Can you see me?

In Excerpt 6, F’ and G figured out how to strategize their play when they lost video and could only rely on IllumiShare + Skype - Audio. (Note that F’ also broke the fourth-wall of research, instructing the researchers to “turn the video back on” and contemplating the researchers’ intentions.) They used a verbal exchange, informed by knowledge of what the other child could see, to re-affirm their common goal.

Excerpt 6, Figuring Things Out

{{Skype is turned off.}}

1F’:

Hello?

2G:

Huh? Our video is turned off?

3F’:

We need to think of a new way to play…that’s what they wanted us to do….Turn the video back on! Just kidding! <Spoken into the room>

I’m thinking…Oh wait, you can still see the table! So we can still play the same game, but with showing them on the table.

4.3 Nature of Parallel Play and Disengagement

We observed multiple instances of parallel play in the form of onlooker-orientation and parallel-proximity play. We based our coding on evidence of independence such as: (1) attention on different types of objects, (2) absence of dialog or gestures about play goals and activities, and (3) uncoordinated drawing on separate places in the IllumiShare. For example, A’ and D exhibited parallel play when they drew independently without speaking or coordinating their drawings. As illustrated in Excerpt 7, at one point, A’ wanted D to comment on his drawings, but D did not respond. Upon further prompting, D gave a minimal, onlooker-type response, even though she actually could not see his drawing. This example demonstrates the difficulty in sustaining mutual play, especially when the technology was not flexible enough to show A’ what D was actually engaged in.

Excerpt 7, Missed Connection

1A’:

(8 s) ((A’ drawing map of animal park, within IllumiShare frame. D is reading book, out of IllumiShare frame.))

((A’ folds up his paper with map. Paper ruffling.))

2A’:

Hey, D. ((looking at Skype))

3D:

(1 s) Haha! ((A’ glances down and back at Skype))

4D:

(.5 s) What?

5A’:

 = H::here’s a map of Sudden Defiance Animal Map ((unfolding map to reveal it. Holds it vertically to Skype. D only can see A’’s hand and the top edges of A’’s paper. D moving her paper off IllumiShare frame with one hand, keeping her place in book with other hand.))

6D:

Oh, that’s cool. ((Puts down pen, glances at Skype, looks down at book. A’ looking at Skype, which shows user icons.))

7A’:

What’s your favorite so far? ((Leaning his map back so he can see it, which is still out of the IllumiShare frame))

8D:

I like the…um::m…((glances at IllumiShare frame, which does not show the map, then back at book)) … the ocelots. (Figure 5)

9A’:

Usually at my zoo, they’ll be two ocelots. ((holding up two fingers))

10D:

Yeah. ((eyes remain down at book))

Disengaged.

On rare occasions, the children shifted from parallel play to disengagement. One striking example is Pair 4, H’ and K. In their face-to-face condition, H’ and K sat far away from each other, with H’ wandering around the room and K playing on his tablet with his headphones in. This dynamic was occasionally replicated when they were in separate lab rooms. K was intent on streaming music and video from his tablet. The audio streamed automatically to H’ via Skype audio. H’ focused mainly on drawing his cat characters. At one point, K wanted to show H’ a video so he put his tablet down on the IllumiShare desktop. Unfortunately, H’ said he could not see it, which was due to the screen glare caused by the IllumiShare lamp, so he resumed drawing. This demonstrates that even when the children attempted to establish a common focus, the technology barrier and a lack of a common goal prevented mutual play.

5 Discussion

To summarize the nature of the children’s remote play, they used fluid, intrinsically motivated embodied interactions while engaging in both mutual and parallel play. Their remote play dynamics were congruent with their baseline face-to-face play dynamics. The children exhibited the ability to focus jointly on a common goal. They exhibited ToM skills such as taking turns and predicting actions when they sang, traced autographs, and played cards. The mutual and parallel play categories that established by previous work on face-to-face play extended effectively to remote play. However, the remote-nature of their play had its challenges, which we surfaced through interaction analysis. The children needed to adapt their play to account for the changing capabilities and constraints of the remote technology. The new audio and visual capabilities were not always readily apparent to the children. In these cases, their interactions became strained when a child continued playing as if the original communication channel (e.g., Skype video) was still available. The children also had difficulty pulling their peer back into previously established goals if their peer had independently shifted to a new activity. Despite being disappointing to the child when this happened, disagreements over play activities is a natural aspect of free-play. Therefore, we conclude that the children exhibited authentic play while not co-located.

There were some unique qualities of being remote that facilitated play and strengthened their ToM skills. Once in different rooms, the children were motivated to establish a connection with their peer. They articulated their sense of self by explaining their sensory capabilities (what they could hear, see, and do) and compared that to their peer’s experience. The children made discoveries about how they could share materials and adjust rules of games to play collaboratively. The children engaged in all types of play (mutual and parallel), regardless of the specific remote technology. We found it counter-intuitive that the condition with the richest modes of interaction, IllumiShare + Skype, was not immune to the least desirable mode of interaction, isolated play. And the condition with the least-rich communication channel, Skype, was sufficient for mutual play. This achievement means that children, with today’s publically available technology, can host authentic playdates.

Another implication of effective technology-mediated play is that the children’s interaction styles were successful over communication channels that are more restricted than face-to-face communication. In our case, and in other research such as Minecraft servers geared toward autism communities [20, 27], there is synergy between autistic children’s communication preferences and technology-mediated communication. A remote technology channel distills social interactions down to dialog (aloud or in chat), body language (projected or embodied in an avatar), and manipulation of objects (physical and digital). Aspects of social interactions that can be difficult for people with autism are limited (e.g., eye contact) or eliminated altogether (e.g., physical touch). Technology-mediated communication also facilitates asynchronous communication, which can be a preference for people with autism [17], thus loosening the expectations of managing synchronous, unpredictable in-person communication. Interestingly, adding the IllumiShare desktop, which enriches the communication channel, also benefitted the children. They could more easily engage in familiar, table-based activities and did not have to engage in extra work to share the materials and their actions. This points to an element of authentic embodied interaction—the value of easily communicating and sharing embodied actions without requiring an extra step to explicitly share the interaction with each other.

Although our research yielded rich qualitative data, we had a limited number of interview and study participants. By conducting the study in a lab, the children were asked to play in an unknown environment with unknown researchers, which can be disorienting for children with autism. Future studies could be conducted in a setting familiar to the child, such as home or school, and perhaps over a longer period of time to evaluate the effect of novel technology. Future research could also explicitly explore symbolic play with objects, which is a component of Wolfberg’s play model and could lead to deeper insights on the role of tangible objects [25].

6 Design Considerations

Based on our exploratory research and given the limitations of our study, we offer the following design considerations for remote technology toward enhancing mutual play and ToM skills. We target our design direction toward (1) reinforcing embodied interactions to scaffold mutual play and (2) using technology to strengthen shared experiences.

6.1 Reinforcing Embodied Interactions to Facilitate Mutual Play

Our first design direction aims to help children successfully establish mutual goals and focus, which requires navigating shifts in play. The remote technology environment should amplify the children’s embodied interactions to emphasize directions toward mutual play. Shifts between mutual and parallel play tended to occur when a child became interested in another activity. These transitions between activities were often accompanied by (1) new objects being used, or (2) verbal and non-verbal communication from a child without reciprocal communication from his peer. Interestingly, the child who left the original activity for a new one was often the neurotypical child, perhaps because the autistic child was often still immersed in the original activity. This points to a common trait among people with autism—the ability to deeply pursue interests [1] and engage in activities that are personally motivating [6]. Regardless of who tried to initiate mutual play with a new activity, unsuccessful attempts to engage over time resulted in both children becoming absorbed in their own activities.

At these vulnerable points, it would be helpful for a child to gain the attention of their peer in a more purposeful and concrete manner. This could be accomplished by making the system aware of the children’s gestural attempts to engage each other. When the pair’s non-verbal, verbal, and object movements indicate there is a vulnerable point of play, the system could initiate visual cues (e.g., changing the color hue of the IllumiShare light) or audio cue (a beep) to draw the pair’s joint attention. An alternative approach could be to provide controls (e.g., a button on the IllumiShare surface), “smart” tangible object (one with knowledge of its location, user, and state of play), or a gesture vocabulary to the children, so they can activate sounds and visual cues in the shared environment.

To further promote joint attention when the pair is veering off to different activities within the IllumiShare space, there could be features that nudge the pair toward a common area of the IllumiShare desktop. For instance, if one child is focusing on one corner (e.g., drawing animals) and their peer is focusing on another corner (e.g., writing), the IllumiShare could display visual cues to bring their attention together, perhaps using a trail of lights. As with any auto-detection software, care would need to be taken to correctly distinguish between desired (mutual) and less desired (parallel-proximity) activities.

6.2 Strengthening Relationships Through Shared Experiences

Our second design direction is toward strengthening the bond between the children. As discussed in the Related Work section, children valued being able to comment on their coordinated experiences and share them with others. In our study, the children verbally and physically checked in with each other (e.g., giving each other thumbs up), and often discussed what they were able to see, hear, and do. By taking a step back and discussing their experience, they were establishing a common understanding and frame of reference. The technology could do more to create a sense of joint experience with mechanisms for capturing the experience. The children could take screen captures of each other on Skype and their IllumiShare desktops. When they experience something funny or confusing, they could request that the system retroactively capture the last two minutes of the experience, for example. They could then review their experience from their vantage point or from their peer’s. Along with being a compelling way for a child to be immersed in the sensory experience of their peer, this flipped vantage point could help the pair resolve confusion about their play strategies. Ultimately, the children may choose to share their vignettes with other peers and trusted adults, deepening each other’s understanding of the play experiences of the children.

7 Conclusion

Motivated by the barriers faced by children with autism to participate in inclusive play, we examined the opportunity of conducting playdates over remote technologies. Through video conferencing and a shared surface, the children in our study carried out embodied interactions, thus leveraging their non-verbal communication skills to convey their intentions to, and interact with, their peer. This naturalistic style of play contributed to authentic, socially appropriate play experiences that our parent interviewees desire for their children. Although it is not possible to observe mental states, we observed behaviors that point to mental states that support a child’s ToM. We found that there is not a direct correlation between the type of play (mutual versus parallel) and the specific remote technology. The design of remote technology can amplify children’s embodied interaction, and therefore, scaffold mutual play. Remote play technology can facilitate reflecting and sharing play vignettes, toward the goal of strengthening social bonds. Although remote technologies provide a less-rich communication environment than face-to-face interactions, the communication and dynamics of remote technologies are adequate for inclusive play, and in some ways may be more ideal for children on the autism spectrum.