Researchers have often pointed to a fascinating ability for social and cultural learning in the human species (Bandura, 1977, 1986; Gould, 1979; Morgan, Rendell, Ehn, Hoppitt, & Laland, 2012; Nehaniv & Dautenhahn, 2007), which essentially outperforms animals’ ability to learn through observation (e.g., Herrmann, Call, Hernandez-Lloreda, Hare, & Tomasello, 2007; Thorndike, 1911). It has been argued that humans’ unique ability for social learning is one of the most important factors in the evolutionary success of homo sapiens (Boyd, Richerson, & Henrich, 2011; Habermas, 1985; Richerson & Boyd, 2005). Importantly, social learning is not restricted to adult members of human societies (e.g., Cross, Kraemer, Hamilton, Kelley, & Grafton, 2009) but is already present in human children and infants (e.g., Bandura, 1977; Hewlett, Fouts, Boyette, & Hewlett, 2011). That is, knowledge is even transferred in a nonverbal way at an age before language has fully developed. Already, infants acquire novel behaviors by observing and imitating the actions of others. Accordingly, it has been argued that imitation plays a crucial role in the acquisition of cultural knowledge, such as language and tool use, and the enculturation of the child (e.g., Hewlett et al., 2011; Shea, 2009; Vygotsky, 1978). These claims are supported by numerous studies demonstrating imitative learning in infants (for reviews, see Barr, 2010; Elsner, 2007; Nadel & Butterworth, 1999; Rovee-Collier, 1999).

Historical and theoretical perspectives on infant imitation

Notwithstanding the theoretical considerations about the relevance of imitation, the cognitive mechanisms subserving imitative learning have remained a topic of avid discussion. Indeed, for almost one century, researchers have been debating how imitation in infancy is possible (e.g., Baldwin, 1906; Bates, 1979; Guillaume, 1925; Parton, 1976; Piaget, 1962). Already in the first half of the 20th century, developmental psychologists discussed the question of how human infants are able to imitate others’ actions. Guillaume (1925) suggested that a perceived action might serve as a signal that induces the same action in the infant, since perceived and executed actions are related to each other by means of associative learning. This happens when the infant executes an action and perceives the visual consequences of his or her own action. Piaget (1962), however, proposed that the perceived action serves as an index that allows the infant to assimilate the other’s action to his or her own (invisible) action scheme. He labeled this phenomenon of immediate mimicry “circular reactions.” Yet, although he described this phenomenon with this novel concept, he did not provide a fully satisfactory account of the underlying mechanisms. In other words, it has been argued that Piaget’s (1962) account merely describes findings without explaining them (e.g., Brainerd, 1978). Additionally, according to Piaget (1962), true (deferred) imitation can take place only after children have entered the preoperational phase at age 2 and have developed internal representations. Recent findings are suggestive of an earlier onset of imitative abilities, thus calling into question Piaget’s (1962) model.

Continuing this discussion, Meltzoff and colleagues (Meltzoff, 1990; Meltzoff & Moore, 1989) suggested that infants are able to relate perceived and executed actions to each other by means of an inborn active intermodal mapping process. They suggested an inherited ability to detect the equivalences between perceived actions of others and (to be) performed movements by the self. This comparison of others with the self leads infants to conceive of other humans as being “like me,” thus forming the basis of a child’s developing theory of mind (Meltzoff, 2007). It should be appreciated that this model has been extremely fruitful for developmental and cognitive science, since it has generated a considerable amount of empirical research, which has provided some support for this model (for a review, see Meltzoff & Moore, 1989). Notwithstanding this fruitfulness, some reviews and empirical investigations with infants, as well as with adults, have suggested that the evidence for this model is actually sparse (e.g., Anisfeld, 1991; Anisfeld et al., 2001; Cook, Johnston, & Heyes, 2013; Koepke, Hamm, Legerstee, & Russell, 1983). Some have explicitly questioned that the current evidence supports the idea of an inborn matching scheme (e.g., Jones, 2009; Ray & Heyes, 2011). These authors have argued that there is no empirical evidence for imitation in human neonates. Previous findings supporting these conclusions can be fully explained by other mechanisms (e.g., exploratory responses, Jones, 1996, 2006; innate releasing mechanism, Jacobson, 1979). Others have concluded that the matching responses seem to be restricted to a single facial gesture (i.e., tongue protrusion; Anisfeld, 1996). Note that the question of the early origins of facial imitation is an ongoing line of empirical work. However, even if we were to concede that there is an inborn matching system specifically for tongue protrusion, this phenomenon cannot explain how infants become fully able to also imitate other behaviors. Moreover, from a theoretical point of view, Moore (1996) has noted that the intermodal matching scheme approach assumes not only an inborn amodal body scheme and an inborn predisposition to imitate, but also inborn self-knowledge and inborn self–other differentiation, since it is assumed to compare interoceptive input, tagged as belonging to the self, with exteroceptive input, tagged as coming from the other. Thus, instead of explaining how the ability and the disposition to imitate develop, the model presupposes them. Additionally, Meltzoff and Moore’s (1989) model, as well as the previously mentioned ones (e.g., Guillaume, 1925; Piaget, 1962), focused predominantly on immediate mimicry of intransitive actions. Cultural learning, however, largely consists in learning about novel transitive actions and deferred execution of the observed behavior (e.g., Tomasello, Kruger, & Ratner, 1993).

One of the most influential process models of imitative learning of novel object-directed actions comes from Bandura (1962, 1977). In a number of studies on children’s imitative learning, he provided compelling evidence that young children’s observation of others’ object-directed actions and the consequences of these actions affects their future behavior toward these objects (e.g., Bandura, Ross, & Ross, 1961). To account for humans’ social learning of novel action knowledge, Bandura (1986) developed a process model of imitative learning in which he stated that imitative learning proceeds over four steps. First, attention must be devoted to the model and the ongoing action. Second, the observed behavior and its consequences need to be encoded and memorized (i.e., the novel knowledge needs to be stored). Third, the action is reproduced. Fourth, reproduction is guided by reward and punishment (i.e., motivational processes). This model convincingly singled out the processes involved in imitative learning and has been very fruitful in generating several lines of research with young children and adults. However, the precise cognitive mechanisms underlying these phases of imitative learning have remained an open question. More precisely speaking, the following questions have remained open: How is the perceived information about the other’s actions and the consequences of the actions in the physical world processed? In which kind of functional networks and modalities is the information stored? How is the information represented and, when needed, remembered? Which kinds of control processes guide the subsequent reproduction of the acquired knowledge?

Similar issues arise with respect to other models. Current theories on imitative learning have focused on the question of whether or not infants’ imitation is affected by top-down processes, such as inborn expectations about the efficiency of observed actions (e.g., Gergely & Csibra, 2003) or an innate expectation about the pedagogical intentions of the model (e.g., Csibra, 2010), or whether or not infant imitation serves social functions besides the acquisition of novel skills and knowledge (Trevarthen & Aitken, 2001). Although some of these models go beyond phenomena of immediate mimicry by focusing on transitive actions and the acquisition of novel action knowledge, they do not address the question of how infants are able to imitate at all.

That is, whereas imitative learning in infancy has been recognized as fundamental in human phylogenetic and ontogenetic development (e.g., Charman et al., 2000; Shea, 2009), the cognitive mechanisms subserving imitative learning in infancy have remained unclear.

A theoretical model of imitative learning

It is the aim of the present contribution to spell out a cognitive model of imitative learning in infancy (and beyond) and to review recent empirical research that provides initial support for this model. Before presenting the model, though, we must clarify which kind of challenges it has to meet. Every theory of imitative learning in infancy has to provide an answer to two questions and, as a consequence, has to be related to two areas of research and theorizing.

First, imitating observed behavior means performing an action. As a consequence, theories of imitative learning need to be related to theoretical accounts of action control, which give us a conceptual framework of how intentional action control is possible (cf. Bertenthal, 2009). That is, every model of imitation has to provide an answer to the question of how infants intentionally control their own actions (on the basis of perceived information about another person’s actions).

Second, when imitating others, infants do not control their actions on the basis of their direct, first-hand experiences with particular actions and their consequences. Rather, the information they use is based on indirect information—that is, the observation of another’s action and the effects of these actions. The second problem posed for a theory of imitation is thus the problem of how observed information is incorporated into the action control system, enabling later imitation.

The present model builds on the ideomotor approach to action control, which suggests that actions are controlled by bidirectional action–effect associations (Aschersleben, 2006; Elsner & Hommel, 2001; Hommel, 2009, 2013; Hommel, Müsseler, Aschersleben, & Prinz, 2001; Kunde, Reuss, & Kiesel, 2012; Nattkemper, Ziessler, & Frensch, 2010). Current ideomotor theories have a long history in cognitive psychology (for an overview, see Stock & Stock, 2004). Historically, they were first put forward by theoreticians such as Lotze (1852) and James (1890/1981). The ideomotor theory of action control states that actions are represented in terms of their sensory consequences. Action knowledge is acquired through the repeated experience of co-occurrences of actions and their sensory consequences (i.e., their effects). The cognitive representations of intentional actions therefore consist of the associations of motor codes (i.e., action representations) with sensory codes (i.e., effect representations). The intention to elicit a particular sensory effect is assumed to activate directly the motor program associated with this effect. Thus, acquired action–effect associations represent the cognitive substrate of intentional action control. This integrated structure of perceptual and motor codes has also been labeled action concepts (Hommel, 1997). This theory provides an integrative framework that is able to explain how humans in general, and infants in particular, come to be able to intentionally control their own actions.

Empirical support has been provided in numerous studies. In line with this theoretical approach, it has been found that the perception of the effect of a previously performed action activates the associated motor program. These findings have been reported in behavioral and neuroimaging studies with adults (e.g., Elsner & Hommel, 2001, 2004; Janczyk, Pfister, Crognale, & Kunde, 2012; Kunde, Hoffmann, & Zellmann, 2002; Kunde, Lozo, & Neumann, 2011; Melcher, Weidema, Eenshuistra, Hommel, & Gruber, 2008), children (e.g., Karbach, Kray, & Hommel, 2010; Kray, Eenshuistra, Kerstner, Weidema, & Hommel, 2006), and infants (Paulus, Hunnius, van Elk, & Bekkering, 2012; Verschoor, Weidema, Biro, & Hommel, 2010).

In addition to ideomotor learning, the present model also integrates findings that the perception of others’ actions leads to an activation of the observers’ motor system (e.g., Bertenthal, Longo, & Kosobud, 2006; Caetano, Jousmaki, & Hari, 2007; Iacoboni et al., 1999; van Schie et al., 2008) when the observed action is in the observer’s motor repertoire (Calvo-Merino, Glaser, Grèzes, Passingham, & Haggard, 2005; Calvo-Merino, Grèzes, Glaser, Passingham, & Haggard, 2006). This suggests that observed and executed actions share a common representational format (e.g., Longo & Bertenthal, 2006; Meltzoff & Moore, 1989; Prinz, 1997). In line with these considerations, motor activation during action perception has also been reported for preverbal infants (Marshall & Meltzoff, 2011; Nyström, Ljunghammar, Rosander, & von Hofsten, 2011; Paulus et al., 2012; Reid, Striano, & Iacoboni, 2011; Saby & Marshall, 2012).

The present ideomotor approach to imitative learning (IMAIL) integrates these theoretical notions. It follows the ideomotor theory on action control in the assumption that all actions are cognitively represented and selected in terms of their effects (e.g., Hommel et al., 2001), and it capitalizes on the observations that perceived actions lead to an activation in the observers’ motor system. Its core notion is the proposition that bidirectional action–effect associations can also be acquired by observational learning and that this acquisition of modality-specific bidirectional associations between motor codes and effect codes through observation constitutes the central mechanisms subserving imitative learning. The following paragraph gives a short summary of the model’s assumed core processes.

Imagine infants perceiving the action of another person. If this action is within their own motor repertoire, the model suggests that a corresponding motor code in infants’ motor system will become activated (i.e., motor resonance). Moreover, the action results in an effect in the physical world (e.g., a light effect when a lamp is turned on). When the effect is interesting enough that attention is devoted to it (cf. Ziessler, Nattkemper, & Frensch, 2004), it leads to an activation of the respected sensory code, which represents this particular effect (henceforth, effect code). This sensory code is a modality-specific representation of the perceived effect and, thus, is stored in the sensory system. The model proposes that the activated motor code and effect code will become associated with each other, forming an action–effect association. This is likely to happen as long as they are simultaneously activated—that is, as long as action and effect are sufficiently contiguous and contingent (e.g., Elsner & Hommel, 2004). In the following, the core processes will be described and discussed in more detail.

Motor resonance

Following the neurocognitive literature (e.g., Borroni, Montagna, Cerri, & Baldissera, 2008; Hari et al., 1998; van Schie et al., 2008), we conceive of motor resonance as the activation of a motor-code/motor-program through action observation. Importantly, motor resonance has been shown to be effector specific—not only in adults (Wheaton, Thompson, Syngeniotis, Abbott, & Puce, 2004), but already in 14-month-old infants (Saby, Meltzoff, & Marshall, 2013). Importantly, the strength of motor resonance has been shown to be dependent on infants’ motor experience (van Elk, van Schie, Hunnius, Vesper, & Bekkering, 2008). In the study of van Elk and colleagues, 14-month-old infants observed a movie showing another infant either crawling or walking to the other side of the screen. Even though the action was on a more abstract level conceptually the same (locomotion from one side to the other), infants showed stronger motor resonance with the crawling than with the walking action, which was also beyond that positively correlated with their own crawling experience. Thus, motor resonance takes place when the observed action matches the observer’s own motor repertoire—that is, when it is performed in a way that is sufficiently similar to the way he or she could perform the action him- or herself.

Here, the question arises as to whether there is a practical way to determine an infant’s motor program before the actual imitation study. Although spontaneous production data are known to be difficult to acquire, clever experimental designs allow such an assessment. For example, Melzer, Prinz, and Daum (2012) were interested in assessing 6- to 12-month-old infants’ ability to perform contralateral reaching movements. To this end, they blocked the infants’ ipsilateral hand by means of providing them with a toy. Then they presented another interesting object at the same side, assessing infants’ ability to produce a contralateral reaching movement. In other cases, infants’ actual motor repertoire can be assessed retrospectively. In an imitation study, Paulus, Hunnius, Vissers, and Bekkering (2011b) presented five groups of 14-month-old infants with a model performing an unusual action. For each group (i.e., condition), the action was performed in a slightly different way. Nevertheless, all infants who performed the particular action did it in exactly the same way, demonstrating the specific way in which this action is in infants’ motor repertoire.

Effects

Action effects are thus assumed to play an important role in infants’ imitative behavior. This corresponds to a number of empirical findings, demonstrating the role of salient effects in infants’ action control (e.g., Hauf & Aschersleben, 2008), infants’ action perception (e.g., Király, Jovanovic, Prinz, Aschersleben, & Gergely, 2003), and their action imitation (e.g., Elsner, 2007; Hauf, & Aschersleben, 2008; Yang, Bushnell, Buchanan, & Sobel, 2013).

It should be noted that in most current ideomotor approaches, an action effect is defined as any action-related change in the environment. Actions thus consist of the activation of a motor program (i.e., motor code), which causes action execution. Any action produces effects/events, which can be perceived. These effects can be quite salient (e.g., a light or a sound effect), but also can include subtle changes (e.g., the mere displacement of the hand is a visual effect). It is assumed that these events consist of the binding of, in principle, separable cognitive codes that represent the distal features of the event (event files; Hommel, 2004). This view relates to work by Kahneman, Treisman, and Gibbs (1992), who suggested that the cognitive system binds feature codes into temporary episodic representations, labeled object files. We argue, on the basis of our literature review, that infants’ imitation is largely driven by salient action effects, probably as they meet children’s demands for sensory stimulation and/or as they enjoy causing self-produced effects (for developmental theories on infants’ motivation to reproduce effects, see Dweck & Leggett, 1988; Piaget, 1971). It should be noted that salient action effects are not confined to physical effects (e.g., light, sound), but also include social effects (e.g., bringing someone to smile or laugh; cf. Sato & Itakura, 2013).

There are several practical ways to determine whether infants find an effect interesting. On the one hand, one could rely on looking-time-based measures. For example, one could present two effects/events simultaneously and measure infants’ attention to each of these events. On the other hand, one could use action-based methods. For example, one could assess the rate at which infants reproduce one effect over the other (e.g., Klossek, Russell, & Dickinson, 2008).

It is an interesting question whether an effect needs to be perceived or whether its presence can also be inferred. Barresi and Moore (1996), as well as Harris (2000), have argued that in the course of development, young children become able to imagine situations and events that are currently not present (i.e., perceivable). Given findings that the imagination of an event recruits similar networks as the actual perception (e.g., Kosslyn, 2005; Laeng & Teodoresco, 2002), it is a plausible hypothesis that imagined effects also can lead to action–effect binding. Some evidence for the hypothesis that imagined effects might also do the job comes from a study by Meltzoff (1995). He demonstrated that 18-month-old infants imitated another’s “intended” action when they only observed a failed attempt. Infants could have predicted the end state on the basis of their knowledge about the physical world. The imagined effect could have been related to the representation of the perceived action, leading thus to a novel action–effect association. Yet, since it is also possible that infants merely relied on already acquired action–effect associations to imagine the unseen effect (Elsner, 2007) or that other learning processes, such as stimulus enhancement, played a major role (Huang, Heyes, & Charman, 2002). Thus, further research is necessary to examine whether inferred action effects are equivalent to perceived action effects in action–effect binding.

Binding

The underlying mechanism of this binding between action and effect representations could be a simple associative learning mechanism, following Hebb’s (1949) learning theory (i.e., one based on automatic associations between concurrently activated codes; Elsner & Hommel, 2001). Alternatively, action–effect binding could be based on associative learning, following the Rescorla–Wagner-model (Cooper, Cook, Dickinson, & Heyes, 2013), or an anticipative learning mechanism (Ziessler et al., 2004). The latter view assumes that the activation of a motor program leads to the expectation of an effect. Consequently, the activation of a motor code through action observation leads to effect anticipation (Paulus, 2012). When the perceived effect is a different one (i.e., a novel action–effect relation is observed), this leads to an adjustment of the existing action–effect association (Ziessler et al., 2004). This anticipative process of action–effect binding can explain findings that toddlers rather imitate actions that are followed by an effect than actions for which the effect is initiated before action onset (Meltzoff, Waismeyer, & Gopnik, 2012, Experiment 3). Whatever the precise underlying mechanism might be, we agree with Elsner and Hommel (2001) that action–effect binding is automatic in the sense that it does not depend on an explicit “intention to learn” (p. 230), constituting thus an example of incidental learning (e.g., Ruffman, Taumoepeau, & Perkins, 2012). As a result of this learning process, infants have acquired an action–effect association, which stores information about the executed action and the caused effect.

Action reproduction

Let us assume that at a later point in time, infants want to reproduce this effect to fulfill a need (e.g., a need for sensory stimulation). We can assume either that particular features of the environment serve as a memory cue that leads infants to recall the possible effects, among which they choose one to reproduce, or that the acquired action–effect association itself is actually an environment–action–effect relation or that it contains information about its execution conditions (for discussions of these options, see, e.g., Heyes, 2013; Hoffmann, Stöcker, & Kunde, 2004; Kiesel & Hoffmann, 2004). Indeed, infant research has provided evidence that a particular environment or the object itself triggers in infants the memory on an object’s typical effect (cf. Bhatt & Rovee-Collier, 1996; Borovsky & Rovee-Collier, 1990) and that 15-month-old infants who observe an actor operating a device and producing an audiovisual effect try to reproduce the effect when the object they are presented with is very similar to the one handled by the experimenter (Yang et al., 2013). When infants find the effect interesting (i.e., enjoy this particular effect; Dweck & Leggett, 1988), they may want to reproduce the effect. The infants’ intention to reproduce the effect (in other words, to attain this goal) shows itself in an activation of the effect representation—that is, the effect code. When this activation passes a particular threshold (i.e., the wish to obtain the effect is strong enough), the associated motor code is activated by spreading activation, which represents an effect- or goal-oriented activation of the motor pattern (Hommel, 2009). When there are no other inhibitory processes involved, this spreading activation leads to the execution of the associated motor program—that is, the previously observed action. As a consequence, the observed action is imitated.

Preexisting knowledge and imitative learning

Theoretically speaking, an association between a particular action and a particular effect is sometimes completely new. So, clear imitation effects are demonstrable (as compared with baseline performance; e.g., Meltzoff, 1988). However, infants are no tabula rasa and come into every situation with previously acquired world knowledge, which affects their learning and performance. That is, they have already acquired action–effect associations, and the newly learned associations are embedded in the already acquired network of action–effect associations. Two options are possible. On the one hand, it is possible that one action representation, which is already related to one effect, will be linked with another effect. That is, infants learn through observation to relate a different effect to a motor code. Given the limited number of effectors humans possess, usually more than one effect has to be related to a motor program. Kiesel and Hoffmann (2004) have shown that action–effect associations are acquired in a context-specific manner and that the very same action can therefore be activated by different effect representations.

On the other hand, infants might relate a novel action representation to an effect code that is already related to another action representation. Under such circumstances, when infants want to reproduce the effect, more than one motor program might be associated with the effect code. Given that several actions are related to the effect, several motor programs will become activated and compete for execution of the action. This can explain the findings that infants often not only imitate the observed action, but also, at the same time, use other means to obtain the same effects (e.g., Paulus, Hunnius, Vissers, & Bekkering, 2011a, 2011b). Such an activation pattern of two (or more) competing effectors might best be described in terms of a dynamic neural field model (cf. Erlhagen & Schöner, 2002; Thelen & Smith, 1994). One hypothesis to derive from this model could be that more demonstrations of the novel and unusual action might lead to a stronger activation of this particular motor code. As a consequence, the action might be more strongly linked to the effect representation. This might lead to an enhanced imitation of the observed action. Some findings on an effect of prior experiences on subsequent imitative learning could be interpreted in such a way (e.g., Barr, Marrott, & Rovee-Collier, 2003; Barr, Rovee-Collier, & Learmonth, 2011).

Summary

In a nutshell, IMAIL assumes that during the observation of another person’s actions and the consequences of these actions in the physical world, infants acquire a novel bidirectional action–effect association. This action–effect association consists of a motor code that has been activated due to action perception and a modality-specific effect code that represents the effect of the other’s behavior. Following Bandura’s (1986) classical distinction between social learning and the behavioral demonstration of acquired knowledge, this associated structure is the result of social learning and forms the basis of subsequent imitation. More precisely, when the infant strives to reproduce the same effect, this intention leads to the activation of the associated motor program and, as a consequence, to the imitation of the action.

From a broader theoretical perspective on the early development of social cognition, IMAIL relates to recent approaches that suggest that sensorimotor and associative learning processes might play a greater role in early social-cognitive development than previously expected (e.g., Barr et al., 2003; Gibson & Pick, 2000; Paulus, 2011; Perner & Ruffman, 2005; Ruffman et al., 2012; Uithol & Paulus, 2013). Furthermore, it suggests that these processes are automatic (although not devoid of attentional demands). That is, the model suggests that infants learn from the mere observation of others’ actions and the effects of these actions and that no explicit intention to learn is needed. It relates to findings that even in adults, mimicry is often automatic and unconscious (e.g., Bargh & Chartrand, 1999; Lakin, Chartrand, & Arkin, 2008) and that learning often happens incidentally (i.e., without the awareness that one is learning; for a review, see Perruchet & Pacton, 2006).

Motor resonance and action–effect binding

It has to be asked how infants are actually able to mirror another person’s action. In other words, why does the observer’s action end up being the one that matches the performer’s action? This question is related to the question of what is actually the common representational format underlying observed and executed actions.

On the one hand, it is possible that this ability relies on an inborn capacity to relate others’ behavior to one’s own behavior, perhaps through a mirror neuron system (Rizzolatti & Craighero, 2004) or an innate intermodal matching scheme (Meltzoff & Moore, 1989). Yet these and related ideas are not universally accepted, for empirical (e.g., Anisfeld, 1991; Jones, 2007; Ray & Heyes, 2011) and conceptual (e.g., Uithol, van Rooij, Bekkering, & Haselager, 2011) reasons. This suggests that the ability to mirror might itself be subject to development.

Importantly, the model’s basic principles can be applied as well to explain this phenomenon. It has been proposed that by observing the consequences of one’s own action (e.g., moving an arm), the representation of the action’s effect (e.g., the visual perception of the moving arm) will be linked to the motor code. When the same action (e.g., the arm movement) is subsequently performed by another person (and the perceived effector is visually sufficiently similar to the child’s own), the activation of the perceptual code will automatically activate the associated motor code (Del Giudice, Manera, & Keysers, 2009), leading to motor activation. That is, through acquired associations between the visual percept of a moving effector and its motor code, infants become able to mirror another person’s action. This process is independent of the particular modality, since corresponding effects can also be observed with auditory stimuli (Paulus et al., 2012). As a side note, acquiring such an association is much more difficult in cases in which the infant cannot observe his/her own actions (e.g., facial gestures). It has been suggested that for these particular actions, caregivers’ mirroring of the infants’ behavior might be the crucial causal link (e.g., Heyes, 2010), enabling the infant to relate a corresponding visual percept to an activated motor code. A recent study with adults confirmed that only visual feedback from another person, but not own proprioceptive feedback helped participants to self-imitated own facial gestures (Cook et al., 2013). In conclusion, the same principles and cognitive mechanisms that underlie imitative learning of novel action–effect relations also subserve the acquisition of the ability to resonance with others’ actions.

In this perspective, the common representational format subserving observed and executed actions are the action’s sensory consequences (Prinz, 1997). Given that actions are controlled by intending their effects and given that we process observed actions by means of their effects, effect representations are the shared format of action production and action perception. In this model, effects’ representations are modality specific rather than “amodal” (Hommel et al., 2001). Yet it should be noted that the size of the shared space between executed and observed actions can differ between actions. On the one hand, self-performed actions include proprioceptive effects that are not available for observed actions. Here, one possibility could be that the perception of the visual sensory consequences of the action might trigger the associated motor code, which then might lead to an activation of the associated proprioceptive effects. On the other hand, sometimes there is no shared effect space. For example, as has been discussed, for self-performed facial gestures, only proprioceptive, but no visual, effects are available, whereas for observed facial gestures, it is exactly the other way around.

It is an interesting speculation as to whether the long premature phase of human infants, during which infants only slowly learn to control their own behavior by watching themselves repeatedly and experiencing the contingencies between actions and effects (for a review, see Rochat, 1998), could be partly responsible for humans’ increased ability for imitative learning (as compared with other animal species; e.g., Herrmann et al., 2007). That is, infants might establish a rich set of action–effect representations through contingently training and perceiving the visual effects of their various body movements during the first year of life. These action–effect associations are then at the basis of additional learning processes through observing other people’s actions, leading to motor resonance when observing others’ actions. Additionally, this basis of imitative behavior in experiencing self-contingencies might also explain the findings of a relation between infants’ behavior in the mirror-self-recognition task and imitative performances (e.g., Asendorpf & Baudonnière, 1993).

In conclusion, in this perspective, infants’ ability to learn through imitation rests on cascading action–effect associations (see Fig. 1). It thus extends current ideomotor approaches to action control (e.g., Hommel et al., 2001) to the area of social and cultural learning. First, infants’ mirroring of the other’s movement rests upon previously acquired (first-order) action–effect associations (i.e., a visual percept of the moving effector has been related to the respective motor code). Such a mirroring process would be short-lived if the action did not lead to interesting consequences (although it might be sufficient to show some immediate effects in terms of mimicry). Second, when this action is followed by an interesting effect (e.g., a light or a sound), infants relate the activated motor code to this novel effect representation, acquiring thus a novel (second-order) action–effect representation. This action–effect association will then be the basis for later imitation.

Fig. 1
figure 1

Schematic illustration of IMAIL. Arrows indicate causal relations; the dashed lines indicate an association that is acquired as a result of a learning process. Person O is the observer who is observing and imitating an action presented by person A. The contents of the clouds symbolize the processes within the cognitive and the motor system of person O. The upper clouds represent the event representations within the cognitive system and the lower clouds the motor representations (represented by a muscle) within the motor system. a Development of motor resonance. The activation of a motor program related to a hand movement (represented by a muscle in the lower code) leads to action execution. Person O observes the visual consequences (i.e., displacement of the hand) and represents these consequences in the cognitive system (hand in the upper cloud). Consequently, a novel action–effect association is acquired, which comprises the motor code of the hand movement and the visual representation of the hand displacement. b Illustration of how person O observes an action of person A. Person A performs a hand action that leads to a light effect (symbolized by the sun). The perceived hand displacement actives the corresponding visual representation (effect code) in person O’s cognitive system. Since this code has previously been associated with the hand movement’s motor code, its activation leads to an activation of the associated motor code (i.e., motor resonance). At the same time, person O observes the salient light effect of person A’s action, which is also represented in the cognitive system (arrow leading from real light effect to the representation of the light effect in the upper cloud). This activated effect representation now becomes associated with the already activated motor code, forming a novel action–effect association. c Demonstration of later imitation. Person O would like to reproduce the light effect. This wish is analogous to the activation of the effect’s representation. This leads to an activation of the associated motor program (i.e., the hand motor code), which leads to action execution and the reproduction of the effect. Thus, the observed action is imitated

Empirical support

Support for IMAIL has been provided in a number of recent behavioral and neurophysiological findings with infants and adults. In the following, this evidence will be systematically reviewed. First, experimental studies from infancy research will be presented. Then, evidence from the adult literature will be considered, suggesting that IMAIL is not restricted to infancy but might also play a role in adult observational learning.

Infancy research

Examining imitative learning in 14-month-old infants, Paulus and colleagues (Paulus, Hunnius, & Bekkering, 2013a; Paulus et al., 2011a, 2011b) assessed the corollaries derived from the ideomotor approach to infant imitation. In a series of studies, they manipulated (1) whether or not an observed action was within infants’ motor repertoire and (2) whether or not it led to a salient and interesting effect (e.g., a sound or a light effect). The findings of these studies provided converging evidence that infants’ imitation rate dropped down to baseline performance either when the action was demonstrated in a way that infants could not relate to their own motor repertoire (as the model demonstrated it in a way untypical for infants) or when it was not followed by a salient outcome.

In a first study, Paulus and colleagues (2011b) examined whether infants’ selective imitation is due to motor resonance or a rational evaluation of another person’s behavior. More concretely, in a previous study, Gergely, Bekkering, and Király (2002) presented two groups of 14-month-old infants with a model bending over a lamp on a table and touching the lamp with her forehead, causing a salient light effect. In one condition (hands occupied), the actor was pretending to be cold, holding a blanket with her hands when performing the head touch. In the other condition (hands free), her hands were free. More infants imitated the head action in the hands-free than in the hands-occupied condition. The authors interpreted their finding as evidence for rational imitation in infancy: They suggested that in the hands-free condition, infants thought that the model must have had a good reason to use her head and not, more efficiently, her hands. Attributing thus some rationality to the head action in the hands-free condition, infants decided to imitate it. Challenging this interpretation, Paulus and colleagues (2011b) noted that when infants themselves produce this particular head action, they always do it with their hands on the table (most likely to maintain a stable posture and body balance). They suggested a motor resonance explanation—that is, that the greater overlap between infants’ ability to perform the head action and the model’s way of demonstrating the action in the hands-free condition could have been responsible for the relatively higher rate of imitation. To support this claim, the authors added three novel conditions to the two original conditions, in which they systematically manipulated whether or not the model’s hands were free when demonstrating the head action (i.e., apparent rationality) and whether or not the action was demonstrated in way that matched infants’ own behavior (i.e., matching the motor repertoire). The results of all conditions supported the motor resonance view. In a subsequent study, Paulus and colleagues (2013a) put the lamp on a rack so that the head-bending/lamp-lightning action could easily be performed by leaning forward. Even though, in one condition, the model’s hands were free, while they were occupied in the other condition, infants imitated the head action to the same (high) extent. This was not the case in two baseline conditions. These studies suggested that motor resonance, rather than a rational evaluation of others’ behavior, plays a fundamental role in infant imitation.

Here, it is important to note that in the former study, for infants in all conditions, an action was demonstrated that they were, in principle, able to perform (bending with head over a lamp on a table). Yet, in some conditions, the action was demonstrated in a way that did not closely resemble infants’ manner of performing this action themselves (e.g., having the arms folded across her chest instead of putting them on the table; see Footnote 1 in Paulus et al., 2011a). Note that on a conceptual level, the action is essentially the same in all these conditions, yet the concrete sensorimotor realization is different. The results thus suggest that action mirroring plays a crucial role in imitative learning and that action mirroring takes place on a sensorimotor level.

In another study, Paulus and colleagues (2011a) used the same experimental setup to investigate the role of salient action effects and motor resonance in greater detail. They manipulated whether or not the head action was followed by a salient light effect. Additionally, they examined whether or not motor resonance played a crucial role in infants’ imitation. To this end, the actor demonstrated that the lamp could not be turned on with the hands (the lamp was secretly temporarily completely turned off). Then, she showed that the lamp could be turned on with the head. Previous studies with 14-month-old infants suggested that a prior ineffective demonstration that is followed by the effective demonstration should support infants’ imitative learning (Király, 2009). Yet again, the action was demonstrated in a manner slightly different from the way it was performed by the infants. Notwithstanding this pedagogical demonstration and notwithstanding the fact that infants at this age are, in principle, able to perform the head action, they did not reliably imitate the action above baseline performance. Furthermore, imitation dropped to baseline when the action was demonstrated in a way that infants could imitate easily but when it was not followed by a clear and interesting effect (i.e., the light remained off).

Additional evidence for central claims of the model comes from findings that 10-month-old infants relate actions to their learned effects (Perone, Madole, & Oakes, 2011; Perone & Oakes, 2006) and that salient effects play an important role in imitation (e.g., Barr, Wyss, & Somanader, 2009; Elsner, 2007; Hauf & Aschersleben, 2008; Yang et al., 2013). In a study by Elsner and Aschersleben (2003), 9-, 12-, 15-, and 18-month-old infants observed a model acting on a box to which a ring was attached. The ring could be either pressed or pulled. Performing one of these actions led to a specific effect (sound or light). Subsequently, infants were allowed to act on the lamp. In one condition, the action–effect relation was reversed, while it stayed the same in another condition. The authors found that by 12 months, infants performed more target actions in both observation groups than in a control condition in which infants were presented with a demonstration. Additionally, by 15 months, infants performed significantly more target actions when the action–effect relations stayed the same than when they were reversed. The results suggest that by their first birthday, infants acquire action–effect relations through observation. A subsequent study, using simpler actions (i.e., buttonpresses) found similar results in 9-month-old and, to a weaker extent, even in 7-month-old infants (Hauf & Aschersleben, 2008).

Moreover, given that the model is based on associative learning mechanisms, infants’ performance should be affected by the contingency with which an action leads to an effect. Support for this claim comes from a study by Schulz, Hooppell, and Jenkins (2008). They presented 18-month-old infants with a number of actions causing interesting effects. Importantly, they varied the consistency with which the actions led to the effect. The results showed that infants imitated the action more precisely when it always produced an effect, as compared with when it only sometimes produced the effect. The model can explain the results, since it would suggest that, in the latter condition, action representation and effect code have been less strongly associated with each other, leading to inferior performance.

Neural evidence for the claim that infants indeed acquire action–effect associations through observation comes from a recent neurophysiological finding reported by Paulus, Hunnius, and Bekkering (2013b). The authors conducted a 1-week training study with a group of 9-month-old infants. Infants observed on a daily basis how a caregiver played in front of them with a novel rattle that produced a specific sound effect when shaken. In addition, the same infants listened every day to a second sound that was presented by means of a voice recorder. Given that infants of this age are able to smoothly grasp and manipulate objects, infants should be able to resonate with this action. That is, they should show motor activation when observing the other’s rattling action. Moreover, the model predicts that they should relate the activated motor program to the representation of the rattle action’s sound effect, even though they never played with the rattle themselves. To test this prediction, Paulus and colleagues assessed infants’ electrophysiological responses to these two sounds plus an additional control sound. Using mu desynchronization as a marker of motor activation (e.g., Marshall & Meltzoff, 2011; Reid et al., 2011), the results provided evidence that infants showed stronger motor activation when perceiving the rattle’s sound than when perceiving the other two sounds, although they themselves had never trained with the rattle. Further direct evidence for the role of action mirroring in imitative learning comes from a recent study by Akano et al. (2013). In this study, the authors showed that infant imitation was directly related to mu wave suppression during the observation of an action in a previous phase. This finding provides strong evidence for the model’s claim that action mirroring during action observation plays a crucial role in imitative learning.

Although not directly testing the corollaries of the present model, further work is in line with core tenets of the present model. For example, recent electrophysiological work demonstrated enhanced motor activation during action observation (e.g., Marshall, Young, & Meltzoff, 2011; Nyström et al., 2011). Moreover, the fact that motor resonance is already effector specific in 14-month-old infants (Saby et al., 2013) supports the model’s central claims. Additionally, the model relates to findings that children’s social learning benefits more from full action demonstrations than from merely demonstrating the final effect (in “ghost conditions” in which the task is operated without sight of an agent performing it; Hopper, Flynn, Wood, & Whiten, 2010).

Furthermore, it can explain the developmental timeline of the emergence of imitative behaviors. Recently, Jones (2007) presented a rich set of data in which she assessed the imitative behavior of 162 infants 6–20 months of age. Interestingly, she presented the children with a range of different behaviors and systematically examined the proportions of infants producing the respective behaviors when they were modeled, as compared with a situation in which another action was modeled (representing thus a kind of baseline appearance probability of the respective behaviors). The results showed that imitation rarely (if at all) happened at 6 months—calling into question some of Piaget’s observations—and increased over the next few months of life. Importantly, the age of appearance was different for the eight different actions modeled. Given the heterogeneous pattern of results, Jones (2007) concluded that “the origins of imitation . . . and the nature of imitation . . . are almost entirely unknown, and waiting to be described and explained” (p. 598). Let us consider whether the present model is able to explain—at least a large part of—the pattern of results.

One striking aspect of Jones’s (2007) findings is that actions with salient action effects are imitated much earlier than actions without clear effects. For example, imitation of clapping hands develops by 8 months and is reliably imitated at 10 months, whereas moving the hands to the head is not imitated before 16 months. For clapping, the model hypothesizes that infants showed motor resonance during action perception because they could refer the observed action to their own motor repertoire—given that their own ability for bimanual coordination develops between 8 and 10 months (Fagard & Jacquet, 1989; Fagard & Peze, 1997). Given that this action had a clear auditory effect, which could be perceived by the infant, the activated motor code could be linked to an activated effect code (i.e., representing the sound), acquiring thus a novel bidirectional action–effect association. This action–effect association subsequently subserved imitation (i.e., when infants wanted to reproduce the clapping). This is in line with the model’s central claim that salient action effects guide the subsequent imitation of observed behavior. Yet moving the hand to the head was not reliably imitated before 16 months. Here, the model suggests either that the children could not refer the action to their own motor repertoire (partly as one sees oneself hardly performing this kind of action and, thus, has no visual representation of this action) or that the effect of the action was less attractive for the younger infants. Surely, touching the head with the hand produces a proprioceptive sensation; yet this is the case only for the actor (who perceives his own touch), not the observing infant. The only perceivable effect for the infant is the visual displacement of the hand.

Furthermore, the findings of a striking developmental difference in the imitation of the vowel sounds >ahh< (reliably detected by 8 months) and >eh< (reliably detected by 12 months) has been puzzling, given that both sounds produce clear effects. This difference is even more striking given that other studies have reported an even earlier onset of the imitation of >ahh< sounds of around 5 months (Kuhl & Meltzoff, 1996). Here, it has to be noted that one of the classical theories of phonological development, Roman Jakobson’s (1941) theory of phonological universals (for a current review, see Fikkert, 2000), predicts such a developmental difference in vowel production. He argued that phonological development follows a series of contrasts according to which—on the vowel side—first >ah< is acquired, since the maximal open vowel >ah< is the maximal contrast for the labial stop consonant >p<, which is quickly followed by >m<. Then the vowal >ih< is acquired as a contrast between maximally low and high vowels. Only thereafter is the vowel >eh< acquired. In sum, >ah< and >eh< differ with respect to their ease of production. Given that the present model of imitation in infancy predicts that ease of motor production determines the likelihood of imitation, it predicts a developmental difference in the imitation of >ah< and >eh< sounds. The data of Jones (2007) thus provide empirical support for the present theory of infant imitation. In sum, although IMAIL clearly does not deny that other processes also might affect infants’ imitation rate (e.g., tutoring by parents; Jones, 2007), the previous analyses suggest that the present model does, in general, a good job in explaining the heterogeneous pattern of Jones’s (2007) results.Footnote 1

Finally, the model can also be extended to explain other findings on infants’ imitation, in which a complex interplay of several effectors is involved. For example, Pinkham and Jaswal (2011) presented 18-month-old infants with the already mentioned head action, which led to the interesting light effect. An observation group received only this demonstration before having themselves the opportunity to perform the action. Another group of infants (the action group) had the possibility of acting themselves beforehand on the object and discovered that they could obtain the effect by a more usual hand action. The authors reported that more infants imitated the unusual action in the observation group than in the other group. It is noteworthy that infants’ preexperience with the hand action and the effect resembles the classical acquisition phases of studies investigating the acquisition of action–effect associations through first-hand action experience (e.g., Elsner & Hommel, 2001, 2004; Kunde et al., 2002). That is, we have good reasons to assume that this additional experience led to the acquisition of an action–effect association between the action code representing the hand action and the effect code representing the light effect. The present model predicts that when infants subsequently were presented with the actor demonstrating another action that led to the same effect, the infants also would associate this action code with the effect representation. As a consequence, the intention to reproduce the effect led to the activation of two competing motor codes. Given that first-hand acquired action representation are stronger than the ones acquired through observation (cf. Maslovat, Hayes, Horn, & Hodges, 2010), infants thus performed the previously learned hand action than the unusual action that had subsequently been demonstrated to them.

Adult findings

Yet action–effect learning through observation does not need to be restricted to infants. If it is indeed a basic mechanism subserving imitative learning, it should also account for adults’ behavior. To directly examine this issue, Paulus, van Dam, Hunnius, Lindemann, and Bekkering (2011) translated a design of Elsner and Hommel (2001) into an observational learning scenario. More precisely, they asked participants to observe another person who was pressing two buttons in an alternated fashion. Each buttonpress triggered a specific auditory effect. In a subsequent test phase, the same tones were presented as stimuli to which the participants had to react as quickly as possible with buttonpresses. A finding of faster responses in the test phase for stimulus–response mappings that are compatible with the action–effect mappings in the observation phase (as compared with incompatible mappings) would provide evidence for the notion that participants acquire action–effect associations via the observation of others’ actions. The results confirmed this prediction. Participants were faster to react with an action to an effect that was previously produced by this action—executed by someone else.

This finding can be explained only under the assumption that participants represented the other’s action–effect-association as if it were their own. Evidence that adults might indeed represent others’ responses and their sensory effects as their own comes from a recent study by Pfister, Dolk, Prinz, and Kunde (2013). These authors asked participants to perform a response–effect compatibility task either alone or in a joint condition with another person. They found a compatibility effect in the joint condition, suggesting that the participants represented the other’s action–effect relations. This interpretation is further support by empirical and theoretical work suggesting that people represent others’ tasks in shared task representations (cf. Sebanz, Bekkering, & Knoblich, 2006).

Finally, the model receives support from neuroimaging findings. Stefan et al. (2005) reported that the pure perception of an action leads to the acquisition of a kinematically specific memory trace of the observed motor action in the primary motor cortex. This suggests a role for mirroring processes in memory formation and motor learning and provides empirical evidence for the first step of the first phase in the present model. That is, it shows not only that the observation of an action leads to an activation in the observer’s motor system, but also that the observed action is stored in the motor cortex.

Conceptualizing developmental change in imitative learning

One central question, though, is how development takes place. In other words, how do developmental changes and improvements in imitative performances occur? Is it possible to conceptualize development within this model, or do we need to introduce some preexisting cognitive principles to guide development?

First, the model makes the prediction that motor development has an important impact on imitative learning. With increasing motor abilities, infants can resonate with more actions. That is, actions that have, beforehand, not been in their motor repertoire will now lead to enhanced motor resonance, which facilitates infants’ acquisition of a novel action–effect association through observation. Accordingly, infants will be able to imitate this action. As a consequence, growing motor abilities might thus be partly responsible for developmental differences in infants’ ability to imitate, as has been observed in numerous studies (e.g., Brownell, 1988; Elsner, Hauf, & Aschersleben, 2007; Fagard & Lockman, 2010).

This point can also explain why imitation has such a late developmental onset, even though its basic mechanisms (action–effect binding) relies on very simple associative learning abilities. On the one hand, the acquisition of action–effect associations depends on the ability to produce an action and, therefore, on the maturation of the motor system. On the other hand, recent findings have shown a developmental difference in the acquisition of action–effect associations and their employment for action control (Verschoor, Spape, Biro, & Hommel, 2013), suggesting that using action–effect associations is more difficult than just acquiring them.

Second, sometimes we are confronted with more complex actions that themselves consist of a number of simple actions. Indeed, cognitive science has discussed the idea of motor primitives as elementary building blocks of more complex behavior (e.g., Flash & Hochner, 2005; Thoroughman & Shadmehr, 2000). Even though the single action steps may be within infants’ motor repertoire, infants may not be able to imitate the overall action sequence, since it is initially unknown to them. Here, an ideomotor approach to imitative learning would offer two explanations, dependent on whether the sequence of action steps consists of enabling or constraining actions—that is, whether each action can, in principle, be completed before the other or whether the execution of an action is a constraint for the next action to be possible (cf. Elsner et al., 2007).

In cases in which the action sequence consists of enabling action steps, the model expects that infants will most likely be imitating the action steps that lead to the interesting effect, neglecting the other action steps. Evidence comes from a number of studies. For example, Brugger, Lariviere, Mumme, and Bushnell (2007) presented 14- to 16-month-old infants with different conditions in which an actor modeled two action steps, which either constrained each other or were independent of each other. In one of the conditions (the off-object condition), the second action step led to an interesting effect, while the first action step was unnecessary and unrelated. The results showed that only 10 % imitated this unnecessary action step, while over 60 % imitated the second action step. In another study, Hauf, Elsner, and Aschersleben (2004) presented 12- and 18-month-old infants with a three-step action sequence in which the second action, the third action, or no action led to an interesting effect (an arbitrary sound effect). The authors reported that the action that was followed by the interesting effect was imitated first and more often than the other actions, stressing the role of salient action effects in infants’ imitation.

Yet the picture gets more complicated in cases in which the action sequence consists of action steps that constrain each other. In these situations, it has been found that infants are more likely to just imitate the very first action step of such a sequence (e.g., Barr, Dowden, & Hayne, 1996; Brugger et al., 2007; Elsner et al., 2007). One explanation would be that infants resonated with this first action step (i.e., showed motor resonance) and—due to processing limitations or an inability to segment this first step from the following action steps—related the action to the final effect, leading infants to reproduce only the first step when aiming to achieve the effect. Yet this is typically less the case for older children, who tend to reproduce more and more of the first action steps in a row (e.g., Elsner et al., 2007).

How could development in such a case proceed? One assumption would be that young children need to develop a hierarchy of effects. That is, they acquire the specific action–effect associations related to each single action step. Coding the different action steps in terms of their particular effects allows the observer to construe a hierarchy of these effects (which effect has to be reached first to be able to get to the next effect until the final effect is realized), which then guides reproduction of the observed behavior by achieving step-by-step the whole action sequence. Alternatively, developmental changes could be due to children’s growing motor abilities. Infants might learn an action sequence through their own first-hand action experience (e.g., relating reaching, grasping, and transporting to an overall action sequence or scheme). Once they have acquired this action sequence, they will be able to relate this sequence to an effect without the need to separately consider the single action steps the sequence consists of. Improvements in action control might thus subserve infants’ improvement in imitative performance.

Third, the model is reconcilable with approaches that stress the important role of scaffolding processes in imitative learning. The important insight derived by scaffolding approaches is the understanding that infants are not just observers of a social world but participate in continuous exchange with others (Nelson, 2007; Rogoff, 2003). The social world reacts to the infants and recognizes their helplessness. Unlike many other species (see Byrne & Russon, 1998), human caregivers adapt their actions to the infants’ current capabilities and help them to learn. Highly relevant for theories of infant imitation are findings that caregivers “tune in” into infants’ motor abilities. In particular, Brand and colleagues have provided evidence for a behavioral tendency called “motionese” (Brand, Baldwin, & Ashburn, 2002; Brand & Shallcross, 2008; Brand, Shallcross, Sabatos, & Massie, 2007; Nagai & Rohlfing, 2007). Here, it was found that caregivers modify their actions when they present them to their infants. In particular, they tend to repeat important action parts, they adapt the motor characteristics, and they simplify the actions. Such facilitated action demonstrations are central for imitative learning, since they facilitate the mapping of the perceived action onto infants’ own motor repertoire. As a consequence, the acquisition of a novel action–effect association through observation, which subserves later imitation, is facilitated. Further empirical evidence for this claim was recently provided by Williamson and Brand (2013), who showed that child-directed action demonstrations promoted imitation in 2-year-old children. Taken together, this line of research suggests that infants do not have to solve the difficulties that are related to imitation alone but can rely on significant others who—albeit not on a theoretical level—implicitly have recognized that their infant benefits most when an action is demonstrated in a way that leads to higher motor resonance.

Fourth, although the present model emphasizes the role of sensorimotor processes at the basis of imitative learning, it acknowledges that cognitive and conceptual development, most likely supported by language acquisition, will support the developing child in overcoming some limitations of sensorimotor learning (for general discussions on the impact of language on social understanding, see Moore, 2006; Nelson, 2007). Embedding observed behavior into a semantic system will open another route for imitative learning (Tessari & Rumiati, 2004). In particular, growing knowledge about the body and the relations of different body parts to each other might play an important role in mapping observed actions onto one’s own motor repertoire. For example, research by Brownell, Moore, and colleagues has indicated that only late in the second year of life do children become aware of objective characteristics of their own bodies (Brownell, Zerwas, & Ramani, 2007; Moore, Mealiea, Garon, & Povinelli, 2007). Up to their third year of life, children have difficulty explicitly representing their own body topography (i.e., its shape, structure, and size; Brownell, Nichols, Svetlova, Zerwas, & Ramani, 2010). We can assume that from the age at which children become able to represent and reason about their body structure on, their ability to imitate others’ actions will not rely solely on automatic motor activation through action observation. Rather, children will become able to also relate another person’s action to their own motor repertoire by means of explicitly reasoning about which of their own behaviors corresponds to the observed action. The employment of the semantic system may help children to overcome some limitations of pure sensorimotor learning, such as when relating others’ actions to a body part, which the child normally cannot see himself. Moreover, language can play a role in the acquisition of goal representations (Kray et al., 2006).

One apparent challenge to the present model is whether it is able to capture the imitation of mimed movements and meaningless gestures (i.e., movements without salient effects). Here, two answers are possible. On the one hand, one could argue that the model primarily captures imitation in infancy. Here, research has shown that it is only toward the end of infancy that children start to also imitate actions without salient effects (e.g., Jones, 2007). At this age, developmental theorists have assumed the onset of more complex representational abilities including propositional language (e.g., Moore, 2006), which add additional competencies to the basic sensorimotor nature of infant learning and might partly transform imitation. On the other hand, it is actually not clear whether these at-first-sight meaningless actions do not have effects. Often these kinds of activities are embedded into social routines in which the salient effects are provided by the interaction partners (Nelson, 2007). That is, the parent performs the mimed movements and produces an interesting sound or facial expression (e.g., smiling). Here, the mimed movement could be related to the parent’s smiling, and in an attempt to reproduce the parent’s smile, the child will imitate the mimed movement. For example, performing the (initially meaningless) bye-bye gesture is accompanied by salient sound (parent’s verbalization of “bye-bye” in an infant-attuned manner of speaking) and visual (parent’s smile) effects, which could lead infants to eventually imitate this gesture (and thus give the gesture its meaning). Importantly, a recent study with adults provided empirical evidence that socially provided effects (here, face movements) can act as salient effects and can result in action–effect binding (Sato & Itakura, 2013).

Predictions

So far, the review has shown that the model has explanatory value, since it is able to incorporate a number of previous findings on infant imitation and social learning; additionally, studies directly examining predictions’ of the model have provided empirical evidence in favor of the model. Are there any further predictions of the model that can be empirically tested in the future? Does it generate any novel research ideas? In the following paragraphs, we would like to give two examples for predictions that could guide future research in this area.

First, the model proposes that no action–effect association can be learned if the action is presented in a way that does not match infants’ own way of performing the action (i.e., infants’ motor repertoire). Consider, for example, Paulus et al.’s (2013b) findings of motor activation for the perception of rattle sounds that were previously produced by someone else’s rattling action in 9-month-old infants. The model hypothesizes that infants should not be able to acquire an action–effect association through observation (and to show subsequent motor activation for the perception of the rattle sound) were the model to shake the rattle in a manner that the infants could not (easily) relate to their own motor repertoire. That is, even though, on a conceptual level, the rattling action might be the same in both cases, the model would predict differences in subsequent motor resonance for the rattle sound. Note that models that assume that infants represent others’ and their own actions on an intermodal or conceptual level (e.g., Meltzoff & Moore, 1989) would not predict such a difference, since the actions are essentially the same.

Second, the model predicts a crucial role for salient effects in infants’ imitative learning, but it is not restricted to physical effects. Given recent evidence that salient social effects also can result in action–effect binding (Sato & Itakura, 2013), the model predicts that infants would also acquire novel behaviors through observation when these behaviors lead to effects in the social environment (e.g., someone smiling, laughing, or making a funny face). This prediction is highly relevant given that it has been argued (e.g., Uzgiris, 1981) that early imitation can serve two functions: a cognitive function (i.e., cultural learning) and a social function (i.e., affiliation). This social side of imitation has recently led to great discussion and to considerations of whether the same or different mechanisms might underlie the two functions of imitation (e.g., Over & Carpenter, 2013). The present theory contributes to this debate by suggesting that, even though imitation might have a number of different beneficial consequences (including cultural learning and affiliation), the cognitive mechanisms subserving imitation in these instances could be essentially the same. A corollary of this view is that even though imitation can have various social consequences, these consequences can be nonintended by-products of imitation that is based on motor resonance and salient effects.

Third, the model provides an answer to the theoretical question of how novel knowledge relates to already acquired knowledge. Does one replaces the other (as, for example, is assumed in theories on conceptual development; e.g., Carey, 1991)? The present model predicts unique interaction effects between already established action–effect associations and newly acquired ones (through observational learning). As was suggested above, the model predicts that such a situation would lead to an activation pattern of two (or more) competing effectors, which might best be described in terms of a dynamic neural field model for action execution (cf. Thelen & Smith, 1994). Such a dynamic field model could describe the interplay between already established action knowledge and action knowledge acquired through observation in terms of different action–effect associations, which differ in their strength.

Conclusion

The question of how infants are able to imitate was also a major issue of discussion in the early phases of developmental psychology. The present model relates to a debate between Guillaume and Piaget (Guillaume, 1925; Piaget, 1962). It suggests that learned associations between actions and effects might play an important role and is thus in line with the considerations initially put forward by Guillaume. It also provides a cognitive framework for the process model put forward by Bandura (1977). In IMAIL, Bandura’s memorizing phase refers to the activation of motor codes and sensory (effect) codes through action perception and the associations between both types of codes. The acquired action–effect association is assumed to subsequently guide infants’ imitation. The notion that imitation is guided by the desired effect relates to Bandura’s (1977) notion that imitation is guided by reward and other motivational processes. Infants’ inclination to reproduce the effect leads to the activation of the associated motor program, which in turn leads to the execution of the action. This ideomotor process represents the cognitive mechanisms underlying the motor reproduction phase in Bandura’s (1977) model. Yet it should be noted that the present model deviates in one core element from Bandura’s (1977) theory. Bandura’s (1977) social learning theory—following the behaviorist tradition—conceptualized the actor’s motivation to reproduce an action as based on (observed) reinforcement and punishment. In contrast, the present model proposes that imitation is based on and motivated by the representation of the action’s associated effect; that is, it conceptualizes imitation as intentional or goal-directed behavior.

The present model of imitative learning has a number of advantages. It presents a cognitive account of imitative learning in infancy that is based on simple associative learning mechanisms. That is, it is parsimonious with respect to theoretical assumptions about the mechanisms underlying infant imitation. By focusing on infants’ ability to acquire bidirectional action–effect associations that has been documented in the literature (e.g., Paulus et al., 2012; Verschoor et al., 2010), it does not capitalize on further sophisticated cognitive abilities. By stressing the perceptuo-motor nature of imitative learning, it relates to sensorimotor approaches to early social cognition (e.g., Uithol & Paulus, 2013) and adds to a long-standing sensorimotor research tradition on infant imitation (e.g., Kaye & Marcus, 1981).

In conclusion, although researchers are widely agreed on the importance of imitative learning for human development (e.g., Richerson & Boyd, 2005), the cognitive mechanisms underlying infants’ ability to acquire novel action knowledge through the observation of others’ actions have remained unclear. The present contribution develops a cognitive model of imitative learning in infancy (and beyond), which aims at specifying the neurocognition of imitative learning. Extending the ideomotor approach (e.g., Hommel et al., 2001; James, 1890/1981) to the realm of social learning, it suggests that imitative learning is based on the acquisition of novel action–effect associations through action perception and infants’ subsequent employment of these associations in guiding their action production.