In the original version of the Simon task (Simon & Rudell, 1967), single participants have to choose between a left- and a right-hand key press according to a non-spatial attribute of a stimulus presented either on the left or on the right of a fixation point. The spatial correspondence between stimuli and responses is a main determinant of the participants' performance. Performance expressed both in terms of error rate and mean reaction time (RT) is better when the required response corresponds spatially to the irrelevant stimulus location (ipsilateral association) than when it does not correspond (contralateral association). This effect is termed the “Simon effect” (Hommel, 1993, 2011; Simon, 1990). It has been proposed that the critical variable for the Simon effect is the correspondence between the perceptual representations of the stimulus and response events that the participant codes “left” and “right” relative to a reference frame (see Hommel, 1993, 2011). In this concept, choosing between spatially coded responses is a prerequisite for the Simon effect to emerge: without the need to discriminate between these response representations, ipsilateral and contralateral associations would be equivalent (Dolk et al., 2014). In other words, the key-factor for the Simon effect would be the number of concurrently active response representations. In what follows, we shall refer to this concept as “referential coding” (Dolk et al., 2014).

While the Simon effect is typically obtained in single participants performing a choice RT task, Sebanz et al. (2003) showed that an analogous effect (although smaller) can emerge in a joint Simon task consisting of sharing the task between two co-actors, which renders the task a Go/No-Go task. This effect was initially called the “Social Simon effect” because it was first obtained in a social context, in which participants performed the Go/No-Go task together with a co-actor, each individual operating one of the two response keys (Guagagno et al., 2010; Sebanz et al., 2003, 2005). During the past ten years, the Simon task paradigm has been intensively used to investigate how cognitive processing could be influenced by the presence of others (for a review, see Dolk et al., 2014). Results consistently showed that when two participants – seated side-by-side – share a task, ipsilateral associations consistently lead to shorter RTs than contralateral associations, an effect of about 10 ms eventually termed the “joint Simon effect (JSE).” This spatial correspondence effect seems to disappear under most circumstances when participants perform Go/No-Go tasks in isolation (Dolk et al., 2014; Hommel, 1993), it has been proposed (Sebanz et al., 2006) that the presence of a co-actor motivates the participants to code their unique response in terms of left or right relative to the other’s response. Such interpretations assume that when co-acting, the actions of the other person are integrated into one’s own body’s schema and participants behave as if they were choosing between alternative responses even though such a choice is objectively unnecessary. This co-representation phenomenon would thereby induce a JSE while the participants objectively perform a Go/No-Go task.

Importantly, Dolk et al. (2011, 2013) show evidence of a JSE (or “cSE” in reference to Donders’ type c task) simply with visual and/or auditory irrelevant objects placed nearby single participants. This finding demonstrates that the presence of a co-actor per se is unnecessary to induce a JSE, and suggests that attentional attracting events are responsible for the emergence of a cSE, interpreted in terms of referential coding (Dolk et al., 2011, 2013). Accordingly, the irrelevant object is represented as if it constituted a response alternative to the required response. In the course of a trial, the participant would thus choose between the response and its putative alternative representing the irrelevant object according to their relative lateral locations.

While referential coding provides a suitable interpretation of both SE and cSE (whatever the social or non-social nature of the attractive event), alternative models do not assume that the associations between the stimulus and response events are mediated by representations. Instead these associations could be more or less direct (or strong), in an associationist stance (DeJong et al., 1994; Kornblum et al., 1990). In this concept, what matters for the SE is the strength of individual stimulus-response associations: ipsilateral associations are stronger and thus lead to shorter RTs than contralateral ones. Accordingly, the SE can emerge in Go/No-Go tasks even when no salient event is present in the participants' peripersonal space, simply because ipsilateral associations are stronger than contralateral ones. Henceforth, we shall refer to this notion as “direct activation” (Kornblum et al., 1990).

Crucially, referential coding and direct activation make different predictions with respect to the emergence of a cSE when performing the task in isolation. Direct activation leads one to expect that ipsilateral stimulus-response associations are stronger and faster than contralateral ones. Thus, a cSE similar to that initially reported by Callan et al. (1974) should emerge. Note that the direct activation notion (Shiu & Kornblum, 1999) is mute with respect to the presence of an additional response button in the participant’s peripersonal space. In contrast, according to the referential coding account, the cSE is conditioned by the presence of visual and/or auditory action events sufficiently attractive to change the weight of the spatial location codes (Dolk et al., 2011, 2013; Stenzel & Liepelt, 2016). Assuming that attention-attracting events are responsible for the cSE (Dolk et al., 2013), no cSE is expected in absence of such events.

However, a close look at the current literature suggests that it might be premature to accept the null hypothesis regarding the emergence of a cSE in Go/No-Go tasks performed in isolation. While retrospective analysis of the literature should be taken with caution because of publication bias, a review of previous studies suggests that it might be reasonable to question the absence of a cSE (Table 1). If a cSE emerges in the absence of attracting events, it is very small (a few milliseconds) compared to that obtained in two-choice RTs (25–30 ms). Such a small amplitude may give a clue regarding the reasons for which it has often been considered to be absent. Of the 28 experiments reported in Table 1, 24 experiments report ipsilateral stimulus-response associations faster than contralateral ones and only four experiments report a reverse effect. Note that its statistical significance is often unspecified and when the statistics are available it has sometimes be reported to be marginally significant (e.g., Tsai et al., 2006), suggesting a lack of statistical power. The present study was conducted to assess whether or not a cSE is present when performing a Go/No-Go Simon task in isolation. Specific attention was paid to increasing the statistical power by substantially enlarging both the number of participants (48) and the number of trials completed by each participant (960 trials), almost twice the number of trials compared with past studies. This further allowed us to perform RT distribution analyses in order to reveal the temporal dynamics of information processing in this task.

Table 1 Overview of studies reporting a Simon effect during visual and auditory Go-No-Go tasks performed in isolation

In addition, we sought to explore the influence of irrelevant objects placed nearby single participants on the cSE. So far, as stated above, this interaction has been interpreted in terms of referential coding, an interpretation that derives from the theory of event coding (TEC; Hommel et al., 2001, 2009). In TEC, feature overlap between event representations creates the conditions for the SE (and other compatibility effects). Reasoning within this frame, we expected the lateral representations of the response and of irrelevant objects to be facilitated when the irrelevant objects share features with what would be an actual alternative response. To document this issue, in one condition of the experiment we introduced a supplementary inactive response key laterally positioned next to the key on which the Go response was to be produced.

Method

Participants

Forty-eight students volunteered (36 females, aged 18–31 years (M = 21.39 years; SD = 2.86)). They were paid (5€) for taking part. Participants were all right-handed, and had normal or corrected-to-normal vision. Informed written consent was obtained according to the Declaration of Helsinki.

Design and cognitive tasks

A single participant was centrally positioned in front of a device and was required to respond with one effector (left or right hand) to one of two non-spatial stimulus attributes (green or red color). Half the participants performed the Go/No-Go task seated in front of a table equipped with two lateralized response keys sharing the same spatial configuration as the stimuli. Although static, the presence of an additional irrelevant response key can be expected to induce a lateral representation of the to-be-given response. The other half were seated in front of a table equipped with one response key lateralized on the same side of the responding hand (Reeve & Proctor, 1988). Participants were sitting on a chair facing a black panel 1.5 m away. Two green/red light-emitting diodes (LEDs), separated by 18 cm, were positioned at both sides of a central blue gaze-fixation LED. The response keys were 10-cm plastic tubes equipped with a button on the top and fixed on a table. In the two-button condition, the response keys were right and left lateralized and separated by 20 cm. Participants maintained one hand (active hand) on one response key, the other hand lying on the ipsilateral leg (passive hand). The nature of the active hand (left or right hand) was counterbalanced across participants. In the one-button condition, the set-up was exactly the same except that only the relevant response key was present.

Participants were asked to respond as quickly and accurately as possible. The light could be green or red and could be delivered either to the left or to the right side. The response was given according to the color of the LED (task-relevant attribute) whatever the location of the LED (the task-irrelevant attribute). Half the participants had to exert a press with the thumb when the LED was red and the other half of the participants exerted a press when the LED was green. There were two types of trials in each block: ipsilateral trials (50%) and contralateral trials (50%). In ipsilateral trials (IPS), the lateral locations of the stimulus and response were on the same side (e.g., left stimulus/left response). In contrast, in contralateral trials (CNT), the lateral locations of the stimulus and response were on the opposite side (e.g., left stimulus/right response).

A trial began with the switching on of the blue central LED for 300 ms. Then the stimulus was displayed for 200 ms. Regardless of the correctness, the delivery of a response turned off the stimulus and the next trial began after a constant 1,500-ms inter-stimulus interval (ISI). If 1 s elapsed without a response, the LED extinguished and the next trial began after the ISI. The design of the experiment was optimized to allow the completion of a large number of trials per participant (ten blocks of 96 trials each). There was a brief break between each block. Participants performed one training block just before the experimental blocks.

Data analysis

Reaction times shorter than 100 ms and longer than 1,000 ms, respectively considered as anticipated responses and omissions, were excluded from further analyses (22 trials). We analysed 23,221 trials (mean = 484 trials per subject, SD = 5.5; mean RT = 338 ms, SD = 43). Responses given in No-Go trials were classified as errors. The errors committed on the same side as the No-Go signal were considered as ipsilateral errors, whereas errors committed on the opposite side to the No-Go signal were considered as contralateral errors (mean accuracy = 98.9%, SD = 1.13%). Errors are, on average, significantly faster (mean RT = 281 ms, SD = 91.7) than correct trials (mean RT = 338 ms, SD = 75.7). This is not surprising, as errors are likely to be impulsive responses.

Reaction times were analysed with a Bayesian hierarchical linear model using the Brms package (Bürkner, 2016) in R (version 4.2, R Core Team, 2017). To approach a Gaussian distribution, we considered log-transformed RT as a dependent variable. We were primarily interested in the effect of the spatial correspondence between the stimulus location and the response key, as well as the number of response keys (Condition). Furthermore, given the differences between correct trials and errors in terms of RT and cognitive processes, response accuracy should be taken into account. We thus included the Intercept, Accuracy, Spatial correspondence (IPS vs. CNT), Condition (one vs. two response keys), and the Spatial correspondence × Condition interaction as fixed effects, and the Intercept, Accuracy, and Spatial correspondence as random effects at the subject level. Predictors were coded using sum contrasts. Thus, "Accuracy" takes a value of 1 if the trial is correct, and -1 otherwise; "Spatial correspondence" takes a value of 1 for CNT trial and -1 otherwise; "Condition" takes a value of 1 if there is a one response key, and -1 otherwise. We used weakly informative priors (normal (0,10)) for regression parameters, and performed 10,000 iterations (burn-in period: 1,000).

Results

Estimations of the fixed effects parameters for the mixed effect model are reported in Table 2. Gelman-Rubin convergence statistics \( \widehat{\Big(\boldsymbol{R}}, \)) Gelman & Rubin, 1992 and visual inspection of traces show that the model converged. There is no evidence that RT was affected by the Condition (β = -1.47 × 10-2, 95% credible interval = [-5.01 × 10-2, 2.12 × 10-2]) or the Condition × Spatial correspondence interaction (β = -4.47 × 10-6, 95% credible interval = [-3.97 × 10-3, 3.91 × 10-3]). On the other hand, there is evidence that RT depends on the Spatial correspondence, with a mean value of the posterior parameter (Fig. 1) corresponding to responses 3 ms faster for IPS trials than CNT ones (β = 4.53 × 10-3, 95% credible interval = [5.32 × 10-4, 8.53 × 10-3]). Moreover, there is evidence that this effect significantly varies across subjects (standard deviation = 1.15 × 10-2, 95% credible interval = [3.63 × 10-2, 1.54 × 10-2]).

Table 2 Posterior parameters for the hierarchical regression predicting log-transformed RT from Accuracy, Spatial correspondence, and Condition
Fig. 1
figure 1

Distribution of posterior parameters for fixed effects. Shaded areas represent 95% credible intervals, and dashed lines represent mean value

Discussion

The present study was conducted to decipher whether or not a cSE in reference to Donders’ type c task) can emerge in Go/No-Go tasks performed in isolation. To this aim, a single participant was centrally positioned in front of a device and was required to respond by a hand key-press to one of the two possible colors of a visual stimulus. Half the participants were seated in front of a table equipped with one response key and the other half in front of a table equipped with two response keys (one active and the other one useless). Using a substantial number of subjects and trials, the present study revealed a tiny cSE in a Go/No-Go task performed in isolation thus replicating the original findings of Callan et al. (1974). The lack of statistical power of previous studies could be the major reason for it so often being considered to be nil.

In the one-response key condition, there was no incitation for lateral response coding. The emergence of a difference in performance between IPS and CNT associations can thus hardly be accounted for in terms of referential coding according to which the cSE is linked to the lateral representation of the response relative to another concurrent event. In contrast, the present results are compatible with the direct activation notion, which predicts that IPS associations are stronger –and thus faster – than CNT ones (Kornblum et al., 1990). In addition, our data failed to find evidence of an effect of a second response key on the cSE. Note that Dolk et al. (2011, 2013, Experiment 3) conjectured that unanimated objects fail to affect the cSE. The dynamic properties of events (such as the waving movement of the Japanese cat or the clicking of the metronome) thus seem to be crucial for increasing cSE.

This view is supported by electrophysiologic studies. In between-hand choice RT tasks, response-locked evoked response potentials (ERPs) reveal a component (N-40) that develops over the supplementary motor areas (Vidal et al., 2003). This wave precedes the activation of the primary motor cortex, which reflects the build-up of the motor command and is closely related to response selection processes. It is notably larger for incongruent than for congruent conditions (Carbonnell et al., 2013). Crucially the N-40 is completely absent in individually performed Go/No-Go tasks (Vidal et al., 2011), indicating that the only decision performed in those tasks is perceptual in nature. Such psychophysiological results suggest that the locus of the cSE is stimulus discrimination rather than response selection. We suggest that it is linked to an increased attentional focus on the side ipsilateral to the response button.

Methodologically speaking, it must be noted that as opposed to Go/No-Go tasks, simple reaction time tasks are not optimally suited for testing the alternative conceptions. This is because in those tasks, there is no uncertainty regarding the emission of the response and the only problem faced by the participant is how to synchronize his or her response with the presentation of the imperative stimulus. As a consequence, response processes are not necessarily contingent upon stimulus identification; for this reason, the proportion of correct responses results from efficient time estimation, a process unrelated to the processing of the information conveyed by the imperative stimulus. Such responses can be expected to be affected neither by response representations nor by the strength of the stimulus-response association.

Importantly it should be noted that the present results are relevant for the understanding of JSE. Currently, in most articles regarding joint action situation, the logic of the argument is based on the fact that cSE disappears when participants perform an individual Go/No-Go task and reappears when the task is performed alongside another participant or non-social attentional-attracting events (e.g., Karlinsky, Lam, Chua, & Hodges, 2017; Puffe, Dittrich, & Klauer, 2017; Saunders, Melcher, & van Zoest, 2017; Stenzel & Liepelt, 2016). The present data speak against this notion since a numerically small but statistically reliable cSE emerged in an individual Go/No-Go task performed in isolation. This suggests that lateral response representation is unnecessary for the CSE to emerge. Instead an ipsilateral stimulus-response association could lead to shorter RTs than contralateral ones because they are stronger from an association stance. In this context, it must be acknowledged that the effect of spatial correspondence is much larger in choice RT than in Go/No-Go tasks. We conjecture that this may be because performance of choice tasks relies on processes such as response selection that are absent in Go/No-Go tasks and the duration of which depends on the strength of the association to be performed.

The interpretation of the cSE in terms of strength of individual stimulus-response associations opens a new perspective relative to the interpretation of the JSE found in co-action settings. We conjecture that the JSE could have more to do with a basic social facilitation effect (i.e., an increase in arousal) than with a co-representation phenomenon. If the presence of a congener promotes the delivery of the stronger association in the behavioral repertoire of the individual, this would be sufficient to induce an increased cSE. Further studies are needed to test this assumption.