Our expectations about certain attributes of a forthcoming event—such as its likely location, timing, or sensory modality—can guide attention and influence the neural and behavioral responses to that event. Effects of spatial expectations are well studied. In endogenous spatial-cueing studies, expectations about the location of an upcoming target are manipulated to have participants orient their attention to a specific location (Posner, 1980). When a target appears at the expected (attended) location, the neural responses to the target are boosted, and behavioral responses (in detection or discrimination tasks) become faster and/or more accurate than when the target appears at an ignored location (for a review, see Luck, Woodman, & Vogel, 2000). Research has shown that people can also orient attention in time based on their expectations about the timing of a future event (for reviews, see Coull, 2009; Coull & Nobre, 2008; Nobre, Correa, & Coull, 2007; Nobre & van Ede, 2018). Effects of temporal expectations have been studied using a temporal version of the spatial-cueing paradigm (temporal-cueing paradigm; Coull & Nobre, 1998). In a typical temporal-cueing task, participants are informed about the likely timing of a target—either by symbolic cues or instructions—allowing them to voluntarily attend to a specific time interval in anticipation of a target. As in spatial-cueing studies, temporal-cueing studies have demonstrated that when a target appears with the expected timing, the neural responses to the target are modulated in sensory and motor cortices and behavioral responses (detection and/or discrimination) become faster and/or more accurate than when the target appears with an unexpected timing (e.g., Correa, Lupiáñez, Milliken, & Tudela, 2004; Coull & Nobre, 2008; Lange, Krämer, & Röder, 2006; Miniussi, Wilding, Coull, & Nobre, 1999; Nobre et al., 2007). In addition to spatial and temporal expectations, research has shown that people can selectively attend to a specific sensory modality based on the knowledge about the likely sensory modality of a future target, as evidenced by the enhanced neural responses and faster and/or more accurate behavioral responses (discrimination) to targets presented through the expected modality compared with those presented through an unexpected modality (e.g., Foxe, Simpson, Ahlfors, & Saron, 2005; Foxe & Snyder, 2011; Spence & Driver, 1997; Spence, Nicholls, & Driver, 2001).

In addition to the abovementioned probability-driven expectation effects, stimulus-driven priming effects can also influence neural and behavioral responses to stimuli. When a given trial is preceded by the same trial type, behavioral responses are typically facilitated (faster and/or more accurate) relative to when it is preceded by a different trial type. Stimulus repetition has been shown to decrease neural responses at various cortical regions and at different stages of stimulus processing depending on the task parameters (for a review, see Grill-Spector, Henson, & Martin, 2006). Although robust priming effects have been reported for repetitions of spatial, temporal, and sensory modality attributes of stimuli (e.g., Capizzi, Correa, Wojtowicz, & Rafal, 2015; Spence et al., 2001; Steinborn, Rolke, Bratzke, & Ulrich, 2008; Woodrow, 1914), studies that focused on the orienting of attention through expectation have mostly ignored the inherent priming effects in their design. Concurrently investigating expectation and priming effects and their potential interaction is informative for understanding the various factors that influence attention orienting in different domains.

As people can orient attention to specific spatial locations, time points, and sensory modalities, is it possible to concurrently orient attention across multiple domains? A number of studies have looked at the association between spatial and temporal orienting of attention (e.g., Coull & Nobre, 1998; Doherty, Rao, Mesulam, & Nobre, 2005) by orthogonally manipulating the probabilities of the target location and target timing in the same experiment protocol using detection or discrimination tasks. They reported mostly additive behavioral effects and both distinct and overlapping neural mechanisms at various stages of information processing for simultaneously orienting attention in space and time. Here, following similar logic, we investigated the independent and/or interdependent nature of orienting attention in temporal and modality domains by orthogonally manipulating the probabilities of target timing and target modality. The knowledge of statistical regularity would generate expectations that would endogenously orient attention to the likely target timing and/or target modality, while each experience with a target stimulus would prime the perceptual system to exogenously orient attention to the same timing and/or modality for the upcoming target. We considered both expectation (endogenous) and priming (exogenous) effects to gain a more thorough understanding of the facilitative effects of statistical regularity and repetition on attentional orienting. On each trial, participants received an audiovisual cue followed by a target letter that was presented either through the auditory modality (spoken) or the visual modality (written). The task was to identify the letter (B or D) as quickly and as accurately as possible, regardless of its modality. We manipulated temporal expectation by varying the relative probability of a target stimulus presented at either a short or a long interval after the cue. The temporal expectation manipulation was blocked; in the short-expected block, the target stimulus was frequently presented (80% of trials) following a short interval (400 ms), and infrequently presented (20% of trials) following a long interval (1,200 ms) after the cue. In the long-expected block, the target was frequently presented (80% of trials) following the long interval (1,200 ms) and infrequently presented (20% of trials) following the short interval (400 ms) after the cue. We manipulated modality expectation across participants; half of the participants were in the auditory-expected group who received auditory targets more frequently than visual targets (80% vs. 20% of trials) throughout the experiment. The remaining participants were in the visual-expected group who received visual targets more frequently than auditory targets (80% vs. 20% of trials). Participants were made aware of the exact temporal and modality probabilities before the experiment.

We focused on correct response times (RT) as the primary dependent variable given that accuracy on this task is high (i.e., at or near ceiling). We measured the effect of temporal expectation as the speeding of RTs on the short-interval trials in the short-expected block, when a target was presented at the expected time—the expected-timing condition—relative to the short-interval trials in the long-expected block, when a target was presented at an unexpected time—the unexpected-timing condition. We focused only on the short-interval trials for analyses because the long-interval trials do not have the same expected versus unexpected distinction, because participants can reorient their attention to the long interval upon not receiving the target after the short interval in the short-expected block (Correa et al., 2004; Coull & Nobre, 1998; see the Supplementary Material for the results on the long-interval trials). We measured the effect of modality expectation as the speeding of RTs on the trials where targets were presented through the expected modality—the expected-modality condition—relative to those where targets were presented through the unexpected modality—the unexpected-modality condition. Because expected modality was manipulated as a between-participant factor, for half of the participants, auditory was the expected modality and visual was the unexpected modality—the auditory-expected group—while for the remaining participants, visual was the expected modality and auditory was the unexpected modality—the visual-expected group. As for the priming effects, we measured the effects of temporal priming as the speeding of RTs when a given trial was preceded by a same-timing trial (e.g., a short-interval trial preceded by a short-interval trial)—the repeated-timing condition—relative to when it was preceded by an opposite-timing trial (e.g., a short-interval trial preceded by a long-interval trial)—the switched-timing condition. Similarly, we measured the effects of modality priming as the speeding of RTs when a given trial was preceded by a same-modality trial (e.g., an auditory trial preceded by an auditory trial)—the repeated-modality condition—relative to when it was preceded by an opposite-modality trial (e.g., an auditory trial preceded by a visual trial)—the switched-modality condition.

We expected to find a main effect of temporal expectation and a main effect of modality expectation, both of which have been previously reported (e.g., Coull & Nobre, 1998; Spence & Driver, 1997). We also expected to find a main effect of temporal priming and a main effect of modality priming, extending previous findings on stimulus priming effects (e.g., Los, 2010; Spence et al., 2001). Importantly, we were interested in potential interactions between the temporal and modality domains for expectation and/or priming effects as well as potential interactions between the expectation and priming effects for the temporal and/or modality domains. These interactions would reveal the extent to which the mechanisms that orient attention in the temporal and modality domains interact with each other.

As for the expectation effects, we might observe additive effects between temporal and modality expectation, similar to the previously reported additive effects of temporal and spatial expectation on target detection or discrimination response times (e.g., Coull & Nobre, 1998; Doherty et al., 2005). Such a result would point to relatively independent attentional orienting mechanisms in the temporal and modality domains. Alternatively, temporal and modality-expectation effects might interact in a specific way. First, temporal-expectation effects might be modality specific. If so, the temporal-expectation effect would facilitate responses to visual (or auditory) targets only when targets are expected through the visual (or auditory) modality. Second, temporal-expectation effects might be predominantly associated with either the visual or auditory modality. If temporal-expectation effects are predominantly associated with the auditory modality due to the typical dominance of the auditory modality in temporal processing (e.g., Lukas, Philipp, & Koch, 2014; Ortega, Guzman-Martinez, Grabowecky, & Suzuki, 2014), then temporal expectation should generally facilitate responses to auditory targets regardless of whether auditory or visual targets are expected, whereas temporal expectation might facilitate responses to visual targets only when visual targets are expected. In contrast, temporal-expectation effects may be predominantly associated with the visual modality as suggested by the finding that temporal expectation may prioritize visual over auditory signals in the context of audiovisual competition (Menceloglu, Grabowecky, & Suzuki, 2017) and by other studies showing that attention more strongly prioritizes visual than auditory signals (Colavita, 1974; Lukas, Philipp, & Koch, 2010; Posner, Nissen, & Klein, 1976; Sinnett, Spence, & Soto-Faraco, 2007). If so, temporal expectation should generally facilitate responses to visual targets regardless of whether visual or auditory targets are expected, whereas temporal expectation might facilitate responses to auditory targets only when auditory targets are expected.

As for the priming effects, simple additive effects between temporal and modality priming would provide evidence for independent mechanisms of exogenous orienting in the temporal and modality domains. Alternatively, we might observe a specific interaction between temporal-priming and modality-priming effects. For instance, the presence of priming in one domain may reduce or override the effect of priming in another domain (e.g., reduced or no effect of temporal priming in repeated-modality trials or reduced or no effect of modality priming for repeated-timing trials). This outcome would result in a ceiling effect (observed as an RT floor effect) in the optimization of sensory processing via stimulus repetition. In contrast, we may observe a supra-additive effect in which repetition in one domain enhances the priming effect in the other domain (e.g., stronger temporal-priming effects on repeated-modality trials and vice versa). This outcome would suggest that a primed sensory system becomes more conducive to further facilitation. Overall, any dependency between the temporal-priming and modality-priming effects would imply a shared mechanism for exogenously orienting attention in the temporal and modality domains.

Finally, expectation and priming effects may interact differently within the temporal domain or the modality domain. In particular, the modality-expectation and modality-priming effects may interact in a modality specific manner given the literature on auditory versus visual differences in attention effects (e.g., Menceloglu et al., 2017; Soto-Faraco & Spence, 2002; Talsma & Kok, 2002). The findings will elucidate potential dependencies between probability-driven and stimulus-driven effects in attention orienting within the temporal and modality domains.

To summarize, by orthogonally manipulating attentional orienting in the temporal and modality domains, we aimed to answer the questions of (1) whether there are shared or distinct mechanisms of orienting attention in the temporal domain and the sensory-modality domain, and (2) whether there are dependencies between probability-driven expectation effects and stimulus-driven priming effects within the temporal domain or the sensory-modality domain. We discuss our findings in the context of the relevant literature in the Discussion section.

Method

Participants

Thirty-two Northwestern University undergraduate students were recruited to participate in the study. All had normal or corrected-to-normal vision and normal hearing. Participants gave informed consent and were treated according to the guidelines of the Institutional Review Board at Northwestern University. Participants received partial course credit for their participation, which lasted approximately 1 hour. Our sample included 32 participants (20 female) between the ages of 18 and 21 years (M = 18.63 years, SD = 0.91). Eight participants were included in each of the four counterbalancing cells—half of the auditory-expected and visual-expected group participants received the short-expected block first, while the remaining participants of each group received the long-expected block first.

Stimuli and procedure

Stimuli were presented using a 15-inch, 2.2 GHz MacBook Pro, running MATLAB (Version R2015a) with Psychtoolbox extensions (Version 3.0.12; Brainard, 1997; Kleiner, Brainard, & Pelli, 2007; Pelli, 1997). All visual stimuli were black (1.2 cd/m2) on a light-gray background (73 cd/m2) presented at the center of the display monitor.

The experimental design and sequences of trial events are illustrated in Fig. 1. Each trial began with the audiovisual cue, which served as a warning signal, consisting of a central row of three asterisks (2.6° × 0.7° visual angle, in Calibri font) presented within a black, rectangular frame (22.6° × 17° visual angle) and a concurrent sine-wave tone (440 Hz). This and all other auditory stimuli were binaurally presented over headphones (60 dB SPLA). The cue presentation lasted 200 ms and was then replaced by a central fixation dot (0.5° visual angle diameter) that lasted either 400 ms—the short interval—or 1,200 ms—the long interval. In the short-expected block, the short interval was used 80% of the time and the long interval was used 20% of the time. In the long-expected block, the probabilities were reversed. Following the short or long interval, an auditory or a visual target was presented. For the participants in the auditory-expected group, an auditory target was presented 80% of the time whereas a visual target was presented 20% of the time. For the participants in the visual-expected group, the probabilities were reversed. Half of the participants were assigned to the auditory-expected group, while the remaining participants were assigned to the visual-expected group. The auditory target consisted of a female American-English speaker pronouncing the letter B or D, whereas the visual target was a centrally presented letter B or D (0.7° × 1° visual angle, in Calibri font). Participants were instructed to report the letter as quickly and as accurately as possible regardless of target modality. Using an external number pad, participants pressed the “1” key (marked “B”) upon detecting the B target and the “2” key (marked “D”) upon detecting the D target, with the index and middle fingers, respectively, of their dominant hands. The target presentation lasted 350 ms, followed by the central fixation display. Participants were required to respond within the 1,650-ms response window; otherwise, the trial was classified as incorrect. The next trial began following a 1,200-ms intertrial interval during which a blank screen was shown.

Fig. 1
figure 1

Experimental design and sequences of trial events. To manipulate modality expectation, half of the participants were assigned to the auditory-expected group who frequently (80%) received auditory targets (left column) and infrequently (20%) received visual targets (right column), while the remaining participants were assigned to the visual-expected group who frequently (80%) received visual targets and infrequently (20%) received auditory targets. To orthogonally manipulate temporal expectation, for each group, the cue-to-target intervals were frequently (80%) short and infrequently (20%) long in the short-expected block and frequently (80%) long and infrequently (20%) short in the long-expected block. Participants were instructed to indicate whether the target letter was “B” or “D” regardless of the modality of presentation, spoken (left column) or displayed (right column)

The viewing distance was about 65 cm, and participants were instructed to maintain central fixation throughout the experiment. The experiment was divided into two blocks—a short-expected block and a long-expected block—with a total of 620 trials. The first 10 trials in each block served as practice trials and were excluded from analysis, leaving 600 experimental trials in total. The critical short-interval trials consisted of expected-timing trials (80%, or 240 trials, from the short-expected block) and unexpected-timing trials (20%, or 60 trials, from the long-expected block), with the majority of them being repeated-timing trials (68%, or 204 trials, breaking down to 64%, or 192 trials, where expected timing was followed by expected timing, and 4%, or 12 trials, where unexpected timing was followed by unexpected timing) and the rest being switched-timing trials (32%, or 96 trials, breaking down to 16%, or 48 trials, where either expected timing was followed by unexpected timing or unexpected timing was followed by expected timing). Orthogonally, these trials were expected-modality trials (80%, or 240 trials) or unexpected-modality trials (20%, or 60 trials), with the majority of them being repeated-modality trials (68%, or 204 trials, breaking down to 64%, or 192 trials, where expected modality was followed by expected modality, and 4%, or 12 trials, where unexpected modality was followed by unexpected modality), and the rest being switched-modality trials (32%, or 96 trials, breaking down to 16%, or 48 trials, where either expected modality was followed by unexpected modality, or unexpected modality was followed by expected modality). The order of the short-expected and long-expected blocks was counterbalanced across participants. We gave participants a brief break between the blocks and two brief breaks during each block. Participants were informed of the probabilities of the long versus short time intervals and the auditory versus visual target modalities before the experiment started.

Data acquisition and analysis

We recorded RT and accuracy (measured as proportion correct) for each trial. Only the correct responses were included in the RT analysis. For each participant, RTs that were more than 2.5 standard deviation above or below their mean RT as well as RTs that were below 200 ms were identified as outliers and removed. We analyzed RT and accuracy data from the short-interval trials (see the introduction section for the rationale). Data from the long-interval trials are presented in the Supplementary Material. All analyses were performed using R.

Results

Overall accuracy (M = 98%, SE = 0.4%) was near ceiling. We report both RT and accuracy results, but we consider RT effects to be more informative.

Across-domain expectation effects

First, we investigated whether expectation effects in the temporal and modality domains interacted. We performed a three-way mixed-design analysis of variance (ANOVA), with Temporal Expectation (expected-timing vs. unexpected-timing) and Modality Expectation (expected-modality vs. unexpected-modality) as the within-participant factors, and Expected Modality (auditory-expected vs. visual-expected) as the between-participant factor. Note that for modality expectation, the coding of the expected versus unexpected modality depends on the between-participant factor of expected modality. That is, for participants assigned to the auditory-expected (or visual-expected) group, the expected modality was auditory (or visual) and the unexpected modality was visual (or auditory). For RTs, the main effect of temporal expectation was significant; participants responded faster on the expected-timing trials (M = 517, SE = 18.71) than on the unexpected-timing trials (M = 544, SE = 16.88), F(1, 30) = 13.40, p < .001, ηp2 = .31. This confirmed the effectiveness of our temporal expectation manipulation. The main effect of modality expectation was also significant; participants responded faster on the expected-modality trials (M = 489, SE = 15.90) than on the unexpected-modality trials (M = 573, SE = 20.39), F(1, 30) = 58.61, p < .0001, ηp2 = .66. This result confirmed the effectiveness of our modality expectation manipulation. There was no significant interaction between temporal expectation and modality expectation, F(1, 30) = 0.09, p > .70 (see Fig. 2), or a three-way interaction including expected modality, F(1, 30) = 0.01, p > .90, suggesting that the temporal-expectation and modality-expectation effects are additive. Note that the nonsignificance of both the two-way and three-way interactions indicate that temporal expectation and modality expectation were similarly noninteractive for both the auditory-expected and visual-expected groups of participants. This indicates that the lines shown in Fig. 2 were statistically parallel for each group of participants. This in turn indicates that the temporal-expectation effects were statistically equivalent on the auditory and visual trials because the expected-modality trials were all auditory (or all visual) trials and the unexpected-modality trials were all visual (or all auditory) trials for the auditory-expected (or visual-expected) group. To assess whether the modality-expectation effects were equivalent or different on the auditory and visual trials requires additional analyses, which will be presented in the later section (Within-Domain Expectation and Priming Effects).

Fig. 2
figure 2

The effects of temporal expectation and modality expectation on RT. The error bars represent ±1 standard error of the mean, adjusted for within-participants comparisons (Morey, 2008)

For accuracy, none of the RT effects were significant (ps > .30), providing no evidence of a speed–accuracy trade-off.

Across-domain priming effects

Next, we investigated whether priming effects in the temporal and modality domains interacted. We performed a three-way mixed-design ANOVA, with Temporal Priming (repeated-timing vs. switched-timing) and Modality Priming (repeated-modality vs. switched-modality) as the within-participant factors, and Expected Modality (auditory-expected vs. visual-expected) as the between-participant factor. For RTs, the main effect of temporal priming was significant; participants responded faster on the repeated-timing trials (M = 500, SE = 17.48) than on the switched-timing trials (M = 525, SE = 17.14), F(1, 30) = 32.96, p < .0001, ηp2 = .52. The main effect of modality priming was also significant; participants responded faster on the repeated-modality trials (M = 483, SE = 16.45) than on the switched-modality trials (M = 543, SE = 18.40), F(1, 30) = 100.15, p < .0001, ηp2 = .77. There was no significant interaction between temporal priming and modality priming, F(1, 30) = 0.08, p > .70 (see Fig. 3), or a three-way interaction including expected modality, F(1, 30) = 1.45, p > .20, suggesting that the temporal-priming and modality-priming effects are additive for both the auditory-expected and visual-expected groups of participants. In the previous analysis, we found that the temporal-expectation effects did not depend on target modality. How about the temporal-priming effect? Note that the lack of the three-way interaction here does not suggest equivalent modality-priming effects on the auditory and visual trials; neither could we include Target Modality (auditory-target vs. visual-target) in the main ANOVA because the between-participant factor, Expected Modality (auditory-expected vs. visual-expected), is an alternative way of coding target modality; for example, for the auditory-expected group of participants, the auditory trials always presented the expected modality, while the visual trials always presented the unexpected modality. We thus directly evaluated the potential dependence of the temporal-priming effect on target modality by conducting a two-way repeated-measures ANOVA, with Temporal Priming and Target Modality as the two within-participant factors. The interaction was not significant, F(1, 31) = 1.16, p > .60, suggesting that the strength of the temporal priming does not depend on target modality.

Fig. 3
figure 3

The effects of temporal priming and modality priming on RT. The error bars represent ±1 standard error of the mean, adjusted for within-participants comparisons (Morey, 2008)

For accuracy, none of the RT effects were significant (ps > .20) except for the main effect of modality priming; participants were more accurate on the repeated-modality trials (M = 98.5%, SE = 0.4%) than on the switched-modality trials (M = 97.4%, SE = 0.5%), F(1, 30) = 13.42, p < .001, ηp2 = .31, providing no evidence of a speed–accuracy trade-off.

Given that we reported all possible main effect results for RT and accuracy above, we only report the novel interaction results in the next section, as they answer specific questions.

Within-domain expectation and priming effects

We next examined the relationship between expectation and priming effects separately in each domain to see if expectation and priming effects interacted differently in the temporal and modality domains. For the temporal domain, we performed a two-way repeated-measures ANOVA, with Temporal Expectation (expected-timing vs. unexpected-timing) and Temporal Priming (repeated-timing vs. switched-timing) as the within-participant factors. We found a significant interaction; the effect of Temporal Expectation was larger on the switched-timing trials than on the repeated-timing trials, F(1, 31) = 5.43, p = .027, ηp2 = .15 (see Fig. 4). The temporal-expectation effect was significant only on the switched-timing trials, t(31) = 2.48, p = .019 , d = .62, for the switched-timing trials, and t(31) = 0.25, p =.80, d = 0.06, for the repeated-timing trials. These results suggest that combined effects of expectation and priming in the temporal domain are affected by a floor effect. However, given the relatively small effect size of this interaction with small trial numbers for some conditions (e.g., repeated unexpected-timing trials), this result should be interpreted with caution. For accuracy, the interaction effect was not significant (p = .55).

Fig. 4
figure 4

The effects of temporal expectation and temporal priming on RT. The error bars represent ±1 standard error of the mean, adjusted for within-participant comparisons (Morey, 2008)

For the modality domain, we performed a three-way mixed-design ANOVA, with Modality Expectation (expected-modality vs. unexpected-modality) and Modality Priming (repeated-modality vs. switched-modality) as the within-participant factors, and Expected Modality (auditory-expected vs. visual-expected) as the between-participant factor. In addition to the significant main effects of Modality Expectation and Modality Priming (reported earlier), the three-way interaction was significant, F(1, 30) = 19.90, p < .0005, ηp2=.40. None of the two-way interactions were significant (ps > .4).

The significant three-way interaction in the absence of the Modality Expectation × Modality Priming interaction indicates that the Modality Expectation × Modality Priming pattern is different for the two groups of participants. To unpack this, we evaluated the Modality Expectation × Modality Priming pattern separately for the auditory-expected and visual-expected groups of participants. The two patterns are shown in Fig. 5. It is difficult to interpret the data when they are organized in this way. As noted above, target modality (auditory-target vs. visual target) is an alternative way of coding expected modality (auditory-expected vs. visual-expected) in our design. For the auditory-expected group, the auditory targets were expected and the visual targets were unexpected, whereas for the visual-expected group, the visual targets were expected and the auditory targets were unexpected. Thus, we can reorganize Fig. 5 with respect to the auditory-target trials versus the visual-target trials instead of the auditory-expected group versus the visual-expected group. In this alternative organization (see Fig. 6), the modality-expectation effects are between participant; that is, for the auditory-target trials (left panel in Fig. 6), the expected-modality results are from the auditory-expected group and the unexpected-modality results are from the visual-expected group, whereas for the visual-target trials (right panel in Fig. 6), the expected-modality results are from the visual-expected group and the unexpected-modality results are from the auditory-expected group. The modality-priming effects are still within participant. The three-way interaction organized in this way (see Fig. 6) suggests that modality priming and modality expectation had distinct effects on the two sensory modalities. We thus followed up the three-way interaction with simple effects analyses on modality expectation and modality priming performed separately on the auditory-target and visual-target trials.

Fig. 5
figure 5

The effects of modality expectation and modality priming on RT shown separately for the auditory-expected (left panel) and visual-expected (right panel) groups. The error bars represent ±1 standard error of the mean, adjusted for within-participants comparisons (Morey, 2008)

Fig. 6
figure 6

A replotting of Fig. 5, where the effects of modality expectation and modality priming on RT are shown separately for the auditory-target (left panel) and visual-target (right panel) trials (rather than for the auditory-expected and visual-expected groups). The error bars represent ±1 standard error of the mean, adjusted for within-participants comparisons (Morey, 2008). Note that because we separated the auditory-target and visual-target trials, modality expectation within each panel is a between-participant factor (shapes connected with dashed lines). Thus, the error bars are appropriate for assessing the effect of modality priming and its interaction (or lack thereof) with modality expectation, but are inappropriate for assessing the effect of modality expectation

For RTs on the auditory-target trials (see Fig. 6, left panel), the main effect of modality expectation was not significant, F(1, 30) = 0.14, p = .70; participants expecting auditory targets (M = 518, SE = 24.54) and those expecting visual targets (M = 534, SE = 32.66) yielded equivalent RTs on the auditory-target trials. In contrast, the main effect of modality priming was significant; participants were faster on the repeated-modality trials (M = 500, SE = 21.12) than on the switched-modality trials (M = 553, SE = 19.75), F(1, 30) = 52.86, p < .0001, ηp2=.64. There was no significant interaction between modality expectation and modality priming, F(1, 30) = 0.03, p >.80. For RTs on the visual-target trials (see Fig. 6, right panel), the main effects of modality expectation and modality priming were both significant; participants expecting visual targets (M = 463, SE = 21.61) responded faster than those expecting auditory targets on the visual-target trials (M = 574, SE = 27.26), F(1, 30) = 10.10, p = .003, ηp2 =.25; participants also responded faster on the repeated-modality trials (M = 511, SE = 19.34) than on the switched-modality trials (M = 526, SE = 20.55), F(1, 30) = 7.71, p = .009, ηp2 =.20. There was no significant interaction between modality expectation and modality priming, F(1, 30) = 0.46, p >.50. Although the modality-expectation and modality-priming effects were both significant on the visual-target trials, the (between-participant) modality-expectation effect (the tilt in the right panel in Fig. 6) was substantially larger numerically than the (within-participant) modality-priming effect (the vertical separation in the right panel in Fig. 6).

Taken together, the relative strength of modality expectation and modality priming was opposite for the auditory-target and visual-target trials. Whereas modality priming dominated responses to auditory targets (with a nonsignificant effect of modality expectation), modality expectation dominated responses to visual targets. This inference is backed by the aforementioned significant three-way interaction, contrasting the patterns shown in the left and right panels in Fig. 6. A potential concern is that the trial numbers were relatively small in some of the condition cells (e.g., repeated-modality trials for the unexpected modality) by design. Given the novelty of this particular finding, it needs to be replicated in future research. Nevertheless, we are reasonably confident of the reliability of this finding because the crucial three-way interaction yielded a relatively large effect size (p < .0005, ηp2 = .40), and sphericity violations (that could have resulted from unequal numbers of trials) is not a concern here (because each factor had no more than two levels).

Discussion

We investigated potential interactions between mechanisms that orient attention in the temporal and sensory-modality domains by manipulating the target timing and modality within the same behavioral task. We considered both the probability-driven expectation effects and stimulus-driven priming effects in orienting attention. First, we investigated potential across-domain interactions separately for the expectation and priming effects. We observed reliable expectation effects on RT in both domains; participants responded faster on the expected-timing or expected-modality trials compared with the unexpected-timing or unexpected-modality trials. Importantly, the temporal and modality-expectation effects did not interact. We also observed reliable priming effects on RT in both domains; participants responded faster on the repeated-timing or repeated-modality trials compared with the switched-timing or switched-modality trials. Again, the temporal and modality-priming effects did not interact. Taken together, these results suggest that attention allocation in the temporal and modality domains are independently controlled. Next, we investigated potential within-domain interactions between expectation and priming effects separately for the temporal and modality domains. For the temporal domain, we observed a reliable interaction between the expectation and priming effects in which temporal-expectation effects were absent on the repeated-timing trials and temporal-priming effects were reduced on the expected-timing trials, suggesting a potential floor effect on temporal orienting. For the sensory-modality domain, we observed a significant interaction between the modality-expectation effect, the modality-priming effect, and the target modality. This interaction was driven by the result that the priming effect was stronger than the expectation effect on the auditory-target trials, whereas the expectation effect was stronger than the priming effect on the visual-target trials. These results suggest that attention allocation to the auditory modality is dominated by stimulus-driven priming effects whereas attention allocation to the visual modality is dominated by probability-driven expectation effects.

A number of studies have manipulated expectations about the sensory modality of targets concurrently with expectations about either the location or timing of the targets to understand how the mechanisms that orient attention in space and time are coordinated across the senses (e.g., Lange & Röder, 2006; Lloyd, Merat, McGlone, & Spence, 2003; Mühlberg, Oriolo, & Soto-Faraco, 2014; Mühlberg & Soto-Faraco, 2018; Spence & Driver, 1996). It is important to note, however, that in these prior studies, modality expectation manipulations were always yoked to the manipulations of spatial or temporal expectation. Specifically, unexpected (secondary) modality targets were more likely to be presented at unexpected spatial locations or at unexpected time points. The main question asked in these studies was whether the effect of spatial or temporal expectation established through the primary sensory modality would transfer to the processing of stimuli presented through an unexpected secondary modality. For example, Spence and Driver (1996) manipulated the probabilities of target locations (left vs. right) and target modalities (auditory vs. visual). The secondary-modality targets had the opposite expected location (e.g., left frequent) relative to the primary-modality targets (right frequent), and participants were made aware of these opposing probabilities. Using a discrimination task, Spence and Driver (1996) found overall expectation effects; responses were faster for the spatially expected targets than for the spatially unexpected targets and faster for the primary-modality targets than for the secondary-modality targets. They also found that responses to the secondary-modality targets followed the spatial expectation induced by the primary-modality targets even though the location probabilities for the secondary-modality targets were the opposite. Nevertheless, the spatial expectation effect was greater for the primary-modality targets than for the secondary-modality targets, suggesting that spatial expectation effects are not fully supramodal. In contrast, using a very similar design with auditory and tactile modalities, Lloyd et al. (2003) found that audition and touch were more independently influenced by spatial expectations in that the secondary modality did not follow the spatial expectation induced by the primary modality. These findings show that the strength of the cross-modal links in spatial expectation can depend on the combinations of specific sensory modalities.

Others have similarly examined the modality specificity of temporal-expectation effects by manipulating the probabilities of cue-to-target intervals (short vs. long) instead of target locations (e.g., Lange & Röder, 2006; Mühlberg et al., 2014; Mühlberg & Soto-Faraco, 2018). For example, Lange and Röder (2006) included auditory and tactile modalities in a discrimination task. Like Spence and Driver (1996), they found that responses to the secondary-modality targets followed the temporal expectation induced by the primary modality, consistent with the interpretation that temporal expectation influences sensory processing in a supramodal manner. However, using a similar experimental design and a discrimination task with visual and tactile stimuli (Mühlberg et al., 2014), as well as auditory and tactile stimuli (Mühlberg & Soto-Faraco, 2018), Mühlberg and colleagues have found that responses to the secondary-modality targets did not follow the temporal expectation induced by the primary modality, suggesting that temporal-expectation effects can be modality specific, more closely aligning with Lloyd et al.’s (2003) findings on cross-modal spatial expectation effects for audition and touch. Given these mixed results, whether temporal expectation is supramodal or modality specific is a question that is still outstanding.

While opposed manipulation of temporal expectations for primary and secondary modalities is suitable for investigating the degree to which temporal-expectation effects are supramodal or modality specific, such manipulation does not address how temporal expectation may interact with modality expectation when the probability distributions in the temporal and modality domains are uncorrelated in the experiment design. We thus independently manipulated temporal expectation and modality expectation to investigate how internal mechanisms of orienting attention to time and sensory modality might interact. We found that the magnitude of the temporal-expectation effects was similar for the expected-modality and unexpected-modality trials, suggesting that the temporal and modality expectation mechanisms are relatively independent. This interpretation is in fact consistent with the results from the prior studies that used competitive probability manipulations between the primary and secondary modalities. If temporal and modality expectation mechanisms were independent, the temporal-expectation effect would have been driven by the overall temporal statistics determined by the primary modality. For example, if the primary-modality targets predominantly appeared with the longer interval and the secondary-modality targets predominantly appeared with the shorter interval, the overall probability would be that targets appeared more frequently with the longer interval because the primary-modality trials outnumbered the secondary-modality trials. The overall temporal statistics would have facilitated responses to both the primary-modality and secondary-modality targets with the expected timing associated with the primary modality. If the mechanisms mediating temporal and modality expectation were fully independent, temporal-expectation effects should have been equivalent for the primary-modality and secondary-modality targets in the prior studies that used competitive probability manipulations. Nevertheless, it is reasonable to expect that temporal-expectation effects might be reduced, neutralized, or even reversed for the secondary-modality targets based on the extent to which participants were able to use their knowledge of the opposite temporal statistics for the secondary targets. Thus, the presence of supramodal temporal expectation mechanisms, combined with variable degrees of effectiveness in the top-down control of temporal expectation in a modality-specific manner, could explain the apparent discrepancy in the literature. In a competitive probability manipulation paradigm, less top-down effectiveness would produce results more consistent with supramodal expectation effects whereas greater top-down effectiveness would produce results more consistent with modality-specific expectation effects.

The idea that temporal expectation is primarily supramodal (unless effectively countered by top-down efforts) is consistent with a number of studies that reported the scalp-recorded electroencephalography (EEG) data from an experiment that investigated the neural correlates of orienting attention in the temporal and modality domains (e.g., Keil, Pomper, Feuerbach, & Senkowski, 2017; Keil, Pomper, & Senkowski, 2016; Pomper, Keil, Foxe, & Senkowski, 2015). In their experiments using target detection tasks, to manipulate modality expectation, participants were informed about the forthcoming target modality (visual or tactile) with 100% accuracy as opposed to having a biased probability distribution; thus, there were no unexpected modality conditions. To manipulate temporal expectation, either fixed (1,700 ms) or varied (1,000–2,400 ms) cue-to-target intervals were used across blocks. The modality expectation manipulation modulated ERPs (event-related potentials) in sensory-specific brain areas, whereas the temporal expectation manipulation modulated ERPs in a global, supramodal manner. Further, the temporal expectation manipulation produced earlier ERP effects than did the modality expectation manipulation. While temporal expectation reduced the amplitudes of the associated ERP components, modality expectation increased the amplitudes of the modality-specific ERPs, with no significant correlation between the two effects. These EEG results support the notion that the expectation mechanisms that orient attention in the temporal and modality domains act in parallel and independently of each other, consistent with the current behavioral results. In general, the mechanisms that subserve expectation effects in different domains may be independent as studies have also shown that expectation effects in the spatial and temporal domains generate additive behavioral effects (e.g., Doherty et al., 2005).

In our previous work, we investigated how temporal expectation influenced cross-modal (audiovisual) sensory interactions (Menceloglu et al., 2017). While using the same temporal expectation manipulation and letter stimuli as the present study, we introduced audiovisual conflict by simultaneously presenting auditory (spoken) and visual (written) letters, which were either cross-modally congruent or incongruent. Participants were instructed to respond to either the auditory or visual letter while ignoring the other modality. Temporal expectation facilitated target responses (faster RTs) to the same extent whether the target was auditory or visual, and auditory and visual target RTs were comparable (replicated in the present study). However, at the moment of temporal expectation, auditory distractors produced weaker interference with the visual targets, whereas visual distractors produced stronger interference with the auditory targets, suggesting that visual information is prioritized over auditory information at the moment of temporal expectation. Our current results (no interaction between temporal expectation and modality expectation on responses to a single-modality target) and our previous work (reliable interaction between temporal expectation and modality-selective attention under audiovisual conflict) together may suggest that while the mechanisms that orient attention in the temporal and modality domains work in parallel, orienting in the temporal domain may prioritize information in certain modalities under cross-modal competition.

Our second goal was to understand how probability-driven expectation effects and stimulus-driven priming effects interacted within the temporal and modality domains. We found reliable interactions between the probability-driven and stimulus-driven effects. For the temporal domain, expectation effects were observed only when priming effects were absent, and priming effects were weaker when expectation effects were engaged. This pattern of results may indicate a floor effect in the combined facilitation effects of expectation and priming in the temporal domain; only one facilitation process may be sufficient for the system to reach an optimal performance level.

For the modality domain, we found reliable differences between the auditory and visual trials where priming effects predominated in the auditory modality whereas expectation effects predominated in the visual modality. Given that the expectation and priming effects did not interact within each modality and given that there were no overall RT or accuracy differences between the auditory-target and visual-target trials, these results cannot be explained by floor effects, modality differences in task difficulty, or any combination of the two. Thus, the results suggest that the auditory modality is more influenced by stimulus-driven processes whereas the visual modality is more influenced by top-down control in the context of orienting attention to a sensory modality. This interpretation appears to be inconsistent with Spence et al.’s (2001) findings. In their task, Spence et al. (2001) presented auditory, visual, and tactile targets while participants indicated the spatial location of each target (left or right) using foot pedals. In different blocks, the target stimulus was either equally likely to be presented through any of the three modalities (33% for all modalities) or was more likely to be presented through one of the modalities (e.g. 75% auditory vs. 12.5% visual vs. 12.5% tactile). While we found stronger modality-expectation effects for visual targets than for auditory targets and stronger priming effects for auditory targets than for visual targets, Spence et al. (2001) did not find any difference in the strength of expectation or priming effects across these modalities. This discrepancy is likely to be due to the methodological differences between the two studies. Spence et al. (2001) intermixed three sensory modalities, presented auditory stimuli via speakers, used a localization task, and had participants respond using their feet, whereas we intermixed two sensory modalities, presented auditory stimuli via headphones, used an identification task, and had participants respond using their fingers. Further, given that Spence et al. (2001) reported null results regarding any differences in the relative strength of modality expectation and modality priming as a function of modality, it is possible that our paradigm was more sensitive for revealing the interaction. For example, we used parallel manipulations of modality uncertainty and temporal uncertainty. The additional temporal uncertainty in our study might have sufficiently stressed the system to reveal the differential susceptibilities of the auditory and visual systems to modality-expectation and modality-priming effects.

Lastly, our findings also inform the relative contributions of the probability-driven expectation and stimulus-driven priming influences in the temporal and modality domains. In particular, our results inform the question of whether expectation effects observed in block-design experiments can be largely explained by priming effects as suggested by Spence et al. (2001). Spence et al. pointed out that in studies of expectation that manipulate probabilities of certain target attributes in a block-wise manner, priming effects may primarily account for the observed expectation effects. For instance, during a block of trials in which auditory targets are more frequent than visual targets, responses are speeded for auditory targets. Because the frequent auditory targets are more frequently repeated than visual targets, the block effect could be due to modality priming instead of modality expectation. However, we have demonstrated temporal-expectation effects on switched-timing trials as well as demonstrated modality-expectation effects on switched-modality trials. Our results thus suggest that expectation effects can be demonstrated over and above priming effects in a blocked design.

In summary, the current results suggest that both the probability-driven effects of temporal and modality expectation and the stimulus-driven effects of temporal and modality priming are additive; that is, expectation and priming effects both operate relatively independently for the temporal and modality domains. In contrast, within each domain, expectation and priming effects interacted; notably, responses to auditory targets were predominantly susceptible to modality-priming effects whereas responses to visual targets were predominantly susceptible to modality-expectation effects. These results are consistent with the idea that the sensory system controls attentional priorities independently in different domains (e.g., temporal and sensory-modality domains here) through the use of both the probability-driven and stimulus-driven mechanisms, but the relative emphasis on the two mechanisms may be optimized within each sensory modality.