Introduction

In everyday life we are presented with a constant stream of sensory and contextual information. In an effort to understand and encode critical components of this stream, we process it in smaller, meaningful units (Kurby & Zacks, 2008). How these units of information are structured and processed may influence how we interpret not only the temporal characteristics of our experience, but also our sense of how certain information should be grouped together or organized. For example, when observing someone making a pot of coffee, we may simultaneously experience the movements and goals of the person making the coffee, the visual features of the coffee pot, the sound of the grinding of the coffee beans, and the smell of the coffee brewing along with many other details. The processes that determine which subset of this information we attend to and encode into memory likely impact our perceptions of how long it takes to make the coffee as well as the temporal relationship between components of the experience.

Event Segmentation Theory (EST; Kurby & Zacks, 2008; Zacks, Speer, Swallow, Braver, & Reynolds, 2007) argues that we automatically break down the continuous stream of information we experience into meaningful spatio-temporal events (Newtson, 1976; Newtson & Engquist, 1976; Zacks et al., 2007). In particular, we maintain a working-memory representation of what is happening in the present moment, called an event model (Zacks et al., 2007), which informs perceptual predictions about what will happen next. At the transition between events, known as event boundaries, prediction errors increase and the event model in working memory is updated. Event-model updating at boundaries has been described as an attentional control process that guides the processing of new event information to build a new model (Zacks, et al., 2007). Event boundaries may be signaled by a number of conceptual or perceptual changes, including changes in spatial location, changes in time, and changes in the actor’s goal, among many others (Zacks, Speer, & Reynolds, 2009).

Event boundary effects on timing

The segmentation of experience into temporally distinct events has been shown to affect people’s perception of the passage of time under retrospective timing conditions, which are when participants do not know in advance that they should be attending to the temporal characteristics of their experience (Block & Zakay, 1997). When people are asked to make a temporal estimate about the length of an experience after the fact, they tend to use information in memory about the number and coherence of contextual changes to make their temporal judgments (Block & Reed; 1978; Boltz, 1995; Faber & Gennari, 2015; Poynter, 1983; Zauberman, Levav, Diehl, & Bhargrave, 2010). Recent work has shown that retrospective duration estimates of the time between two audio clips are correlated with the degree of pattern change in the fMRI BOLD signal found during encoding of the clips (Lositsky, et al., 2016). In particular, these studies have consistently found that experiences with more, or unpredictable, changes, are typically judged as longer than experiences with fewer, or highly predictable, changes.

In contrast, the manner in which events influence prospective timing, where people have advanced knowledge of the need to attend to the temporal characteristics of a task, is less clear. Most work on prospective timing suggests that temporal estimates are reliant on the amount of attentional resources devoted to temporal versus non-temporal task properties (see Block & Gruber, 2014; Block & Zakay, 1997, for reviews). People’s temporal estimates are shorter when their attention is divided between timing and other task characteristics (Brown, 1997; Macar, Grondin, & Casini, 1994). Oddball stimuli are typically perceived as longer than high-probability stimuli, which has been attributed to attentional orienting and enhanced perceptual processing of the oddball (Tse, Intriligator, Rivest, & Cavanagh, 2004). Attentional Gate Theory (Zakay & Block, 1995) is commonly used to explain these prospective timing effects. This theory argues that an internal pacemaker metes out pulses at regular intervals that are temporarily stored in an accumulator. An attentional gate located between the pacemaker and accumulator opens more widely the more attention that is paid to time, allowing more pulses to pass to the accumulator and leading to longer temporal judgments. The gate narrows when attention is diverted away from temporal task features, causing pulses to be missed and resulting in shorter temporal judgments. In this context, events or contextual changes should have a specific impact on prospective temporal estimates if they impact how attention is directed towards time.

Recent work has shown that, in some cases, the number of contextual changes may, indeed, influence prospective temporal estimates (but see Predebon, 1996). However, the research is inconsistent regarding whether these changes result in longer estimates of time or shorter estimates. For example, the more segments that participants perceived in animations, the longer they later judged those animations to be (Faber & Gennari, 2017). However, in contrast, Liverence and Scholl (2012) found that stimuli with more perceived segments were judged as shorter. Poynter and Homa (1983) discovered that in a prospective temporal reproduction task, the number of changes experienced during stimulus encoding led to longer subsequent reproductions for target durations less than approximately 2.5 s, but shorter reproductions for target durations longer than this threshold. These authors also found that in a series of prospective temporal estimation tasks using intervals shorter than 2 s, more changes experienced during these intervals led to longer subsequent estimated durations. Waldum and Sahakyan (2013) discovered that people’s knowledge of the typical length of pop songs influenced both their prospective estimates of the length of time they had spent completing a lexical decision task (LDT) and their time-based prospective memory (TBPM) responses when completing an LDT. In both cases, participants completed the task while listening to background music; the more songs played, the longer they subsequently judged the LDT task to be and the earlier they made their responses indicating that 10 min had passed in the TBPM task. More recently, Bangert et al. (2019) found that when people were asked to reproduce a previously encoded 30-s duration, reproductions made while simultaneously watching a movie of someone engaged in an everyday activity were shorter during more eventful segments of the movie than less eventful segments. These discordant findings in the prospective timing literature do suggest that events may impact time perception in two different ways. In some cases it appears that events may pull attention away from timing, as shown by Liverence and Scholl (2012), and with longer durations in the study by Poynter and Homa (1983), while in other cases they may operate similarly to how they function in retrospective timing tasks and serve as additional markers of time as demonstrated in the work by Faber and Gennari (2017), Waldum and Sahakyan (2013), Bangert et al. (2019), and Poynter and Homa (1983, for shorter durations). Although these studies required participants to encode a specific duration for later reproduction or estimation, it is unclear what prospective timing task parameters may have led to the differential effects on time perception. Moreover, all of these studies investigated the effect of an aggregation of multiple events on timing, which may introduce a number of unidentified factors that impact timing performance above and beyond the experience of a single event. Thus, in the current study, we focused on the impact that crossing a single event boundary has on prospective timing to best isolate and understand how the updating process, specifically, impacts this type of timing.

Event boundary effects and temporal associations

How events are segmented may not only impact absolute representations of time, they may also have an impact on how information is temporally associated. Components of experience that occur on opposite sides of an event boundary may be perceived as farther apart temporally than those occurring within the same event. Research on events and episodic memory suggests this may be possible (Dubrow & Davachi, 2013, 2014, 2016; Ezzyat & Davachi, 2011, 2014). Namely, event boundaries cued by temporal and contextual changes appear to disrupt people’s memory for episodic information that is spaced out in time; these boundaries limit the ability to form associations between information on either side of an event boundary, while information contained within the same event may be bound together. Ezzyat and Davachi (2011) found that cued recall of subsequent sentences during a text comprehension task was worse when the cued and recalled sentences were part of different events (occurred on either side of an indicated temporal event boundary) than when these sentences were part of the same event. This suggests that in narrative texts, temporal event boundaries influence how information is organized and associated in long-term memory. People also engage in within-event binding of actors and actions when viewing activity (Kersten, Earles, Curtayne, & Lane, 2008).

In addition, a number of studies have demonstrated that contextual shifts in stimulus sets influence people’s temporal order memory (Dubrow & Davachi, 2013, 2014, 2016; Heusser, Ezzyat, Shiff, & Davachi, 2018). For example, memory for the relative order of information was better when that information was located within a single context versus when the information was presented across context shifts (Dubrow & Davachi, 2013). This same study also demonstrated that people’s ability to judge which of two stimuli was more recently presented was poorer when the items occurred across a context switch than when items occurred within the same context, even when the lag between the stimulus items was equated across conditions. This result has been replicated several times (Dubrow & Davachi, 2014, 2016). Importantly, in all of these tasks, participants were explicitly told to use an associative encoding strategy and were either told to attend to temporal order (Dubrow & Davachi, 2014, 2016), or were at least aware of the need to do so after practice and multiple trial repetitions (Dubrow & Davachi, 2013). In an experiment where participants were not given explicit associative encoding instructions and were asked to complete a secondary task at encoding to limit the ability to form associations between successive stimuli, no boundary effects on temporal order memory were observed (Dubrow & Davachi, 2013). These studies argue that boundaries have an impact on the formation of item-item associations spread out over time, likely through shifts in the contextual elements associated with those items. In support of this possibility, activity in areas of the hippocampus and prefrontal cortex has been associated with temporal order recall across context shifts (Dubrow & Davachi, 2016).

A recent study has also argued for a specific role of executive attentional control in temporal order memory performance. Heusser et al. (2018) found that temporal order memory was better for stimuli contained within the same event than for stimuli that were located on either side of an event boundary. Moreover, they demonstrated a trade-off between processing time at event boundaries and subsequent order memory performance – more time spent during encoding at the boundary led to poorer subsequent temporal order memory for across-event stimuli. These results, along with their finding of enhanced item-to-context associations for information presented directly at boundaries led them to argue that directing attentional resources towards processing salient item or item-context information at event boundaries results in diminished attention to maintaining temporal associations between information spaced out in time. This explanation suggests attention may operate during the formation of temporal associations in a way similar to that argued by AGT during event-boundary processing. When attention is directed to updating non-temporal elements of a task at event boundaries, this pulls attentional resources away from forming temporal associations between items.

Similarly, Ezzyat and Davachi (2014) found that one’s judgment of the temporal distance between a pair of stimulus items was influenced by whether a contextual shift occurred between them at encoding. Participants were aware in advance of the timing task through repeated trials but were not made aware of which pairs of stimuli would be tested. They were simply asked to rate whether the temporal distance between pairs of stimuli were “very far,” “far,” “close,” or “very close.” Pairs of stimuli that contained a contextual shift (contextual instability) between their presentations were more likely to be judged as farther apart than stimulus pairs that were presented within the same context. Contextual instability led to dilated estimates of the temporal distance between stimuli.

These studies suggest that the shifting contexts across event boundaries as well as the need to direct attentional resources to episodic memory updating at event boundaries to incorporate new perceptual and conceptual context information may make it harder to form temporal associations between items across those boundaries. Consequently, participants think objects occurring on different sides of a boundary are temporally farther apart than objects contained within the same event. This suggests attending to event boundaries has a differential perceptual outcome for prospective temporal judgments that do not require participants to directly compare absolute representations of temporal intervals in memory. Indeed, the fact that participants in the study by Ezzyat and Davachi (2014) were unaware of which stimulus pairs would be tested would have made encoding of the direct representations of the relevant durations difficult, so relying on the association or contextual similarity of the two stimulus items presented would have been a more efficient way to make their temporal proximity estimates. This suggests that the attentionally demanding process of updating at event boundaries may have very different influences on subtly different temporal estimates, one where people must compare absolute representations of time, and one where they may simply make judgments based on temporal associations between items.

Context shifts in the Ezzyat and Davachi (2014) study and most of the examples mentioned above were directly manipulated by the experimenter, made fairly explicit, and involved artificial stimuli. Here, we are interested in studying the effect of event segmentation during naturalistic viewing of human activity on temporal grouping. Notably, some temporal proximity tasks could be structured so that participants are not instructed to specifically adopt an associative strategy and could more easily compare encoded duration representations to make their judgments rather than relying on information about the temporal association or contextual similarity of stimuli. It is unclear whether crossing a single naturalistic event boundary in this type of task would cause participants to adopt a duration comparison strategy, such that the results would mirror those of a temporal estimation task involving the same target duration or whether different effects of the event boundary would emerge. Understanding how crossing a naturalistic event boundary impacts prospective temporal estimation and temporal proximity judgments involving the same interval duration is important for gaining a clear picture of how the singular process of updating at a boundary shapes different temporal characteristics of our experience. For example, how does crossing a single event boundary in the process of making coffee influence both how long we think it takes to brew a full pot, and how we determine whether the time when the coffee stopped brewing was closer to when we filled the carafe with water or to when we poured our first cup of coffee?

Current study

In the current study, we were interested in assessing how the specific experience of crossing a single event boundary while watching naturalistic activity impacts people’s temporal perception in two types of prospective temporal estimation tasks involving the same target interval (5 s). Critically, the chosen tasks were designed to be similar – both involved prospective temporal estimation and both could, theoretically, be performed by directly comparing two encoded interval durations. Indeed, the first temporal estimation task emphasized this strategy by asking participants to determine whether a comparison interval matched a previously encoded reference interval. The second task required a temporal proximity judgment and could be performed by either directly comparing the length of durations between stimulus pairs or by judging how closely associated pairs of stimulus items were.

The goal was to investigate how the memory updating process that occurs when crossing a single naturalistic event boundary affects these two subtly different forms of temporal perception. We were interested in whether participants would adopt a similar duration-comparison strategy across the two tasks, in which case we would expect a similar influence of the event boundary (time compression if event boundaries cause temporal pulses to be missed or time dilation if pulses serve as additional temporal markers). If, however, they adopt a duration comparison strategy in the first experiment and an associative strategy in the second, they could show different patterns of performance, particularly if updating at the event boundary causes attentional resources to be shifted away from encoding temporal pulses as well as temporal associations across items spaced over time.

In the first experiment, we assessed the influence of crossing an event boundary during a naturalistic movie on people’s prospective estimates of the length of temporal intervals they had to compare to a previously encoded reference interval (5 s in length). We were particularly interested in whether comparison intervals that included an event boundary would be judged to be shorter or longer than the reference interval more often than comparison intervals wholly contained within a single event to gain a better sense of whether the event boundary serves to draw attention away from time, as predicted by the Attentional Gate Theory (Zakay & Block, 1995), or serves as an additional temporal marker. In Experiment 2, we investigated how crossing an event boundary impacts the extent to which equidistant stimuli (5 s apart) are temporally associated with one another. In the experiment, people were presented with a series of three tones while they watched movies of everyday activities; they then made proximity judgments where they judged whether the second tone in the set was closer to either the first or the third tone presented. Critically, an event boundary occurred between two of the tones. If event segmentation serves to temporally group information together and participants rely on these associations to make their judgments, then we would expect people to judge tones within an event to be closer to each other than tones separated by an event boundary. However, if participants use a duration comparison strategy to make their judgments, then they should show a pattern of performance similar to that of participants in Experiment 1.

Experiment 1

In this experiment, participants were asked to engage in a two-alternative forced choice (2AFC) prospective temporal estimation task where they compared durations presented during movies of actors engaged in everyday activities to an initial 5,000-ms unfilled reference duration on which they were trained. Key to this experiment is that the reference comparisons were the same length as the reference duration, but either spanned two events (i.e., contained an event boundary) or were contained within a single event. If the boundary is treated as an additional marker of time, we would expect intervals containing a boundary to be judged as longer than the reference interval more often than intervals not containing a boundary. On the other hand, the process of memory updating at event boundaries has been proposed to direct attentional resources to perceptual and conceptual features of the current experience (Zacks, et al., 2007), which may pull attention away from encoding elapsing time. In this case, according to the Attentional Gate Theory, this shifting of attention should cause temporal pulses to be missed, and as such, intervals with a boundary should be judged as shorter than the reference interval relative to intervals with no event boundary.

Method

Participants

Sixty-eight college students who reported having normal vision and hearing participated for course credit. The experiment was approved by the University of Texas at El Paso Institutional Review Board. One participant was eliminated from subsequent analyses due to computer error that prevented her data from being recorded. An additional three participants were eliminated for using a counting strategy during the task, which they had been instructed not to do. The remaining 64 participants (mean age = 20.94 ± 5.91 years, 30 females) were included in the data analyses reported below. An additional 28 participants (mean age = 20.43 ± 4.21 years, 21 females) were recruited to complete a pilot study where they were asked to complete the order memory task without having first watched the experimental movies.

Apparatus/stimuli

Movies

Six videos of individuals engaged in everyday activities were used for the experiment. These videos depicted the following activities (movie lengths shown in parentheses): (a) a woman making a sandwich (146 s), (b) a woman washing a car (432 s), (c) a man building a ship out of Duplo blocks (246 s), (d) a woman assembling a tent (379 s), (e) a man doing laundry (300 s), and (f) a man planting a flower box (354 s). See Fig. 1 for screenshots from each movie. The sandwich movie was used for the practice trials and the remaining five movies were used for the experimental trials. Participants were seated approximately 54 cm from the computer screen and each movie was displayed in a box on the screen subtending approximately 49° of visual angle horizontally and 24° of visual angle vertically.

Fig. 1
figure 1

Screenshots from the everyday movies used in Experiments 1 and 2. Panel A shows an image from the practice sandwich movie. Panel B shows an image from the movie of a woman washing a car. Panel C shows an image from the movie of a man building a ship out of Duplo blocks. Panel D shows an image from the movie of a woman pitching a tent. Panel E shows an image of from the movie of a man washing clothes. Panel F shows an image of the movie of a man planting a flower box

Selection of test locations

The determination of where event boundaries occurred in these movies was conducted using the same methodology reported in Zacks, Kurby, Eisenberg, and Haroutunian (2011). We first obtained event segmentation data from a previous study (Kurby & Zacks, 2011) in which participants were instructed to watch movies and then press a button to mark off the movie into meaningful units of activity. Then, for each movie, we estimated the probability density of segmentation across time using a 3-s bandwidth Gaussian kernel. We then identified the local maxima and minima. Around each extrema, we defined a 5-s window and computed the proportion of participants who segmented inside that window. For each movie, we selected the four windows with the highest proportion as the boundary test locations, and we selected the four windows with the lowest segmentation probability for within-event test locations.

Temporal durations

The training reference duration and the comparison durations were 5,000 ms in length. The training reference duration was an empty interval demarcated by a male voice saying “start” and “stop” while participants watched a fixation cross in the center of the computer screen. Participants were instructed to time the interval from the onset of the word “start” to the onset of the word “stop.” Comparison durations presented during the movies were identical in length (5,000 ms) to the reference duration, and used the same start and stop signal, but were presented either fully within an event (WITHIN condition) or spanning an event boundary (ACROSS condition). The comparison durations were presented such that the local extrema used to identify the within and across test locations occurred exactly in the middle of the comparison duration (i.e., 2,500 ms after the onset of “start”). Across the five experimental movies, there were a total of 20 ACROSS comparison intervals and 20 WITHIN comparison intervals.

We also created two short comparison filler intervals of 4,500 ms in length (SHORT) and two long comparison filler intervals of 5,500 ms in length (LONG) that did not overlap the previously identified segments for each movie. Filler durations were presented at randomly selected times in the movies that would accommodate their length. In total, across the five experimental movies, there were ten SHORT fillers and ten LONG fillers. The experiment was conducted on a Dell Optiplex 780 computer, using E-Prime 2.0 software.

Procedure

Prior to the start of the practice movie, participants were trained on the reference duration during six training trials. Each training trial started with a 2,000-ms fixation cross in the center of the screen, followed by presentation of the “start” and “stop” signal demarcating the 5,000-ms reference duration while the fixation cross remained on screen. The fixation was viewed for an additional 1,000 ms before the start of the next training trial. After the training trials, participants were asked to watch the practice movie (sandwich) and were told that they would periodically hear “start” and “stop” signals marking comparison trials. They were instructed to judge whether each comparison duration was shorter or longer than the reference duration. After a comparison duration was presented during the movie, the movie paused, and while watching a fixation, the participant had to make their judgment. They pressed the “S” key on the keyboard if they thought the comparison was shorter than the reference and the “L” key if they thought the comparison was longer than the reference duration. Once they made their judgment, the movie continued until the next comparison duration was presented. Participants made judgments about five comparison durations (mixed between two 5,000 ms comparison durations and three filler durations) during the practice trials to acclimate them to the task. Participants were informed that the comparisons were very similar to the reference duration and that they would need to attend carefully to the comparison durations in order to make a correct decision. They were also told that they would need to pay attention to what was happening in the movie for a later memory task. Participants completed the practice movie without wearing headphones so that the experimenter could monitor whether they understood the instructions and completed the task correctly. They were instructed not to count during the temporal estimation task.

After they completed the practice movie they completed an order memory task. For this task participants were presented with different screenshots from the movie presented on 3 × 5-in. cards. The cards were presented in a scrambled order in two rows on a table top. Participants were asked to reorder the cards as quickly and accurately as possible to reflect the correct order of actions as they were presented in the movie. For the practice movie, six images had to be reordered. For the five movies presented during the experimental blocks, participants reordered 12 images from each movie. The time it took participants to complete each order memory task was recorded in seconds. Error in ordering performance was also calculated by determining the absolute value of the distance of each card’s sorted numeric position from the correct numeric position according to the actual order in which scenes occurred in the movie (Zacks, Speer, Vettel, & Jacoby, 2006).

After finishing the practice session, participants repeated the training-temporal estimation during the movie-order memory task cycle for the remaining five experimental movies. For the experimental trials, participants wore headphones so they would not be distracted while performing the task, and they were also reminded not to count during the timing task. The order of experimental movies was counterbalanced across participants. At the very end of the entire experiment, participants also completed an exit questionnaire asking them to describe any strategies they used to remember or reproduce the durations in the experiment. Any participants who reported counting were removed from later analyses.

Data analysis

For all of the comparison types, we calculated the probability with which individuals judged that comparison type to be longer than the reference duration. We conducted a two-way paired-samples t-test between ACROSS and WITHIN comparison trials to determine whether participants showed a difference in their probability of “longer” judgments for trials that crossed an event boundary versus trials that were contained within a single event. We also used a paired-samples t-test to compare the SHORT and LONG filler trials on this dependent measure to confirm that participants were able to detect actual differences in the length of these durations.

Results

Order memory task

On average, participants took 118.89 s to complete the order memory task after each movie. Their mean deviation score was M = .72, showing that on average, when participants made an ordering error, it was by less than one position. These results were significantly better than those of a naïve set of participants who completed the order memory task without first watching the movies. Naïve participants made more sorting errors (M = 1.83, t(90) = -10.38, p <.001) and took longer to complete the task (M = 141.69, t(90) = -2.74, p =.007) than participants from this experiment. This suggests that participants who completed the timing task attended to the content of the movies.

Comparison durations

Figure 2 shows the proportion “longer” judgments for filler trials as well as the ACROSS and WITHIN comparisons. Participants showed a higher proportion of “longer” judgments for comparisons contained within a single event (WITHIN M = .44, SD = .18) as compared to comparisons that crossed an event boundary (ACROSS M = .39, SD = .19), t(63) = -2.43, p = .018, d = .27.

Fig. 2
figure 2

Proportion “longer” judgments from Experiment 1. The top graph shows the data for comparison durations presented ACROSS an event boundary and WITHIN a single event. The bottom graph shows data for the FILLER comparisons that were shorter or longer than the reference duration. Error bars are ± 1 standard error. *p < .05, ***p <.001

When comparing the filler durations, we found a significant difference in proportion of “longer” judgments, t(63) = -6.40, p < .001, d = -1.32, with people making significantly fewer “longer” judgments for the SHORT filler (M = .26) as compared to the LONG filler duration (M = .51). This suggests that individuals were sensitive to differences in length of the filler intervals.

Discussion

Participants produced a smaller proportion of “longer” judgments for ACROSS compared to WITHIN trials, which shows that crossing an event boundary impacted participants’ perceptions of these comparison durations. Specifically, participants perceived that less time had passed during comparison intervals that crossed a single event boundary than comparison intervals that were wholly contained within an event. This is notable given that these comparison durations were the same length, and equivalent to the initial reference interval. This suggests that processing an event boundary while experiencing the comparison duration may have engaged resources that would normally be devoted to attending to time, consistent with the Attentional Gate Theory (Zakay & Block, 1995). In the case of the current experiment, attending to the event boundary during a comparison interval may have drawn attention away from timing, causing a narrowing of the attentional gate and a loss of some of the temporal markers representing time; this led participants to think that durations that crossed event boundaries were shorter than comparison durations that did not cross a boundary.

The fact that participants showed a high level of performance on the order memory task suggests that participants generally attended to the content of the movies and processed the event boundaries. Participants also demonstrated differences in performance for the filler trials, showing that they were sensitive to actual changes in temporal duration when they existed. One might wonder whether Vierordt’s law (Lejeune & Wearden, 2009), where participants tend to overestimate shorter durations and underestimate longer durations, might explain the pattern of proportion of “longer” judgments for the filler intervals. While there was likely some underestimation of the longer filler, it did not appear that participants were overestimating the shorter filler interval to a great degree, since 74% of the time they accurately judged it as shorter than the comparison interval. Also, the comparison durations that were equivalent in duration to the reference interval were judged as shorter than the reference duration a larger proportion of the time than longer. Therefore, what our data more likely reveal is that participants underestimate durations that they encode when watching movies, so that all judgments, even those for intervals that were equivalent to the reference interval, were more likely to be judged as shorter than they actually were. This seems to explain our pattern of effects more clearly than Vierordt effects.

Experiment 2

In Experiment 2, we examined how crossing a single naturalistic event boundary impacts temporal binding of information in perception. In particular, we wanted to assess whether items contained within a single event tend to be temporally grouped or bound together while items contained in different events experience a reduced temporal association between them. In this experiment, instead of asking participants to compare durations to a reference interval in long-term memory, we asked participants to judge the temporal distance between three tones in working memory. More specifically, participants were asked to judge whether the middle tone was closer to the first tone they heard or the third tone they heard. The three tones were spaced equidistantly, except that an event boundary occurred between the middle tone and either the first or third tone. This meant that the middle tone and the tone on the other side of the boundary were contained within two separate events while the middle tone and the remaining tone were wholly contained within a single event. If event segmentation groups within-event temporal information together, tones presented within a single event should be judged as closer together while tones separated by an event boundary should be perceived as more distant from one another, consistent with previous findings from Ezzyat and Davachi (2014).

Method

Participants

Sixty-nine college students with normal vision and hearing completed the study for course credit. The experiment was approved by the University of Texas at El Paso Institutional Review Board. Two participants were eliminated due to technical difficulties that led to computer problems during the task. An additional seven participants who reported counting during the task were also eliminated. The remaining 60 participants (mean age = 20.48 ± 3.19 years, 41 females) were included in the analyses reported below.

Apparatus/stimuli

We presented participants with three 50-ms, 1,000-Hz tones (tones A, B, and C) separated equally by 5,000 ms. These three tones comprised tone sets that were presented during movies of actors engaged in everyday activities. The movies were the same as those used in Experiment 1 (see Fig. 1) with the sandwich movie used for practice and the other five movies used during the experimental trials. Again, participants were seated approximately 54 cm from the computer screen; each movie was presented in a box on the screen subtending approximately 51° horizontally and 28° vertically.

We used the same event boundary and event middle locations that were identified for Experiment 1. For each movie, we selected four to seven boundaries to help construct the experimental tone sets. The varying number of location selections occurred to accommodate the longer time window (10 s) used in this experiment compared to Experiment 1 (5 s). We then created two stimulus lists, one where for a particular tone set the event boundary was between tones A and B and another where the same boundary was positioned between tones B and C. The extrema containing the event boundary was located either directly in the center of the 5-s segment between tones A and B (A|BC trials) or directly in the center of the 5-s segment between tones B and C (AB|C trials). See Fig. 3 for a visual representation of these two stimulus types. In stimulus set 1, there were 15 tone sets where the event boundary was located between tones A and B (A|BC sets) and 16 tone sets where the event boundary was located between tones B and C (AB|C sets). In stimulus set two, those boundary locations were reversed so there were 16 A|BC sets and 15 AB|C sets. Since the timepoint location of each boundary was static, to craft these two stimulus sets, the initiation time of each tone was shifted accordingly to accommodate the event boundary location. If the boundary was 2,500 ms after Tone A for when it was an A|BC trial in set 1, in set 2, when that same boundary was used for an AB|C trial, Tone A was initiated 7,500 ms prior to the boundary. Half of the participants received the first stimulus set while the other half received the second stimulus set, with assignment to a set counterbalanced across participants. Table 1 shows the number of A|BC and AB|C trials identified from each experimental movie during the experiment.

Fig. 3
figure 3

Illustration of the tone sets for Experiment 2. Different events are shown by different gray-scale shading over the course of time to help illustrate which tones are located within each event. Tones A, B, and C were equally spaced 5,000 ms apart from one another. An event boundary either occurred in the middle of Tones A and B (A|BC trials), as shown in the top portion of the figure or between Tones B and C (AB|C), as shown in the bottom portion of the figure

Table 1 Selection of tone sets for Experiment 2

We also created filler tone sets where tone B was actually closer to tone A (AB fillers) by 500 ms and sets where tone B was actually closer to Tone C (BC fillers) by 500 ms. For AB fillers, this produced a spacing of 4,500 ms between tones A and B, and a spacing of 5,500 ms between tones B and C. The opposite spacing was used for BC fillers. For each movie, we presented participants with one each of an AB filler and a BC filler for a total of ten filler trials, five of each type, across all five experiment movies.

The experiment was conducted on a Dell Optiplex 780 computer, utilizing the E-Prime 2.0 software.

Procedure

Participants were asked to complete the consent form and a demographics form before starting the experiment. The experiment involved a two-alternative (2AFC) procedure where participants were presented with three tones (tones A, B, and C) periodically while watching movies, and they had to decide whether the middle tone, tone B, was closer to tone A or tone C. Immediately after completion of the third tone, the movie paused and participants pressed the “q” key if they thought the middle tone, tone B, was closer to the first tone, tone A, than the last tone in the set, tone C. They pressed the “p” key if they thought tone B was closer to tone C than to tone A. After they made their response, the movie resumed. Participants first completed the task during a practice movie of a woman making a sandwich. They then completed the task during each experimental movie. Movie order was partially counterbalanced across participants so that each movie appeared once in each possible order position.

After each movie, participants completed an order memory task, in the same fashion as in Experiment 1. At the very end of the entire experiment, participants also completed an exit questionnaire asking them to describe any strategies they used to remember or reproduce the durations in the experiment. Any participants who reported counting were removed from later analyses.

Data analysis

For all of the types of tone sets, we calculated the probability with which individuals judged tone B to be closer to tone A. We then conducted a paired-samples t-test to compare the A|BC and AB|C trials to see whether the proportion of judgments that B was closer to A differed depending on the location of the event boundary. We also used a paired-samples t-test to compare the filler trials where tone B actually was closer to tone A or tone C to confirm that participants were able to detect actual differences in the temporal distances between tones.

Results

Order memory task

On average, participants took 104.81 s to complete the order memory task after each movie, which was significantly shorter than that of naïve participants (see Experiment 1), t(86) = -4.04, p <.001. Their mean deviation score was M = .79, which was also better than that of participants who sorted the pictures without first watching the movies, t(86) = -9.64, p <.001). This suggests that participants who completed the timing task attended to the content of the movies in this experiment.

Tone sets

Figure 4 presents the mean judgment proportions by condition. Participants were significantly more likely to judge tone B to be closer to A for AB|C tone sets (M = .54, SD = .16) than A|BC tone sets (M = .47, SD = .15), t(59) = 3.66, p = .001, d = .43.

Fig. 4
figure 4

Proportion “B closer to A” judgments from Experiment 2. The top graph shows the data for tone sets where the event boundary was between tones A and B (A|BC trials) or between tones B and C (AB|C trials). The bottom graph shows the data for the FILLER tones sets where tone B was closer to tone A than tone C (Filler AB) or where tone B was closer to tone C than tone A (Filler BC). Error bars are ± 1 standard error. **p < .01, ***p <.001

We found a significant difference in the proportion of tone B closer to tone A judgments for AB fillers versus BC fillers, t(59) = 7.02, p < .001, d = 1.43, with people making significantly more B closer to A judgments (M = .71) for the AB fillers than the BC fillers (M = .37) (see Fig. 4).

Discussion

The results of this study showed that when asked to judge tone sets where the three tones were equidistant but an event boundary was located between one pair of tones within each set, the location of the event boundary influenced participants’ proximity judgments of these tones. Specifically, when an event boundary occurred after tone B, such that Tones A and B were contained within the same event but Tone C was located in a different event (AB|C trials), participants perceived Tone B to be closer to Tone A more often than when the event boundary occurred between Tones A and B (A|BC trials). This suggests that event boundaries impact temporal grouping, such that stimuli contained within the same event are judged as closer together while crossing an event boundary impacts people’s ability to form associations between tones that are located in separate events. This is also consistent with the findings by Ezzyat and Davachi (2014) that a context shift may influence people’s judgment of temporal distance between pairs of previously presented stimuli. Therefore, crossing a single naturalistic event boundary in a movie disrupts temporal grouping of stimuli across events in a similar fashion to an artificially created context shift in a stream of stimuli. The results also showed that participants attended to the content of the movies and were sensitive to actual differences in distance between the three tones when asked to make judgments about filler tone sets.

The design of this temporal proximity task could have allowed participants to use a strategy where they directly compared representations of durations between each pair of tones to make their judgment. If participants had done this, then we would have expected the results to mirror those of Experiment 1 where attending to the event boundary pulled attention away from time, causing shorter distance judgments for tone pairs whose tones were on opposite sides of an event boundary. Instead, the results suggest the use of the temporal association strategy mentioned above and supported by findings seen in prior studies investigating temporal order and temporal proximity (Ezzyat & Davachi, 2014; Heusser et al., 2018).

General discussion

Both experiments demonstrate that crossing a single naturalistic event boundary has an impact on people’s perceptions of the temporal characteristics of their experience of a target 5-s interval. In the first experiment involving a prospective temporal estimation task, comparison intervals that included an event boundary were deemed shorter than a previously encoded reference interval held in long-term memory relative to comparison intervals that contained no event boundary. This suggests that devoting attention to memory updating at an event boundary may cause individuals to miss pulses when it comes to timing with a pacemaker. In the second experiment, when presented with a series of three equidistant tones and asked to judge whether the second tone was closer to the first or the third tone, participants tended to group together tones contained within the same event and judged tones on either side of an event boundary as being farther apart. Thus, event boundaries appear to play an important role in how we temporally group information together.

The results of Experiment 1 are consistent with the Attentional Gate Theory (Zakay & Block, 1995) and the view that during event boundary updating, attention was directed to task features other than the temporal estimation task, and temporal pulses were missed. This adds to an extensive body of work supporting the role of attention in prospective timing tasks. Two issues are important to consider with regard to this finding, however. The first is that in this experiment, the timing task that we asked people to perform was separate from the actual temporal event structure portrayed in the movie. We asked people to time an arbitrary interval that was not linked to any specific goal of the actor in the movie. We did not, for example, cue people to begin timing when the actor was performing a meaningful action that lasted 5 s. Instead, the cued temporal intervals were not linked in any meaningful way to temporal properties of the naturalistic movie. Given that the arbitrary timing task we assigned was not useful for predicting what the actor might do next, it may have been assigned lower priority by the attentional control updating processes that operate at an event boundary than information about perceptual or conceptual features of the experience that would enable participants to better make predictions about future behaviors of the actor in the movie.

The second issue is that there is evidence that temporal processing can be disrupted by concurrent tasks that require executive control. Indeed, a number of studies have found that in dual-task scenarios, when individuals were asked to engage in serial temporal production while concurrently completing tasks requiring executive attention processes, such as inhibition, shifting, and updating, the tasks showed bidirectional interference; timing interfered with performance on the executive control tasks and executive performance tasks interfered with timing performance (Brown, Collier, & Night, 2013; Brown, Johnson, Sohl, & Dumas, 2015; Brown & Perreault, 2017). These findings support the idea that executive control processes and timing may rely on a similar pool of attentional resources. Event Segmentation Theory (Zacks, et al., 2007) argues that updating at event boundaries is an attentional control process that allocates attentional resources where needed. This would suggest that the updating that occurs at event boundaries and performance of a concurrent, arbitrary timing task may also utilize similar resource pools. Therefore, in situations where individuals must complete a concurrent timing task while experiencing an event boundary, performance on the timing task may suffer as the updating process directs attentional resources away from this task in order to refresh the critical features of the event model. Note that temporal features of the event model may be updated during this process in addition to perceptual and conceptual features of the event structure if they are relevant for prediction. This means that if there is timing-relevant information in the event model that is related to the actor’s goals – for example, this information may be included in updating of the event model. Thus, while temporal details of the everyday experience that are relevant for prediction may receive additional processing, temporal information from an arbitrary but concurrent timing task may not. This idea is actually consistent with the expectations of Event Segmentation Theory. Perhaps event segmentation and the executive control process of boundary updating interact with the timing system in such a way that event boundaries occupy resources that might normally be directed towards a concurrent but separate temporal task. Future work should explore this relationship further and determine whether concurrent timing may similarly disrupt updating of event models in working memory.

Our finding for the prospective temporal estimation task used in Experiment 1, however, runs counter to a number of recent studies demonstrating that during prospective timing tasks event boundaries may serve as additional markers of time (Bangert et al. 2019; Faber & Gennari, 2017; Waldum & Sahakyan, 2013). It is unclear whether this difference may have been due to the fact that the current experiment only assessed the influence of crossing a single-event boundary while the prior studies investigated the influence of multiple event boundaries on timing or whether the difference might be due to other features of the concurrent timing tasks used in these experiments. There are differences across these studies regarding when event boundaries were experienced in the task (i.e., during encoding of a reference interval, during the estimation/reproduction phase, or both) and also differences in the nature of the temporal decision being made (i.e., via reproduction or estimation). The differences across these studies call for further investigation to determine what circumstances or task characteristics may drive whether event boundaries pull attention away from a concurrent timing task or serve as additional temporal markers. It will also be important to assess how event boundaries impact timing in a task that is inherently a part of the ongoing eventful experience.

The results of Experiment 2, using a temporal proximity task, suggest participants adopt an associative encoding strategy, even without explicit instructions to do so, and that the updating that occurs at event boundaries directs attentional control in such a way that it impacts how temporal associations are formed between items in perception, consistent with prior findings by Ezzyat and Davachi (2014) and Heusser et al. (2018). These results are also similar to studies showing that context shifts, such as a shift in the stimulus category from faces to objects at encoding, make it harder for people to later judge which of two stimuli they experienced was more recent. If a category shift occurred between the probed pair of items during their initial encoding, then participants had a harder time determining recency of presentation than if no category shift occurred between the items at the encoding stage (Dubrow & Davachi, 2013, 2014, 2016). We found that when stimulus items occurred within the same event, they were judged as closer together than when a boundary separated them. Had participants used a duration comparison strategy we would have expected tone pairs with a boundary between them to be judged as closer together, as they were in Experiment 1. Therefore, even in a task where a comparison strategy could have been used for making the judgment, participants adopted a more efficient strategy where they relied on information about the temporal associations to make their judgments of temporal proximity.

While these results may seem discrepant with the results from Experiment 1, we think that, in fact, the results are complementary and arise out of the same attentionally effortful processes that are engaged during working memory updating at event boundaries. When an event boundary is experienced, attentional resources are directed to updating perceptual and conceptual information about the new event and are less available for processing both the temporal pulses that would be required for encoding an absolute representation of a duration and the temporal associations between items from the prior event and the current one. This is supported by findings from Heusser et al. (2018) that while item-to-context associations appear to be enhanced for information presented directly at event boundaries, the more time people spent processing information presented at event boundaries, the poorer their temporal order memory for items spanning across those boundaries. This suggests a trade-off in attentional resources devoted to updating new information that needs to be incorporated into the event model and those dedicated to encoding temporal associations between items from different events. Thus, in both experiments reported here, attentional resources required for updating at event boundaries reduces the pool of available resources for the temporal processes required in each task.

It is noteworthy that in Experiment 2 the impact of the event boundary occurred within a very short timescale in our task (a 10-s time span) and that it influenced the grouping of simple, equidistant stimuli that were identical in their perceptual characteristics. This suggests that the experience of a single event boundary can impact temporal binding of three highly similar stimuli all concurrently held within working memory. Therefore, event boundaries may influence what information is grouped together as part of an individual’s experience of what is “present” or “happening now” above and beyond the similarity of perceptual characteristics of the stimuli. This may have implications for the concept of the “psychological present” in the timing literature, which is the time period during which a set of stimuli are thought to be occurring during the same moment (see Grondin, 2010, for a review). Typically, the upper range of this period is thought to occur around 2–3 s (Pӧppel, 2004), which is much shorter than the timespan utilized in our task. Even though the distance between each of our tone stimuli was longer than this upper limit, it is important to determine to what degree the experience of crossing an event boundary might impact the concept of what is considered “past” and what is considered “present.” For example, is it possible that an event boundary might disrupt the grouping of items presented within the timescale of the psychological present? While our study does not address this question, it is an intriguing direction for future investigations. Certainly, in our task, we show evidence of differential temporal binding when identical tones are separated only by 5-s intervals. This provides further support for the organizing effects of event boundaries and the fact that within-event perceptual information is considered as part of a unit that is bound together, while information across different events is not. It will be important to reconcile how the impact of event boundaries on the structure of temporal associations influences people’s perception of the “present,” both in the context of traditional timing paradigms and in their experience of everyday activities.

Across two different experiments, we have demonstrated that crossing a single event boundary influences not only how long we perceive certain experiences to be but also how we bind information together and organize it in perception. Although we believe the effects we found are best explained by how attention is directed to different components of one’s experience, another possibility may be that when segmentation occurs at an event boundary, the working memory updating that occurs may cause temporally relevant information held in memory from the event prior to the boundary (i.e., start signal of the comparison interval in Experiment 1 and first tone in Experiment 2) to be lost or degraded. A number of studies have revealed that when memory for information presented prior to an event boundary is probed after the boundary occurs, that information is often remembered more poorly than when memory is probed within a single event (Bailey, Kurby, Sargent, & Zacks, 2017; Ben-Yakov, Eshel, & Dudai, 2013; Radvansky & Copeland, 2006; Radvansky, Krawietz, & Tamplin, 2011; Radvansky, Tamplin, & Krawietz, 2010). For example, Bailey et al. (2017) found that when asked to attend to spatial shifts in a text comprehension task, individuals showed poorer memory for both spatial and character features that were relevant prior to the shift. These findings suggest that through the process of updating at event boundaries, irrelevant information or information not currently perceptually available about the prior event may be discarded from memory so that new episodic memories of the next event may be formed.

While this memory interpretation is worth consideration, we do not think this is the most likely explanation for our current findings. First, it implies that weaker item memory relates to changes in both perceived temporal duration and temporal distance. However, there is no straightforward reason why such a reduced accessibility would lead to shorter across-event estimates in one case (Experiment 1) and longer across-event estimates in the other (Experiment 2). This is also complicated by the fact that, as suggested by our results, participants may have used different strategies in the two experiments. Second, even if one concedes that this memory account could explain a loss of temporal pulses and the resultant temporal compression for durations containing an event boundary in Experiment 1, it is not clear why weakened item memory would lead to reduced item-item temporal associations in Experiment 2. Traditional work on recognition memory has drawn a distinction between item and associative memory processes with evidence for differential involvement of medial temporal lobe structures during encoding of each type of information (Cohen, Poldarck, & Eichenbaum, 1997; Davachi, 2006). In addition, many studies have shown that aging leads to specific declines in associative recognition while item recognition is relatively preserved (Old & Naveh-Benjamin, 2008). There is further evidence of a dissociation between these two types of memory from studies showing that increases in emotional arousal and negative content of materials at encoding may impair associative memory while simultaneously enhancing item memory (Bisby & Burgess, 2014; see Bisby & Burgess, 2017, for a review).

A more parsimonious explanation that more easily accounts for the different temporal perception effects in our two experiments is that the relevant temporal information may simply receive less attention as individuals engage in boundary updating. Thus, events likely have their influence on the temporal estimation and temporal proximity judgment tasks through attentional mechanisms that may be strategically adjusted to attend to aspects of the current task conditions that are most relevant to event boundary updating and the organization of one’s ongoing experience. In both experiments, devoting attention to updating perceptual and conceptual features of the viewed activity left fewer attentional resources for accumulating temporal pulses to represent the relevant duration in Experiment 1 and for temporally binding information across contexts in Experiment 2. Remarkably, we have demonstrated that updating at even a single naturalistic event boundary is an attentionally demanding process that takes away attentional resources from concurrent temporal estimation at a relatively short timescale, even when participants know in advance they need to pay attention to time or temporal relationships.

Overall, our results demonstrate that event segmentation in naturalistic circumstances shapes our perceptions of the temporal characteristics of our experience through episodic memory updating that captures timing-relevant attentional resources. Referring to our coffee example, one may perceive that the coffee brewing time is faster if one experienced a concurrent event boundary, such as retrieving a mug from the cupboard, than if a concurrent boundary was not perceived (e.g., the mug was sitting on the counter already). Likewise, this concurrent event boundary might make someone think that the moment the coffee stopped brewing was closer to when she finally poured the coffee into her mug than when she initially filled the carafe with water, even if the timing between those sets of events was the same.