Focusing completely on a task can be difficult. So difficult, in fact, that people regularly report lapses in attention, in which one’s mind wanders away from the current situation. To investigate this phenomenon, researchers have employed a range of different laboratory situations, including visual search, reading, and sustained-attention tasks (Forster & Lavie, 2009; Foulsham, Farley, and Kingstone, 2013; Jackson & Balota, 2012; McVay & Kane, 2009; Reichle, Reineberg, & Schooler, 2010; Smilek, Carriere, & Cheyne, 2010; Uzzaman & Joordens, 2011). Common to these types of settings is the relatively invasive method of measuring mind wandering (MW) by interrupting individuals to ask whether, at the current moment, they are “on task” or “off task” (i.e., engaged in MW). This probe-caught methodology has revealed that in lab-based situations individuals are frequently off-task, with MW rates climbing to 50% (Kane et al., 2007; Mooneyham & Schooler, 2013; Risko, Anderson, Sarwal, Engelhardt, & Kingstone, 2012; Thomson, Seli, Besner, & Smilek, 2014). Moreover, the effect of MW on task performance is substantial, negatively impacting response times and accuracy, as well as memory for the test materials (Antrobus, 1968; Esterman, Noonan, Rosenberg, & DeGutis, 2012; Giambra, 1995; Kam, Dao, Stanciulescu, Tildesley, & Handy, 2013; McVay & Kane, 2012; Storm & Bui, 2016; Varao Sousa, Carriere, & Smilek, 2013). These and comparable findings across lab and more complex, real-life situations (Galéra et al., 2012; Lindquist & McLean, 2011; Wammes, Boucher, Seli, Cheyne, & Smilek, 2016) support the general notion that MW rates are stable, reflecting the diversion of attention away from the task at hand, which in turn interferes with the encoding and/or integration of task-relevant information (Kam et al., 2012; Smallwood & Schooler, 2006).

Though the consistency of probe-caught MW results speak to both the stability of MW, and the conclusion that the lab results scale up to real life situations, one might ask whether this stability is due in part to the use of the thought-probe methodology itself. In short, is interrupting individuals and asking them to reflect on their state of mind creating a common, artificial situation at all levels of investigation and this is the reason for the reliable MW results? Or to put it even more forcefully, the robust MW results may be anything but evidence that MW is relatively stable across different settings; rather, it is evidence that the probe-based methodology is inserting a singular common artificial event into different situations. Accordingly, the stability of MW may be merely an illusion that reflects the common practice of using the probe-based method to measure MW. If this is true, then probes not only influence the specific moment of query within a task, but how participants view, and in turn perform, during the entire experiment. The aim of the present study was to address this issue by examining MW using the ecologically valid self-caught methodology and then by testing whether self-caught MW rates change when the probe-based methodology is inserted into the experiment.

The self-caught method, as the name suggests, requires individuals to report a MW event whenever they notice that their mind has strayed from the task at hand (Giambra, 1993; Schooler, Reichle, & Halpern, 2004). Phenomenologically, this seems more akin to the actual way in which individuals become aware of their MW: That is, they catch themselves mind wandering rather than responding to an external probe that demands that they reflect immediately on their state of mind. Thus, a primary attraction of the self-caught method is that it introduces a high degree of ecological validity into the testing situation, which can carry with it numerous advantages (Chisholm et al., 2014; Kingstone, Smilek, & Eastwood, 2008; Risko, Richardson, & Kingstone, 2016). However the lack of experimental control that accompanies the self-caught method is a double-edged sword, as it means that when using this method a researcher can do little else but stand back and receive whatever MW reports are logged by participants. One cannot be certain how many MW reports one will collect nor that those MW reports will coincide with task events that are of particular research interest. Critically, the probe-method addresses both these limitations. Moreover, on the rare occasion when probe-caught and self-caught methods have been combined, the evidence suggests that the probe-caught method is more likely to detect MW events than self-caught reports (Jackson & Balota, 2012; Sayette, Reichle, & Schooler, 2009; Schooler et al., 2004). These limitations with the self-caught methodology notwithstanding, because of its ecological validity and lack of experimental control, the self-caught method is ideally suited for our present purpose, which is to test if the probe-caught methodology is creating an artificial situation that alters MW performance across an experimental session. By using the self-caught results as our dependent variable, we can manipulate the presence of probes to determine whether self-caught MW reports are influenced by the introduction of probes.

To create a strong test of this issue, we chose to measure MW in the classroom as this is a nonartificial, natural environment for university students, where MW frequently occurs (Lindquist & McLean, 2011; Risko et al., 2012; Unsworth, McMillan, Brewer, & Spillers, 2012; Varao-Sousa & Kingstone, 2015; Wammes, Seli, Cheyne, Boucher, & Smilek, 2016b; Young, Robinson, & Alberts, 2009). The key question was whether self-caught MW would be impacted by the introduction of thought probes. In the present study, we examined students’ ability to self-catch MW during three classroom sessions. Thought probes were introduced in the middle session only. By comparing MW rates in the self-caught sessions (Sessions 1 and 3) with the MW rates in the self-caught/probe-caught session (Session 2), we could examine whether self-caught MW rates are altered by the introduction of thought probes.

If our hypothesis is correct, and the probe-caught method is creating an artificial testing environment, then the self-caught MW rates in Sessions 1 and 3 would be significantly different from those in Session 2, when MW probes were included. Alternatively, if the probe-caught method is providing a “pure measure” of MW (as the field assumes), then the self-caught MW rates should be relatively stable across all three sessions and, most crucially, should be unaffected by the introduction of thought probes in Session 2. On the basis of a vast wealth of past work, we expected that the overall probe-caught MW rate would fall in the range of 30%–50% (e.g., Lindquist & McLean, 2011; Risko et al., 2012; Risko, Buchanan, Medimorec, & Kingstone, 2013; Unsworth et al., 2012; Varao-Sousa & Kingstone, 2015; Wammes, Seli, et al., 2016a).

Method

Participants

The participants in this study were students enrolled in an Introductory Psychology course at the University of British Columbia (UBC). The students (N = 259) were informed of the testing dates and general task protocol via the UBC Learning Management System (Connect); however, only the data for participants who completed all three sessions of the study are included in the analyses. Three additional participants were removed from analyses for reporting self-caught MW rates greater than three SDs above the mean (thereby influencing the kurtosis and skew of the data). Of the included participants (n = 86), 63 were female, and their ages ranged from 17 to 28 years (M = 19.07, SD = 1.72). Participants received course credit for each session they completed. Participants provided informed consent before taking part in each session. This study was approved by the University of British Columbia Ethics Board.

Materials and measures

Course details

The course was offered from 9:30 to 10:50 a.m. on Tuesdays and Thursdays. The testing sessions covered content related to neurons and neurotransmitters (Session 1), genetics and twin studies (Session 2), and heredity and evolution (Session 3). Participants were provided a response sheet at the start of each class to record their answers.

Self-caught mind wandering

In all three sessions, participants were asked to self-catch MW and to report any instance by placing a check mark in the 10-min interval that corresponded with the current time (e.g., 10:10–10:29) on their response sheet. Laboratory studies investigating self-caught reports tend to have participants respond via keypress whenever MW is noticed. Because students were not provided with electronic devices as part of this class, the “paper-and-pencil” method was a natural solution.

Probe-caught mind wandering

In Session 2 only, participants responded to six visual probes in addition to reporting self-caught MW. The probe frequency was selected on the basis of prior research in a live lecture environment (Lindquist & McLean, 2011; Varao-Sousa & Kingstone, 2015; Wammes, Seli, et al., 2016b). The probes were presented on PowerPoint slides that were incorporated into the class lecture. The probes asked students to indicate whether they were on or off task by circling their response on their response sheet. Participants were given roughly 5 s to answer each probe.

Memory test

At the end of each lecture, six multiple choice test questions were displayed to the class via PowerPoint. Test questions were created after previewing the lecture slides from the course instructor. Participants indicated their answer by circling the option they felt to be correct (i.e., A/B/C/D) on the response sheet. Participants were provided 10 min to record their answers and were asked not to use their class notes or consult with classmates during the testing period.

Interest and motivation ratings

Once completed, participants reported their interest and motivation in the lecture (“How interesting did you find the material presented in today’s lecture?” and “How motivated were you to attend to the lecture?”). A 5-point Likert Scale was used, where 1 = low interest/motivation and 5 = high interest/motivation. These responses were indicated by circling the number that participants felt corresponded to their experience. These measures were collected in order to replicate past findings that both factors relate negatively to probe-caught MW rates (Hollis & Was, 2014; Lindquist & McLean, 2011; Unsworth & McMillan, 2013) and to examine whether this relationship extends to self-caught MW.

Prior experience

Finally, participants recorded how much experience they had with the content presented in the lecture. Introductory psychology classes at UBC tend to draw in many upper-year students from other areas, with these students often having had prior exposure to similar content (e.g., neuroscience or biology majors). Participants were asked to indicate experience with the material by circling one of three options: “I’ve taken multiple courses on the topic”; “I’ve taken 1–2 courses on the topic”; “This was my first lecture on the topic.” We suspected that this factor might influence memory test performance.

Procedure

At the start of each class period, a link to an online consent form was made available to students. Any student wishing to participate was asked to go to the e-form to provide consent. Students were provided response sheets on which all responses were made. On the response sheet, students recorded a numeric identifier that allowed researchers to anonymously track performance over the three sessions. The following instructions were provided verbally at the start of each session:

During the lecture I’d like you to note any mind wandering that you experience by putting a checkmark in the box that corresponds to the current time. This means that you might have some time block (for example: 10:00–10:09) where you have 8 checkmarks, some where you have 2 or 3 and some where you have none. Simply put a checkmark each time you notice mind wandering, in the corresponding time block. Mind wandering is any thought that is not related to the course material being presented. Examples of mind wandering include thinking about what you are going to have for lunch, thinking about something you did on the weekend, other course work, etc.

In Session 2, participants were also provided with the following instructions:

Additionally, you will be asked to indicate your attentional focus at a number of specific time points during the lecture. During the lecture you will see a PowerPoint slide that asks you to indicate whether you were mind wandering or on task. When you see this slide, again on the sheet provided, simply circle your response to indicate your thoughts in the moments before you saw that slide. Remember you provide reports both anytime you notice mind wandering and at the specific time points where the question is on the slide.

Students were reminded that participation was optional and that their instructor would not access individual responses or know whether students chose to participate. Participants were then given the opportunity to ask questions, after which the lecture began. The lectures ran roughly 60 min, with 10 min provided at the end of each session for participants to complete the memory test and follow-up questions.

Results

Across all three sessions, a small majority of participants (54%) reported having taken one to two courses on the topic presented.Footnote 1 Descriptive statistics are presented in Table 1. In addition to standard null hypothesis significance testing, we conducted Bayesian analyses using the BayesFactor package in R (Morey & Rouder, 2015; R Core Team, 2015). Bayesian analyses allow researchers to move beyond simply failing to reject the null by computing a Bayes factor value, which provides a ratio for the data in support of evidence for the null versus the alternative hypothesis. To interpret a Bayes factor, the subscript 01 or 10 is used to indicate whether the value is in support of the null (BF01) or the alternative (BF10) hypothesis. The greater the Bayes factor, the stronger the evidence in support of the indicated hypothesis.

Table 1 Descriptive statistics for dependent variable measures

Self-caught MW reports

A repeated measures analysis of variance (ANOVA) suggested that self-caught rates of MW did not differ significantly across the three sessions, F(2, 170) = 1.44, p = .24, BF01 = 13.56. This BF indicates that the data are 13.56 times more likely under the null hypothesis (that there is no difference across sessions) than under the alternative hypothesis.Footnote 2

Probe-caught MW reports

In Session 2, participants indicated that they were mind wandering in response to thought probes 40% of the time. This rate falls squarely in the center of the expected 30%–50% range (e.g., Lindquist & McLean, 2011; Varao-Sousa & Kingstone, 2015; Wammes, Boucher, et al., 2016a).

MW time course

As can be seen in Fig. 1, self-caught MW reports were blocked by the 10-min interval in which participants responded (e.g., 9:40–9:49). Analyses indicated a significant difference in self-caught reports for Session 1, F(4, 340) = 3.42, p = .009, BF10 = 0.10, but not for Session 2, F(4, 340) = 1.48, p = .21, BF01 = 56.96, or Session 3, F(4, 340) = 0.78, p = .54, BF01 = 84.42. For Session 1, a linear trend analysis was conducted to determine whether the self-caught reports increased over time; however, this result was nonsignificant, F(1, 85) = 0.37, p = .55.

Fig. 1
figure 1

Time course of self-caught mind wandering reports, split by session. The colored bands represent standard errors of the means

Figure 2 presents the time course for probe-caught MW responses. A repeated measures ANOVA revealed a significant effect of time on task, F(5, 425) = 3.97, p = .002, BF10 = 3.81. A linear trend analysis revealed a significant trend, F(1, 85) = 5.77, p = .018. Follow-up analyses indicated that, after Bonferroni corrections, only Time 2 was significantly different, such that less MW occurred then than at Time 3 or Time 6, ps < .04.

Fig. 2
figure 2

Time course of probe-caught mind wandering reports in Session 2. The colored bands represent standard errors of the means

Memory test

Figure 3 displays the mean performance across sessions. A repeated measures ANOVA revealed that memory test performance differed significantly across the three testing sessions, F(2, 170) = 33.80, p < .001, BF10 > 150. Post-hoc analyses using a Bonferroni correction for multiple comparisons (corrected p value = .05/3 = .017) revealed that memory test performance differed significantly between each of the conditions (all ps < .002), with the best performance in Session 2 and the worst performance in Session 1.

Fig. 3
figure 3

Memory test performance across the three sessions. Error bars represent standard errors of the means

Correlations

MW rates

Table 2 displays the statistical relationships between MW reports across all sessions. Individuals’ MW rates were significantly and positively correlated across all sessions, with the exception of the Session 1 self-caught reports with the Session 2 probe-caught reports (p = .07), although the relationship was still positive. This suggests that the ability to report MW remained consistent at a participant level across sessions.

Table 2 Pearson’s r correlations for mind wandering measures across all three testing sessions (N = 86)

MW and other measures

Table 3 displays statistical summaries for the correlations between MW and the other measures collected. Across all three sessions, no relationship between MW and memory test performance was found. This was the case for both self-caught and probe-caught MW reports. Both self-caught and probe-caught MW reports were significantly negatively correlated with interest and motivation ratings. This relationship suggests that as interest and motivation in the course content decreases, MW reports increase.

Table 3 Pearson’s r correlations of mind wandering with other measures across all three testing sessions

Discussion

In this study we investigated whether the relatively stable MW rates found across past investigations were an artifact of the prevalent use of the thought-probe method. Our data allow us to reject this hypothesis. In the present study, the probe-caught MW rate was 40%, which dovetails with the rates reported during other lecture-based studies (Lindquist & McLean, 2011; Varao-Sousa & Kingstone, 2015; Wammes, Boucher, et al., 2016b). If the inclusion of thought probes had altered the testing situation, the self-caught MW rates in Session 2, in which thought probes were also present, should have differed from those in Sessions 1 and 3, in which only self-caught reports were collected. However, neither the self-caught MW rates nor their time courses across the three lectures differed. Students caught themselves mind wandering roughly six or seven times per lecture, with these reports being evenly distributed over time. The thought probes in Session 2 had no impact on self-caught performance. Thus, these data indicate that the probe-caught method is a valid sampling method and that the stability of MW that this method has yielded is valid and not an artifact of the experimental design.

In addition to recording MW, we measured memory test performance and both interest and motivation ratings. Because each session covered a different topic, it is not surprising that we found variation in terms of these factors. Perhaps more interesting is that although these subjective ratings varied across the different sessions, the self-caught MW rates did not. This suggests that individual differences in attention are more stable than subjective ratings of interest or motivation. MW rates were also highly and positively correlated, speaking further to the stability of individual MW variation.

We also replicated prior work reporting relationships between probe-caught MW and both interest and motivation ratings, and we found evidence that this finding extends to self-caught MW rates (Forster & Lavie, 2014; Lindquist & McLean, 2011; Phillips, Mills, D’Mello, & Risko, 2016; Unsworth & McMillan, 2013). We also replicated the recent finding that MW may not correlate with memory test performance in live lectures (Varao-Sousa & Kingstone, 2015; Wammes, Seli, et al., 2016a), though it is worth noting that since overall memory performance was quite high, it is possible that ceiling effects reduced the ability to detect a correlation.

Collectively, these data converge on the conclusion that both the self-caught and probe-caught methods are valid techniques for measuring MW. What should be made of our finding that self-caught and probe-caught MW methods can be combined seamlessly, without one method influencing the other? At one extreme, one could conclude that these data indicate that each method is capturing a similar type of attentional lapse. This would follow if the self-caught method provides participants the opportunity to indicate that MW that occurs between probes, notwithstanding the fact that instances might still be “caught” at times when participants have not yet realized that they are mind wandering (Jackson & Balota, 2012; Sayette et al., 2009; Schooler et al., 2004). At the other extreme, it is conceivable that these two MW methodologies do not interact because they are capturing distinct types of MW (e.g., probes are measuring lapses in attention that operate outside of conscious awareness). Some tentative support for this idea is found in the different time course data for the two MW methods. Probe-caught MW showed a linear trend, increasing over the course of the lecture; however, no linear trend was found for self-caught MW. It is worth highlighting that although probe-caught MW showed a linear trend, this was driven by only two time points, and thus the stability of the pattern is unclear. The lack of strong support here is consistent with research suggesting that in live lectures, attention may wax and wane to a different rhythm than in laboratory settings, and thus our findings warrant further exploration (Wammes, Boucher, et al., 2016b; Wammes & Smilek, 2017). Another point of support for self-caught and probe-caught MW being distinct measures of MW comes from the correlations between MW reports reported in Table 2. The correlations between self-caught reports are strong across all sessions, but when one compares self-caught and probe-caught, the correlations between sessions are much weaker. Note that the higher correlation between self-caught and probe-caught MW on Session 2 (r = .50) is likely due to the reduced variance arising from both measures being derived on the same day.

Beyond our own data, previous research by Schad, Nuthmann, and Engbert (2012) suggests that MW is not simply an all-or-none cognitive state, and that attention may decouple in a more graded manner. Different MW measures may sample qualitatively different parts of this gradient. Distinguishing between these alternatives seems a fruitful avenue for future research. Such research might examine this issue by using objective behavioral measures that could be correlated with both types of MW reports—for example, blink rates (Smilek et al., 2010), pupil dilation (Franklin, Broadway, Mrazek, Smallwood, & Schooler, 2013), or fidgeting (Carriere, Seli, & Smilek, 2013; Farley, Risko, & Kingstone, 2013; Seli et al., 2014). In addition to looking for objective measures that relate reliably with MW measures, it would be fruitful to carefully examine covariates that might influence MW reports and the relationship between measurement methodologies—for example, time on task (Farley et al., 2013; Thomson et al., 2014; Wammes, Boucher, et al., 2016a), task difficulty (Feng, D’Mello, & Graesser, 2013), or probe frequency (Seli, Carriere, Levene, & Smilek, 2013).

Prior MW research has been dominated by probe-caught measurements. Although researchers have proposed that individuals lack enough meta-awareness to reliably report MW without a thought probe (Jackson & Balota, 2012; Sayette et al., 2009; Schooler et al., 2004), the findings of this study suggest that individuals are quite capable of noticing and reporting MW during a lecture. The use of the self-caught method provides stable rates of MW in a live lecture setting, indicating that this is a viable method for future research. Furthermore, as we demonstrated in the present study, self-caught MW rates are unaffected by the introduction of the probe-caught method, meaning that the ecologically valid self-caught method can be used alongside the more controlled probe-caught method without any negative effects, and with the benefit of collecting additional data. In short, these methods appear to be complementary, and by availing themselves of both, MW researchers can have the best of both worlds.