Introduction

Dietary behavior in humans is complex and controlled by a variety of factors (Drewnowski, 1997a, b). More than two-thirds of adults in the United States are overweight or obese (Ogden, Carroll, Kit, & Flegal, 2014). A much smaller number suffer from binge-eating disorder, bulimia nervosa, or anorexia nervosa and are severely emaciated (Hoek, 2006). A growing body of research on these issues considers the role of cognition in understanding eating (Hall, 2016; Shafran, Lee, Cooper, Palmer, & Fairburn, 2007). One focus of such research is the attentional bias that we have for energy-dense, typically high-fat and/or high-sugar, food.

Using variations of some standard cognitive tasks, such as the dot-probe task and various eye-tracking paradigms, previous research has shown an attentional bias for food over non-food objects and a bias for energy-dense food over food with little energy value (Mogg, Bradley, Hyare, & Lee, 1998; Van Dillen, Papies, & Hofmann, 2013; Werthmann et al., 2011). For example, Mogg et al. (1998) used a dot-probe task to demonstrate that people have an attentional bias for food words compared with non-food words. In their task, they presented subjects with two words, one above a fixation cross and one below. In one case, one of the words was a food-related word (e.g., sandwich) paired with a neutral, unrelated non-food word (e.g., chair). In another case, they presented a transport-related word (e.g., airplane) and an unrelated neutral word. They presented the word pairs for a brief period of time, followed by a dot probe that appeared in the place of one of the words. The subjects were told to respond as quickly as possible to the location of the dot probe. They found that when subjects were hungry, they responded faster to dot probes that were in the same previous position as food words compared with dot probes in the same previous position as non-food words.

Recently, other researchers have used distraction paradigms as a way to better understand how food stimuli might distract us in the presence of other tasks. Critically, these studies attempt to determine whether food-related stimuli are more able to interfere with performance on an ongoing task than are non-food stimuli. Distraction paradigms test the ability of a subject to remain on task in the face of interference, that is, they involve executive control. The goal of that work has been to better understand the link between certain cognitive abilities and predilections and dietary outcomes (e.g., high body mass index).

Lessons about distraction from the attentional capture literature

Some ideas relevant to the design of distraction experiments can be gleaned from the literature on attentional capture. Psychologists have been interested in how attention is controlled for well over a century (James, 1890/1950). Early research was concerned with how internal factors, such as expectation and intention, controlled where attention was directed. Recently, external factors, such as stimulus salience or the suddenness of onset of a stimulus, have been the subject of intensive investigation. These two lines of research are now described as pertaining to “goal-directed” and “stimulus-driven” attention. Research on stimulus-driven attention suggests that attention may be captured by particular external stimuli and that this capture may be unintentional and directly contrary to the subject’s intentions (Bacon & Egeth, 1994; Graves & Egeth, 2016; Leber & Egeth, 2006; Theeuwes, 1994, 2010; Yantis & Jonides, 1984).

A major debate in the field of attention is concerned with the “automaticity” of attentional capture. An early critical example was provided by Folk, Remington, and Johnston (1992). At the time they conducted their experiments, it had been well-established that a suddenly onset stimulus could attract attention, as could a salient singleton (e.g., the only red in a field of green elements) (Theeuwes, 1990; Yantis & Jonides, 1984). Folk et al. (1992) were interested in whether attentional capture was modulated by the feature similarity between the salient distractor (the captor) and the target item. Their studies used a variant of the spatial cuing paradigm. In their first experiment, abrupt-onset cues produced a validity effect for sudden onset targets. That is, reaction time was faster when the target appeared in the same location that the cue had just appeared in than when it appeared in a different location. However, sudden onset cues had no such effect when the target was based on a color discontinuity. In their second experiment, the cue was a color discontinuity (a red stimulus presented among several white stimuli). In this case, the cue produced a validity effect for color-defined targets but not for abrupt-onset targets. Note that the cues in their experiments were statistically nonpredictive of the location of the target or the correct response. For this reason, it seems reasonable to think of them as irrelevant distractors. (We used the Folk et al. [1992] terminology of “cue” to avoid confusion for readers familiar with their research.)

The double dissociation in the results of Folk et al. (1992) has been described as an example of contingent capture: attention is oriented toward (and perhaps captured by) nonpredictive cues/distractors only when those cues/distractors are defined by features that match the defining features of the target. What this and subsequent studies made clear was that to determine whether capture by distractors is independent of attentional control settings, it is necessary to eliminate features from the potential distractor display that match the defining features of the target (Bacon & Egeth, 1994; Folk et al., 1992; Folk, Remington, & Johnston, 1993).

Forster and Lavie (2008) made a serious effort to determine if attentional capture could be observed when target-defining features were eliminated from potential distractors. In their study, subjects were presented with visual search displays near the center of a monitor. On each trial, they had to indicate whether an N or an X was present in a background of several other letters. Subjects were requested to focus attention on this task. It was already known that letters presented in the periphery (and thus, due to their location, were task irrelevant) would interfere with task performance (Lavie, 1995). However, letters are of the same category as targets and thus, arguably, not totally irrelevant (see Lleras, Buetti, & Mordkoff, 2013 for an insightful discussion of this issue.) Therefore, Forster and Lavie (2008) presented on some trials a picture of an irrelevant cartoon character in the periphery. This stimulus slowed performance on the search task in a variety of conditions across three experiments. This result was taken to support the idea that attention could be captured by entirely irrelevant distractors.

Forster and Lavie (2008) had eliminated several target-defining features from their stimuli (e.g., the target was central, the distractor was peripheral; the target was alphanumeric, the distractor was pictorial). However, Gibson and Kelsey (1998) have demonstrated that subjects also monitor displays for features other than just local, target-specific features. In particular, subjects may be set to detect features that signal the onset of the task-relevant target display as a whole. For the Forster and Lavie (2008) experiment, this means that the fact that the search display and the distractor both onset at the same time stands in the way of considering the distractor being totally irrelevant to the task. To circumvent this problem, Forster and Lavie (2011) modified the procedure of their earlier paper. The central task involved successive judgments about a central matrix of letters and digits that remained static and visible for several seconds. During this time, dynamic (i.e., suddenly onset) peripheral distractor pictures were shown. They still interfered with performance, thus establishing that totally irrelevant distractors can capture attention, even when their onset is differentiated from that of the relevant stimuli.

Bridging the gap: combining attentional capture paradigms with studies on food distraction

The experiments described point to critical methodological concerns that need to be addressed when we try to determine whether some stimulus class is acting as a distractor or is more distracting than some other stimulus class. It would appear that existing studies of distraction by food-related stimuli have not fully addressed these concerns. As an example, consider an adaptation of a method previously used to examine the distracting power of emotional stimuli, the attentional blink paradigm (Most, Chun, Widders, & Zald, 2005; Most & Wang, 2011). Specifically, in studies using the attentional blink paradigm in which the potentially interfering stimuli included pictures of food, accuracy of detection of the target picture was lower when the distractor stimulus was food than when it was a neutral stimulus (Neimeijer, de Jong, & Roefs, 2013; Piech, Pastorino, & Zald, 2010). Furthermore, the magnitude of the effect depended on the subject’s state of hunger (Piech et al., 2010). Although their results are both interesting and plausible, these studies use methods that create a task situation where the distracting stimuli share critical features with the target stimuli and thus are relevant to the current goals of the participant.

Putting this more generally, in the attentional blink paradigm all stimuli (including the targets and distractors) are typically of the same type (e.g., colored pictures), characterized by sudden onsets, and share the same spatial location. It may not be possible to remove all such shared features (all stimuli, for example, are being shown in a laboratory), but one can at least try to maximize the distinctiveness of targets and distractors. To provide a strong test, experimenters should make the irrelevant stimuli as unrelated to the task as possible.

Our goal for the present experiments is twofold. First, by adapting the previous paradigm from Forster and Lavie (2011), we presented participants with a novel task where the food images were truly irrelevant to the task at hand. Second, by varying the nutritional content of these irrelevant distractors, we could address a gap in the literature that has failed to investigate whether all food that is irrelevant to the current task captures attention or only foods that are the most desirable (again, something that would increase our understanding of food behavior in the real world).

To accomplish this, we assessed differences in the distracting power of images of energy-dense foods (e.g., high fat, high calorie), low-energy foods, and ordinary objects (Figure 1) by adapting a variant of the task used by Forster and Lavie (2011). Specifically, subjects were shown a set of four alphanumeric characters that they were to classify, one at a time, as digits or letters by pressing one of two response keys. At some time during the execution of this task, on a subset of the trials, a picture was shown; the picture was irrelevant to the classification task and no response of any kind was required to it. Note three key features of the task. First, by their natures the pictures and the alphanumeric characters were distinct, both semantically and perceptually. Second, they appeared in distinctive locations; the characters were central and the pictures peripheral. Third, the picture was not presented until the subject had already responded to the first of the characters. At that moment, the suddenly onset picture had dynamic properties, whereas the array of alphanumeric characters was static (Folk et al., 1993; Gibson & Kelsey, 1998). It has been shown that even when these three features are in place pictorial stimuli can capture attention (Forster & Lavie, 2011). The question we ask in Experiment 1 is whether the magnitude of attentional capture varies with the energy value of the stimuli depicted in the pictures. Finding that it does so vary, in Experiment 2 we explored the lability of the real-world goals that drive attentional capture and their dynamic relationship with attentional capture by giving subjects a small amount of energy-dense food to eat before they perform the experimental task. In Experiment 3, we aimed to replicate the key findings from Experiments 1 and 2, thus half of the subjects got a small amount of energy-dense food before the experimental task and half did not. Furthermore, we wanted to extend those findings by testing whether consuming energy-dense foods decreases attentional capture to other attractive stimuli or whether these effects were stimulus-specific.

Fig. 1
figure 1

Design for Distractor Task for Experiments 1 and 2. Study design and sample stimuli. Observers were instructed to respond to each of the 4 symbols indicating, in order, whether each is a letter or number as quickly as possible. On 50% of trials, a distractor image randomly appeared prior to the 2nd, 3rd, or 4th response (never with the onset of the display). There were three types of distractor images: energy-dense food images, low-energy food images, and non-food object images. Every participant completed 600 trials. Note: Images are not to scale.

Experiment 1

Participants

We ran 18 Johns Hopkins University undergraduate students and community members (mean age = 19.4 years; 7 males, 11 females) with normal or corrected-to-normal visual acuity and normal color vision. We based our sample size on two criteria: (1) previous literature and (2) a power analysis. Previous work investigating attention and bias toward food images has used groups of 18 subjects to look at effects similar to ours (Castellanos et al., 2009). In addition, we conducted a power analysis using G*Power (Faul, Erdfelder, Lang, & Buchner, 2007) which revealed that given an effect size of ηp 2 = 0.14, based on what related studies have previously found (e.g., Veenstra, de Jong, Koster, & Roefs, 2010), at least 18 participants would be required to have 95% power to detect the effect in our design). Research was performed with the approval of and in accordance with the Johns Hopkins Homewood Institutional Review Board. Informed consent was obtained from all participants in all experiments.

Apparatus

Experimental sessions were performed on a Dell Precision T-3400 2.33-GHz computer. Stimuli were presented on a Dell 1708 FP monitor. Stimulus presentation was performed using programs written in MATLAB and using the PsychToolbox software (Brainard, 1997). The screen had a refresh rate of 60 Hz, and the resolution of the screen was 1,280 × 1,024 pixels. Participants sat at a viewing distance of approximately 60 cm.

Stimuli

Each trial contained a matrix of four symbols surrounding a central fixation cross. The fixation cross subtended 0.5° of visual angle vertically and horizontally. Each of the four symbols was positioned in one of four quadrants around the fixation cross (Figure 1). The distance from fixation to the nearest part of a symbol was 1.2° of visual angle. Symbols subtended 0.5° by 0.8°. Finally, on a subset of trials distractor images would appear randomly above or below the central matrix for a brief time (125 milliseconds). These images subtended 5° to 9° horizontally and 6° to 9° vertically (with at least 1.5° between the distractor image and the nearest symbol).

Design and Procedure

On each of the 600 trials, participants were presented with a centralized discrimination task, adapted from Forster and Lavie (2011). Each of the four items in the matrices shown in Figure 1 are equally likely to be a letter or a number (chosen from 1-9 or X, R, T, J, L, P, N, F, B). Participants were shown one matrix at a time and were instructed to report, in ordinary reading order, whether each symbol in the matrix is a letter or a number, responding as quickly as possible while also trying to be accurate. When a distractor image was presented, it appeared before the participants’ response to the second, third, or fourth item. Thus, the matrix on which the task was being performed was static while the distractor image had the dynamic property of sudden onset.

Distractors appeared on a random 50% of trials. For distractor-present trials, a third of these trials contained an image of energy-dense food, a third contained an image of a low-energy food, and a third contained an image of a non-food object. The order of presentation of all images was randomized. Images of food and objects were taken from the Food.pics database (Blechert, Meule, Busch, & Ohla, 2015). The database provides the macronutrient information for each image in the database, as well as physical image characteristics such as color composition, contrast, brightness, size, and complexity; this allowed us to control for any physical differences between distractor category sets. Images that were taken from this database were categorized into energy-dense foods, low-energy foods, and non-food objects, which were then compared on each of these physical characteristics using t-tests. All t-tests resulted in p values > 0.05, indicating that these image categories did not differ from one another on these physical image characteristics. Altogether there were 100 images of energy-dense foods, 100 images of low-energy foods, and 176 images of non-food objects available for the study. For each participant, we used all 100 images from the two food categories, and 100 images were randomly selected for each subject, from the non-food category. Thus, for all distractor present trials, a novel distractor image was presented (avoiding any effects of familiarity). Participants completed 600 trials. Reaction times and accuracy on the task were recorded.

Data Analysis

We removed trials where reaction times that were more than 2 standard deviations above or below the mean per subject. This accounted for 4% on average across all experiments. We removed trials on which subjects made errors on the digit-letter discrimination task; rates of such errors across all experiments were 4% on average. We also removed responses to symbols (within the four item matrix) where subjects made an error to the symbol immediately prior to a response. Due to the type of response in this paradigm, participants respond quickly to items within the matrix and if they incorrectly responded to an item due to a motor error (e.g., pressed the letter button but they meant to press the number button), this might cause some delay on the subsequent button press. If a distractor was presented, this delay would appear to take longer due to distractor processing, but in reality, would reflect distractor processing plus the cost of processing the previous motor error. Therefore, we removed these trials, which accounted for 3% of trials. The first ten trials of every session were considered practice and were removed.

We created an “attentional capture” score to assess the amount of distraction during the task. Specifically, we categorized trials into four types: energy-dense food distractor present, low-energy food distractor present, non-food distractor present, and distractor absent. We used response times on distractor absent trials as a reference point and subtracted those response times from the response times for the equivalent lag and position (recall that the distractor was never presented prior to the response to the first item, similar to Forster and Lavie, 2011) in each of the distractor present conditions. The remaining value is the reaction time cost of distraction for each of the distractor image types.

Results

As shown in Figure 2 (black bars), participants in Experiment 1 were more distracted by images of energy-dense foods, than by images of non-food objects and low-energy foods, suggesting that participants rapidly and implicitly assessed the nutritional value of the distractor images (Toepel, Knebel, Hudry, le Coutre, & Murray, 2009), even when they were entirely irrelevant.

Fig. 2
figure 2

Attentional Capture by Irrelevant Images of Energy-dense Foods is Flexible. Results from Experiment 1 (black bars; without eating energy-dense foods) and Experiment 2 (gray bars; with eating energy-dense foods). The dependent variable (y-axis) is the amount of attentional capture for a given distractor type. This measure was calculated by taking the average reaction times when a distractor image was presented prior to a response (for each subject) and subtracting the baseline distractor absent reaction times (for each subject). Our results demonstrate that images of energy-dense foods are much more distracting than images of low-energy foods and images of non-food objects (ps < 0.05). However, when we had observers consume an energy-dense food prior to the study, the level of distraction of energy-dense food images was greatly reduced and this resulted in a significant interaction of image type and experiment (p < 0.05). Error bars are +/- 1 Standard Error of the Mean (SEM).

The differences evident in Figure 2 from Experiment 1 (see black bars) are supported by the results of statistical analysis. We first conducted a one-way within-subjects ANOVA on the data from Experiment 1. We found a main effect of image type, F(2,34) = 6.786, p = 0.003, ηp 2 = 0.29, Geisser–Greenhouse corrected for nonsphericity. Additional post-hoc comparisons revealed that there was significantly more distraction by images of energy-dense foods compared to low-energy foods and non-food objects (ps < 0.05, Bonferroni corrected for multiple comparisons).Footnote 1 Distraction by images of low-energy food and images of non-food objects did not differ (p > 0.05).

It is reassuring that the Food.pics database included physical characteristics for the images we used and that we were able to ascertain that there were no significant differences in color composition, contrast, brightness, size, or complexity among the three categories of stimuli compared. However, that is not an exhaustive listing of visual features, and so there was a lingering question about whether the striking difference we found in attentional capture scores might not be due to some other perceptual feature and not the “meaning” of the pictures in terms of the fat content/energy density of the depicted stimuli. It struck us that it would be useful to manipulate some factor unrelated to the perceptual features of the stimuli that might be expected to influence the degree to which the energy-dense stimuli captured attention. Therefore, we decided to replicate Experiment 1 but with the addition that all subjects would be given a small snack immediately before participating. If it is the case that our difference in distraction is due to the hedonic value of energy-dense foods, we may be able to modulate that by changing the goal state related to these items by giving them some energy-dense food before the task.

Experiment 2

Participants

Similar to Experiment 1, we ran 18 Johns Hopkins University undergraduate students and community members (mean age = 19.2 years; 5 males, 13 females) with normal or corrected-to-normal visual acuity and normal color vision.

Methods

Experiment 2 was essentially identical to Experiment 1 with the exception that the 18 new subjects consumed their choice of two “fun sized” candy bars (each bar weighed about 17.3 g). The following candy bar types were used due to their high fat content: Snickers, Kit Kat, and Reese’s Peanut Butter cups. The average nutritional contents of two “fun-sized” candy bars was 173.3 calories, 9.1 g of total fat, and 17.7 g of sugar. Candy bars were given to participants before participating in the task while they were filling out forms. Candy bars were completely consumed before the task.

Results

As shown in Figure 2 (light gray bars), this separate set of participants in Experiment 2 was not more distracted by images of energy-dense foods than by images of non-food objects and low-energy foods. A subsequent one-way within-subjects ANOVA revealed no main effect of image type, F(2,34) = 0.062, p = 0.908, Geisser–Greenhouse corrected for nonsphericity.

Experiment 2 was run after Experiment 1, and therefore, subjects were not assigned randomly to receive a snack or no snack before the test. Nevertheless, we were not aware of any obvious reasons why the subjects in the two experiments would have differed systematically (e.g., all subjects signed up for the studies in our university subject pool), so we performed an overall analysis of variance combining the results of the two experiments.

We conducted a mixed-design two-way ANOVA that included the results of both Experiments 1 and 2. There was a main effect of image type, F(2,68) = 3.414, p = 0.04, ηp 2 = 0.09, Geisser–Greenhouse corrected for nonsphericity, and, critically, a significant interaction between image type and experiment F(2,68) = 4.517, p = 0.015, ηp 2 = 0.12, Geisser–Greenhouse corrected for nonsphericity. Additional post-hoc comparisons on the interaction revealed that consuming the energy-dense food (Experiment 2) significantly decreased the amount of capture by images of energy-dense foods compared to not consuming the energy-dense food (Experiment 1). Specifically, we found that there was a statistically significant difference in the attentional capture scores for distraction by images of energy-dense foods when comparing across experiments (see the “Energy-dense Food” black and gray bars in Figure 2, p = 0.029, Bonferroni corrected for multiple comparisons). This was supported by comparing the three image types within each experiment. For Experiment 1 (black bars in Figure 2), there was significantly more distraction by images of energy-dense foods compared with low-energy foods and non-food objects (ps < 0.05, corrected for multiple comparisons). For Experiment 2 (gray bars), there were no statistically significant differences.

Experiment 3

In Experiment 3, we wanted to explore the limits of the effects found in Experiments 1 and 2 and replicate the key findings. In particular, we wanted to determine whether consuming energy-dense foods decreases attentional capture to other attractive stimuli, or whether these effects were specific to food-related stimuli. Previous research has demonstrated that emotional faces can capture attention (Hodsoll, Viding, & Lavie, 2011; Theeuwes & Van der Stigchel, 2006; Vuilleumier & Schwartz, 2001). Therefore, if consuming energy-dense foods generally reduces attentional capture by any interesting stimuli, we should see a differential effect for distractor images of emotional faces as well as energy-dense foods compared with non-food objects. Experiment 3 included distractor images of faces depicting fear and disgust, which were taken from the Warsaw set of emotional facial expression pictures (Olszanowski et al., 2015). To keep the experiment acceptably short, we eliminated the low-energy food stimuli.

Participants

In Experiment 3, we ran two groups of 32 Johns Hopkins University undergraduate students and community members (mean age = 19.5 years; 15 males, 49 females) with normal or corrected-to-normal visual acuity and normal color vision. Using the results from the mixed analysis in Experiments 1 and 2, a power analysis via G*Power (Faul et al., 2007) revealed that given an effect size of ηp 2 = 0.09, a minimum of 28 participants per group would be required to have 95% power to detect the effect in our design. As before, participants received extra credit in undergraduate courses or monetary payment as compensation and all gave informed consent. The Johns Hopkins Homewood Institutional Review Board approved the protocol.

Methods

As the combination of Experiments 1 and 2 in a single analysis was not ideal, we ran the snack and no-snack conditions as a true experiment with subjects randomly assigned to the two conditions. Experiment 3 was similar to both Experiments 1 and 2 except the low-energy food image category was replaced with images of emotional faces. Additionally, there were several small changes in procedure from the preceding experiments; to enhance distraction, the distracting pictures were shown on 33% of the trials instead of 50% and the number of trials was reduced to 540.

Results

As shown in Figure 3 (black bars), participants in Experiment 3 who did not consume the energy-dense food before the task were more distracted by images of energy-dense foods than by images of non-food objects and emotional faces. In contrast, when participants consumed the energy-dense food before the task (Figure 3, gray bars), participants in Experiment 3 were not more distracted by images of energy-dense foods than by images of non-food objects and emotional faces. The differences evident in Figure 3 are supported by the results of statistical analysis. We first conducted a mixed-design two-way ANOVA. There was a main effect of image type, F(2,124) = 3.281, p = 0.043, ηp 2 = 0.05, Geisser–Greenhouse corrected for nonsphericity, and, again, a significant interaction between image type and whether subjects had an energy-dense snack before the test, F(2,124) = 4.143, p = 0.019, ηp 2 = 0.06, Geisser–Greenhouse corrected for nonsphericity. Additional post-hoc comparisons on the interaction revealed that consuming the energy-dense food (gray bars) significantly decreased the amount of capture by images of energy-dense foods compared to not consuming the energy-dense food (black bars). Specifically, we found that there was a statistically significant difference in the attentional capture scores for distraction by images of energy-dense food when comparing across the two groups of subjects (see the “Energy-dense Food” black and gray bars in Figure 3, p = 0.02, Bonferroni corrected for multiple comparisons). This was supported by comparing the three image types within each group of subjects. For subjects who did not receive an energy-dense snack (black bars in Figure 3), there was significantly more distraction by images of energy-dense foods compared to non-food objects and emotional faces (ps < 0.05, corrected for multiple comparisons). While it was surprising that emotional faces were not any more distracting than everyday objects, this may be due to the effort that was made to make our pictures entirely irrelevant to the task at hand. Finally, for subjects who had such a snack (gray bars), there were no statistically significant differences (ps > 0.05).

Fig. 3
figure 3

Naturally occurring goal-states related to energy-dense foods are flexible and stimulus-specific. Results from Experiment 3. We replicated findings from Experiment 1 and 2. Additionally, the results demonstrate that images of energy-dense foods are even more distracting than images of emotional faces (ps < 0.05). When we had observers consume an energy-dense food before the study, we demonstrated that manipulating the goal-state associated with a particular stimulus is stimulus specific; we only found an effect of consuming energy-dense foods for distraction by images of energy-dense foods, resulting in a significant interaction (p < 0.05). Error bars are ±1 SEM.

General Discussion

Across all of our experiments, we have implied that the reaction time differences seen in our distraction task are due to “attentional capture.” That is, attention is drawn to a distracting picture when it appears, and this interferes with performance on the central task. However, it has been argued that a shift of spatial attention is just one way to account for reaction time costs due to distraction. Alternatively, some reaction time differences due to distraction may result from “filtering costs,” which do not require a shift of attention to a distracting item, but rather, only a delay of the shift of attention to the relevant item, i.e., a disruption and delay of attention (Folk & Remington, 1998). One could argue that these disruption costs are not precisely attentional capture and that terminology should be reserved for spatially driven attentional effects. In any event, determining whether our effect is better described as capture or filtering cost is well beyond the scope of this paper (see Folk [2013] for the current state of this debate). Rather, the goal for this work has been from the beginning to determine whether when these interesting food stimuli are entirely irrelevant to the task at hand, they do cause some kind of disruption, whether spatially driven or some kind of delay and disruption. Importantly, these results demonstrate, for the first time, evidence for distraction by foods that have a higher energy density, even when they are entirely irrelevant to the current task. Critically, they are irrelevant in three distinct ways: (1) their semantic content (i.e., the main task involved responding to simple letters, not images), (2) their temporal properties (i.e., they showed up after participants had already begun responding the static central task), and (3) their spatial location (i.e., they never overlapped with any location that contained task relevant information). Furthermore, we demonstrate that even though these images are entirely irrelevant, these distraction effects are still sensitive to recent changes in goal-states (i.e., only distraction to the images of energy-dense foods were influenced by consuming candy).

It seems unlikely that the difference between energy-dense food and our other stimuli (food of low caloric density, objects, and faces) is due to bottom-up salience. For one thing, we were able to take advantage of the stimulus characterizations provided in the Food.pics database to control for many low level differences among distractor categories. For another, if perceptual salience were the key factor in the difference between calorie dense food and the other stimuli, it is difficult to see why ingesting a candy bar would eliminate that effect. Also, although we have no data on this point, it is worth doing the following thought experiment. Imagine a Kalahari bushman who has not had any contact with Western foods. Is there any reason to think that a picture of a pizza would be any more salient than a picture of a baseball or a picture of a candy bar any more salient than a picture of a computer chip? These items are unfamiliar and, in the case of the two foods, there has been no opportunity for the development of motivational salience through the rewarding experience of eating them (Nijs & Franken, 2012; Werthmann, Roefs, Nederkoorn, & Jansen, 2013).

While it seems reasonable to dismiss perceptual salience as a key factor, with real-world stimuli like foods it is harder to figure out the relative roles of intentional top-down processes and reward history. Indeed, it seems likely that both may be operative (Lynn & Shin, 2015; Ristic & Landry, 2015). In this connection it is interesting to contrast the performance of our subjects with those in a study in which monetarily rewarded stimuli maintained their ability to serve as potent distractors for several months following the cessation of reward (Anderson, Laurent, & Yantis, 2013). Why did ingesting a candy bar so quickly eliminate the greater attention-capturing power of energy-dense foods? Eating a candy bar amounts to just another “trial” in our life-long “training” on the rewarding effects of energy-dense food. Why should it decrease rather than increase attentional capture? The answer, obviously, has to do with motivational state (Piech et al., 2010; Robinson & Berridge, 2013). Recent research has shown that when an ordinarily rewarding stimulus (chocolate) is devalued, attention was no longer oriented toward reward-associated stimuli (Pool, Brosch, Delplanque, & Sander, 2014). Note, however, that devaluation in that case was accomplished by having subjects eat chocolate until they were satiated which, on average, involved ingestion of 62.5 g of chocolate. This is substantially more the two small candy bars our subjects ate (approximately 30 g).

More broadly, there is a substantial literature in which subjects are given “preloads” of food prior to behavioral testing to assess the effect on attention to food-related stimuli or to actual consumption of food (Herman & Mack, 1975; Branton et al., 2014). It is especially interesting that some investigations of the effects of preloads do not involve the actual consumption of food but are essentially cognitive manipulations such as arranging for subjects to see, smell, or think about palatable foods (Rogers & Hill, 1989; Papies, Stroebe, & Aarts, 2008; Jansen & van den Hout, 1991; Fishbach, Friedman, & Kruglanski, 2003). While the amount of palatable food that we gave our subjects is on the low side for actual to-be-consumed preloads, it is obviously more than the amount given in the studies that used strictly cognitive manipulations.

It seems clear that it would be fruitful in future experiments to systematically manipulate motivational state (e.g., hours since eating), size of serving, as well as the nature of the “serving” (e.g., normal food, low calorie food, or mere visual or olfactory presence of food). Similarly, it would be useful to compare the effects of such manipulations on the bias towards foods (e.g., with dot probe or eye-tracking studies), the distracting power of food-related stimuli (e.g., with the task presented in this paper), and the actual consumption of food, as these may not be controlled by the same mechanisms. Finally, individual differences, in particular, the extent to which subjects may be categorized as normal or restrained eaters, should be considered, as the effects of preloads have been shown to be markedly different for these groups (Herman & Mack, 1975; Rogers & Hill, 1989).