Introduction

Many of us have encountered a situation where, after returning home from a long day at work, we are unable to locate where we have placed our wallet or car keys. This ability (or inability) to remember the identity and location of items is a form of visuospatial memory. Successful visuospatial memory is dependent on the accurate binding of the “what” and “where” features of an item. That is, it is not sufficient to remember what your wallet looks like (visual information) or its potential locations (spatial information), but rather the link between the item and location (e.g., my wallet is on top of my night stand). As in other forms of episodic memory, errors in visuospatial memory (e.g., a misplaced wallet) may become more frequent with advancing age (Brockmole & Logie, 2013; Park et al., 2002), the presence of neurodegenerative disorders like Alzheimer’s disease (Iachini, Iavarone, Senese, Ruotolo, & Ruggiero, 2009; Sahakian et al., 1988), and in situations in which we are distracted (Feng, Pratt, & Spence, 2012; Fougnie & Marois, 2009).

Of particular interest in the current study was how visuospatial memory ability may be affected by divided attention at encoding, especially when task-relevant goals must be pursued. Prior work demonstrates that, in the presence of an abundance of information, participants are able to selectively attend to and later remember what is most important (e.g., Castel, Benjamin, Craik, & Watkins, 2002; Castel, 2008; Stefanidi, Ellis, & Brewer, 2018). Given limitations on memory capacity, this represents an efficient strategy to remember information that may be the most useful during recall. Importantly, the ability to be selective (that is, to remember what is valuable) may be dependent upon how attention is allocated during encoding (Castel, 2008). Thus, during the encoding period, participants must deliberately focus on high-value (and away from low-value) information to increase the likelihood of later remembering the valuable information. This is supported by research demonstrating that participants with attentional impairments like attention-deficit/hyperactivity disorder and Alzheimer’s disease exhibit suboptimal selectivity relative to healthy controls (Castel, Balota, & McCabe, 2009; Castel, Lee, Humphreys, & Moore, 2011). As such, the availability of attentional resources during encoding likely influences subsequent memory selectivity during retrieval.

Attention may also be a crucial factor in forming and maintaining visuospatial associations in memory. Attention plays a critical role when searching for complex stimuli during visual search, and it is likely that the binding of multiple visual features into an integrated representation in memory requires visual attention, consistent with the feature integration theory (FIT; Treisman & Gelade, 1980; Treisman & Sato, 1990). Unlike search for single visual features, when searching for stimuli with conjunctions of features the serial and effortful allocation of attention is required and aids in binding individual characteristics of a stimulus into a coherent unit. Empirical work has attempted to extend this theory to the domain of visuospatial memory with mixed results. Generally, in these tasks, attentional resources are taxed at encoding by the presence of a secondary task unrelated to the visuospatial binding task (e.g., backwards counting). While some studies have demonstrated that the introduction of a secondary task to divide attention during encoding leads to less accurate visuospatial binding (e.g., Brown & Brockmole, 2010; Elsley & Parmentier, 2009), other work has not found a disproportionate effect of increased attentional load on binding as compared to memory for single visual features (e.g., Allen, Baddeley, & Hitch, 2006, 2014; Allen, Hitch, Mate, & Baddeley, 2012; Baddeley, Allen, & Hitch, 2011; Johnson, Hollingworth, & Luck, 2008; Ueno, Allen, Baddeley, Hitch, & Saito, 2011). Currently, there is not definitive evidence as to whether diminished attentional resources during encoding influences later memory by disrupting feature binding, memory for individual features, or both. What does seem clear, however, is that taxing attentional resources during encoding results in less accurate visuospatial memory indicating that attention is involved in this process, at least in some capacity.

The ability to selectively allocate attention during encoding may also depend on the format in which information is encountered. Various studies have found that participants are less effective in executing goal-relevant study strategies for sequentially-presented, as compared to simultaneously-presented information (Ariel, Dunlosky, & Bailey, 2009; Dunlosky & Thiede, 2004; Middlebrooks & Castel, 2017; Robison & Unsworth, 2017). For sequentially-presented information, participants must maintain information in working memory while making item-by-item decisions in line with the task goal. For simultaneously-presented information, no such maintenance of information in working memory is necessary as all information is available for the duration of the encoding period. Participants may be more effective in strategically allocating attention during encoding for simultaneously-presented information, as they may have more cognitive resources available. Prior work utilizing the same paradigm as the current study has found that both younger and older adults were able to selectively attend to and remember high-value visuospatial associations, regardless of presentation format (Siegel & Castel, 2018). Both younger and older adults became more selective with continued task experience when information was sequentially-presented, while they were consistently selective throughout the task for simultaneously-presented information. These results further supported the notion that the execution of value-based study strategies may be inherently more difficult for sequentially, relative to simultaneously, presented information, especially when items and locations must be associated in memory.

As it currently stands, the literature suggests that participants are able to engage in selective study-strategies to optimize their performance related to task goals in both a verbal (Castel, 2008; Castel et al., 2002; Stefanidi et al., 2018) and a visuospatial (Siegel & Castel, 2018) memory context. These value-based study strategies appear to be more effectively implemented when information is encountered in a simultaneous presentation format which may be due to decreased attentional load and strain on working memory resources during encoding relative to a sequential presentation of information (Ariel et al., 2009; Dunlosky & Thiede, 2004; Middlebrooks & Castel, 2017; Robison & Unsworth, 2017; Siegel & Castel, 2018). While some work has shown that an increase in attentional load during encoding may disrupt the binding of visual and spatial information (Brown & Brockmole, 2010; Elsley & Parmentier, 2009), it may not affect participants’ ability to execute value-based study strategies (Middlebrooks, Kerr, & Castel, 2017), although these factors have not been studied in conjunction. Further, while the division of attentional resources during encoding may not hinder the implementation of value-based study strategies for single pieces of verbal information (Middlebrooks et al., 2017), it may have differential effects when the cognitive load is already high, as in the case of visuospatial binding. Deficits in strategy execution in a visuospatial binding context are most likely to be present when information is sequentially-presented, as this represents an additional stressor on cognitive resources. Thus, the current study sought to examine how these factors (presentation format and secondary tasks during encoding) may interact to affect value-directed remembering in a cognitively-demanding visuospatial binding paradigm.

The current study

The purpose of the current study was to examine how visuospatial memory and selectivity may vary under conditions that differentially strain attentional resources. In a 2 (presentation format: simultaneous, sequential) × 2 (attention: full, divided) between-subjects design, we examined memory and selectivity using a visuospatial value-directed remembering paradigm (Castel et al., 2002; Castel, 2008; Siegel & Castel, 2018) while manipulating encoding conditions through differing presentation formats and the presence or absence of a secondary tone discrimination task. The current study addresses an important theoretical issue as to whether these factors interact to produce a compounded effect on attentional resources, or whether they would independently influence participants’ visuospatial memory and selectivity.

Participants may be more selective under conditions that tax attentional resources less (i.e., simultaneous presentation, full attention) than those that may have a greater strain on attentional resources (i.e., sequential presentation, divided attention), consistent with prior findings (Ariel et al., 2009; Dunlosky, Ariel, & Thiede, 2011; Dunlosky & Thiede, 2004; Middlebrooks & Castel, 2017; Robison & Unsworth, 2017; Siegel & Castel, 2018). It may also be the case that the combination of these factors produces compounded effects. That is, participants in the condition with the greatest hypothesized strain on attentional resources (i.e., sequential-divided attention) may exhibit the poorest memory performance and selectivity, while the condition with the least hypothesized strain (i.e., simultaneous-full attention) may exhibit the best memory performance and selectivity. Some work has found that participants are able to maintain selectivity in a variety of divided attention conditions (Middlebrooks et al., 2017). In this case, we would expect consistent selectivity regardless of the level of strain on attentional resources during encoding. However, in contrast to the present study, Middlebrooks et al. (2017) used verbal materials (i.e., word lists) and did not require any association of information in memory. In the context of the current task where stimuli are item-location associations, attentional resources may be stressed to a greater degree, which may lead to lower subsequent selectivity, especially in the sequential presentation format. In fact, prior work in the visual working memory domain examining how numerical reward values may influence participants’ visual working memory has found that participants were able to prioritize high- over low-priority visual objects (Hu, Hitch, Baddeley, Zhang, & Allen, 2014). However, the addition of increasingly demanding concurrent secondary tasks reduced or eliminated this ability to prioritize the encoding of high-priority information, highlighting the important role of the executive control of attention during the encoding of visual information of differing importance (Hu, Allen, Baddeley, & Hitch, 2016). As such, there may exist differences in how participants encode visuospatial and verbal information of differing value due to differential strain on attentional resources, which may be especially pronounced in the presence of a cognitively demanding secondary task.

A unique benefit of the current design is that it allows for the investigation of a spatial resolution measure by examining the pattern of errors produced by participants. By analyzing participants’ spatial relocation errors (i.e., how far participants misplaced an item from its target location), we were able to examine gist-based visuospatial memory in the absence of an exact memory trace. The binding of items with a wider range of locations (i.e., not exclusively the target location) represented a coarser measure of visuospatial memory in the current task. Using this measure, we investigated whether gist-based visuospatial memory accuracy differed as a function of information importance between these different encoding conditions. Prior work has found that younger adults’ gist-based visuospatial memory was influenced by information importance only when adequate attentional resources were available during encoding (i.e., for simultaneously-presented information), but not under more taxing encoding conditions (i.e., sequentially-presented information; Siegel & Castel, 2018). In the context of the current study, we expected that gist-based visuospatial memory may be moderated by the value of information under less demanding encoding conditions, whereas this may not have been the case when attentional resources were more strained during the study period.

Finally, it is important to note that we examined memory selectivity across of series of eight trials (referred to as “grid numbers” in the current study). The inclusion of multiple trials was motivated by prior research that has consistently demonstrated that participants may not optimally execute a value-based study strategy on the first trial, but increase their selectivity towards high-value information with continued task experience (Castel, 2008; Middlebrooks et al., 2017; Siegel & Castel, 2018). So, if only one trial is completed, it may appear as if participants are not selective in their memory performance, remembering a similar proportion of low- and high-value information. However, with repeated trials and feedback on their performance, participants are able to assess their own performance and modify their strategy use in order to improve their performance on the task. As such, the utilization of multiple trials is critical when examining how participants may optimize their strategies on goal-directed tasks (Ariel, 2013; Castel, 2008; Nelson & Narens, 1990; Wong et al., 2018).

Method

Participants

The participants in this study were 96 University of California, Los Angeles undergraduate students ranging in age from 18 to 25 years (73 females, Mage = 20.13 years, SDage = 1.43). Participants had completed an average of 14.10 years of education (SD = 1.02) when they completed the study and received partial course credit for participation. For demographic information for each encoding condition, see Table 1.

Table 1 Demographic information for each encoding condition

Materials

The materials in this study consisted of eight unique 5 × 5 grids containing ten items each presented on a computer screen (see Fig. 1 for an example grid). The grids were approximately 15 × 15 cm on the screen (17.06° visual angle) and contained 25 cells, each of which was approximately 3 × 3 cm in size (3.44° visual angle). Within each of ten randomly chosen cells was an item selected from a normed picture database (Snodgrass & Vanderwart, 1980). The items used were 80 black and white line drawings of everyday household items (e.g., a key, a camera, and an iron). On the computer screen, items were approximately 2 × 2 cm in size (2.29° visual angle). To form a grid, ten items were randomly selected from the pool and randomly placed in the cells of the grid with the constraint that no more than two items be present in any row or column of the grid (to reduce the likelihood of the item arbitrarily forming spatial patterns that may aid memory). Items were then randomly paired with point values ranging from 1 point (lowest value) to 10 points (highest value) indicated by the numerical value placed in the top left portion of each item-containing cell. Each value was used once per grid. This process was repeated to form eight unique grids for each participant. While one participant may have been presented with an iron paired with the 7-point value in the top left cell of the second grid, a different participant could encounter that same item paired with the 4-point value in the bottom right cell of the sixth grid. As such, each participant was presented with a different set of eight completely randomized grids.

Fig. 1
figure 1

An example grid that participants may have been presented with during the study phase. Items were everyday household objects taken from a normed picture database. Information importance was indicated by the numerical value in the top left corner of each item-containing cell. In the simultaneous conditions, information was presented as shown in the figure. In the sequential conditions, items were presented one at a time, with only one item present in the grid at any point during the study phase

Procedure

Participants were randomly assigned to one of four between-subjects encoding conditions: simultaneous presentation format/full attention (Sim-FA), simultaneous presentation format/divided attention (Sim-DA), sequential presentation format/full attention (Seq-FA), or sequential/divided attention (Seq-DA). All participants were instructed that they would be presented with ten items placed within a 5 × 5 grid and would be later tested on that information. Participants were further instructed that each item would be paired with a point value from 1 to 10 indicated by a number in the top left portion of each item-containing cell. The participants’ goal was to maximize their point score (a summation of the points associated with correctly remembered information) on each grid. Participants in the simultaneous conditions were shown all ten items concurrently for a total of 30 s. Participants in the sequential conditions were shown items one at a time, each for 3 s (totaling 30 s for the ten items) and were presented randomly with regards to their location in the grid and their associated point value. Participants were told that after they studied the information within the grid, they would immediately be shown the items underneath a blank grid and be asked to place each item in its previously presented location by first clicking on the item and then the cell in which they wanted to place it. If participants were unsure of an item’s location, they were asked to guess, as they would not be penalized for incorrectly placed items. Participants were given an unlimited duration to complete this testing phase and were required to place all ten items before advancing to the next grid. After participants placed all ten items, they were given feedback on their performance in terms of the items that they correctly placed, the number of points they received (out of 55 possible), and the percentage of points they received. After receiving feedback, participants repeated this procedure with unique grids for seven further study-test cycles (for a total of eight grids).

Participants in the divided attention conditions also completed a tone discrimination task during the study period. Participants were instructed that during the study phase they would hear a series of tones. Tones were presented auditorily through headphones and were one of two pitches: low pitch (400 Hz) and high pitch (900 Hz). Each tone was played for a duration of 1 s and the order of presentation was random for each participant with the constraint that no pitch was played more than three times consecutively. Participants completed a one-back tone discrimination task such that they were required to determine whether the most current tone they heard was the “same” or “different” than the tone immediately preceding it. The corresponding keys were labeled as such on the keyboard. Before each study-test cycle, a blank grid appeared on the screen and the first tone was played. Participants were instructed that they were not required to respond to this first tone. After 3 s, the first item (in the Seq-DA) or all ten items (in the Sim-DA) appeared along with the second tone. Participants then had to make their first decision (“same” as or “different” to the first tone). After that, the remaining tones were played in 3-s intervals, totaling 11 tones by the end of the study period (one preceding the presentation of items and ten during item presentation). In the Seq-DA condition, tones were played for the first second of each item’s 3-s presentation duration. For both conditions, participants were required to make their tone discrimination response within a 3-s window before the following tone was played. Participants were able to change their response within that 3-s interval and their final response was used in later analyses.

After finishing the experimental task, participants also completed a modified operation span (OSpan) task (Oswald, McAbee, Redick, & Hambrick, 2015) as a measure of working memory capacity. This measure was included examine whether participants’ visuospatial memory performance and/or selectivity may vary with working memory capacity. However, we found no significant differences in terms of the amount of information recalled or participants’ selectivity as a function of OSpan, consistent with prior studies examining memory selectivity (Castel et al., 2009; Cohen, Rissman, Suthana, Castel, & Knowlton, 2014; Middlebrooks et al., 2017) and discussion of these results is not included in the current study.

Results

Given the nature of the data, we first analyzed tone discrimination, overall item-location recall accuracy, and spatial relocation errors using analyses of variance (ANOVA). Then, in order to examine the effects of item value and task experience on these measures, we used hierarchical linear modeling (HLM). Explained in more detail at the beginning of the Memory Selectivity section, HLM is a powerful technique that allowed us to examine the relationship between our variables (i.e., the relationship between item value and recall probability for any given item, and how each encoding condition and task experience may have changed this probability). This technique has been used in prior work as a useful analytical approach (Middlebrooks, Murayama, & Castel, 2016; Middlebrooks et al., 2017; Siegel & Castel, 2018). However, it does not provide any comparison directly examining mean condition differences (e.g., differences in the overall averages between encoding conditions). In contrast, a mean-based analytic technique (e.g., ANOVA) is unable to detect any direct relationships between item value and recall probability, but is able to determine whether there were differences between encoding conditions on average. As such, the utilization of these analyses in conjunction allowed us to appropriately examine differences in overall recall (using analyses of variance) and differences in selectivity between conditions (using HLM).

Tone discrimination

Tone discrimination performance for the two divided attention conditions is depicted in Fig. 2. Tone discrimination performance was analyzed to ensure that participants’ attention was adequately divided during encoding. Firstly, we examined each participants’ tone discrimination performance individually to ensure that participants were not simply ignoring the auditory task in order to focus on the visuospatial memory task. We initially set an inclusion criterion such that, to be included in the analyses, participants had to (a) have responded on at least 50% of tones and (b) have tone discrimination accuracy greater than 50% averaged across all eight grids. This criterion excluded two participants (one from the Sim-DA condition and one from the Seq-DA condition) resulting in 94 participants across the four conditions. However, the exclusion of these two participants did not result in any change in the pattern of results described in the results section below when including all 96 participants. Therefore, we decided to include all 96 participants we collected in the following analyses.

Fig. 2
figure 2

Tone discrimination performance in the divided attention conditions across grids. The dotted line indicates chance performance of 50%. Error bars represent ±1 standard error. Sim simultaneous presentation format, Seq sequential presentation format

To determine whether tone discrimination accuracy during encoding varied as a function of presentation format or across grids, we conducted a 2 (Presentation format: simultaneous, sequential) × 8 (Grid: 1, 2, ..., 8) repeated-measures analysis of variance (ANOVA) which revealed a significant main effect of grid, F(7, 322) = 12.72, p < .001, η2 = .21. Follow-up comparisons using Tukey HSD tests indicated that tone discrimination accuracy was significantly lower on Grid 1 (M = .47, SD = .30), than on Grids 2–8 (MG2-8 = .69, SDG2-8 = .22), adjusted ps < .001. There were no other significant comparisons between grids. There was also no main effect of presentation format, F(1, 46) = 0.10, p = .75, η2 = .002, and no interaction, F(7, 322) = 1.24, p = .28, η2 = .02. Finally, to determine whether performance differed from chance (i.e., 50%) throughout the task, we conducted one-sample t-tests on tone discrimination performance for Grid 1 and Grids 2–8 collapsing across presentation format conditions. The analyses revealed that while tone discrimination performance on Grid 1 was not significantly different to chance performance, t(47) = 0.72, p = .48, it was significantly greater than chance on Grids 2–8, t(47) = 7.47, p < .001. These results suggest that there was no difference in tone accuracy between presentation conditions and that participants’ performance was consistently above chance after the first grid.

Overall item-location recall

Item-location recall accuracy (i.e., the proportion of items correctly placed) for each presentation format and attention condition across grids is depicted in Fig. 3. To analyze these data, we first examined item-location recall accuracy without regard to item value across the task using a 2 (Presentation format: simultaneous, sequential) × 2 (Attention: full, divided) × 8 (Grid: 1, 2, ..., 8) repeated-measures ANOVA on item-location recall accuracy. There was a main effect of presentation format, F(1, 92) = 17.60, p < .001, η2 = .11, such that participants had higher item-location recall accuracy in the simultaneous (M = .56, SD = 2.77) relative to the sequential presentation format (M = .44, SD = .24). There was also a significant main effect of attention, F(1, 92) = 54.06, p < .001, η2 = .33, such that participants had higher item-location recall accuracy in the full (M = .61, SD = .25) relative to the divided attention condition (M = .39, SD = .23). In addition, the ANOVA revealed a main effect of grid, F(7, 644) = 4.22, p < .001, η2 = .04. Follow-up comparisons using Tukey’s HSD tests indicated that participants had higher item-location recall accuracy on Grid 8 (M = .57, SD = .26) relative to Grid 1 (M = .46, SD = .32), t(94) = 4.38, p < .001, and Grid 2 (M = .45, SD = .26), t(94) = 4.53, p < .001. No other follow-up comparisons were significant.

Fig. 3
figure 3

Participants’ item-location recall memory as a function of presentation format and attention conditions across grids. Error bars represent ±1 standard error. Sim simultaneous presentation format, Seq sequential presentation format, FA full attention, DA divided attention

Finally, we found a significant interaction between attention and grid, F(7, 644) = 11.96, p < .001, η2 = .11. To decompose this interaction, we conducted one-way ANOVAs analyzing item-location recall across grids for each attention condition. For the full attention conditions, we found a main effect of grid, F(7, 329) = 2.80, p < .01. Follow-up comparisons using Tukey HSD tests indicated that participants in the full attention conditions had significantly higher item-location accuracy on Grid 1 (M = .70, SD = .21) relative to Grid 5 (M = .59, SD = .25), t(94) = 3.17, p = .04, Grid 6 (M = .56, SD = .27), t(94) = 3.80, p = .004, and Grid 7 (M = .57, SD = .24), t(94) = 3.40, p = .02. No other follow-up comparisons were significant. For the divided attention conditions, we also found a main effect of grid, F(7, 329) = 12.62, p < .001. Follow-up comparisons using Tukey’s HSD tests indicated that item-location recall accuracy was lower on Grid 1 (M = .21, SD = .20) than Grids 3–8 (MG3-8 = .44, SDG3-8 = .23), ps < .001, and lower on Grid 2 (M = .30, SD = .17) than Grids 5–8 (MG5-8 = .47, SDG5-8 = .23), ps < .01. No other follow-up comparisons were significant. There were no other significant interactions.

Memory selectivity

Item-location recall accuracy as a function of item-value, encoding condition, and grid number is depicted in Fig. 4. In order to compare selectivity between groups and across grids, we used hierarchical linear modeling (HLM) to analyze item-location recall accuracy as a function of item value. HLM has been used in previous studies investigating memory selectivity (Castel, Murayama, Friedman, McGillivray, & Link, 2013; Middlebrooks & Castel, 2017; Middlebrooks et al., 2017; Middlebrooks, McGillivray, Murayama, & Castel, 2016; Middlebrooks, Murayama, et al., 2016; Raudenbush & Bryk, 2002). The post-hoc binning of items into low, medium, and high-value groups may not accurately reflect participants’ valuations of to-be-learned stimuli (e.g., Participant 1 may consider items with values of 6–10 to be of “high” value, while Participant 2 may only consider items with values of 8–10 as such). In contrast, HLM treats item value as a continuous variable, allowing for a more precise investigation of the relationship between item-location recall accuracy and item value. Further, by first clustering data within each participant and then examining possible condition differences, HLM accounts for both within- and between-subject differences in strategy use, the latter of which would not be evident when conducting standard analyses of variance. Thus, HLM allows for a more fine-grained analysis of participants’ value-based strategies.

Fig. 4
figure 4

Item-location recall accuracy as a function of presentation format, attention, and item value averaged across grids. Error bars represent ±1 standard error. Sim simultaneous presentation format, Seq sequential presentation format, FA full attention, DA divided attention

In a two-level HLM, item-location recall accuracy (using a Bernoulli distribution, 0 = not recalled, 1 = recalled; level 1 = items; level 2 = participants) was modeled as a function of item value, grid number, and the interaction between those two variables. Item value and grid number were entered into the model as group-mean centered variables (with item value anchored at the mean value of 5.5 and grid number anchored at the mean value of 4.5). The encoding conditions (0 = Sim-FA, 1 = Sim-DA, 2 = Seq-FA, 3 = Seq-DA) were included as level-2 predictors. In this analysis, participants in the Sim-FA condition were treated as the comparison group, while Comparison 1 compared Sim-FA and Sim-DA, Comparison 2 compared Sim-FA and Seq-FA, and Comparison 3 compared Sim-FA and Seq-DA.

Table 2 presents the tested model and estimated regression coefficients in the current study. Regression coefficients (β) obtained from HLM can be interpreted via their exponential (Raudenbush & Bryk, 2002) – that is, the Exp(β) represents the effect of the independent variable on the odds ratio of correct item placement (the probability of successful item-location recall accuracy divided by the unsuccessful recall probability). An Exp(β) value greater than one indicates a positive effect of a predictor, while an Exp(β) value less than one indicates a negative effect of a predictor.

Table 2 Two-level hierarchical linear model of item-location memory performance, relocation error, and bimodal error predicted by item value, grid number, and encoding condition

Firstly, there was a significant effect of value on item-location recall accuracy for participants in the Sim-FA condition, β10 = 0.10, p = .001. This effect was consistent across the other encoding conditions (ps > .19). This indicates that for each increase in item value, participants were e0.10 = 1.11 times more likely to correctly place that item. Further, participants were e0.10*10 = 2.84 times more likely to successfully place a 10-point as compared to a 1-point item. Thus, as item value increased, participants in all conditions were more likely to have accurate item-location recall accuracy.

Secondly there was no significant effect of grid number for participants in the Sim-FA condition, β20 = -0.04, p = .34. While this lack of grid number effect was consistent for Comparison 2 comparing Sim-FA and Seq-FA condition (p = .83), there was a significant difference for Comparison 1 comparing Sim-FA and Sim-DA and Comparison 3 comparing Sim-FA and Seq-DA (ps < .001). To calculate the simple slopes for the Sim-DA and Seq-DA conditions, the β20 and β2123 coefficients were added (βSim-DA = 0.19, βSeq-DA = 0.15). To determine the significance of these slopes, the model was adjusted to treat Sim-DA as the comparison group and then adjusted again to include Seq-DA as the comparison group. This method was used throughout the rest of this study to calculate the significance of simple slopes. For the Sim-DA condition, grid number was a significant predictor of item-location recall accuracy, βSim-DA = 0.19, p < .001, such that for each increase in grid number, participants were e0.19 = 1.21 times more likely to successfully place an item and were e0.19*8 = 4.57 times more likely to successfully place an item on Grid 8, as compared to Grid 1. Similarly, for participants in the Seq-DA condition, grid number was also a significant predictor of item-location recall accuracy, βSeq-DA = 0.15, p < .001, such that for each increase in grid number participants were e0.15 = 1.16 times more likely to successfully place an item and were e0.15*10 = 3.32 times more likely to successfully place an item on Grid 8, as compared to Grid 1. Taken together, these results indicate that participants in the divided attention conditions had higher item-location recall accuracy with continued task experience, while those in the full attention conditions maintained a consistent level of accuracy throughout the task.

Finally, for the Sim-FA condition, there was not a significant value × grid number interaction, β30, = -0.003, p = .72. This was not significantly different for either the Sim-DA or Seq-FA conditions (ps > .82). However, there was a marginally significant difference between Sim-FA and Seq-DA as indicated by Comparison 3, β33 = 0.02, p = .08. Analyzing the simple slope of the Seq-DA condition revealed that there was in fact a significant value × grid number interaction for that group, βSeq-DA = 0.02, p = .04. This indicates that while participants in the other three conditions were consistently selective throughout the task, participants in the Seq-DA condition became more selective with continued task experience.

Bayesian analysis

We conducted a Bayesian analysis to address potential issues of statistical power related to the lack of value-based differences in precise item-location recall found between encoding conditions. Bayesian null hypothesis testing has been used to determine the likelihood of null effects in previous value-directed remembering research (e.g., Middlebrooks, Murayama, et al., 2016; Middlebrooks et al., 2017; Siegel & Castel, 2018). We computed a Bayes factor (BF10) to determine the likelihood of the null effect of value on memory performance between encoding conditions. Computing Bayes factors allows one to compare the probability of obtaining the results under the null hypothesis (i.e., no difference between encoding conditions) with the probability of obtaining the results under the alternative hypothesis (i.e., true differences in the effect of value on memory performance between encoding conditions; Jarosz & Wiley, 2014).

Comparing Bayes factors within the HLM framework can be difficult (Lorch & Myers, 1990; Murayama, Sakaki, Yan, & Smith, 2014). So, we conducted a simpler two-step procedure that has been used in previous value-directed remembering studies (Middlebrooks, Murayama, et al., 2016; Middlebrooks et al., 2017; Siegel & Castel, 2018). First, using logistic regression, item-location recall accuracy was regressed on item value within each grid for each participant. Then, a 4 (Encoding condition: Sim-DA, Sim-FA, Seq-DA, Seq-FA) × 8 (Grids: 1, 2, ..., 8) repeated-measures Bayesian ANOVA was conducted on the obtained slopes using default priors. The computed Bayes factor (BF10 = .059) for encoding condition indicated that the null hypothesis was 1/.059 = 16.95 times as likely to be true than the alternative hypothesis. This represents “strong” evidence (as determined by norms set by Kass and Raftery, 1995) that the lack of difference between encoding conditions likely reflects a similar effect of value on memory performance for these groups, rather than a lack of statistical power to detect an existing difference.

Spatial resolution

A unique benefit of the current design is that, in addition to correctly recalled information, we were able to analyze the pattern of errors produced by participants in each condition and determine whether these errors varied systematically as a function of item value or grid number. The usage of items placed within grids enabled us to examine participants’ spatial resolution (i.e., not only if a participant misplaced an item, but the magnitude of that error) by calculating the distance between a participant’s erroneous placement of an item and the item’s previously presented (target) location. The inclusion of this spatial resolution measure allowed us to draw conclusions about participants’ visuospatial gist memory, which may be influenced in different manners by varying degrees of attentional resources and presentation formats. Further, gist-based visuospatial memory may be influenced by information importance in that participants may have smaller errors for higher value information, which would represent another form of memory selectivity that is not apparent when solely examining correct and incorrect item placement.

Spatial resolution was analyzed using spatial relocation error (SRE) scores. A visual depiction of SREs is shown in Fig. 5. SREs were calculated in the following manner. For each incorrectly placed item, the coordinates of the erroneous placement were compared to the coordinates of that item’s previously presented location. In the context of the 5 × 5 grids used in the current study, coordinates were of the form (row, column) and ranged from (1, 1) indicating the cell in the top left corner of the grid to (5, 5) indicating the cell in the bottom right corner. Row and column differences were calculated by subtracting the incorrect row value from the correct row value and the incorrect column value from the correct column value. The absolute value of the row difference and column difference scores were calculated and the SRE was determined by the larger of these two values. Essentially, SREs represent the minimum number of “steps” (either vertical, horizontal, or diagonal) between an incorrectly placed item and the target location. Dependent upon an item’s previously presented location, SREs could range from 1 (directly adjacent to the correct cell) to 4 (four steps away from the correct cell). While certain locations had a maximum SRE of 3 (e.g., a cell in the center of the grid) and others a maximum of 4 (e.g., a cell in the corner of a grid), these differences were likely evenly distributed across item value and grid number due to the random assignment of value to items and random placement of items within grids for each participant. SREs were used as the dependent variable in the following analyses.

Fig. 5
figure 5

An example of relocation error scores relative to an item’s correct location. Relocation error represents the number of “steps” from an incorrectly placed item to the previously presented location. Depending on the target location, the relocation error score ranged from 1 (directly adjacent to the previously presented location) to 4 (distance of four steps from correct placement). Lighter shades indicate a misplaced item closer to the target cell resulting in a small relocation error score. Darker shades indicate a misplaced item farther from the target cell resulting in a large relocation error score

First, we compared SREs across grids and between conditions, without regard to item value. In order to avoid excluding participants from analyses who did not receive an SRE score on at least one grid (due to perfect item-location recall accuracy), we averaged participants’ data into grid quartiles resulting in four SREs for each participant (Grids 1–2, Grids 3–4, Grids 5–6, and Grids 7–8). After averaging, six participants were still excluded from the following analyses due to perfect item-location recall accuracy on at least one grid quartile (after exclusion, nSim-FA = 20, nSim-DA = 22, nSeq-FA = 24, nSeq-DA = 24). We conducted a 2 (Presentation format: sequential, simultaneous) × 2 (Attention: full, divided) × 4 (Grid quartiles: 1-2, 3-4, 5-6, 7-8) repeated-measures ANOVA on participants’ SREs and found a main effect of presentation format, F(1, 86) = 12.07, p = .001, η2 = .10, such that participants in the sequential conditions had significantly higher SREs (M = 1.91, SD = 0.29), as compared to the simultaneous conditions (M = 1.70, SD = 0.29). There was also a main effect of attention, F(1, 86) = 16.50, p < .001, η2 = .16, such that participants in the divided attention conditions (M = 1.93, SD = 0.29) had significantly higher SREs than those in the full attention conditions (M = 1.68, SD = 0.29).

In addition to main effects, we also observed several interactions. There was a significant interaction between presentation format and attention, F(1, 86) = 4.94, p = .03, η2 = .04. To decompose this interaction, for each attention condition, we conducted independent samples t-tests to compare SREs between presentation formats. For the full attention condition, participants in the sequential presentation format (M = 1.84, SD = 0.35) had significantly higher SREs than those in the simultaneous presentation format (M = 1.51, SD = 0.32), t(42) = 3.28, p = .002. However, for the divided attention condition, there was no difference in SREs between the sequential (M = 1.97, SD = 0.20) and simultaneous (M = 1.89, SD = 0.23) presentation formats, t(44) = 1.30, p = .20. There was also an interaction between presentation format and grid quartiles, F(3, 84) = 3.11, p = .03, η2 = .03. Follow-up one-way ANOVAs comparing SREs across grid quartiles for each presentation format revealed no main effect of grid for either sequentially or simultaneously-presented information (ps > .12).

To examine spatial resolution as a function of the value of information, we conducted a two-level HLM using SREs as the dependent variable. We applied the same model used on item-location recall accuracy by modeling SREs as a function of item value, grid number and the interaction of these two variables (the output variable, however, was not coded as a Bernoulli distribution, but rather a continuous one from 1 to 4 to reflect the range of SRE scores). The obtained regression coefficients are presented in Table 2 and participants’ SREs with regard to item value and grid number are shown in Fig. 6. There was a significant effect of value on SREs for the Sim-FA group, β10 = -0.04, p < .001. However, the regression coefficients from the other conditions revealed that there were significant differences between the Sim-FA and Sim-DA conditions, β11 = 0.03, p = .05, the Sim-FA and Seq-FA conditions, β12 = 0.05, p = .01, and the Sim-FA and Seq-DA conditions, β13 = 0.03, p = .04. Further analyses confirmed there was no significant effect of value on SREs for the Sim-DA (βSim-DA = -0.01, p = .18), Seq-FA (βSeq-FA = 0.0003, p = .98) or Seq-DA (βSeq-DA = -0.01, p = .14) conditions. These results indicate that participants in the Sim-FA condition placed higher value items closer to the correct location, while participants in the other three conditions did not misplace items with regard to item value.

Fig. 6
figure 6

Mean relocation error as a function of presentation format, attention, and item value averaged across grids. Error bars represent ±1 standard error. Sim simultaneous presentation format, Seq sequential presentation format, FA full attention, DA divided attention

There was no significant effect of grid number on SREs for the Sim-FA condition, β20 = 0.004, p = .85. This was consistent for both the Seq-FA and Seq-DA conditions (ps > .20). However, there was a marginal difference between Sim-FA and Sim-DA conditions, β21 = -0.04, p = .05. A follow-up analysis revealed that there was a significant effect of grid number on SREs for the Sim-DA condition (βSim-DA = -0.03, p < .001). For the Sim-DA condition, the magnitude of participants’ SREs decreased with task experience. For the other three conditions, the SREs produced by participants were of a similar magnitude throughout the task.

Finally, there was no interaction between value and grid number on SREs for the Sim-FA condition, β30 = -0.004, p = .38. This was also consistent for the three other conditions (ps > .14). Thus, the previously described effects of value on SREs for each encoding condition were consistent throughout the task.

Bimodal errors

In addition to examining errors as a function of the number of “steps” the item was placed from the target cell, we also conducted analyses of bimodal error with regard to grid number, item value, and encoding condition. In these analyses, bimodal error was defined such that 0 indicated an error that was greater than one horizontal, vertical, or diagonal “step” from the target cell (SRE > 1) and 1 indicated an error that was within one “step” from the target cell (SRE = 1). As such, this measure represents the proportion of participants’ errors that were adjacent to the target cell. These analyses allowed us to determine whether participants’ errors were a misplacement in a location neighboring the target cell or a more random guess around the grid. We first examined bimodal error without regard to item value using a 2 (Presentation format: sequential, simultaneous) × 2 (Attention: full, divided) × 4 (Grid quartiles: 1–2, 3–4, 5–6, 7–8) repeated-measures ANOVA on the bimodal error measure. Similar to analyses conducted on SREs, we averaged participants’ data into grid quartiles resulting in four bimodal error scores for each participant (Grids 1–2, Grids 3–4, Grids 5–6, and Grids 7–8). After averaging, the same six participants who were excluded from the analyses on SREs were also excluded from the following analyses due to perfect item-location recall accuracy on at least one grid quartile thus resulting in no bimodal error score (after exclusion, nSim-FA = 20, nSim-DA = 22, nSeq-FA = 24, nSeq-DA = 24). Firstly, there was a main effect of presentation format such that participants in the simultaneous presentation formats placed a higher proportion of errors adjacent to the target cell (M = .58, SD = .18) as compared to those in the sequential presentation formats (M = .46, SD = .15), F(1, 86) = 14.62, p < .001, η2 = .11. There was also a significant main effect of attention such that participants in the full attention conditions placed a higher proportion of errors adjacent to the target cell (M = .60, SD = .19) than those in the divided attention conditions (M = .45, SD = .12), F(1, 86) = 24.76, p < .001, η2 = .19.

There was also a significant interaction between presentation format and attention, F(1, 86) = 4.88, p = .03, η2 = .04. To decompose this interaction, for each attention condition, we conducted independent samples t-tests to compare bimodal errors between presentation formats. For the full attention conditions, participants in the simultaneous presentation format had a significantly higher proportion of errors adjacent to the target cell (M = .69, SD = .16) than those in the sequential presentation format (M = .50, SD = .17), t(42) = 3.51, p = .001. However, in the divided attention conditions, there was only a marginally significant difference in bimodal errors between the simultaneous presentation format (M = .47, SD = .12) and those in the sequential presentation format (M = .42, SD = .15), t(43) = 1.74, p = .09. This analysis produced no other significant main effects or interactions (ps > .29).

To examine bimodal errors as a function of the value of information, we conducted a two-level HLM using bimodal error as the dependent variable. We applied the same model used on item-location recall accuracy and SREs by modeling bimodal errors as a function of item value, grid number, and the interaction of these two variables (the output variable was coded on a Bernoulli distribution with 0 representing an error not placed adjacent to the target cell and 1 representing an error placed adjacent to the target cell). The obtained regression coefficients are presented in Table 2 and participants’ bimodal errors as a function of item value and grid number are shown in Fig. 7. With regard to item value, we found that in the Sim-FA condition there was a significant positive effect of value on bimodal error, β10 = 0.08, p = .001, indicating that with each increase in item value, the probability of an error being adjacent to the target cell increased e0.08 = 1.08 times. Further, a 10-point item was e0.08*10 = 2.12 times more likely to be placed next to the target cell than a 1-point item. This effect of value did not differ for participants in the Sim-DA condition, β11 = -0.04, p = .13, but was significantly different for those in the Seq-FA condition, β12 = -0.09, p = .001, and marginally different for those in the Seq-DA condition, β13 = -0.05, p = .07. Additional analyses conducted using the other three encoding conditions as the comparison groups revealed that there was no effect of value on bimodal error for participants in the Sim-DA, Seq-FA, or Seq-DA conditions (ps > .28).

Fig. 7
figure 7

Bimodal errors (proportion of errors adjacent to the target cell) as a function of presentation format, attention, and item value averaged across grids. Error bars represent ±1 standard error. Sim simultaneous presentation format, Seq sequential presentation format, FA full attention, DA divided attention

The HLM also revealed no significant effect of grid number on bimodal errors for the Sim-FA condition, β20 = 0.01, p = .72, which did not differ significantly for the other encoding conditions (ps > .14). Finally, there was no significant interaction between item value and grid number for the Sim-FA condition, β30 = 0.01, p = .38, which also did not differ significantly for the other three encoding conditions (ps > .24). Taken together, these results suggest that item value influenced bimodal errors in the Sim-FA condition such that participants were more likely to place high-value than low-value items in a cell adjacent to the target cell, while bimodal errors did not vary significantly as a function of item value in the other encoding conditions.

In sum, the results demonstrate that participants in the sequential and divided attention conditions were less accurate in their item-location recall than those in the simultaneous and full attention conditions, respectively. While participants in all four encoding conditions were equally selective in terms of correctly recalled information, only those in the condition with the lowest cognitive load (i.e., Sim-FA) exhibited errors that were sensitive to item value, misplacing high-value items closer to the target location than low-value items. So, while no differences in selectivity were present between encoding conditions in terms of precise item-location memory, analyses of the errors produced by participants did indicate an interaction between the availability of attentional resources during encoding and participants’ gist-based visuospatial memory.

Discussion

The current study examined how participants’ visuospatial memory and selectivity would be affected by differentially stressing attentional demands through varying presentation formats and the presence or absence of a secondary task during encoding. We found that both sequentially-presented information and divided attention led to less accurate visuospatial memory than simultaneously-presented information and full attention, respectively. This was reflected in not only the items that were correctly placed by participants, but also in the distance by which items were misplaced – that is, when participants in the simultaneous and full attention conditions inaccurately placed an item, it was placed closer to the target location. Further, all participants were equally selective in terms of the information they correctly remembered, despite overall deficits for sequential and divided attention conditions. Differences emerged, however, when examining gist-based visuospatial memory. Only the Sim-FA condition’s errors (both spatial resolution and bimodal) were influenced by the value of information, placing high-value information closer to the target location, while the other conditions exhibited a more random pattern of errors.

The results obtained in the current study demonstrate greater visuospatial memory accuracy for simultaneously, as compared to sequentially, presented information (e.g., Blalock & Clegg, 2010; Lecerf & de Ribaupierre, 2005; Siegel & Castel, 2018) and full, as compared to divided, attention at encoding (e.g., Brown & Brockmole, 2010; Feng et al., 2012; Fougnie & Marois, 2009). Further, participants in the divided attention conditions recalled more information overall with increased task experience, consistent with prior findings (Middlebrooks et al., 2017), suggesting that, as they received feedback, participants in those conditions refined their strategy in order to recall more information on later grids. We also found further evidence that participants can selectively engage in value-based study strategies related to task goals even under attention-demanding conditions, such as when information is presented sequentially (Middlebrooks & Castel, 2017; Siegel & Castel, 2018) and the presence of a secondary task during encoding (Middlebrooks et al., 2017). This was particularly notable for participants in the Seq-DA condition, whose attentional resources were thought to be the most depleted due to the necessity of binding sequentially-presented items and locations while performing the tone discrimination task. Participants in this condition required adequate task experience to reach maximum selectivity, consistent with prior findings (Castel, McGillivray, & Friedman, 2012; Middlebrooks & Castel, 2017; Siegel & Castel, 2018).

On the surface, this lack of detrimental effects of divided attention on selectivity (at least in terms of precise item-location recall) may appear to be inconsistent with prior work that has found that the ability to prioritize information in visual working memory is impaired by cognitively demanding secondary tasks (Hu et al., 2014, 2016). However, it is important to note that the prior research did not use the same value structure (i.e., a continuous series of point values) as the current study – rather, participants were instructed to prioritize the first or last item presented in a series of items. Taxing attentional resources may have a more detrimental effect on high-value information in Hu et al.’ (2014, 2016) paradigm, where the value structure is dichotomous (i.e., a single item is prioritized over other items). If that single high-value item is not remembered, then participants’ ability to selectively encode high-priority information is considered impaired. In the current study, where the value structure is continuous, the effects of a secondary task during encoding may be more dispersed over a range of values, rather than one high-value item in particular. As such, these apparent differences in the effects of attentional load on visuospatial memory may be due to the differences in value structure of the task, rather than participants’ ability to remember visuospatial information of differing importance.

When examining the current results there appears to be little evidence that presentation format and attention during encoding interact to influence visuospatial memory and selectivity. However, the inclusion of analyses examining the spatial resolution of errors produced by participants suggests there may in fact be a combined effect of these factors. As previously described, only participants in the Sim-FA condition’s errors were influenced by information importance, while the other conditions’ errors in visuospatial memory did not vary as a function of item value. These results are consistent with prior findings investigating visuospatial memory and selectivity, such that gist-based visuospatial memory was only influenced by the value of information when adequate resources were available during encoding (Siegel & Castel, 2018).

One potential explanation for the superiority of the Sim-FA condition in this regard is the ability to engage in relational processing. Prior research investigating the representation of information in visuospatial memory suggests that visuospatial information is organized based on a global spatial configuration when encoding in a simultaneous manner (Jiang, Olson, & Chun, 2000). That is, each item is encoded and represented relative to the other items in the array, which has been shown to later enhance visuospatial memory (Lilienthal, Hale, & Myerson, 2014; Taylor, Thomas, Artuso, & Eastman, 2014). In contrast, when information is encoded in a sequential manner in which items are presented in isolation, visuospatial representations may shift to a more local, item-specific organization (Blalock & Clegg, 2010; Jaswal & Logie, 2011). In the context of the current study, participants in the Sim-FA condition may have been able to rely upon relational processing during encoding to enhance visuospatial memory. This may have been especially true for information of high value, as participants likely allocated a significant amount of study time toward such items. This may have enhanced these participants’ precise item-location (e.g., remembering that the key is in the top left corner) and gist-based (e.g., remembering that the iron is somewhere below key in the left side of the grid) visuospatial memory. On the other hand, the presence of a secondary task during encoding may have attenuated Sim-DA participants’ ability to engage in relational processing leading to less accurate visuospatial memory overall and errors that were not sensitive to item value. Similarly, participants in both sequential conditions may not have engaged in relational processing at all, which may have led to lower precise item-location memory and gist-based visuospatial memory that was not affected by item value. So, the ability to engage in global/relational processing during encoding may explain the observed differences in precise and gist-based visuospatial memory. It is important to note, however, that the results are not direct evidence of relational processing during encoding as this represents only one potential explanation for the obtained results. It is entirely possible that the errors produced by participants were individual item-location errors reflecting a lack of spatial precision for particular items not dependent upon any form of relational processing. Future research should consider systematically (rather than randomly, as in the current study) varying the location of items in order to determine whether the pattern of errors produced by participants was due to relational processing or more random item-location errors.

Finally, these results also help to clarify the role of attention in visuospatial binding. Currently, a debate in the literature exists as to whether attention is particularly crucial when binding multiple visual features of an object (e.g., Brown & Brockmole, 2010; Feng et al., 2012; Wheeler & Treisman, 2002) or whether increasing attentional load equally affects individual component memory for single features and memory for feature bindings (e.g., Allen et al., 2006, 2014; Baddeley et al., 2011; Ueno et al., 2011). Given the design of the current study, we cannot make any direct comparison between item (individual identity or location feature memory) and associative memory. As we were specifically interested in the binding mechanism underlying visuospatial memory and the effect of information importance and cognitive load on this mechanism, the current design only tested memory for item-location associations. As such, we cannot determine whether value directly (i.e., an exclusive memory “boost” to high-value item-location pairs) or indirectly (i.e., a “boost” to individual visual or spatial component memory leading to better overall memory for high-value item-location pairs) affects visuospatial binding. However, as performance in the current study was dependent upon associative memory for item-location pairs, the observed effects of value demonstrate that information importance is influencing visuospatial binding in some manner.

With this limitation in mind, the results suggest that attentional control is a crucial aspect of the feature binding process in visuospatial memory, at least when the maintenance and execution of goal-related strategies is required. It is likely that successful performance on this task required two different forms of attention. First, a bottom-up form of visual attention was necessary in order to bind the visual and spatial features of items within the grid array (i.e., associating a particular item to a particular location). Secondly, a top-down form of strategic attention was required for participants to maintain and execute task-related goals (i.e., maximizing their point score by attending to high-value information). This bottom-up attention was disrupted when attention was divided (resulting in lower overall visuospatial memory accuracy), adding further support that bottom-up attention is crucial in the binding of multiple visual features of an object, consistent with predictions made by the FIT (Treisman & Gelade, 1980; Treisman & Sato, 1990). However, results also suggest that these deficits in visuospatial binding may be reduced when participants are given multiple trials to optimize their study strategies. This secondary top-down attention may have facilitated the bottom-up attention needed to bind visuospatial features by guiding participants’ focus towards high-value information. By learning to strategically allocate attention, participants were able to successfully bind visuospatial information in the event that this bottom-up attention failed to accurately do so. As such, it is likely that the role of attention in this visuospatial memory selectivity paradigm is twofold by (1) facilitating the binding of multiple visual features into a coherent unit and (2) enabling the execution of goal-related strategies in order to optimize performance.

The current study examined how differentially stressing attentional resources during encoding would affect performance on an attention-demanding visuospatial memory and selectivity task. Despite lower overall memory accuracy, participants in the most cognitively-demanding conditions maintained their selectivity towards high-value information, suggesting that factors that influence attentional resources may not impair participants’ ability to implement value-based study strategies. When adequate attentional resources were available during encoding, participants may have been able to rely on relational processing to form gist-based item-location memory traces that were moderated by information importance. When attentional resources were stressed to a greater degree, however, engagement in relational processing may have been attenuated or eliminated and participants’ gist-based visuospatial memory was no longer influenced by the value of information. In sum, while participants were able to compensate for overall memory deficits by selectively focusing on high-value information when attentional resources were taxed, impairments in gist-based visuospatial memory were still observed, highlighting the role of attention in visuospatial binding and the execution of optimal value-based study strategies during encoding.