Introduction

In real life, the most powerful episodic memories tend to be emotional (e.g., events of a wedding day or funeral). Likewise, in the laboratory, participants remember emotional information more often than neutral information (i.e., emotion-enhanced memory [EEM]; Ack Baraly, Hot, Davidson, & Talmi, 2017). However, young and older adults generally differ in their emotional memory biases: Whereas young adults often preferentially remember negative (vs. positive) information, older adults often preferentially remember positive (vs. negative) information (Carstensen & DeLiema, 2018; Reed, Chan, & Mikels, 2014). Researchers have attributed this “positivity effect” in aging to lifespan changes in motivation (Carstensen, Isaacowitz, & Charles, 1999; Mather & Carstensen, 2005), in cognitive–affective complexity (Labouvie-Vief, 2003; Labouvie-Vief, Grühn, & Studer, 2010), and/or in neural anatomy/function (Cacioppo, Berntson, Bechara, Tranel, & Hawkley, 2011; Dolcos, Rice, & Cabeza, 2002; St Jacques, Bessette-Symons, & Cabeza, 2009). The positivity effect in aging memory continues to be controversial, however (Grühn, Sharifian, & Chu, 2016; Kan, Garrison, Drummey, Emmert, & Rogers, 2017). The positivity effect is usually found using brief study–test delays of less than 1 h (e.g., Charles, Mather, & Carstensen, 2003). At such brief delays, several cognitive factors might influence the EEM by altering encoding and retrieval processes (Ack Baraly et al., 2017; Bennion, Ford, Murray, & Kensinger, 2013; Hamann, 2001; Talmi, 2013). Semantic relatedness and relative distinctiveness are two cognitive factors that could influence early EEM (i.e., EEM tested within a brief delay),Footnote 1 but these factors have rarely been addressed in the aging positivity-effect literature. In the present article, we examine whether semantic relatedness and distinctive processing can explain emotional memory biases in young and older adults.

Semantic relatedness

If not chosen carefully, the emotional stimuli in a memory study can be more interrelated and easier to organize semantically than the neutral stimuli (Talmi & Moscovitch, 2004). That is, participants might identify thematic links among the emotional stimuli (e.g., pictures of a shark, bandage, and ambulance) more readily than among the neutral stimuli (e.g., pictures of a dolphin, handkerchief, and bus). This will render the emotional stimuli easier to organize within a given schema, ultimately leading to more elaborative encoding and easier retrieval (Einstein & Hunt, 1980; R. R. Hunt & McDaniel, 1993). Consequently, sets of emotional stimuli that are highly interrelated could result in an immediate EEM effect that would otherwise not be present if the neutral stimuli were also highly interrelated (C. Hunt, Trammel, & Krumrei-Mancuso, 2015; Talmi & Moscovitch, 2004). Indeed, many EEM studies with young adults have used two sets of neutral stimuli: a randomly selected “unrelated-neutral” set, in which item interrelatedness is generally low, and a “related-neutral” set, in which item interrelatedness is high and equal to that of the emotional stimulus set(s). Usually, the young adults remember a greater number of neutral stimuli from the high- than from the low-relatedness sets (Buchanan, Etzel, Adolphs, & Tranel, 2006; Talmi, Luk, McGarry, & Moscovitch, 2007; Talmi, Schimmack, Paterson, & Moscovitch, 2007), sometimes remembering just as many related-neutral as emotional items (Talmi & Moscovitch, 2004). In older adults, memory for neutral word pairs improves when the pairs are related as compared to when they are unrelated. In fact, semantic relatedness can be so helpful to older adults that this can attenuate the typical age-related memory decreases seen with unrelated-neutral stimuli (Naveh-Benjamin, Craik, Guez, & Kreuger, 2005; Naveh-Benjamin, Hussain, Guez, & Bar-On, 2003).

Older adults’ EEM may therefore result in part from their ability to automatically utilize preexisting semantic associations (Naveh-Benjamin et al., 2005) between emotional stimuli, which would provide an encoding and/or retrieval advantage over unrelated-neutral stimuli. But inherent differences could also exist between the interrelatedness of positive and negative stimulus sets that, when uncontrolled, lead to the positivity bias in older adults. For instance, positive information might be more tightly clustered and interrelated in memory than negative information (Koch, Alves, Krüger, & Unkelbach, 2016; Unkelbach, Fiedler, Bayer, Stegmüller, & Danner, 2008) because of negative information’s greater (Baumeister, Bratslavsky, Finkenauer, & Vohs, 2001)—yet more diverse—representation in memory. Older adults could more easily discern the semantic organization of positive stimuli, and ultimately remember them better, when the semantic relatedness of the stimulus sets is not controlled. This might affect young adults to a lesser extent because they likely have sufficient resources to organize both types of information and/or to adopt appropriate strategies (Naveh-Benjamin, Brav, & Levy, 2007). Consequently, the variability across existing findings on the aging positivity effect in memory might have something to do with differences from study to study in stimulus selection across the emotional and neutral item sets. For instance, those articles showing a particularly strong positivity effect in memory might have included positive stimuli that were highly interrelated, or negative stimuli that were slightly less so. Yet, the interrelatedness of emotional and neutral stimuli remains generally uncontrolled in the aging positivity-effect literature.

Distinctiveness

The relative distinctiveness of emotional stimuli might also contribute to EEM. Emotional stimuli are inherently more salient than neutral stimuli, in the sense that they have a greater “absolute” significance because of their unique attributes stored in long-term memory (e.g., compare a facial expression of pain to a neutral expression; Schmidt, 1991). But emotional stimuli are also distinct in a “relative” sense, because they are often more salient than other, neutral stimuli presented close in time. Relative distinctiveness might influence memory to a greater extent than absolute distinctiveness (Schmidt, 1991, 2002). The EEM effect is commonly explained by the properties inherent to emotional stimuli (e.g., their high arousal; McGaugh, 2004), but it may also be due in part to the use of study designs that increase the relative distinctiveness of emotional items. Indeed, young adults’ EEM seems greater when emotional and neutral items are studied/tested together in mixed (i.e., emotion-heterogeneous) lists than when each emotion category is studied/tested separately in unmixed (i.e., emotion-homogeneous) lists (Dewhurst & Parry, 2000; Hadley & MacKay, 2006; McDaniel, Dornburg, & Guynn, 2005; Schmidt & Saari, 2007; Talmi, Luk, et al., 2007; Talmi & McGarry, 2012). The unmixed study lists reduce the relative distinctiveness of the emotional stimuli, by presenting emotional and neutral items in isolation from one another, which subsequently reduces EEM.

Few aging studies have used unmixed designs (e.g., Emery & Hess, 2011) or have directly contrasted unmixed and mixed sets of stimuli (e.g., Grühn, Scheibe, & Baltes, 2007; Grühn, Smith, & Baltes, 2005). Interestingly, none of the authors just listed reported a positivity bias in older adults. In fact, one of these studies (Grühn et al., 2005) showed that increasing relative distinctiveness improved older adults’ memory for negative words. To our knowledge, currently no study has considered both item interrelatedness and distinctiveness when examining older adults’ positivity bias. The effects of relative distinctiveness on semantically matched negative, positive, and neutral pictures remain unclear. Although several behavioral (Isaacowitz, Allard, Murphy, & Schlangel, 2009; Isaacowitz, Wadlinger, Goren, & Wilson, 2006) and neural (Cacioppo et al., 2011; Dolcos et al., 2002; Mather et al., 2004; St Jacques et al., 2009) studies have suggested that older adults prioritize positive stimuli, it is possible that these differences are only present when the positive stimuli are relatively distinct as compared to the other stimuli. If distinctive processing underlies the positivity effect in older adults, then positive stimuli should receive particularly high processing priority when test items are presented together in mixed sets. In contrast, all emotional and neutral stimuli should be processed equally well when they are presented in separate, unmixed sets, thus attenuating or even abolishing older adults’ positive memory bias. In other words, the positivity bias in older adults might result from a temporary, contextual advantage given to positive information when it is processed in relation to other information, rather than from a permanent and absolute memory decrease for negative information.

Present study

The aim of this study was to examine whether semantic relatedness and relative distinctiveness can explain emotion-enhanced memory in young adults, and more specifically, the positivity bias in older adults. In Experiment 1, we performed a conceptual replication of Talmi, Luk, et al. (2007), who found that both semantic relatedness and relative distinctiveness influenced young adults’ memory for negative and neutral pictures. To build on their work, we also tested memory for positive pictures (which were absent from their original study) in a sample of young adults in Canada. In Experiment 2, we used the same experimental design with young and older adults in France.

To examine item interrelatedness, neutral pictures were either low (unrelated neutral) or high (related neutral) in semantic relatedness—that is, interrelated to an extent similar to that in emotional pictures. In addition, participants processed the emotional pictures in either a distinctive manner (mixed condition), with all items studied together, or a nondistinctive manner (unmixed condition), in which each picture category was studied and recalled separately (similar to Talmi, Luk, et al., 2007). We expected participants to recall more emotional pictures when these were more highly interrelated or more relatively distinct than the neutral pictures (Talmi, 2013; Talmi & McGarry, 2012). Furthermore, we predicted that relative distinctiveness would influence the presence of young adults’ EEM and older adults’ positivity bias. More specifically, we expected that young adults (Exps. 1 and 2) would recall more emotional pictures than related-neutral pictures in the mixed condition, but not in the unmixed condition, when distinctiveness was controlled. We also expected young adults to remember more negative than positive pictures in the mixed condition only. In contrast, we expected older adults (Exp. 2) to show a positive memory bias in the mixed condition, which would disappear or become weaker in the unmixed condition. Both age groups were always expected to remember more emotional pictures than unrelated-neutral pictures (i.e., the classic EEM pattern), regardless of distinctive processing.

A final consideration was the influence of recall delay, which occurred 1 min and 45 min after picture presentation. The 1-min delay (replicating Talmi, Luk, et al., 2007) was long enough to test for early EEM (Talmi, Grady, Goshen-Gottstein, & Moscovitch, 2005) and all of our research hypotheses, while ensuring high recall rates in the older adults. The 45-min delay was more exploratory. Some research has suggested that EEM is stronger when it is tested over a delay (Yonelinas & Ritchey, 2015) and that older adults’ positivity bias could be strengthened by repeated testing (Mather & Knight, 2005). We included two relatively brief intervals in order to examine these effects on early EEM in young and older adults.

Experiment 1

In Experiment 1 we examined whether Canadian university students’ early EEM could be accounted for by semantic relatedness and distinctive processing. The experimental conditions and procedures were similar to those used in Talmi, Luk, et al. (2007, Exp. 1), with the addition of a positive picture category. We expected students to remember more emotional than neutral pictures, but only when the emotional pictures were more interrelated and/or were relatively distinct. Furthermore, we expected students to remember more negative than positive pictures when the items were processed in mixed sets.

Method

Participants

Forty-seven young adults (under 35 years old) were randomly assigned to either the mixed (n = 24; 20 women, four men; mean age = 19.67 years) or unmixed (n = 23; 20 women, three men; mean age = 19.57 years) condition. The participants were University of Ottawa students who received course credit for participating. All provided written informed consent and completed the tasks in their choice of English or French. Data from an additional four participants were excluded because of a visual memory impairment, current drug abuse, incomplete study session, and incomplete data due to microphone error. Participants were further screened for high levels of depressive symptomatology based on the z-score distribution of the Centre for Epidemiologic Studies Depression scale (Radloff, 1977), which resulted in no additional exclusions. This study was approved by the University of Ottawa Research Ethics Board (#H12-14-14).

Stimuli

The target images consisted of 16 positive, 16 negative, 16 related-neutral, and 16 unrelated-neutral pictures. The related-neutral pictures depicted domestic scenes of people, objects, or scenes around the house (e.g., man painting a room, ironing board, or backyard), whereas the unrelated-neutral pictures had no obvious thematic link (e.g., blue mug, buffalo, or outdoor staircase). Approximately one-third of the pictures in each category portrayed people, and the remaining pictures illustrated objects, animals, or outdoor landscapes. An additional 16 pictures (four per category) were chosen as buffer images. Some of the negative, related-neutral, and unrelated-neutral pictures were drawn from Talmi and McGarry’s (2012) collection, but others, in addition to the positive pictures, were selected from the International Affective Picture System (Lang, Bradley, & Cuthbert, 2008), the Geneva Affective Picture Database (Dan-Glauser & Scherer, 2011), and the internet.

We conducted a pilot study with 12 university students (nine women, three men; mean age = 19.08 years) to determine the average valence, arousal, and semantic interrelatedness of the pictures (Table 1). One additional participant was excluded because of brief response times (< 10 ms). During the pilot study, participants saw each of the 64 pictures (i.e., 16 pictures/category) one at a time in a randomized order and reported their feelings of valence, from 1 (happy) to 9 (unhappy), and arousal, from 1 (excited) to 9 (calm), using the Self-Assessment Manikins from Lang et al. (2008).Footnote 2 Next, participants rated, in random order, the semantic interrelatedness of all possible pairs of pictures from the same category (i.e., 120 negative pairs, 120 positive pairs, etc.) from 1 (not at all related) to 7 (extremely related), concentrating on picture content rather than physical similarity, as per Talmi and McGarry (2012). The mean relatedness was calculated for each picture by averaging all relatedness scores between that picture and the 15 other pictures from the same category. The mean valence, arousal, and semantic relatedness of each picture were averaged across all participants and analyzed with separate univariate analyses of variance (ANOVAs), with picture type (negative, positive, related neutral, unrelated neutral) as the between-item variable. The ANOVAs showed that the four picture types differed significantly in valence [F(3, 60) = 240, p < .0001], arousal [F(3, 60) = 61, p < .0001], and semantic relatedness [F(3, 60) = 128, p < .0001]. Planned contrasts with Bonferroni corrections showed that all pictures differed in valence (ps < .0001), except for the two neutral categories (p > .999). Negative pictures were more arousing than positive pictures (p = .044), and both were more arousing than neutral pictures (ps < .0001). Related-neutral and unrelated-neutral pictures were matched in arousal (ps > .999). Each picture category therefore represented the expected emotional valence, and the emotional pictures were more arousing than the neutral pictures. Crucially, the negative, positive, and related-neutral pictures were all more highly interrelated than the unrelated-neutral pictures (ps < .0001). The negative pictures were matched in relatedness with the positive (p = .102) and related-neutral (p = .593) pictures, but the related-neutral pictures were more interrelated than the positive pictures (p = .001).

Table 1 Mean (SD) ratings for pictures used in Experiments 1 and 2

Procedures

Each session began with the written informed consent, followed by the memory task, based on that of Talmi, Luk, et al. (2007), which included three parts: intentional encoding, arithmetic questions, and free recall. During the encoding task, participants studied pictures that appeared in a random order on a computer screen. Each picture appeared for 2 s, followed by a blank screen for 4 s. We instructed participants to memorize as many pictures as possible and included no interfering task, in order to minimize the effects of unequal attention allocation on memory (Talmi & McGarry, 2012). Once all pictures of that block had been presented, participants then completed short arithmetic problems involving addition, subtraction, multiplication, or division, for 1 min (e.g., which equation produces the higher value: “15 + 39” or “25 + 18”?). This distraction task ensured that memory performance would reflect early long-term memory, by displacing items from working memory (Talmi et al., 2005). Immediately after, participants described the pictures they could remember from the previous study block, in any order and with enough detail so that the experimenter could identify the picture. The experimenter recorded the participants’ responses for 3 min using an audio recorder. Once recall was done, participants started over again with a new set of pictures. There were four blocks in total, each containing 16 targets and four buffers (two before and two after the targets). The buffers minimized the effects of primacy and recency on memory. In the unmixed condition, the targets and buffers in each block were from the same category of pictures. In the mixed condition, four targets were randomly selected from each category, and the buffers were chosen randomly. Participants familiarized themselves with the task procedures by completing a practice session with four neutral pictures at the start of the experiment. The memory task was run using E-Prime 2.0 software.

After the memory task, participants completed a demographics form, health questionnaire, and the Centre for Epidemiologic Studies Depression (CES-D) scale, which was used to screen the participants for depressive symptomatology (Radloff, 1977). Then they took a 5-min break before completing the Montreal Cognitive Assessment (MOCA), a measure of general cognitive function (Nasreddine et al., 2005). The participants completed brief cognitive tasks until the 45-min delay had elapsed (e.g., Wisconsin Card Sorting Test, verbal fluency, and/or digit span). After the 45-min delay, participants were given as much time as they needed to describe, once again, as many of the pictures as they could remember from any of the four presentation blocks. The experimenter recorded their responses with an audio device. The entire session lasted up to 2 h. Participants received a written and oral debriefing at the end of the study.

Statistical analyses

The first author (K.T.A.B.) scored all of the recall data, and author L.F. double scored the data from 14 participants (i.e., 30%). Each picture description was considered a correct match if the rater could identify which picture was being described (without specific elements needing to be recalled). Correct matches were only given to pictures recalled in the correct study block (i.e., if a picture was recalled during a subsequent block, it would not count). We calculated the interrater reliability between the two raters using Pearson’s correlations and Cohen’s kappa.

We then performed a 2 × 4 × 2 repeated measures ANOVA with distinctiveness (mixed, unmixed) as the between-subjects factor and picture type (negative, positive, related neutral, unrelated neutral) and recall delay (immediate, delayed) as within-subjects factors, on the number of correctly recalled pictures. Alpha was .05, and a Bonferroni-corrected alpha was used for post-hoc comparisons. Statistical analyses were performed using the SPSS Statistics 24 software.

Results

The correlations between the two raters were high for all responses (r = .99), picture types (r ≥ .97), and recall delays (r ≥ .98). Cohen’s kappa also indicated high agreement for all responses (κ = .95, p < .0001), picture types (κ ≥ .90, p < .0001), and recall delays (κ ≥ .94, p < .0001). Fewer than 1% of descriptions were ambiguous and not matched to a target or buffer image.

The data were also normally distributed (based on the kurtosis and skewness indices in SPSS), and there were no extreme outliers. The repeated measures ANOVA revealed main effects of picture type [F(3, 135) = 77.60, p < .0001, ηp2 = .63] and recall delay [F(1, 135) = 71.24, p < .0001, ηp2 = .61], as well as an interaction between picture type and recall delay [F(3, 135) = 6.97, p < .0001, ηp2 = .13]. There was no main effect or interaction with distinctiveness (Fig. 1). To examine the main effect of picture type, we calculated a total recall score summing the immediate and delayed totals for each participant. The paired t tests revealed that participants recalled more positive and negative pictures than related-neutral and unrelated-neutral pictures (ps < .0001), and more related-neutral pictures than unrelated-neutral pictures (ps < .0001). No difference was found between the negative and positive pictures (p = .35). Finally, the main effect of recall delay resulted from higher recall at immediate testing (M = 36.74 pictures, SD = 5.40) than at delayed testing (M = 31.38 pictures, SD = 6.23).

Fig. 1
figure 1

Mean numbers of pictures correctly recalled by young adults after a 1-min (solid-colored bars) and after a 45-min (striped bars) delay, based on picture type and study condition

To examine the two-way interaction, we first compared the recall of each picture type for immediate and delayed tests separately, using paired t tests (Bonferroni-corrected alpha = .05/12 = .004). The same pattern of results was found as above: At both test delays, participants recalled more positive and negative pictures than related-neutral and unrelated-neutral pictures (ps ≤ .001). They also recalled more related-neutral than unrelated-neutral pictures (ps ≤ .001), but there was no difference between positive and negative pictures (immediate, p = .690; delayed, p = .064). This did not explain the two-way interaction, so we performed additional exploratory analyses comparing the immediate and delayed recall scores for each picture type separately, using paired t tests (Bonferroni-corrected alpha = .05/4 = .0125). Recall was higher at immediate than at delayed testing for negative [t(46) = 5.93, p < .0001], related-neutral [t(46) = 5.71, p < .0001], and unrelated-neutral pictures [t(46) = 7.77, p < .0001], but not for positive pictures [t(46) = 2.03, p = .05]. The majority of pictures (92%) recalled in the delayed test were the same as those recalled in the immediate test; the numbers of novel pictures recalled during the delayed test were equal for each of the four picture types.

Discussion

In Experiment 1 we assessed whether EEM in young adults results from the greater semantic interrelatedness and distinctiveness of the emotional stimuli. We compared the immediate and delayed recall of emotional pictures to two sets of neutral pictures (one high and one low in semantic interrelatedness) when the relative distinctiveness of the emotional stimuli was high (i.e., mixed condition) or low (i.e., unmixed condition).

Semantic relatedness contributed in part to EEM, but distinctive processing did not. The greater semantic relatedness of the related-neutral pictures improved recall relative to the unrelated-neutral pictures. This showed that without modifying the emotion of the pictures, increasing their semantic cohesion could itself improve both immediate and delayed recall (Buchanan et al., 2006; C. Hunt et al., 2015; Talmi & Moscovitch, 2004). Contrary to our predictions (Schmidt & Saari, 2007; Talmi & McGarry, 2012), distinctive processing did not influence EEM. Recall was greater for the emotional than for the related-neutral items when distinctive processing was uncontrolled in the mixed condition, but also when it was controlled in the unmixed condition. Although the EEM pattern was observed at both recall delays, we observed an unexpected interaction with picture type. Whereas most pictures (i.e., negative, related neutral, unrelated neutral) were recalled better in the immediate than in the delayed test, positive pictures maintained a stable rate of recall after the delay. This suggests that positive pictures were “forgotten” at a slower rate than the other pictures. Although there was no relative difference in recall for positive and negative pictures at either immediate or delayed testing, the decelerated forgetting of positive pictures could be interpreted as a positivity advantage that appeared over time. This was contrary to our prediction that young adults would show a negative memory bias.

In the following experiment, we sought to reexamine the roles of semantic relatedness and distinctive processing on young adults’ EEM, this time also including older adults. Using the same methods and design, we wanted specifically to examine the positive memory biases of older adults and how they compare to the emotional biases of young adults.

Experiment 2

In Experiment 2, we recruited French young and older adults to complete the same emotional memory paradigm used in Experiment 1 (Talmi, Luk, et al., 2007). The main focus of this study was to determine whether semantic relatedness and distinctiveness account for EEM in young and older adults (Talmi & McGarry, 2012). More specifically, we expected distinctive processing to underlie older adults’ positive memory bias.

Method

Participants

In Experiment 2, we aimed to test 60 young adults and 60 older adults in order to obtain power of .90 for the within–between interaction (determined a priori using the G*Power software; Faul, Erdfelder, Buchner, & Lang, 2009). The final sample included 61 young adults (under 35 years old) and 59 older adults (over 60 years old; see Table 2), randomly assigned to the mixed or unmixed condition. The young adults attended the University of Grenoble or the University of Savoie Mont Blanc, and they received course credit for participating. The older adults resided in Grenoble, Chambéry, or Lyon, and they received no compensation. Participants provided their written informed consent and completed the study in French. This study was approved by the University of Ottawa (#H12-14-14) and University of Savoie Mont Blanc (#20158) Research Ethics Boards.

Table 2 Demographic information for the young and older adults in Experiment 2

Participants reported that they were in good health, with no psychiatric or neurological condition. They were further screened for possible cognitive impairment using the Montreal Cognitive Assessment (MOCA; Nasreddine et al., 2005) and the Frontal Assessment Battery (FAB; Dubois, Slachevsky, Litvan, & Pillon, 2000), and for depressive symptomatology using the CES-D (Radloff, 1977). Five young adults were excluded because of experimenter error (n = 2), age (53 years old), extreme CES-D score (z score = 3.62), and outlying delayed-recall scores (kurtosis = 2.68; this participant did not understand the delayed-recall instructions). Three older adults were excluded because of an incomplete study session, a low MOCA score of 18 (z score = – 3.82), and low MOCA (17) and FAB (10) scores (z scores of – 4.26 and – 5.81, respectively). The young adults in both conditions were matched in their age, education, MOCA, FAB, and CES-D. The older adults were matched in age, education, MOCA, and FAB, although those assigned to the unmixed condition may have had greater depressive symptomatology (i.e., higher CES-D scores) than those assigned to the mixed condition [t(56) = 2.00, p = .050]. The CES-D scores were therefore inputted as a covariate, to control for this difference at baseline.

Stimuli

Experiment 2 included the same number of pictures (64 targets and 16 buffers) and the same picture categories (negative, positive, related neutral, unrelated neutral) as Experiment 1. Most of the same pictures were used, except for one unrelated-neutral, two related-neutral, two positive, and seven negative pictures.

Ratings of valence, arousal, and semantic interrelatedness (Table 1) were obtained from 13 students from the University of Savoie Mont Blanc and 14 older adults (62–83 years old) from the wider community. The students completed the ratings as per the procedures described in Experiment 1. The older adults completed the study online and saw only a portion of all trials, to ensure that the study lasted less than 1 h. By-item univariate ANOVAs conducted separately for the young and older adults showed that the four picture types differed significantly in valence [young, F(3, 60) = 280, p < .0001; older, F(3, 60) = 157, p < .0001], arousal [young, F(3, 60) = 94, p < .0001; older, F(3, 60) = 24, p < .0001], and semantic relatedness [young, F(3, 60) = 84, p < .0001; older, F(3, 60) = 65, p < .0001]. Planned contrasts with Bonferroni correction showed that all pictures differed in valence (ps < .0001), except for the two neutral categories (p > .999). Young adults rated the emotional pictures as being more arousing than the neutral pictures (ps < .0001), but they did not rate the negative and positive pictures differently (p = .110), nor did they rate the related-neutral and unrelated-neutral pictures differently (p = .237). In contrast, older adults rated the negative pictures as being more arousing than the rest (ps < .0001) and reported no differences in arousal between positive, related-neutral, and unrelated-neutral pictures (ps > .999). Importantly, for both age groups the negative, positive, and related-neutral pictures were more highly interrelated than the unrelated-neutral pictures (ps < .0001). Young adults rated the related-neutral pictures as being more interrelated than the negative (p = .038) and positive (p < .0001) pictures, but they rated the positive and negative pictures equally (p = .922). In contrast, older adults rated the negative pictures as being more interrelated than the positive (p = .011) and related-neutral (p = .044) pictures, which they rated as being equally interrelated (p > .999).

Procedures

The procedures were identical to those of Experiment 1, except that the Frontal Assessment Battery replaced the Wisconsin Card Sorting Test in Experiment 2.

Statistical analyses

The primary rater (K.T.A.B.) scored all of the recall data, and the secondary rater (nonauthor K.I.-N.) double scored nearly 25% of the data (immediate recall from 15 young and 15 older adults, and delayed recall from 12 young and 15 older adults). The secondary rater in this experiment was completely blind to the research hypotheses and did not test any of the participants. The picture descriptions were scored in accordance with the rules outlined in Experiment 1. The interrater reliability between the primary and secondary raters was calculated with Pearson’s correlations and Cohen’s kappa indices.

We performed a 2 × 2 × 4 × 2 repeated measures ANOVA with age (young, older) and distinctiveness (mixed, unmixed) as between-subjects factors, and picture type (negative, positive, related neutral, unrelated neutral) and recall delay (immediate, delayed) as within-subjects factors, on the total number of correctly recalled pictures. CES-D scores were included as a covariate because of the older adults’ difference at baseline.Footnote 3 Alpha was set to .05. Given the large number of post-hoc comparisons, we used a Holm–Bonferroni correction (Holm, 1979), which is a sequentially rejective procedure useful for performing multiple contrasts without increasing Type I error. The rank order (from smallest to largest) is reported for each p value. Statistical analyses were performed using the SPSS Statistics 24 software.

Results

Interrater reliability and assumptions

The correlations between the two raters were high for all responses (r = .99), and equally high for the two age groups (r ≥ .98), four picture types (r ≥ .96), and two recall delays (r ≥ .98). Cohen’s kappa also indicated high agreement for all responses (κ = .92, p < .0001), age groups (κ ≥ .89, p < .0001), picture types (κ ≥ .90, p < .0001), and recall delays (κ ≥ .90, p < .0001). Approximately 1.55% of descriptions were ambiguous and could not be matched to an image, which occurred more frequently for older than for young adults [older adults, M = 1.07 ambiguities; young adults, M = 0.44 ambiguities; t(118) = 3.38, p = .001]. We excluded one young-adult outlier that was negatively skewing the delayed recall of positive pictures (kurtosis = 2.68, skewness = – 1.03). This participant did not understand the delayed-recall instructions and only recalled pictures from one picture category. After excluding this participant, the data were normally distributed.

Repeated ANOVA and post-hoc tests

We used a Huynh–Feldt correction because of a violation of sphericity. The repeated measures ANOVA revealed main effects of age [F(1, 114) = 41.69, p < .0001, ηp2 = .268], picture type [F(3, 342) = 52.12, p < .0001, ηp2 = .314], and recall delay [F(1, 114) = 77.04, p < .001, ηp2 = .403], but no main effect of distinctiveness (p = .255; Fig. 2). These findings were characterized by the following interactions: Distinctiveness × Age × Picture Type [F(3, 342) = 3.19, p = .024, ηp2 = .027], Age × Picture Type [F(3, 342) = 3.75, p = .011, ηp2 = .032], Distinctiveness × Recall Delay [F(1, 114) = 11.84, p = .001, ηp2 = .094], and Recall Delay × Picture Type [F(2.91, 332) = 3.07, p = .029, ηp2 = .026]. The CES-D covariate did not significantly affect these results.

Fig. 2
figure 2

Mean numbers of pictures correctly recalled by young adults (a) and older adults (b) after a 1-min (solid-colored bars) and after a 45-min (striped bars) delay, based on picture type and study condition

To further examine the three-way interaction, we calculated the total recall for each picture type summed across recall delays. We performed a series of paired t tests contrasting the levels of picture type (i.e., negative vs. positive, negative vs. related neutral, etc.) for each age and distinctiveness condition separately. Young adults’ EEM was consistent in both the mixed and unmixed conditions: They recalled more positive and negative pictures than related-neutral pictures (ps < .0001), and more related-neutral than unrelated-neutral pictures (ps < .0001), with no difference between the positive and negative pictures. EEM was therefore always present in young adults, with no emotional bias toward either negative or positive pictures. In contrast, the older adults in the mixed condition remembered more pictures from the positive category than from any other category, showing evidence of a positivity bias [positive vs. negative, t(27) = 3.24, 18th ranked p = .003; positive vs. related-neutral, t(27) = 4.70, p < .0001; positive vs. unrelated neutral, t(27) = 11.04, p < .0001]. Older adults’ positivity bias disappeared in the unmixed condition: They recalled equal amounts of positive, negative, and related-neutral pictures [positive vs. negative, t(30) = 2.54, 19th ranked p = .017; positive vs. related neutral, t(30) = 1.37, 21st ranked p = .181; negative vs. related neutral, t(30) = 0.91, 23rd ranked p = .373]. In both distinctiveness conditions, young as well as older adults always recalled fewer unrelated-neutral pictures than all other types of pictures (ps < .0001), thus demonstrating the classic EEM effect.

The two-way interaction between age and picture type therefore resulted from a positivity bias that was present in older but not in young adults. Another interaction existed between distinctiveness and recall delay: Participants’ total recall on the delayed test (summed across all picture types) was higher in the mixed than in the unmixed condition [t(118) = 2.30, 1st ranked p = .023]. Recall delay also interacted with picture type: The memory advantage for positive over negative pictures was significant for the delayed test [t(119) = 5.10, p < .0001] but not for the immediate test [t(119) = 1.92, 12th ranked p = .057]. We observed no differential rate of forgetting: All pictures were recalled better on the immediate than on the delayed test [ps < .0001 for all picture types]. We further explored this interaction by examining the number of novel pictures recalled during the delayed test that were not recalled during the immediate test (8% of the total delayed recall). Indeed, more novel pictures were recalled from the positive category than from the negative [t(31) = 3.82, p = .001] or unrelated-neutral [t(31) = 3.69, p = .001] categories.

Discussion

Using the same methods and procedures as in Experiment 1, we assessed whether semantic relatedness and distinctiveness explain EEM in young adults, and more specifically the positivity bias in older adults. We compared immediate and delayed free recall of negative, positive, related-neutral, and unrelated-neutral pictures when the relative distinctiveness of emotional stimuli was high (i.e., mixed condition) or low (i.e., unmixed condition).

In this experiment, older adults recalled more positive than negative or neutral pictures when these items were processed in a distinctive manner. But this “positivity effect” disappeared when both distinctive processing and semantic interrelatedness were controlled in the unmixed condition. On the other hand, young adults showed no emotional bias toward either positive or negative pictures, and their recall was not influenced by distinctiveness, although higher semantic interrelatedness enhanced memory for neutral pictures. The classic EEM effect was observed in all conditions and age groups when comparing the recall of emotional pictures to that of unrelated-neutral pictures.

General discussion

In healthy older adults, positive information seems easier to remember than negative information, even when it is retrieved after a short delay. But does this age-related positive memory bias result from an absolute memory decrease for negative information, or simply from a contextual advantage that appears when positive information is processed in relation to negative information? The findings of this study support the latter possibility: Older adults remembered more positive than negative pictures when studying them together, at the same time (i.e., mixed condition), but not when studying/recalling them separately (i.e., unmixed condition). The study context did not affect young adults, who consistently recalled equal proportions of positive and negative pictures. Enhanced emotional memory in young and older adults was further attributed to the higher semantic relatedness of the emotional pictures. These findings were consistent across 1-min and 45-min test delays.

Semantic relatedness

The present experiments build on previous work (Talmi & McGarry, 2012) that had identified semantic relatedness and distinctive processing as two cognitive factors underlying emotion-enhanced memory in young adults (when tested shortly after study; i.e., early EEM). In the present study, neutral pictures were either (a) low in semantic relatedness (i.e., they were selected randomly) or (b) high in semantic relatedness (i.e., they were selected according to a general theme), at levels comparable to the emotional pictures. Young and older adults consistently remembered more items from the related-neutral than from the unrelated-neutral category, demonstrating that increased organization improved memory for the neutral items. This extends previous findings with young adults (Buchanan et al., 2006; Talmi, Luk, et al., 2007; Talmi, Schimmack, et al., 2007) by showing that older adults’ memory also improves when the stimuli are more interrelated. This is in line with reports of older adults utilizing preexisting associations between study items to improve their associative memory (Naveh-Benjamin et al., 2005; Naveh-Benjamin et al., 2003). Here, no active elaboration was required during study, because the related-neutral stimuli were already organized around a common theme (i.e., the house), so older adults’ memory could likely improve without their needing to exert additional cognitive resources (Craik, 1983, 1986).

Importantly, when controlling distinctive processing (i.e., an unmixed condition), older adults only demonstrated EEM when the emotional items were more interrelated than the neutral items (i.e., positive/negative vs. unrelated-neutral), but not when the neutral items were also more closely interrelated (i.e., positive/negative vs. related-neutral), suggesting that older adults’ EEM depended in part on semantic relatedness. Furthermore, in the unmixed condition, older adults did not show a positivity bias, perhaps because the high interrelatedness of negative pictures facilitated their encoding and/or retrieval when processed separately from the positive pictures. The large variability across existing findings on the positivity effect thus might result in part from the uncontrolled effects of semantic relatedness. Future work should carefully consider item interrelatedness and its effects on EEM and the positivity effect in aging.

Distinctiveness

In the present experiments, participants processed emotional pictures in either a relatively distinct manner (mixed condition), by studying emotional and neutral pictures at the same time, or in a nondistinctive manner (unmixed condition), by studying and recalling each picture category separately. Relative distinctiveness influenced memory in older but not in young adults. Older adults showed a positive memory bias when positive pictures were relatively distinct in the mixed condition, but they recalled equal numbers of positive, negative, and related-neutral pictures when distinctive processing was minimized in the unmixed condition. This is in line with previous reports that have failed to show a positivity bias when using unmixed study designs (Emery & Hess, 2011; Grühn et al., 2007; Grühn et al., 2005). To our knowledge, this is the first study to have shown that older adults’ positivity bias selectively appeared when positive stimuli were relatively distinct as compared to the negative and neutral stimuli, while also controlling item interrelatedness.

Previous reports on the positivity effect in aging may have overestimated the size and/or robustness of the effect by using predominantly mixed (vs. unmixed) study designs that enhance older adults’ processing of positive stimuli. The present findings suggest that older adults can recall positive and negative information equally well, provided that the two information types are studied independently. In other words, older adults’ memory for negative stimuli might simply decrease when it competes for resources with positive stimuli. This is particularly relevant for older (but not younger) adults because of the perceptual and cognitive reductions common in normal aging (e.g., poorer vision, slower processing speed, reduced attention). Given these limitations, older adults might be unable to successfully attend to and memorize all stimuli; therefore, some stimuli will be favored over others. Positive stimuli may be prioritized for a number of reasons. First, positive information helps older adults fulfill their current goals of life satisfaction and emotional well-being (the socioemotional selectivity theory; Carstensen et al., 1999; Charles et al., 2003). Second, positive information may be less complex (visually and/or semantically) than negative information, rendering it easier to process (Labouvie-Vief, 2003; Labouvie-Vief et al., 2010). Third, alterations in fronto-amygdalar brain activity may selectively reduce the perceptual processing of negative stimuli and increase emotion regulation, which can facilitate the processing of positive over negative stimuli (Leclerc & Kensinger, 2010, 2011; Mather et al., 2004; St Jacques, Dolcos, & Cabeza, 2010). For these reasons, among others, positive stimuli may be easier to remember than negative stimuli when both types are processed at the same time. In real life, older adults certainly might experience a mix of positive and negative events close in time, or a single event could even elicit mixed emotions.Footnote 4 In these cases, older adults would tend to remember the positive events better than the rest. This would explain why the positivity bias is frequently observed in real life, because positive events are often processed relative to other events. Nonetheless, the results of the present study suggest that older adults would still maintain the ability to remember negative events well, provided the events are experienced in isolation from other emotional events.

Contrary to the older adults in our study, the young adults’ EEM was not influenced by distinctiveness. In both experiments, young adults recalled more positive and negative pictures than related-neutral or unrelated-neutral pictures. This is contrary to previous findings, in which young adults remembered equal numbers of negative and neutral words (Dewhurst & Parry, 2000; Hadley & MacKay, 2006; Schmidt & Saari, 2007) and pictures (Talmi, Luk, et al., 2007; Talmi & McGarry, 2012) when studying each category individually. The purpose of controlling distinctiveness is to minimize the processing advantage of emotional relative to neutral information. Yet, even when stimuli are processed in isolation from one another, emotional stimuli might still engage more attention than neutral stimuli, perhaps due to their high salience, goal relevance, or ability to induce arousal (Barnacle, Montaldi, Talmi, & Sommer, 2016; Murphy & Isaacowitz, 2008; Pourtois, Schettino, & Vuilleumier, 2013; Vuilleumier, 2005). Therefore, many factors beyond relative distinctiveness might lead emotional stimuli to capture more attention than neutral stimuli. According to mediation theory (Talmi, 2013; Talmi & McGarry, 2012; Talmi et al., 2013), increased attention, semantic relatedness, and distinctiveness can fully account for the immediate emotional enhancement of memory. In the present experiments, we attempted to reduce potential differences in attention allocation by using full-attention, intentional-encoding instructions with slow presentation times, but there was no direct measure of attention. It is possible that some uncontrolled characteristic in the pictures (e.g., arousal or visual complexity) led the emotional ones to capture more attention than the neutral ones, regardless of the distinctiveness condition. In future work, it will be necessary to measure and/or manipulate attention directly, to determine the extent to which distinctive processing alters attention allocation and subsequent EEM effects.

Test delay

A final consideration was whether testing memory 1 min or 45 min after study would influence the effects of relatedness and distinctiveness. Overall, memory was greater, the sooner it was tested. In Experiment 1, EEM in young adults was present at both delays. Exploratory analyses suggested that young adults forgot positive pictures at a slower rate than all the other pictures, but this was not replicated in Experiment 2. Nonetheless, this differential forgetting rate for positive pictures in young adults is surprising, given our expectation that they would prioritize negative information. In Experiment 2, the test delay directly affected EEM: Participants recalled more positive than negative pictures when tested after 45 min, but not when tested after 1 min, perhaps due to participants’ remembering novel positive pictures during the delayed testing that they had forgotten during the immediate testing. This result did not further interact with age, although it was likely driven by the positivity bias in older adults. This may suggest that older adults’ positivity bias becomes stronger over time, similar to young adults’ EEM becoming stronger over time (Yonelinas & Ritchey, 2015). A between-subjects design testing recall at two or more delays would be useful to disentangle the effects of repeated testing from those of delayed testing.

Conclusion

When examining the aging positivity effect, it is important to consider basic cognitive factors that can improve encoding and/or retrieval. Semantic relatedness partly explained EEM in young adults, and relatedness together with distinctiveness entirely explained memory in older adults. Distinctive processing was necessary for producing older adults’ positivity bias, which disappeared when distinctiveness was controlled. This argues that the positivity effect reflects a temporary contextual advantage for positive information that can be eliminated by controlling item interrelatedness and distinctiveness.

The present findings are consistent with the few previous aging studies that have used unmixed study designs (Emery & Hess, 2011) or have directly contrasted unmixed and mixed sets (Grühn et al., 2007; Grühn et al., 2005), all of which failed to find a positivity bias in aging memory. On the one hand, this might have been due to specific stimulus characteristics (Grühn et al., 2005) or memory assessment procedures (Grühn et al., 2007) that might have attenuated the positivity effect in aging (Reed et al., 2014). On the other hand, it might be that the majority of previous reports, which have used mixed designs and paid little attention to item interrelatedness, have overestimated the size and/or robustness of the aging positivity effect. Note that we do not claim here that this effect does not exist. However, we would urge care moving forward in studying the aging positivity effect. Item interrelatedness and distinctiveness are but two cognitive factors that can influence emotion-enhanced memory, and they should be accounted for carefully in future work.