Carroll and White (1973) first discovered that the age at which we learn words influences their processing speed, independent of other language-processing determinants. They found shorter latencies for picture naming when words had an earlier age of acquisition (AoA). Since then, AoA effects have been reported in various tasks and language modalities, including picture naming (e.g., Belke, Brysbaert, Meyer, & Ghyselinck, 2005), word naming (e.g., Gerhand & Barry, 1999b), masked priming (e.g., Brysbaert, Lange, & Van Wijnendaele, 2000), semantic categorization (e.g., Brysbaert, Van Wijnendaele, & De Deyne, 2000), and lexical decision (e.g., Gerhand & Barry, 1999a). For reviews, see Johnston & Barry (2006) or Juhasz (2005).

Age-of-acquisition hypotheses

Two hypotheses try to explain the mechanism behind the AoA effect. The semantic hypothesis claims that AoA effects do not primarily originate from learning lexical word forms, but from their semantic representations. AoA effects then reflect the speed by which these are accessed, as a function of the organization of the representational network (Brysbaert et al. 2000; Steyvers & Tenenbaum, 2005). When new concepts are learned, they are linked to the ones already in the network. Early-learned words will be more central and better connected in the network, making them more easily accessible. Evidence for this hypothesis comes from the observation that AoA effects become larger when semantic activation of stimuli is necessary; that is, they are larger in object-naming tasks than in lexical decision (Barry, Johnston, & Wood, 2006), and larger in lexical decision than in word naming (Cortese & Khanna, 2007). More direct evidence has come from semantic categorization tasks in which AoA effects were found (Brysbaert et al., 2000), and from a semantic Simon task (Ghyselinck, Custers, & Brysbaert, 2004). In this last paradigm, participants judged whether words were presented in upper- or lowercase by responding verbally with labels that could be semantically congruent or incongruent with the (irrelevant) meaning of the target (“living” or “nonliving”). The semantic congruency effect was stronger for early-acquired words, showing that the meaning of the early-learned words was activated faster. The authors conclude that semantics play an important role in the AoA effect.

The second hypothesis is the mapping or connectionist hypothesis. It originates from simulations with connectionist networks (Ellis & Lambon Ralph, 2000; Monaghan & Ellis, 2010): Items that were trained first always had an advantage over later-trained items because the early items were learned better. The researchers argued that the information that enters a network first benefits more from the plasticity of the network and alters its connections, or weights, to a greater extent. As new information keeps entering the network, the network loses plasticity, making weight changes smaller. Early items thus have a larger impact on the network’s final structure. In contrast to the semantic hypothesis, the mapping hypothesis does not situate AoA effects on a single processing level; AoA could play roles at the lexical, semantic, and/or phonological levels. Evidence for this hypothesis has come from tasks in which it was shown that learning completely new information (e.g., nonwords, complex patterns, etc.) in several stages resulted in an order-of-acquisition effect, analogous to the AoA effect (Joseph, Wonnacott, Forbes, & Nation, 2014; Stewart & Ellis, 2008).

AoA in eyetracking

AoA effects emerge across language modalities, and are therefore also of interest for visual word recognition research, which often uses lexical decision tasks (Brysbaertet al., 2000). Next to these studies with single word presentations, eyetracking has also been used to investigate AoA effects in a few, rare sentence-reading studies (Joseph et al., 2014; Juhasz & Rayner, 2003, 2006). This is highly relevant, given that most words are encountered in a sentence context. It is therefore important to generalize findings from experimental, isolated-word recognition to natural language processing.

One of the advantages of investigating eye movements is that they can be monitored with high spatial and temporal resolution. They reveal large amounts of information about the underlying word recognition processes (Rayner, 1998, 2009). Also, multiple dependent variables are available in eyetracking. Single fixations are the durations of the fixations of words that were fixated only once. First fixations are the durations of the first fixation on a word, regardless of later refixations. Gaze durations are the sum of all fixation durations on a word before the eyes move on to a new word. These measures are “early” measures of eyetracking because they reflect the initial stages of word recognition. Finally, total reading times are a “late” measure of eyetracking, since they constitute the sum of all fixations on each target word, including refixations. Because participants only have to read the presented text, another advantage of eyetracking is the minimal amount of interference by task demands, in contrast to, for example, lexical decision, which includes a decision component that may introduce strategic biases. Eyetracking therefore seems to be a promising technique to investigate AoA effects in visual word recognition.

Juhasz and Rayner (2003, 2006) found that the AoA of target words influenced reading times in eyetracking: Earlier AoAs lead to shorter fixations. In the 2003 study, this was found in early measures (single-fixation and gaze durations); in the 2006 study, it was also found in additional early (first fixations) and late (total reading time) measures. The authors argued that this difference was due to the designs of the studies: In the 2006 study, an orthogonal design with early and late AoA values was applied, whereas in the 2003 study, AoA was treated as a continuous variable. The effects were more pronounced when only extreme AoA values were presented. Because both studies presented the target stimuli in sentences, and because semantic activation (i.e., the meaning) of these words is necessary to understand the sentence, Juhasz and Rayner interpreted their results as evidence for the semantic hypothesis.

These pioneering eyetracking studies on AoA effects have been very informative and now require assessments of their generalizability. First, the total numbers of target sentences (and words) tested by Juhasz and Rayner (2003, 2006) were limited to, respectively, 72 and 108. These numbers are typical for an eyetracking paradigm, but rather small as compared to the megastudy approach that we adopted here. Second, the researchers operationalized “natural reading,” their extension of isolated-word recognition, as single-sentence reading, whereas in daily life we also tend to read longer chunks of text that make a coherent whole. Finally, although the 2003 study, with continuous AoA, yielded significant effects, the most convincing results of AoA effects have come from orthogonal designs. Balota, Cortese, Sergent-Marshall, Spieler, and Yap (2004) argued that a factorial approach could entail several flaws, such as implicit biases of experimenters and participants, and a reduction of power and reliability when continuous variables are converted to categorical ones. They proposed a megastudy approach as a valuable alternative, with large samples of stimuli varying on a broad range of characteristics. For isolated-word recognition, this approach has been successfully applied in two studies (Cortese & Khanna, 2007; Cortese & Schock, 2012) that assessed AoA effects in lexical decision data from the English Lexicon Project (ELP; Balota et al., 2007). Both studies revealed an AoA effect (faster reaction times for earlier AoA) above and beyond other predictors, such as word frequency and length. In compliance with these studies, we assessed AoA effects using megastudy data from natural story reading.

Present study

We investigated AoA effects in the Ghent Eye-tracking Corpus (GECO; Cop et al. in press). This corpus is an eyetracking database of participants reading an entire novel. GECO has previously been used successfully to investigate, for example, the effects of word frequency (Cop, Keuleers, Drieghe, & Duyck, 2015) and orthographic neighborhood (Dirix et al. in press). Here we used the corpus to investigate the importance of AoA, in addition to other lexical variables, when participants were reading a large body of text, rather than single words or sentences. The corpus contains a monolingual (English) and a bilingual (Dutch and English) part. For the present study, we focused on the monolingual data, since we wanted to investigate the AoA effect without potential influences of second-language knowledge. The monolingual dataset contains about 760,000 words read in total: 14 participants read 54,364 words (5,012 unique), embedded in 5,300 sentences. This dataset provides a large variety in target words and a broad range of word characteristics.

We analyzed both early (single-fixation, first-fixation, and gaze durations) and late (total reading time) measures of eyetracking. The AoA ratings for our stimuli were taken from the database of Kuperman, Stadthagen-Gonzalez, and Brysbaert (2012). Such ratings are commonly used in AoA experiments and score well on validity (Brysbaert, in press). Aside from AoA, we included other (sometimes correlated) important word recognition predictors in the analysis: word frequency (SUBTLEX-UK; van Heuven, Mandera, Keuleers, & Brysbaert, 2013), length, and neighborhood density (CLEARPOND; Marian, Bartolotti, Chabal, & Shook, 2012). Several target words were presented more than once throughout the novel, so we included the predictor “rank of occurrence” to account for repetition effects.

We expected that reading times, on all measures, would be shorter for earlier-learned words, in accordance with Juhasz and Rayner (2003, 2006). We did not apply an orthogonal design, but we included interactions between the predictors in the base models. This allowed the interaction of AoA with word frequency, as in Gerhand and Barry (1999a), who found that the AoA effect was larger for low-frequency words.

Method

Participants and materials

The stimuli and data of this study were taken from the monolingual part of GECO (Cop et al., in press), in which participants read the entire novel The Mysterious Affair at Styles by Agatha Christie. We included all nouns for which an AoA rating was available in Kuperman et al. (2012), but only if at least 75% of the raters made an AoA estimation (to ensure a reliable AoA rating). In all, 7,158 nouns (1,487 unique) remained in the final selection (see Table 1).

Table 1 Descriptive statistics for the nouns of the monolingual portion of GECO used in the present study, averaged over stimuli (standard deviations are between parentheses)

The monolingual participants were 14 undergraduate students at the university of Southampton (eight females, six males; M age = 21.8, SD age = 5.6). Their language proficiency was tested with the LexTALE (Lemhöfer & Broersma, 2012; M = 91.07, SD = 8.92, range = [71.25–100]).

Procedure

The eye movements of the participants were monitored while they read the novel in four separate sessions. The number of chapters was fixed for each session, but the reading tempo within the sessions was self-paced. To ensure that participants were reading for comprehension, multiple-choice questions were presented after each chapter. For a detailed overview of the procedure, see Cop et al. (in press).

Eye movement analysis

Each dependent variable was fitted in a linear mixed model using the lme4 package (version 1.1-10) in R (version 3.1.1; R Development Core Team, 2013). All p values were calculated with lmerTest (version 2.0-30). Initial models included the fixed factors AoA, Word Frequency, Word Length, Neighborhood Density, Language Proficiency, and Rank of Occurrence (all continuous), as well as random intercepts for subjects and words. The random intercepts for subjects were included to ensure that individual differences in genetic, developmental, or social factors between subjects were modeled (Baayen, Davidson, & Bates, 2008). The random intercept for words was included so we could generalize to other nouns, since the present stimulus set is not an exhaustive list of all English nouns. Word frequency was log-transformed (base 10) to normalize its distribution. All continuous variables were centered.

Each dependent variable was also log-transformed (base 10). The following procedure was applied to discover the optimal model (Barr, Levy, Scheepers, & Tily, 2013): First, a full model including all interactions between the fixed effects (up to three-way) was fitted. Then the model was backward-fitted by excluding the interaction with the smallest t value. An interaction term was excluded if a model comparison chi-square test turned out not to be significant, meaning that it did not contribute to the fit. Next, the random effects were forward-fitted and were kept in the model if they contributed to the fit. Finally, the fixed effects were again backward-fitted.

Results

The average fixation times are presented in Table 2. We median-split the data by AoA and word frequency, just to give an indication of the effect sizes of these crucial predictors. The descriptive statistics indicated that their independent effects were comparable in size.

Table 2 Average single-fixation durations, first-fixation durations, gaze durations, and total reading times for early [2.4–7.8] and late [7.9–19] AoA and low [0.01–3.44] and high [3.45–5.85] word frequency, in milliseconds

Outliers were determined as fixation times more than 2.5 SDs away from the subject means, and were removed from the dataset (2.16% for single fixations, 2.37% for gaze durations, 2.80% for total reading times). All final models are presented in Table 3. See the supplementary materials for the first-fixation analysis.

Table 3 Estimates, standard errors, t values, and p values for the fixed and random effects of the final general linear mixed-effect model for the dependent measures

Single-fixation duration

Only nouns that received a single fixation were selected for this analysis (56.35%). We observed a main effect of AoA: Single fixations were shorter for words with an earlier AoA. The main effects of word frequency and word length were significant, as was their interaction. Single fixations were shorter for more-frequent words, but only for nouns of four or more letters (χ = 6.17, df = 1 p < .05). The interaction between word length and language proficiency was also significant: Fixations became longer with increasing word length, but this effect diminished for participants who scored 92.65 or higher on the LexTALE (χ = 3.84, df = 1 p < .05).

Gaze duration

The main effect of AoA was significant: Gaze durations were shorter for earlier-learned words. The main effects of word frequency and word length were also significant, as was their interaction. Gaze durations were shorter for higher-frequency nouns; post-hoc contrasts showed that the effect was significant for even the shortest words (three letters, χ = 5.27, df = 1 p < .05), but it became larger as word length increased.

Total reading time

The main effects of AoA and word frequency were significant, as was their interaction (see Fig. 1): Total reading times were faster for earlier AoA, but only for words with a word frequency up to 4.290 (χ = 3.86, df = 1 p < .05). The main effects of word length and rank of occurrence were significant. Reading times were slower with increasing word length, but faster for repeated presentations of a noun.

Fig. 1
figure 1

The interaction between AoA (x-axis) and word frequency (lines) for total reading times (y-axis).

Discussion

We investigated AoA effects in the monolingual data of an eyetracking corpus (GECO; Cop et al., in press). In accordance with a few, rare earlier eyetracking investigations (Juhasz & Rayner, 2003, 2006), we expected faster reading times for earlier-learned words, and indeed we found that AoA had the expected effect on reading times for all four dependent eyetracking measures: Earlier-learned words were read faster, independent of other lexical variables. Furthermore, we hypothesized that word frequency and AoA could interact. For total reading times, this interaction was indeed significant and in line with previous results (Gerhand & Barry, 1999a): The AoA effect was larger for low-frequency words.

This study was the first to investigate AoA effects in natural reading. Our results show that the age at which we learn words not only influences the reading process when encountering single words (e.g., Brysbaert et al., 2000) or sentences (Juhasz & Rayner, 2003, 2006), but even while reading longer pieces of coherent text. The results are also in line with other megastudy investigations of AoA effects on isolated-word recognition (e.g., Cortese & Khanna, 2007).

Following the reasoning of Juhasz and Rayner (2003, 2006), semantic activation is needed to understand the words embedded in sentences, and AoA effects emerged during such reading. AoA effects were found in measures such as single fixations (in which the word is read and recognized on a single fixation) and total reading times, for which we assume that semantic activation of the word was then completed. Indeed, the present results could be considered evidence for the semantic hypothesis (Brysbaert et al., 2000), in which the semantic network’s organization plays a central role in AoA effects. However, the present results could also be framed according to the mapping hypothesis (Ellis & Lambon Ralph, 2000). This hypothesis does not specify which processing level AoA influences, but applies a “first-come, first-served” principle: Network weights are altered in favor of items that entered the network earlier. We also observed AoA effects on measures for which semantic access to words is not yet assumed to be complete (i.e., first fixations and gaze duration).

Furthermore, this hypothesis predicts that AoA effects should be the strongest in tasks in which input–output mappings are arbitrary, such as in picture naming, in which there is no systematic mapping between the meaning of a picture and the phonology of the word it represents. On the other hand, AoA effects should be smaller in tasks in which input–output mappings are consistent, as in word-naming tasks, which usually have a reasonably consistent relationship between the orthography and phonology of a word. Evidence for this prediction was provided in a both computational and experimental study by Lambon Ralph and Ehsan (2006), in which the AoA effect was indeed larger for arbitrary than for systematic mappings. In the present study, we found a significant AoA effect in all timed measures of reading, but the averages in Table 3 indicate that the effect is smaller in early measures (which are supposed to reflect early word recognition) than in late measures (which involve semantic processing of the words, and thus rely on the arbitrary orthography–semantic mappings). In addition, the mapping hypothesis predicts that AoA effects will be present in opaque languages (with arbitrary orthography-to-phonology mappings). Since English is considered an opaque language, our present results are also in line with this prediction.

A third option is that the AoA effect originates from systems that occur in both the semantic and mapping hypotheses, since they are not mutually exclusive. Indeed, whereas the mapping hypothesis describes a functional mechanism, the semantic hypothesis provides a structural explanation. In the data, early-learned words have an overall advantage over later-learned words, even in early word recognition stages. This can be explained by the mapping hypothesis. However, the meaning of early-learned words is also activated faster, possibly because they have a more central place in the lexicon. Since our data point toward evidence for both hypotheses, it is likely that they both have a share in the etiology of the AoA effect.

Next to theoretical accounts of the AoA effect, these results are also of importance to eye movement models. An example is the E-Z Reader model (Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Pollatsek, & Rayner, 2006). According to this model, the lexical processing of words occurs in two serial stages. In the familiarity check, lexical candidates become active. After completion of this stage, the oculo-motor system starts programming a saccade toward the next word. In the verification stage, full lexical identification of the target word is accomplished. After the completion of this stage, attention is shifted toward the next word. This model thus decouples saccade programming from the attention shift. The determining factors for the durations of the two stages are assumed to be word frequency and the predictability of the target. However, the present results suggest that AoA also determines the durations of fixations. For example, the familiarity check might be faster for words that are more easily accessible, because they have a more central place in the network (semantic hypothesis) or because the network weights are shifted to their advantage (mapping hypothesis), leading to shorter fixations. Future versions of E-Z Reader could introduce AoA as a determining factor for fixation times, thereby possibly increasing the explained variance in observed reading times.

In conclusion, we found clear AoA effects in the eyetracking patterns of monolinguals reading an entire novel, independent of and above the influences of other lexical variables. These results generalize the large body of evidence that has shown that earlier-learned words are processed faster to the domain of natural reading of running text.