The Effect of Word Predictability on Phonological Activation in Cantonese Reading: A Study of Eye-Fixations and Pupillary Response


This study aimed to investigate the effects of contextual predictability on orthographic and phonological activation during Chinese sentence reading by Cantonese-speaking readers using the error disruption paradigm. Participants’ eye fixations and pupil sizes were recorded while they silently read Chinese sentences containing homophonic, orthographic, and unrelated errors. Sentences had varying amounts of contextual information leading up to target words such that some targets were more predictable than others. Results of the fixation time analysis indicated that orthographic effects were significant in first fixation and gaze duration, while phonological effects emerged later in total reading time. However, interactions between predictability and the homophonic condition were found in gaze duration. These results suggest that, while Cantonese readers activate word meanings primarily through orthography in early processing, early phonological activation can occur when facilitated by semantics in high-constraint sentence contexts. Analysis of pupillary response measurements revealed that participants’ pupil sizes became larger when they read words containing orthographic errors, suggesting that orthographic error recovery processes significantly increase cognitive load.


A major goal among researchers investigating the cognitive processes involved in silent reading has been to untangle the roles of orthography and phonology in relation to semantic access. One view is that word identification first involves the activation of a mental representation of the word’s phonological form, which in turn allows access to the word’s meaning (Frost 1998; Van Orden 1987). Another view, referred to as the dual-route model (Coltheart et al. 2001), proposes that while reading the semantic system can be accessed either directly through orthography or via an indirect route by which orthography first activates phonology, which then in-turn activates meaning. According to this model, the more efficient direct route from orthography to meaning is predominant in skilled reading, but both routes can receive varying degrees of simultaneous activation. Whether phonological activation occurs prior to lexical access has been found to depend upon a complex interplay of factors including the skill of the reader, the frequency of the word, and the predictability of the word in context (see Leinenger 2014 for review of relevant studies). However, as the development of theories of reading has primarily relied on experimental data from studies on readers of English and other alphabetic languages, the issue of how factors such as word frequency and context affect reading in non-alphabetic languages, such as Chinese, remains underexplored.

Chinese and Alphabetic Writing Systems

In alphabetic writing systems, letters map to individual phonemes, which are combined to form words. Even unfamiliar words in alphabetic languages like English can be assembled letter-by-letter via a successive grapheme-phoneme mapping procedure and pronounced aloud. A word like wug, for example, can be broken down into /w/, /ʌ/, and /g/. In contrast, no element within a Chinese character corresponds to an individual phoneme. Rather, the character maps to a single syllable. Most Chinese characters are composed of semantic and phonetic radicals, which give information related to the meaning and pronunciation of the character. For example, the character 估/gu2/Footnote 1 [estimate], consists of the semantic radical 亻[person] and the phonetic radical 古/gu2/[ancient]. Importantly, nothing in the characters 估/gu2/[estimate] or 古/gu2/[ancient] represents the phonemes/g/and/u/or the tone.

There are two important points to note regarding the phonetic information contained in Chinese characters. The first is that while many Chinese characters contain phonetic radicals (e.g., 估/gu2/, discussed above), the information conveyed by these radicals is often unreliable or inconsistent. For example, in the character 猜/caai1/[guess], the radical 青/cing1/[green] has a different pronunciation than the character as a whole. Only around 26% of Chinese characters have a pronunciation that corresponds to their phonetic radicals, and only between 14% and 46% have meanings that are clearly related to their semantic radical (Gao et al. 1993)—e.g., 蟻 [ant] containing the semantic radical 虫 [insect]. Secondly, many characters have the same pronunciation, even though they might be orthographically unrelated (e.g., 股/gu2/[a share] and 古/gu2/[ancient]). There is an average of approximately 17 homophonic characters per base syllable in Cantonese (total of 625 base syllables—1761 with tones; The Linguistic Society of Hong Kong 1997) and 31 characters per syllable in Mandarin (420 base syllables—1471 with tones; Denisowski 2005).

To summarize the points above, there are three different facts regarding Chinese that distinguish it from alphabetic systems and that are relevant to the current study: (1) each Chinese character represents a full syllable and cannot be broken down into individual phonemes; (2) orthographic cues indicating a character’s pronunciation are inconsistent, and characters with entirely different orthographic structures might share the same pronunciation; (3) there are a large number of homophonic characters in Chinese, though these homophones might vary in terms of orthographic relatedness. In light of these unique characteristics associated with Chinese script, Perfetti et al. (1992) proposed the universal phonological principle in order to address the question of how these characteristics might affect phonological processing during Chinese reading. This view purports that, across scripts, phonological activation automatically proceeds from the visual identification of words, and the unique properties of individual writing systems determine how phonological activation unfolds. While not necessarily used for lexical access, the activated phonological representations are important for reading comprehension and retaining words in memory. Reading processes in Chinese only differ from alphabetic languages in that Chinese word activation is necessarily a two-step process: once orthographic identification is complete, the character’s phonology is activated at the syllable level. Alphabetic languages, in contrast, allow for earlier phonological activation, beginning at the phonemic level with grapheme units, before full orthographic recognition at the word-level is achieved.

Homophones and the Error Disruption Paradigm

In Chinese, many characters are homophonic in that they share a pronunciation but may be entirely dissimilar in orthographic structure. For example, the characters 估 [estimate], 古 [ancient], 股 [a share], and 鼓 [drum] all share the Cantonese pronunciation /gu2/ but vary in terms of shared visual similarity. In the case of English, homophones necessarily overlap in orthographic structure. For example, “bare” and “bear” share the same letters in addition to their identical pronunciation. Even homophone pairs that share fewer orthographic similarities, such as “chute” and “shoot,” share some of the same letters. Therefore, while homophones exist in both languages, Chinese homophones differ from English ones in that many Chinese homophone pairs are orthographically unrelated.

In the error disruption paradigm, participants silently read sentences containing either orthographic errors or homophonic errors while their eye-movements are recorded by an eye-tracker to investigate the time course of phonological and orthographic activation during sentence or passage reading. When analyzing eye-tracking data obtained through such studies, researchers are interested in the time readers spend reading each homophone or orthography error. Three commonly used measures of reading time include first fixation duration (FFD; the duration, in milliseconds, of the first fixation on the target word), gaze duration (GD; the duration of the first fixation in addition to any refixations made before the reader moves on to the next word), and total reading time (TRT; the total duration of all fixations made on the target word, including regressive fixations made after the reader has already fixated on any word outside the target region). FFD and GD are generally considered similar in that they both reflect early stages of processing during which words are retrieved from the mental lexicon (Rayner 2009; Rayner 1998), though some researchers have argued that GD is more closely tied to semantic integration processes (Carpenter and Daneman 1981; Inhoff 1984). TRT is considered a measure of late stages of word processing and integration (Conklin et al. 2018) as well as error decoding and recovery (Daneman and Reingold 1993).

In English, when homophone errors (e.g., They decided that they wouldmeatat twelve o’clock for lunch) are fixated on for shorter periods of time than orthographic controls (e.g., They decided that they wouldmeanat twelve o’clock for lunch), it is taken to indicate that phonology is being involved in activating the meaning of the correct target word (see Daneman and Reingold 2000; Rayner et al. 1998). Chinese error disruption paradigm experiments have generally included, at minimum, a correct condition (e.g., 自/zi6/[self] in 自己/zi6 gei2/[self]), an orthographic error condition (e.g., 目/muk6/[eye]), a homophonic error condition (e.g., 字/zi6/[word]), and an unrelated control error (e.g., 分/fen1/[divide]) (see Zhou, Shu, Miller, & Yan 2018). Shorter fixations on orthographically unrelated homophonic errors than unrelated errors (e.g., shorter fixations on 字/zi6/[word] than 分/fen1/[divide]) would indicate phonological involvement in lexical access. Shorter fixations on orthographic errors, but not on homophonic errors, relative to unrelated controls, would suggest that meaning is being activated primarily through orthography.

Daneman and Reingold (1993) and Daneman et al. (1995) conducted two error disruption experiments in English to look at how homophonic and orthographic errors affected participants’ eye-movements while reading a 1100-word passage. They found that both types of error caused similar amounts of disruption to reading speed in GD compared to the correct word. The analysis of later processing measures (e.g., TRT) showed that readers spent significantly less time fixating homophonic errors than orthographic controls. However, this only held true for homophone pairs of the same length (e.g., blue and blew as opposed to wade and weighed). The authors concluded that the role of phonology is more delayed or limited than earlier single-word reaction time studies had suggested (for review of these studies, see Leinenger 2014).

Rayner et al. (1998) carried out a study using a similar paradigm to that of Daneman and Reingold (1993) and Daneman et al. (1995). They hypothesized that high contextual predictability might make it more likely for phonological activation to occur, so they varied the degree of contextual predictability (or “constraint”) to include high-constraint sentences, in which the target was highly predictable (e.g., “She realized that she was going too fast and needed to slam on thebraketo slow down”), and low-constraint sentences, in which the target was less predictable. They also varied the degree of orthographic overlap between homophone errors and orthographic controls (e.g., same initial two letters: brake/break; same initial one letter: bear/bare; different initial and final letters: chute/shoot). When the variables of constraint and orthographic similarity were collapsed, their results reflected those of Daneman and Reingold (1993) and Daneman et al. (1995), who had found only later involvement of phonology. However, they did find evidence of early phonological activation in FFD when both predictability and orthographic similarity were high. They interpreted their data as supporting early and dominant involvement of phonology in lexical access.

In a later study, Daneman and Reingold (2000) responded to Rayner et al.’s (1998) conclusion that their findings supported pre-lexical phonological activation, pointing out that early phonological activation was only found in high-constraint contexts. Furthermore, they replicated their original study with an additional group of participants and again found evidence of phonological activation only in later measures, suggesting that phonology did not play a role in early lexical access. They concluded that phonological codes play only a limited role in regular reading—mainly in accessing the meanings of low frequency words. In this replication study, to test Rayner et al.’s (1998) finding that early phonological activation could occur when contextual constraint was high, Daneman and Reingold (2000) conducted a norming study to check the predictability of target words in the texts used in their own experiments, inviting a new group of participants to read each target sentence up to but not including the target word and then guess what the next word would be. Based on the results, each sentence was assigned a contextual predictability value (i.e., the proportion of correct guesses). Among the three texts used in their experiment, the text with the highest number of predictable targets had a mean predictability value of 0.28 (SD = 0.33) with a wide range of scores from 0.00 (no correct predictions) to 1.0 (all participants guessed correctly). Like Rayner et al. (1998), they also found that higher predictability values correlated with shorter fixation times on homophone errors. However, as high-constraint sentence contexts are relatively uncommon, they argued that this phonological involvement could not be said to generalize to most reading situations. Daneman and Reingold (2000) proposed the following model for reading in these predictable contexts: if a given context is highly constrained, a semantic representation of the predictable target will be activated by context prior to its being fixated. This semantic representation will then, in turn, activate its associated phonological and orthographic representations. If these pre-activated representations match the target, lexical access will be quicker, leading to shorter fixation durations on homophone and orthographic errors. Thus, in high-constraint contexts, phonology is also activated top-down via semantics rather than only bottom-up through orthography.

The error disruption paradigm was extended to examine orthographic and phonological processing during silent reading in Chinese by Wong and Chen (1999), who conducted an error disruption experiment with Cantonese speaking readers of traditional Chinese charactersFootnote 2 from Hong Kong. The experimenters manipulated the first characters of two-character words and embedded them in short passages. Their stimuli contained five conditions: a correct target condition (e.g., 栽培/zoi1 pui4/[to grow]), an orthographic condition (e.g., 截培/zit6 pui4/; similar orthography + dissimilar pronunciation to the correct target), an orthographically similar homophone condition (e.g., 哉培/zoi1 pui4/; similar orthography + identical pronunciation to the correct target), an orthographically dissimilar homophone condition (e.g., 災培/zoi1 pui4/; dissimilar orthography + identical pronunciation to the correct target), and an unrelated control condition (e.g., 紗培/saa1 pui4/; dissimilar orthography + dissimilar pronunciation to the correct target). In FFD, only orthographic errors benefited (i.e., caused less disruption to) fixation times relative to controls. In GD, they found a strong effect of orthography and a weak (i.e., significant across items but not subjects) phonological effect. In TRT, both orthographic and phonological effects were significant. When erroneous characters were both homophonic and orthographically similar, there was an additive effect leading to faster reading times than either homophonic or orthographic similarity alone. The authors concluded that these findings point to a dominant role of orthography in the early stages of processing during Chinese reading and that phonology was activated during the late stage of error recovery. This conclusion was similar to that of Daneman and Reingold (2000)—that meaning is accessed mainly through orthography in normal reading.

To compare the roles of orthography and phonology during silent reading in Chinese and English, Feng et al. (2001) conducted two error disruption experiments, one with native speakers of English and another with native speakers of Mandarin Chinese. In English, phonological benefit was found in FFD, but only when the target was highly predictable from context and the homophone was orthographically similar to the target (e.g., creek and creak). Word frequency was also found to play a role in that phonological benefit was found for high-frequency words but not low. The authors concluded that early effects of phonology are found in English but highly constrained by factors such as orthographic similarity, frequency, and contextual predictability. In the Chinese data, they only found phonological effects in the late processing measure of TRT, and this lack of early phonological effects in Chinese held true regardless of orthographic similarity, predictability, and frequency.

While the studies discussed above mainly support direct access to meaning via orthography for skilled readers, studies on developing readers support a more central role for phonological mediation. Among English readers, weaker readers have been found to rely more on phonology for lexical access than more skilled readers (Jared et al. 1999). In their developmental study of Chinese readers, Zhou et al. (2018) conducted an experiment using the error disruption paradigm to compare child (M = 9.1 years, SD = 0.3) and adult (M = 22.8 years, SD = 2.0) readers and found that children fixated on homophonic errors for shorter durations than unrelated errors in GD, suggesting that phonological activation was helping them access word meanings. As with other studies, phonological activation only occurred among adult readers in late processing measures.

In sum, these studies suggest that in normal reading conditions, skilled readers of Chinese and English access meaning directly through orthography. However, manipulating certain conditions, such as contextual predictability and word frequency, might facilitate very early phonological activation in English. Less skilled readers of both languages also appear to rely more on phonology in earlier stages of reading development.

Pupillary Response

Eye fixation measurements have been widely used in studies on reading, but less attention has been given to pupillary response, which can also be measured by many widely used eye-tracking devices. While most changes in pupil size relate to light or accommodation response, tiny pupil diameter increases (less than 0.5 mm) have been shown to reflect increases in cognitive processing demands (Beatty and Lucero-Wagoner 2000). Pupil size changes measured during experimental tasks have been found to be associated with memory load (Peysakhovich et al. 2015), attentional processing (Ariel and Castel 2014), uncertainty (Friedman et al. 1973; Steinhauer and Zubin 1982), lexical ambiguity (Ben-Nun 1986), and contextually unexpected stimuli (Raisig et al. 2012).

Only a relatively small number of studies have used pupil measurements as a measure of cognitive load during reading tasks, and to our knowledge, none have investigated Chinese reading. Briesemeister et al. (2009) recorded pupil size measurements in addition to EEG and response time during a lexical decision task that included real words, pseudohomophones (e.g., brane for brain), and orthographic controls (e.g., brine). After being presented with each stimulus item, participants had to indicate whether it was a real word or a non-word using designated response keys and then rate how confident they were in their judgment on a 7-point scale. Analysis of pupil size found greater mean peak pupil diameter for pseudohomophones compared to real words and spelling controls. Furthermore, pupil size inversely correlated with confidence ratings, such that pupil diameters were greater when participants were less confident in their word judgments, suggesting that pupil size can reflect conflict monitoring processes. They concluded that the evidence for conflict processing in behavior and pupillometric data is partially the result of phonological activation, as evidenced by the EEG data, which showed a reduced N400Footnote 3 for pseudohomophones relative to spelling controls.

In one of the few studies on pupillary response during sentence reading, Hyönä and Pollatsek (2000) conducted two Finnish reading experiments in which they analyzed pupil diameter changes during first and second fixations on compound words embedded in sentences. For each compound, they varied the frequency and length of the first and second words (overall length was held constant). Contrary to expectation, they did not find that less frequent words evoke increases in pupil diameter. However, they did find consistent significant differences between first and second fixation pupil sizes, suggesting that more processing effort is exerted on the first fixation during the stage of word identification.

In single word reading tasks, pupil dilations in response to stimulus onset tend to occur with a delay of about 300 ms, while single fixations tend to last only around 250 ms (Hoeks and Levelt 1993). Some researchers might, therefore, question whether analyzing pupillary response on word fixations in sentence reading tasks is appropriate given that the reader might move on to the next word before pupil diameter changes can develop. Hyönä and Pollatsek (2000) argued that because the processing of words begins before the word is directly fixated (i.e., while it is visible in the parafoveal range of visionFootnote 4), changes in pupil diameter will begin to develop earlier in sentence reading tasks than they would in single-word tasks. The early development of pupillary response due to parafoveal processing effects has been supported by a study by Snell et al. (2018) which found that pupil size is affected by the brightness of parafoveally presented stimuli at around 200 ms after onset and that an effect of orthographically related stimuli, also presented in the parafoveal range of vision, developed around 500 ms after onset.

In sum, measuring changes in pupil diameter can tell us about processing load during reading tasks. Furthermore, orthographic and phonological relatedness, as well as contextual predictability, all have been found to influence pupillary response, suggesting it could be a useful addition to more common eye-tracking measures in experiments that employ the error disruption paradigm.

The Present Study

In the present study, we use an error disruption paradigm experiment to test Daneman and Reingold’s (2000) model of reading in predictable contexts, which proposed that when a target word is highly predictable, its associated orthographic and phonological representations can be activated before the word is fixated. Chinese is different from English in that two orthographically similar characters may have entirely different phonological forms, and two phonologically similar characters may not share any similarities in orthography. Because Chinese is unique in that it allows us to decouple orthography and phonology in this way, it is an optimal script for testing Daneman and Reingold (2000) model, and for investigating whether both phonological and orthographic representations can be activated via semantics.

While Feng et al. (2001) did include contextual predictability as a variable in their study and found no significant effects among the Chinese readers, there are a couple of methodological issues that have led us to reinvestigate this topic. The first is that, in addition to contextual predictability, the authors also manipulated the orthographic similarity and frequency of target words. These two variables were collapsed together when analyzing the effects of predictability, making their findings somewhat difficult to interpret. Secondly, in Feng et al.’s (2001) norming tests, participants were given the passage up to, but not including, the target and asked to make five guesses of what they thought the next word would be. Because predictability values were the probability of correctly guessing the target within five guesses, the high predictability contexts were most likely not as constrained as those used in Daneman and Reingold (2000) and Rayner et al. (1998), whose norming tests only allowed for one guess.

From a theoretical standpoint, if Daneman and Reingold (2000) were correct that contextual predictability allows for the top-down activation of orthographic and phonological representations via semantics—occurring even before the target is fixated—we should expect to see early effects of predictability on orthographic and homophonic errors in both Chinese and English scripts. However, if our results, like Feng et al.’s (2001), show no effect of contextual predictability on the reading of homophonic and orthographic errors in Chinese, it would suggest that the phonological effects found in studies on English reading might be the result of a mechanism other than that described by Daneman and Reingold’s (2000) model. For these reasons, we believe this question is worth re-examining.

Finally, while several eye-tracking studies have used an error disruption paradigm to examine how eye fixations reflect orthographic and phonological activation during Chinese reading, none has incorporated pupil diameter measurements to examine changes in cognitive demand. We hypothesize that fixation times will reflect the findings of earlier studies on Chinese, wherein orthographic errors will provide processing benefit (evidenced by shorter reading times) over unrelated controls in the earlier processing measures of FFD and GD. In contrast, both homophone errors and orthographic errors will provide benefit over controls in TRT. Based on the findings of Briesemeister et al. (2009), we expect that changes in pupil diameter may follow a similar pattern of increased pupil size for character errors relative to correct targets and that unrelated errors might induce greater pupil diameter increases than homophone and orthographic errors.



Participants were twenty-eight native Cantonese speakers (age M = 21.21, SD = 2.88) who had attended primary and secondary school in Hong Kong or Macau and were residing in Australia at the time of the study. On average, participants had been in Australia for around 2.5 years (SD = 2.2), and all reported returning to their home cities regularly. All participants had normal or normal-corrected vision and were paid for their participation. Ethical clearance was obtained from the University of Melbourne’s Human Ethics Sub-Committee (HESC) before carrying out the study.


We selected twenty sets of traditional Chinese characters, each set containing a correct character (e.g., 木/muk6/[wood] in 木頭/muk6 tau4/[wood]), an orthographically similar character (e.g., 本/bun2/[origin]), a homophonic character (e.g., 目/muk6/[eye]) and an unrelated control character (e.g., 充/cung1/[to fill]). Orthographic, homophonic, and unrelated character substitutions were not semantically related to their correct mates. To confirm this, we calculated semantic relatedness scores on a 0–1 scale (low–high relatedness) using HowNet (Dong et al. 2006), a Chinese semantic database for natural language processing which can produce similarity ratings that approximate those given by human judges (Dai et al. 2008). Semantic relatedness was low overall (M = .3, SD = .29), and an analysis of variance showed no interaction between semantic relatedness and condition [F(2, 57) = 2.1, p > .1].

None of the characters were phonograms (i.e., characters that contain a phonetic radical on the right side, such as 估/gu2/which contains the phonetic radical 古/gu2/). Wherever possible, we matched orthographic, homophonic, and unrelated characters with the correct character for strokes (M = 6.69, SD = 2.42) and frequency (M = 166, SD = 118 per corpus unit in Ho and Kwan 2001). There were no significant differences across conditions for either [strokes: F(3, 76) = 1.16, p > .1; frequency: F(3, 76) = 1.80, p > .1]. Due to the limited number of traditional Chinese character pairs with high orthographic similarity, we included four less-common characters in the orthographic condition, each with a frequency of seven or fewer per 660,000 characters, such as 戍/syu3/as the orthographic mate to 成/sing4/. To confirm that none of our findings could be attributed to the presence of these less-common characters, we ran a second statistical analysis on a subset of data that excluded trials associated with these characters and found they did not affect patterns of significance.

We selected characters for the orthographic condition based on visual similarity to correct targets, which differed in orthographic structure by no more than two strokes. To confirm that readers would perceive these characters as orthographically similar to their correct mates, we conducted a norming study with twenty native Cantonese speaking participants who did not take part in the experiment. These participants were asked to judge how similar the orthographic, homophonic, and unrelated characters were to the corresponding correct targets in terms of appearance on a scale of one (highly dissimilar) to five (highly similar). As expected, there was a significant effect of condition on orthographic similarity [F(1, 19) = 110.77, p < .001]. The orthographically similar characters (M = 3.6, SD = 1.3) were rated as significantly more similar to targets than homophonic (M = 1.3, SD = 0.8; p < .001) and unrelated characters (M = 1.3, SD = 0.08; p < .001) while homophonic and unrelated characters did not differ significantly in their overall similarity rating (p > .1). A summary of character properties is presented in Table 1.

Table 1 Character properties

As eye fixation measures were based on the two-character word level (following Feng et al. 2001; Wong and Chen 1999; Zhou et al. 2018), each correct target character was embedded as the first character in three to four different two-character word-frames (e.g., 自 [self] was embedded into 自己 [oneself], 自由 [freedom], 自然 [natural], and 自殺 [suicide]). Due to the limited number of words available for some characters, six word-frames were used twice, though never in the same sentence frame or condition within a single stimulus list. All words were common, with most taken from the Hong Kong Primary School Vocabulary Learning List (Chan 2007) to ensure that all participants would easily recognize them.

Eighty sentence frames, each containing varying amounts of contextual information before the target, were collected from online sources. Each sentence was 20–25 characters in length with target words appearing at least two words in distance from the beginning of the sentence or any punctuation marks. Alternative versions of each sentence were created for the three error conditions by replacing the first character of the target word with an orthographic, homophonic, or unrelated substitution. The resulting character pairs never formed a legal word. From these sentences, we created four stimulus lists, each containing the same 80 sentence frames. Within each list, there were 20 correct sentences, 20 with a homophonic error, 20 with an orthographic error, and 20 with an unrelated character error. We used a within-items design such that the second character of the target word, as well as sentence frames, were identical across conditions (see Table 2 for examples), and the pairing of sentence frames and targets was counterbalanced across lists so that no participant was exposed to the same sentence frame or target character more than once. A native Cantonese-speaking research assistant from Hong Kong assisted in stimuli design and editing.

Table 2 Set of example sentences

Because this study is looking at the effects of contextual predictability, another norming study was conducted with 20 participants not involved in the experiment wherein they were presented with each sentence frame up to but not including the target word and given one open-ended guess as to what word was most likely to come next. Scores ranged from 0 (no correct guesses) to .95 (the target character was correctly guessed around 95 percent of the time—or by 19 out of 20 participants). Twenty-eight sentences were neutral with 0 correct guesses (targets included 4 adjectives; 7 adverbs; 11 nouns; 6 verbs); twenty-six had predictability values between .05 and .25 (7 adjectives; 2 adverbs; 4 nouns; 2 prepositions; 11 verbs), and twenty-six had predictability values between .3 and .95 (3 adjectives; 14 nouns; 9 verbs). The overall mean predictability of targets for all sentences was .22 (SD = 0.27). Predictability values did not correlate with sentence length, target frequency, or target stroke count (ps > .1). Table 3 shows four example sentences, along with their corresponding predictability values.

Table 3 Set of example sentences of different predictability values


Eye movements were recorded using an SMI RED-n Scientific eye-tracking device with a sampling rate of 60 Hz. The manufacturer reports the device’s spatial resolution as 0.05° and gaze position accuracy as 0.4°. This model was designed to be used without head restriction to allow for greater ecological validity and uses dynamic algorithms to compensate for head movements when calculating fixation times and pupil sizes. Lower-frequency eye-tracking systems, such as the 60 Hz RED-n Scientific used in the current study, tend to have a higher error rate than high-frequency eye-trackers more commonly used in reading research. For instance, a 60 Hz eye-tracker would have an average error rate of around 8 ms, compared to only around 0.5 ms for a 1000 Hz eye-tracker (Raney et al. 2014). While this makes 60 Hz eye-trackers inadequate for research on saccadic eye-movements, they are sufficient for research on fixations (Dalmaijer 2014; Karn 2000; Leube et al. 2017; Raney et al. 2014), which are the primary measure used in the current study. Furthermore, lower frequency remote eye-trackers perform comparably to higher frequency ones in their pupil size measurement capabilities (Titz et al. 2018) and have even been used successfully even in pupillometric studies that allow some degree of head movement (Coyne and Sibley 2016; Klingner et al. 2008). In contrast to Eyelink eye-tracking systems, which output pupil size data in either arbitrary units or pixels, the Red-n Scientific system can output pupil size data in millimeter units, which facilitates pupillometric data analysis. The experiment was run on an HP EliteBook 850 laptop PC (processor, Intel Core i7 running at 2.6 GHz; operating system, Windows 7) with a 15.6-inch LED monitor (frame rate, 60 Hz; resolution, 1366 by 768 pixels). Participants were positioned 65 cm away from the monitor. Stimuli were single line sentences in traditional Chinese displayed in black on white using the font KaiTi, size 36 with each character equaling 1.1° of visual angle.


Each participant performed a 9-point calibration of the eye tracker, followed by a validation of calibration accuracy before beginning the experiment. While the remote eye-tracking system used in our study was designed to compensate to some extent for head movement, participants were asked to avoid overt head or body movements as much as possible when reading the stimulus sentences in order to improve tracking accuracy. Before each sentence, a fixation point appeared on the center-left of the screen, marking the approximate location of the beginning of the sentence. Participants were instructed to fixate on the point and press a keyboard key to present the sentence. They were asked to silently read each sentence at a natural pace and try to understand its meaning regardless of any errors they encountered. Sentence order was randomized, and participants were allowed to rest after 40 sentences. The calibration and validation processes were repeated after the rest period. Participants were asked to rate their comprehension of the previously read sentence on a scale of one to five (1 = unclear in meaning, 5 = clear in meaning) at eight random points during the experiment. These prompts were intended to encourage participants to read for comprehension, and responses were not recorded for analysis. A short practice block of five sentences and a comprehension rating item was included at the beginning to ensure that all participants understood the procedures of the experiment. The entire experiment lasted approximately 30 min.

Data Analysis

One participant was discarded from analysis due to technical problems with eye-tracker calibration. For fixation time and pupil size data, we discarded trials with missing observations due to blinks, movements, or equipment failure (12% of trials). For all fixation time measures, trials with FFDs over 1000 ms (8% of data) were discarded. For early processing measures of FFD and GD, we discarded trials with GDs over 2000 ms (2% of data), and for the late processing measure of TRT, we discarded trials with TRTs over 3000 ms (2% of data).

Inferential statistics for fixation times were based on planned comparisons between each related condition and the unrelated condition, which served as the reference condition. Fixation time analyses looked at three dependent variables: FFD, GD, and TRT. As pupil size has not previously been examined in an error disruption study, we could not assume that the unrelated condition would elicit the largest change in pupil diameter. Therefore, the correct condition was used as the baseline rather than the unrelated condition.

All data were analyzed in R (R Development Core Team 2014) with the lme4 package (Bates et al. 2012). Absolute t-values at or above 1.96 indicate statistical significance at the 0.05 alpha level. Because fixation time data were positively skewed, generalized linear mixed-effects models (GLMMs) with a Gamma distribution and identity link between the predictors and dependent variables were used, which allowed us to avoid log-transforming the fixation times (Lo and Andrews 2015). For pupil diameter analysis, standard linear mixed models (LMMs) were used.

The best fits for LMM and GLMM structures were determined by model comparisons using the Akaike information criterion (AIC; Akaike 1998). Due to overparameterization, models with maximal random structures that included random slopes would not converge. Therefore, random structure was reduced to by-participant and by-item random intercepts (Baayen et al. 2008) with condition and predictability value as fixed effects. As dichotomizing quantitative variables can introduce a number of problems to analysis, such as spurious statistical significance and loss of effect size (MacCallum et al. 2002), predictability values were standardized to facilitate model convergence and incorporated into models as a continuous predictor. The formula used for the FFD, GD, and TRT GLMMs was: (Fixation Time Measure) ~ Condition * Predictability Value + (1 | Participant) + (1 | Item).

While pupillary diameter changes begin prior to the initial fixation on a target when it is visible in the parafoveal range (Snell et al. 2018), the slow emergence and decay of pupil size changes makes it likely that these effects continue to develop for several fixations. Thus, we examined the development of pupil size over the first fixation on the target as well as four consecutive fixations after. While this includes fixations that occur after leaving the target region, significant differences in pupil size between conditions can be attributed to the manipulated target words as sentence frames were otherwise identical (see Table 2). Fixation number (i.e., first fixation–fifth fixation) was included in models to account for the changes in pupil size over time. As predictability was not found to interact with pupil size and did not improve model fit, this variable was excluded from the final model, which was specified as follows: (Pupil Diameter) ~ Condition * Fixation Number + (1 | Participant) + (1 | Item).

In the error disruption paradigm, upon noticing patterns in the kinds of errors present in the stimuli, participants might change their reading strategies and become more adept at resolving erroneous characters as the experiment progresses. To check whether participants in our study displayed any such adaptive reading patterns, we built additional models that included trial number as a fixed effect. Predictability was excluded from these models to facilitate model convergence. For conciseness, only significant values are reported for these additional models.


A full summary of normed means from the raw data is presented for fixations and pupil diameters in Table 4. Fixations were measured in milliseconds (ms), while pupil diameters were measured in millimeters (mm). All GLMMs and LMMs can be found in the appendix.

Table 4 Eye-tracking measures

FFDs were significantly shorter on correct targets than on unrelated errors (b = − 100.36, SE = 8.63, t = − 11.64). Orthographic errors also had significantly shorter first fixations than unrelated errors (b = − 41.31, SE = 10.23, t = − 4.04). First fixations on homophonic errors were not significantly different from those on unrelated errors (b = 3.28, SE = 8.25, t = 0.40). Interactions between predictability and the homophonic condition approached significance (b = − 18.88, SE = 9.77, t = − 1.93), suggesting that fixations on homophonic errors became shorter as predictability values increased. Other interactions were far from reaching significance [all abs(t-values) ≤ 1.34].

As in FFD, GDs were shorter on correct targets than on unrelated errors (b = − 206.44, SE = 11.82, t = − 17.46). Orthographic errors had significantly shorter GDs than unrelated targets (b = − 103.05, SE = 11.72, t = − 8.79) but were not significantly different for unrelated errors and homophonic errors (b = 2.57, SE = 12.39, t = 0.21). There was a significant effect of predictability on GDs in the homophonic condition (b = − 33.55, SE = 11.22, t = − 2.99), indicating that fixation times on homophonic errors relative to controls tended to be shorter when predictability values were higher. There was no effect of predictability in the correct (b = − 4.74, SE = 12.07, t = − 0.39) and orthographic conditions (b = 5.71, SE = 10.67, t = 0.53). Predictability effects in GD are visualized in Fig. 1a.

Fig. 1

Predictability effects in gaze duration and total reading time. Note Readers’ gaze durations/GDs (a) and total reading times/TRTs (b) as a function of predictability for unrelated (solid), correct (short-dashed), orthographic (long-dashed), and homophonic (long-dashed, spaced) conditions. The y-axes show fixation times in milliseconds, and the x-axes show predictability values, which have been centered and scaled, from − 1 (low-predictable) to 2 (high-predictable). Error bands show 95% confidence intervals

TRTs were shorter on correct targets than on unrelated errors (b = − 494.82, SE = 10.69, t = − 46.27). TRTs were also significantly shorter on both orthographic errors and homophone errors than on unrelated errors (for orthographic errors, b = − 246.91, SE = 10.23, t = − 24.14; for homophone errors, b = − 185.58, SE = 10.70, t = − 17.34). There was a significant effect of predictability in the homophone condition (b = − 35.60, SE = 13.35, t = − 2.67), indicating shorter TRTs on homophonic errors when predictability was higher, but not in the correct (b = 14.22, SE = 10.02, t = 1.42) or orthographic conditions (b = 10.50, SE = 12.46, t = 0.84). Predictability effects in TRT are visualized in Fig. 1b.

Average pupil diameters were significantly larger by approximately 0.03 mm during fixations on orthographic errors than they were during fixations on correct targets (b = 0.03, SE = 0.01, t = 2.57). The interaction between the experimental condition and fixation number was also significant (b = 0.005, SE = 0.003, t = 1.99). Pupil diameters in the homophonic and unrelated conditions were not significantly different than those in the correct condition (homophonic: b = 0.01, SE = 0.01, t = 0.86; unrelated: b = 0.01, SE = 0.01, t = 0.62). Overall, the data suggests that pupil size dilates and then contracts more strongly in the orthographic condition than in the other experimental conditions (see data visualization in Fig. 2).

Fig. 2

Pupil diameters. Note Average pupil diameters during the first target fixation and four consecutive fixations after. Error bars show 95% confidence intervals

In our follow-up analysis looking at the effects of trial, trial number was found to interact significantly with the reference condition, showing that baseline fixation times became progressively shorter as trials progressed (FFD: b =  2.16, SE = 0.48, t =  4.51; GD: b =  2.75, SE = 0.72, t =  3.80). Importantly, significant interactions were found between trial number and the orthographic condition in FFD (b = 3.53, SE = 0.67, t = 5.27) and GD (b = 4.31, SE = 0.98, t = 4.38), indicating that fixation times on orthographic errors became longer by about 4 ms per trial relative to the unrelated baseline condition. For pupil diameter, interactions between trial number and the orthographic condition were significant (b = 0.001, SE = 0.0004, t = 3.08), indicating that pupil size increases associated with the orthographic condition were greater in later trials.


The primary goal of our study was to test Daneman and Reingold’s (2000) model of meaning activation in predictable texts for Chinese. This model posits that in high-constraint contexts, the phonological and orthographic representations of a predictable word can be activated in early processing via semantics. To test this model, we ran an eye-tracking experiment using an error disruption paradigm. In the experiment, participants silently read sentences that had varying levels of contextual constraint and contained a target word belonging to one of four conditions—correct, orthographic, homophonic, or unrelated. Shorter fixations on orthographic errors (relative to unrelated controls) were taken to indicate lexical access via orthography, and shorter fixations on homophonic errors were taken to indicate lexical access via phonology.

In the early processing measures of FFD and GD, we found significant effects in the orthographic condition, but none in the homophonic condition, suggesting that early lexical access was primarily facilitated by orthography. In the late processing measure of TRT, fixation times were shorter in both orthographic and homophonic conditions, suggesting that both orthographic and phonological information were facilitating error recovery and integration in late processing. These findings are in line with those of Feng et al. (2001), Wong and Chen (1999), and Zhou et al. (2018) who found that early stages of lexical access were primarily facilitated by the orthographic route in Chinese while the role of phonology was limited to late processing.

Importantly, we found significant interactions between the homophonic condition and predictability values, an effect that emerged in GD and continued into TRT. This finding indicates that readers could more quickly resolve homophonic errors relative to unrelated controls in contexts with high predictability targets. Given that there was no evidence in our data of a robust overall phonological effect in GD when controlling for varying levels of constraint across sentence frames, these results suggest that phonological effects only begin to emerge at higher levels of constraint. This pattern is apparent in Fig. 1a, where we can see that fixation times on homophonic errors only begin to diverge from those on unrelated errors at the higher end of the predictability range. The linear relationship between higher predictability values and shorter fixation times most likely reflects the fact that higher constraint contexts increase the probability that readers will quickly activate the target phonological form, which then, in turn, facilitates faster lexical access and semantic integration. The pattern of results in our data is in line with the findings of Daneman and Reingold’s (2000) study on English reading and thus lends support to their proposal that phonology can be activated early in processing via semantics rather than via bottom-up decoding processes.

While some studies on English reading have found phonological effects emerging in high-constraint sentences in the early processing measure of FFD (e.g., Feng et al. 2001; Rayner et al. 1998), in our study, interactions between predictability and the homophonic condition reached significance slightly later in GD. This might indicate that contextually facilitated early phonological activation in Chinese mainly benefits lexical integration processes rather than initial lexical retrieval. It is also possible that such phonological effects can emerge slightly earlier in English because alphabetic scripts allow phonological activation to begin as soon as individual phonemes are identified. In contrast, Chinese script necessitates that the full character is identified before its associated phonological form can be activated at the syllable level (Perfetti et al. 2005).

Diverging from the predictions of Daneman and Reingold’s (2000) model, it is not clear from our findings that orthographic representations are pre-activated by semantics to the same degree as phonological representations as we did not find a significant correlation between predictability values and fixation times on orthographic errors. The finding that phonological activation is closely tied to semantic context and appears to play a role in facilitating lexical access also supports one of the claims of Perfetti et al.’s (1992) universal phonological principle—that phonology plays an important role across scripts and is involved in facilitating comprehension and lexical integration processes during reading. Based on our results, we suggest that the following process is unfolding when readers resolve homophonic character errors in high-constraint contexts: (1) the target phonological representation is activated via semantics; (2) the reader encounters the homophonic character error, and its component orthographic and phonological representations are immediately activated; (3) the reader matches the phonological form activated bottom-up via the text with the target representation activated top-down by context, thus quickly resolving the error.

It is worth addressing that, in contrast to what some might expect, we did not find an effect of predictability on correct targets. Two studies on readers of English, Rayner and Well (1996) and Sereno et al. (2017), found that fixations times on low-predictable targets were longer than on medium- or high-predictable targets, but they also found that medium- and high-predictable targets were not significantly different from each other. A study on Chinese readers (Yan et al. 2006) found the same pattern of results: significant differences between low-predictable targets when compared to medium- and high-predictable targets but no significant differences between medium- and high-predictable targets. This suggests that while correct targets are read faster in more predictable contexts, this faciliatory effect seems to reach a maximum when predictability values are somewhere between .11 and .67 (M = .36), the range for medium-predictability Chinese sentence items in Yan et al. (2006), and fixation times do not appear to decrease further as predictability increases beyond this point. Because 43 out of 80 sentences in our study had predictability values ranging between .1 and .95 (M = .22), this maximum faciliatory effect may have already been reached for most items, which might explain why we did not see continually decreasing fixation times on correct targets coinciding with higher predictability values. To test this, we dichotomized the predictability variable, putting sentences with predictability values of 0 into the low-predictable category and those with values above 0 into the high-predictable category. With this change, while we still saw an interaction between predictability and the homophonic condition in GD (b = 42.60, SE = 10.28, t = 4.14), a different pattern of results emerged for the correct and orthographic conditions, showing significant differences between high- and low-predictable conditions for both (correct: b = 37.58, SE = 9.29, t = 4.05; orthographic: b = 30.26, SE = 11.04, t = 2.74). This supports our hypothesis that reading speeds on correct words reach a maximum when predictability values are around .1 and increase little beyond that point. As the practice of dichotomizing variables has been shown in many cases to yield misleading results (MacCallum et al. 2002), and as outcomes in the current study were shown to differ based on whether we treated predictability values as a continuous or dichotomous predictor, researchers looking at predictability effects should put forth careful consideration when deciding which approach to follow.

In addition to the standard fixation time measurements, we examined pupil diameter changes as a measure of cognitive effort and found that orthographic errors elicited more substantial pupillary response effects than the other experimental conditions. On the first fixation, there was a significant effect of the orthographic condition on pupil diameter, indicating that readers’ pupil sizes increased when they fixated on orthographic errors. Pupil diameters continued to increase from the second fixation to the fourth and then sharply decreased on the fifth. To check whether pupil diameter increases were a by-product of longer fixation times, we tested for a correlation between pupil diameters and fixation times in the orthographic condition and found no statistically significant correlation [r(1620) = .02, p > .1], suggesting that the significant pupillary response effects we found were not merely due to longer fixations on orthographic errors. The dominant pattern observed across pupil diameter measurements suggested that readers were recruiting increased cognitive resources when resolving errors in which the characters were contextually incorrect but retained strong orthographic likeness to correct targets. This could be evidence that subtle orthographic errors in Chinese induce uncertainty in readers, leading to pupillary response effects similar to those that have been associated with exposure to lexical ambiguities (Ben-Nun 1986) or pseudohomophones (Briesemeister et al. 2009) in experimental tasks. Future research on alphabetic languages could investigate whether we see pupil size increases associated with orthographic or homophonic errors in sentence contexts. It is possible, however, that the need for increased cognitive resources when reading words with orthographic errors is unique to Chinese, for which reading recruits complex visual processes specially tuned to configural information, similar to those involved in facial recognition (Perfetti et al. 2013).

One limitation of the error disruption paradigm used in this study is that, as participants encounter different error types multiple times, they are likely to adapt and adjust their reading style. As part of our analysis, we looked at interactions between our dependent variables and trial number. Importantly, we found that early fixation times (in FFD and GD) and pupil diameters on orthographic errors increased progressively with trial number, suggesting that readers spent more time and exerted more cognitive effort in deciphering orthographic errors in later trials. Conversely, readers spent progressively less time fixating on unrelated errors in later trials. These results suggest that, as participants became more adept in identifying which manipulated characters contained useful cues for error recovery, they began to allocate more time and cognitive effort toward deciphering these errors while also more readily dismissing those identified as indecipherable (i.e., not containing useful orthographic cues) more quickly.

In conclusion, our results point to a primary role of orthography in early processing. However, phonological effects were also found to play a role in early processing when predictable sentence contexts facilitated top-down activation of phonology. These findings look much like those of Daneman and Reingold (2000), whose studies were on English readers. This suggests that reading processes in both English and Chinese are highly similar under normal reading conditions, and that early activation of phonology via semantics is a phenomenon present in the reading of both scripts. Our study has also shown that increases in pupil size reflect the high degree of cognitive demand recruited by orthographic recovery processes during Chinese reading. Finally, we also found that readers adapt their reading strategies during the error disruption paradigm, allocating more time and cognitive effort toward deciphering orthographic information as the experiment progresses. Follow-up studies investigating the relationship between pupillary response and orthographic processing using different methodologies, such as single-word reading experiments, as well as studies looking more deeply into adaptive reading strategies, would be interesting directions for future research.

Supplementary Material

Data and R scripts associated with this article can be found at


  1. 1.

    Chinese pronunciations shown in the Cantonese romanization system, Jyutping.

  2. 2.

    Traditional characters are those most widely used in Taiwan, Hong Kong, and Macau, in contrast to simplified characters used in mainland China.

  3. 3.

    The N400 is negative-going potential which peaks around 400 ms after the onset of the stimulus. Priming studies have shown its amplitude to be affected by various properties of the prime such as orthographic, phonological and semantic relatedness. .

  4. 4.

    During reading, the parafoveal range encompasses 1-–2 words beyond the word being directly fixated (Vasilev & Angele, 2017).


  1. Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle. In E. Parzen, K. Tanabe, & G. Kitagawa (Eds.), Selected papers of Hirotugu Akaike. Springer series in statistics (pp. 199–213). New York, NY: Springer.

  2. Ariel, R., & Castel, A. D. (2014). Eyes wide open: Enhanced pupil dilation when selectively studying important information. Experimental Brain Research,232(1), 337–344.

    Article  Google Scholar 

  3. Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language,59(4), 390–412.

    Article  Google Scholar 

  4. Bates, D., Maechler, M., & Bolker, B. (2012). lme4: Linear mixed-effects models using S4 classes (2011). R package version 0.999375-42.

  5. Beatty, J., & Lucero-Wagoner, B. (2000). The pupillary system. Handbook of Psychophysiology,2, 142–162.

    Google Scholar 

  6. Ben-Nun, Y. (1986). The use of pupillometry in the study of on-line verbal processing: Evidence for depths of processing. Brain and Language,28(1), 1–11.

    Article  Google Scholar 

  7. Briesemeister, B. B., Hofmann, M. J., Tamm, S., Kuchinke, L., Braun, M., & Jacobs, A. M. (2009). The pseudohomophone effect: Evidence for an orthography–phonology-conflict. Neuroscience Letters,455(2), 124–128.

    Article  Google Scholar 

  8. Carpenter, P. A., & Daneman, M. (1981). Lexical retrieval and error recovery in reading: A model based on eye fixations. Journal of Verbal Learning and Verbal Behavior,20(2), 137–160.

    Article  Google Scholar 

  9. Chan, S. D. (2007). Hong Kong primary school vocabulary learning list. Wan Chai: Hong Kong Education Bureau.

    Google Scholar 

  10. Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review,108(1), 204–256.

    Article  Google Scholar 

  11. Conklin, K., Pellicer-Sánchez, A., & Carrol, G. (2018). Eye-tracking: A guide for applied linguistics research. Cambridge: Cambridge University Press.

    Google Scholar 

  12. Coyne, J., & Sibley, C. (2016). Investigating the use of two low cost eye tracking systems for detecting pupillary response to changes in mental workload. Proceedings of the Human Factors and Ergonomics Society Annual Meeting,60(1), 37–41.

    Article  Google Scholar 

  13. Dai, L., Liu, B., Xia, Y., & Wu, S. (2008). Measuring semantic similarity between words using HowNet. International Conference on Computer Science and Information Technology,2008, 601–605.

    Google Scholar 

  14. Dalmaijer, E. (2014). Is the low-cost EyeTribe eye tracker any good for research? PeerJ PrePrints.,.

    Article  Google Scholar 

  15. Daneman, M., & Reingold, E. (1993). What eye fixations tell us about phonological recoding during reading. Canadian Journal of Experimental Psychology,2, 153.

    Article  Google Scholar 

  16. Daneman, M., & Reingold, E. M. (2000). Do readers use phonological codes to activate word meanings? Evidence from eye movements. In A. Kennedy, D. Heller, J. Pynte, & R. Radach (Eds.), Reading as a perceptual process (pp. 447–474). Amsterdam: North-Holland.

  17. Daneman, M., Reingold, E. M., & Davidson, M. (1995). Time course of phonological activation during reading: Evidence from eye fixations. Journal of Experimental Psychology. Learning, Memory, and Cognition,21(4), 884.

    Article  Google Scholar 

  18. Denisowski, P. (2005). CEDICT: Chinese-English dictionary. Retrieved February 13, 2019, from

  19. Dong, Z., Dong, Q., & Hao, C. (2006). HowNet and the computation of meaning.

  20. Feng, G., Miller, K., Shu, H., & Zhang, H. (2001). Rowed to recovery: The use of phonological and orthographic information in reading Chinese and English. Journal of Experimental Psychology. Learning, Memory, and Cognition,27(4), 1079.

    Article  Google Scholar 

  21. Friedman, D., Hakerem, G., Sutton, S., & Fleiss, J. L. (1973). Effect of stimulus uncertainty on the pupillary dilation response and the vertex evoked potential. Electroencephalography and Clinical Neurophysiology,34(5), 475–484.

    Article  Google Scholar 

  22. Frost, R. (1998). Toward a strong phonological theory of visual word recognition: True issues and false trails. Psychological Bulletin,123(1), 71.

    Article  Google Scholar 

  23. Gao, J., Fan, K., & Fei, J. (1993). Xiandai hanzi xue [The study of modern Chinese characters]. Beijing: Higher Education Press.

    Google Scholar 

  24. Ho, H. H., & Kwan, T. W. (2001). Hong Kong, Mainland China & Taiwan: Chinese character frequency-A trans-regional, diachronic survey. Retrieved February 11, 2017, from

  25. Hoeks, B., & Levelt, W. J. M. (1993). Pupillary dilation as a measure of attention: A quantitative system analysis. Behavior Research Methods, Instruments, & Computers,25(1), 16–26.

    Article  Google Scholar 

  26. Hyönä, J., & Pollatsek, A. (2000). Processing of Finnish compound words in reading. In A. Kennedy, R. Radach, D. Heller, & J. Pynte (Eds.), Reading as a perceptual process (pp. 65–87). Amsterdam: North-Holland.

  27. Inhoff, A. W. (1984). Two stages of word processing during eye fixations in the reading of prose. Journal of Verbal Learning and Verbal Behavior,23(5), 612–624.

    Article  Google Scholar 

  28. Jared, D., Levy, B. A., & Rayner, K. (1999). The role of phonology in the activation of word meanings during reading: Evidence from proofreading and eye movements. Journal of Experimental Psychology: General,128(3), 219–264.

    Article  Google Scholar 

  29. Karn, K. S. (2000). “Saccade pickers” vs. “fixation pickers”: The effect of eye tracking instrumentation on research. Proceedings of the 2000 symposium on eye tracking research & applications.

    Article  Google Scholar 

  30. Klingner, J., Kumar, R., & Hanrahan, P. (2008). Measuring the task-evoked pupillary response with a remote eye tracker. Proceedings of the 2008 symposium on eye tracking research & applications - ETRA’08, 69.

    Article  Google Scholar 

  31. Leinenger, M. (2014). Phonological coding during reading. Psychological Bulletin,140(6), 1534–1555.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Leube, A., Rifai, K., & Wahl, S. (2017). Sampling rate influences saccade detection in mobile eye tracking of a reading task. Journal of Eye Movement Research,10, 3.

    Google Scholar 

  33. Linguistic Society of Hong Kong. (1997). Hong Kong jyutping character table. Hong Kong: Linguistic Society of Hong Kong Press.

    Google Scholar 

  34. Lo, S., & Andrews, S. (2015). To transform or not to transform: Using generalized linear mixed models to analyse reaction time data. Frontiers in Psychology.

    Article  PubMed  PubMed Central  Google Scholar 

  35. MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods,7(1), 19.

    Article  Google Scholar 

  36. Perfetti, C., Cao, F., & Booth, J. (2013). Specialization and universals in the development of reading skill: How Chinese research informs a universal science of reading. Scientific Studies of Reading : The Official Journal of the Society for the Scientific Study of Reading,17(1), 5–21.

    Article  Google Scholar 

  37. Perfetti, C., Liu, Y., & Tan, L. H. (2005). The lexical constituency model: Some implications of research on Chinese for general theories of reading. Psychological Review,112(1), 43.

    Article  Google Scholar 

  38. Perfetti, C. A., Zhang, S., & Berent, I. (1992). Chapter 13 reading in English and Chinese: Evidence for a “universal” phonological principle. Advances in Psychology,94, 227–248.

    Article  Google Scholar 

  39. Peysakhovich, V., Causse, M., Scannella, S., & Dehais, F. (2015). Frequency analysis of a task-evoked pupillary response: Luminance-independent measure of mental effort. International Journal of Psychophysiology,97(1), 30–37.

    Article  Google Scholar 

  40. R Development Core Team. (2014). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.

    Google Scholar 

  41. Raisig, S., Hagendorf, H., & van der Meer, E. (2012). The role of temporal properties on the detection of temporal violations: Insights from pupillometry. Cognitive Processing,13(1), 83–91.

    Article  Google Scholar 

  42. Raney, G. E., Campbell, S. J., & Bovee, J. C. (2014). Using eye movements to evaluate the cognitive processes involved in text comprehension. Journal of Visualized Experiments.

    Article  Google Scholar 

  43. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin,124(3), 372.

    Article  Google Scholar 

  44. Rayner, K. (2009). Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology,62(8), 1457–1506.

    Article  Google Scholar 

  45. Rayner, K., Pollatsek, A., & Binder, K. S. (1998). Phonological codes and eye movements in reading. Journal of Experimental Psychology. Learning, Memory, and Cognition,24(2), 476.

    Article  Google Scholar 

  46. Rayner, K., & Well, A. D. (1996). Effects of contextual constraint on eye movements in reading: A further examination. Psychonomic Bulletin & Review,3(4), 504–509.

    Article  Google Scholar 

  47. Sereno, S. C., Hand, C. J., Shahid, A., Yao, B., & O’Donnell, P. J. (2017). Testing the limits of contextual constraint: Interactions with word frequency and parafoveal preview during fluent reading. The Quarterly Journal of Experimental Psychology,71(1), 1–24.

    Article  Google Scholar 

  48. Snell, J., Mathôt, S., Mirault, J., & Grainger, J. (2018). Parallel graded attention in reading: A pupillometric study. Scientific Reports,8(1), 3743.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Steinhauer, S., & Zubin, J. (1982). Vulnerability to schizophrenia: Information processing in the pupil and event-related potential. In E. Usdin, & I. Hanin (Eds.), Biological markers in psychiatry and neurology (pp. 371–385). Elsevier.

  50. Titz, J., Scholz, A., & Sedlmeier, P. (2018). Comparing eye trackers by correlating their eye-metric data. Behavior Research Methods,50(5), 1853–1863.

    Article  Google Scholar 

  51. Van Orden, G. C. (1987). A ROWS is a ROSE: Spelling, sound, and reading. Memory & Cognition,15(3), 181.

    Article  Google Scholar 

  52. Vasilev, M., & Angele, B. (2017). Parafoveal preview effects from word N + 1 and word N + 2 during reading: A critical review and Bayesian meta-analysis. Psychonomic Bulletin & Review,24(3), 666–689.

    Article  Google Scholar 

  53. Wong, K. F. E., & Chen, H.-C. (1999). Orthographic and phonological processing in reading Chinese text: Evidence from eye fixations. Language and Cognitive Processes,14(5–6), 461–480.

    Article  Google Scholar 

  54. Yan, Guoli, Tian, Hongjie, Bai, Xuejun, & Rayner, K. (2006). The effect of word and character frequency on the eye movements of Chinese readers. British Journal of Psychology,97(2), 259–268.

    Article  Google Scholar 

  55. Zhou, W., Shu, H., Miller, K., & Yan, M. (2018). Reliance on orthography and phonology in reading of Chinese: A developmental study. Journal of Research in Reading.

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Philip Thierfelder.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



Tables 5, 6, 7 and 8.

Table 5 Full fixation time GLMMs
Table 6 Full fixation time GLMMs—trial effects
Table 7 Full pupil diameter LMM
Table 8 Full pupil diameter LMM—trial effects

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Thierfelder, P., Durantin, G. & Wigglesworth, G. The Effect of Word Predictability on Phonological Activation in Cantonese Reading: A Study of Eye-Fixations and Pupillary Response. J Psycholinguist Res (2020).

Download citation


  • Chinese reading
  • Cantonese
  • Phonological activation
  • Eye-tracking
  • Pupil size