Introduction

Higher cognition is supported by a complex network of interacting brain regions. A great deal of neuropsychological and neuroimaging research has been devoted to isolation of individual task-specific aspects of this system. Yet, in an efficient system, much of the processing for any given task will be shared across more domain-general areas. To fully understand how the distributed neural system supports higher cognition, we need to consider how these domain-general areas intersect in the service of a range of different tasks. Current methods based on subtraction logic, however, tend to focus on revealing dissociated, task-specific subcomponents. We outline here a new multi-dimensional correlation-based approach that allows the identification of the primary domain-general cognitive components that underpin a range of tasks, as well as their neural correlates. We apply this generalizable approach to the test case of reading, because it is a domain where the idea of shared processing across language activities has been explicitly formulated and computationally implemented as the primary systems hypothesis (Seidenberg and McClelland 1989; Plaut et al. 1996; Patterson and Lambon Ralph 1999).

Reading is a fundamental human capacity that is supported by a distributed network of neural regions. Although most adults are able to read fluently with little effort, it is, nevertheless, a late-acquired ability both phylogenetically and ontogenetically. Reading therefore builds upon the foundations of more basic, long-established neurocognitive functions. This idea forms the basis for the primary systems account of acquired reading disorders (Patterson and Lambon Ralph 1999), which proposes that different types of acquired dyslexia result from disruption to phonological, semantic, or visual processing (Crisp and Lambon Ralph 2006; Woollams et al. 2007; Roberts et al. 2013). Previous investigations of acquired dyslexia have supported this account by focussing on the relationship between reading and primary systems functions at the behavioural level. For the first time, this study simultaneously explored both neural and behavioural correlates through the application of sophisticated multi-dimensional neuropsychological and lesion-symptom mapping approaches to data collected from a large case-series of chronic post-stroke aphasic patients. These patients have persistent problems in understanding and/or producing speech due to deficits in phonological and/or semantic processing (Lambon Ralph et al. 2002; Butler et al. 2014). If the primary systems account is correct, then there should be a strong convergence and triangulation between reading performance, primary cognitive systems, and their neural bases.

Functional neuroimaging of healthy participants has implicated a dorsal phonologically-related ‘direct’ pathway and a ventral semantically-mediated pathway in reading (Cattinelli et al. 2013; Taylor et al. 2013; Hoffman et al. 2015a), yet these data do not speak to the necessity of the regions involved. Lesion data from patients provide unique insights into the cognitive and neural bases of reading (Woollams 2014). In one of the first case-series studies to directly assess the primary systems account of reading deficits, Crisp and Lambon Ralph (2006) recruited 12 stroke aphasic patients on the basis of their reading behaviour, namely the presence of phonological-deep dyslexic symptoms (enhanced lexicality effects due to poor reading of nonwords, imageability effects, or semantic errors in reading). As predicted by the primary systems account, patients’ nonword reading was related to phonological processing ability, as measured by phoneme manipulation performance. In addition, the advantage for words over nonwords (the lexicality effect) in reading was related to semantic processing ability, as measured by synonym judgement performance. Hence, word reading performance benefitted from residual semantic processing. The advantage for concrete over abstract words in reading was larger for patients with poor phoneme manipulation and poor synonym judgement, indicating that the processing of difficult abstract words draws on both semantic and phonological processing capacities. Although this study provided strong behavioural evidence to support the primary systems account, it did not consider how these effects related to the location of the patients’ brain lesions.

Ripamonti et al. (2014) identified 33 cases of phonological dyslexia amongst 59 individuals recruited for post-stroke reading problems in the subacute stage. Using a lesion overlap approach, they compared phonological dyslexia (defined as better reading of concrete nouns than nonwords and a low incidence of stress assignment errors) over undifferentiated dyslexia (defined as better reading of concrete nouns than nonwords and a high incidence of stress assignment errors or equivalent reading accuracy for concrete nouns than nonwords with a low incidence of stress assignment errors). Areas of damage specific to phonological dyslexia consisted of the left posterior and superior insula and pars opercularis, with an additional VLSM analysis of nonword reading accuracy identifying these areas plus a broader network of perisylvian regions, including the pars triangularis, anterior superior temporal gyrus, temporal pole, middle frontal gyrus, and post-central gyrus. While this study did highlight lesion sites associated with nonword reading deficits, it was not able to directly test the primary systems account, because phonological processing was only minimally assessed by word and nonword repetition, and the relationship of these scores to either reading accuracy or neural integrity was not explored.

Fiez et al. (2006) took a lesion-based approach to participant selection in their study of the neural correlates of reading. They selected 11 patients with circumscribed lesions to the left frontal operculum on the basis of the consistent activation of this region in studies of healthy individuals when reading nonwords. As expected according to the primary systems account, these patients showed relatively greater deficits for nonwords than words as compared to both brain damaged and matched healthy control groups. They also had difficulty in reading aloud low-frequency words with inconsistent spelling-sound correspondences, and, indeed, stronger frontal opercular activation has been observed for these items relative to other words in imaging studies of healthy readers. Fiez et al. found deficits for their patients on a variety of phonological tasks (verbal working memory and phonological discrimination, pseudohomophone and rhyme discrimination). These results show a strong association between frontal opercular damage and reading ability, and between frontal opercular damage and phonological processing, although the relationship between phonological and reading abilities was not directly assessed. While the results are highly consistent with a primary systems view, the focus on patients with frontal opercular lesions necessarily limits coverage of the broader left hemisphere reading network. Hence we cannot know from this study if there are other areas that are particularly important specifically for nonword or word reading.

A more comprehensive lesion-based approach was adopted by Rapcsak et al. (2009), who recruited 31 patients on the basis of the presence of lesions involving one or more of five left perisylvian regions, as identified by their consistent activation in phonological processing tasks in functional neuroimaging meta-analyses (Vigneau et al. 2006). Performance for nonwords was significantly worse than for words, and the majority of patients, therefore, qualified as phonologically dyslexic. Patients were worse than controls at reading aloud nonwords, and also both regular and irregular words. As expected according to the primary systems account, reading performance for both nonwords and words was strongly related to phonological processing ability as measured over a range of receptive and expressive tests (repetition, rhyme processing, and phoneme processing). Neuroanatomically, damage to any of the five left perisylvian regions of interest corresponded to enhanced lexicality effects in reading, consistent with the involvement of a distributed phonological processing network in supporting nonword reading. What is not known from this study is which elements of this network were also necessary for phonological processing.

Most recently, Boukrina et al. (2015) studied the reading performance of 11 left hemisphere stroke patients undergoing rehabilitation in the subacute stage. Their reading aloud of words and nonwords was assessed, along with receptive tests of semantic, phonological, and orthographic processing, all of which involved a reading component. Behaviourally, a significant relationship was found between reading of high imageability, low frequency, low consistency words, and semantic task performance, and between reading of nonwords and phonological task performance, as would be expected according to the primary systems account. Using a lesion overlap approach, they were able to isolate areas specific to impairment on the phonological rhyme task over patients with no impairment on this task, and this corresponded to a wide network of frontal, parietal, and temporal left hemisphere regions. While this study identified the neural network supporting phonological processing, the neural correlates of reading aloud words or nonwords were not considered.

In summary, previous research has focussed on the association between primary systems abilities and reading performance in behaviour, on the neural correlates of reading performance or phonological processing separately, or on the role of specific brain regions in both abilities. The goal of this study was to provide the first large-scale quantitative assessment of the primary systems hypothesis simultaneously at both the behavioural and neural levels. This account predicts a strong overlap between reading performance and the status of each primary system, both behaviourally and neurally. To achieve this, we advanced on previous work in four ways. First, we considered the reading performance of 43 chronic poststroke aphasic patients selected purely because they experience persistent difficulties in spoken language processing (Lambon Ralph et al. 2002; Butler et al. 2014), rather than on the basis of their reading behaviour or lesion location (cf. Crisp and Lambon Ralph 2006; Fiez et al. 2006; Rapcsak et al. 2009; Ripamonti et al. 2014). This approach allows a test of the primary systems prediction that problems with spoken language should be accompanied by reading deficits. Second, our representative sample of chronic stroke aphasic patients gave sufficient variation in lesion location to allow a continuous rather than categorical approach to identifying the neural basis of reading deficits (cf. Ripamonti et al. 2014; Sebastian et al. 2014; Boukrina et al. 2015). Third, we were able to isolate and quantify each patient’s primary systems capacities by using principal components analysis to distil optimal orthogonal measures of semantic and phonological ability from a large neuropsychological battery (Lambon Ralph et al. 2002, 2003). Finally, the uncorrelated nature of these phonological and semantic factors allowed us to identify the neural structures that uniquely correlate with these primary language abilities using voxel-based correlational methodology (see Butler et al. 2014), which treats both behavioural and neural measures as continuous variables. Accordingly, the specific targets of our study were to confirm the strong relationships between reading performance and the status of the primary systems and establish the intersection of their associated lesion maps.

Materials and methods

Participants

Forty-three chronic stroke patients (either ischaemic or haemorrhagic) were recruited, who had persistent impairments in producing and/or understanding spoken language. All patients were at least 12 months post-stroke at the time of scanning and assessment, and were native English speakers with normal or corrected-to-normal hearing and vision. Participants were excluded if they had any contraindications for scanning, were pre-morbidly left handed, had more than one stroke, or had any other significant neurological conditions. Informed consent was obtained from all participants prior to participation under approval from the local ethics committee. Data from a healthy age- and education-matched control group (8 female, 11 male) were used as the reference for identification of areas of neural abnormality.

Neuropsychological assessments

Assessments were conducted with participants over several testing sessions as required to complete the assessments. In addition to the BDAE (Goodglass and Kaplan 1983; Goodglass et al. 2000), a battery of language tests was administered to assess the participants’ language and cognitive abilities. The language assessments included a variety of subtests from the Psycholinguistic Assessments of Language Processing in Aphasia (PALPA) battery (Kay et al. 1992), including: same–different auditory discrimination using nonword minimal pairs (PALPA 1); same–different auditory discrimination using word minimal pairs (PALPA 2); immediate repetition of nonwords (PALPA 8); delayed repetition of nonwords (PALPA 8); immediate repetition of words (PALPA 9); and delayed repetition of words (PALPA 9). A number of tests from the 64-item Cambridge Semantic Battery (Bozeat et al. 2000) were also included: the spoken word-to-picture matching task; a written word-to-picture matching version of the same task; the picture version of the Camel and Cactus Test; and the picture naming test. To increase sensitivity to mild naming deficits, the 60-item Boston Naming Test (BNT) (Kaplan et al. 1983) was also used. Similarly, to increase sensitivity to subtle semantic deficits, a 96-trial synonym judgement test with words presented in spoken and written form (Jefferies et al. 2009) was also used. To capture syntax-level deficits, the spoken sentence comprehension task from the Comprehensive Aphasia Test (CAT) (Swinburn et al. 2005) was administered. The additional cognitive tests included forward and backward digit span (Wechsler 1987); the Brixton Spatial Rule Anticipation Task (Burgess and Shallice 1997); and Raven’s coloured progressive matrices (Raven 1962).

On language assessments, apart from the CAT sentence comprehension test (Swinburn et al. 2005), participants were scored on their first response. For the CAT test, two points are given for a correct response and one point is given for delayed correct responses or self-corrections. For the two naming assessments, participants’ responses were marked correct if they were given within 5 s of presentation. Minor articulatory dysfluencies, but not phonological errors, in responses were accepted as correct. Repetition of auditory stimuli was provided if requested by participants.

Assessments of reading aloud were included in the same testing sessions as the neuropsychological background measures. We used two tests from the PALPA (Kay et al. 1992): the 30 item nonword syllable length list (PALPA 8) and the 80-item imageability by frequency list (PALPA 31), which consists of 20 high-frequency concrete words, 20 low-frequency concrete words, 20 high-frequency abstract words, and 20 low-frequency abstract words. Normative data for the imageability by frequency list for 32 healthy control participants showed a lower bound for the hardest low-frequency abstract words of 98% correct (SD = 0.34). In all reading tasks, the patients’ first response was used for scoring purposes, and in the case of nonwords, any plausible pronunciation was considered correct. Scores for each word type for each patient are provided in Table 1.

Table 1 Demographic details, Boston Naming Test scores, aphasia types, principal components analysis factor scores, and reading accuracy for the 43 chronic stroke aphasic patients in this study

Behavioural data analysis

Participants’ scores on all assessments were entered into a principal components analysis (PCA) with varimax rotation (conducted with SPSS 16.0). Factors with an eigenvalue of 1.0 and above were extracted and then rotated. Following orthogonal rotation, the factor loadings of each test (presented in Table 2) allowed interpretation of what cognitive-language primary process was represented by that factor. Individual participants’ scores on each extracted factor were then used as predictors of reading behaviour and as covariates in the neuroimaging analysis.

Table 2 Factor loadings for each test on the rotated factors identified in the principal components analysis

The scores from the principal components analysis for each factor are presented for each patient in Table 1. To assess the primary systems hypothesis at the behavioural level, these individual factor scores were correlated with reading accuracy for concrete and abstract words and nonwords using Pearson’s correlations (conducted with SPSS 16.0), as we expected to see linear relationships (and these are the format of relationships assessed in the lesion-symptom mapping analyses). Effects were considered significant if their p value fell below 0.05.

Acquisition of neuroimaging data

High-resolution structural T1-weighted Magnetic Resonance Imaging (MRI) scans were acquired on a 3.0 T Philips Achieva scanner (Philips Healthcare, Best, The Netherlands) using an eight-element SENSE head coil for the first 31 patients and a 32-channel head coil for the remaining 12 patients. A T1-weighted inversion recovery sequence with 3D acquisition was employed, with the following parameters: TR (repetition time) = 9.0 ms, TE (echo time) = 3.93 ms, flip angle = 8°, 150 contiguous slices, slice thickness = 1 mm, acquired voxel size 1.0 × 1.0 × 1.0 mm3, matrix size 256 × 256, FOV = 256 mm, TI (inversion time) = 1150 ms, SENSE acceleration factor 2.5, total scan acquisition time = 575 s.

Neuroimaging data analysis

Each participant then had an MRI scan within a few weeks of completion of the behavioural assessments. Structural MRI scans were preprocessed with Statistical Parametric Mapping software (SPM8: Wellcome Trust Centre for Neuroimaging, http://www.fil.ion.ucl.ac.uk/spm/). The images were normalised into standard Montreal Neurological Institute (MNI) space using a modified unified segmentation–normalisation procedure optimised for focal lesioned brains (Seghier et al. 2008). Although referred to as an automated ‘lesion’ segmentation method, the technique detects areas of unexpected tissue class and, therefore, identifies diminished grey and white matter and increased CSF. This method has been shown to perform at an acceptable level relative to the gold standard of cost-function masking with a hand-traced lesion mask (Wilke et al. 2011), particularly in the case of large lesions, as seen in the majority of patients in our sample. Data from all participants with stroke aphasia and all healthy controls were entered into the segmentation–normalisation. Images were then smoothed with an 8 mm full-width-half-maximum (FWHM) Gaussian kernel. The lesion of each patient was automatically identified using an outlier detection algorithm, compared to healthy controls, based on fuzzy clustering. The default parameters were used apart from the lesion definition ‘U-threshold’, which was set to 0.5 to create a binary lesion image. The images generated were used to create the lesion overlap presented in Fig. 1.

Fig. 1
figure 1

a Lesion overlap map across the 43 patients (threshold 3–35); b regions found to relate significantly and uniquely to phonological (blue) and semantic (red) factors in simultaneous VBCM analyses, with the overlap shown in violet. c Regions found to relate significantly to nonwords (green), abstract words (blue), and concrete words (red), with overlap of nonwords and abstract words in cyan, of abstract and concrete words in violet, and of all three in white. Overlap between maps represent a conjunction as recommended by Nichols et al. (2005) with a significance level that is the product of that of each map (i.e., p < 0.000025, voxel level; p < 0.0025, FWE-corrected). Overlays show areas significant at p < 0.005 voxel level, p < 0.05 FWE-corrected cluster level, image threshold (t) = 2.7

Brain regions where tissue concentration (as represented by the continuous values of the abnormality map from the unified segmentation–normalisation procedure) were related to behavioural measures were identified using voxel-based correlational methodology (VBCM) (Tyler et al. 2005), a variant of voxel-based lesion-symptom mapping (VSLM) (Bates et al. 2003) in which both the behaviour and tissue concentration measures are treated as continuous variables (conducted in SPM8 using voxel-based morphometry). We first identified the key dimensions underpinning performance in this sample of stroke aphasic patients on our neuropsychological battery by entering the phonology, semantics, and cognitive factors simultaneously using VBM in SPM8. We then conducted independent analyses (due to the intercorrelation between reading measures) of reading accuracy for concrete and abstract words and nonwords. The degree of overlap of each of these reading maps with the PCA factor maps for phonology and semantics provides an assessment of the primary systems hypothesis that there should not be large areas associated with reading of particular stimulus types that do not overlap with the regions supporting spoken language processing. Identification of anatomical regions was guided by the Harvard–Oxford cortical and NatBrainLab white-matter templates provided in MRIcron. All results are thresholded at p < 0.005 voxel level, p ≤ 0.05 family wise error (FWE)-corrected cluster level.

To examine the robustness of our results, we repeated the analyses using the SnPM toolbox (https://warwick.ac.uk/fac/sci/statistics/staff/academic-research/nichols/software/snpm) (Nichols and Holmes 2002), and the results are presented in Figures S2–S5, and Tables S2 and S3 of the Supplementary Materials. Reassuringly, the results were almost identical to the main analyses, with the one difference being that the semantic cluster did not survive cluster level FWE correction (p = 0.148). This difference is not surprising given the nature of permutation testing and the lower incidence of lesions within the semantic cluster in this predominantly phonologically impaired stroke aphasic sample.

Results

Behavioural analyses

Across the battery of background neuropsychological tests, these patients had marked and varied deficits. A varimax-rotated principal components analysis (PCA) produced a three-factor solution which accounted for 79% of variance in participants’ performance (F1 = 58%; F2 = 14%, F3 = 7%). The factor loadings of each of the different behavioural assessments are given in Table 2. Tasks which tapped input and/or output phonology (e.g., repetition, picture naming, and digit span) loaded heavily on the first factor, which we term ‘phonology’. Tasks involving conceptual processing (e.g., picture naming, word-to-picture matching, and synonym judgement) loaded heavily on the second factor, which we term ‘semantics’. The assessments that loaded heavily on the third factor involved reasoning (e.g., Brixton, Raven’s, Camel, and Cactus) and we, therefore, labelled this factor ‘cognitive’.

Correlations between the scores from the principal components analysis and reading accuracy are presented in Table 3. As expected according to the primary systems hypothesis: reading of concrete words correlated significantly with phonology and slightly more strongly with semantics, whereas reading of abstract words correlated significantly with both semantics and more strongly with phonology; accuracy of nonword reading correlated only with phonology. Unsurprisingly given the automatic nature of adult performance, no reading measure correlated significantly with the cognitive factor, and hence, it will not be considered further.

Table 3 Correlations between the principal components analysis factors and reading measures

A series of repeated-measures ANCOVAs were conducted in SPSS 16.0 to assess whether there were significant differences between the predictive capacity of each of the PCA factors over different word types. These involved the PCA factors as predictors (either phonology or semantics) and word type as a within-participant variable (nonwords vs abstract words or abstract vs concrete words), yielding four separate analyses. Phonology predicted performance for nonwords and abstract words to a similar degree [F(1, 41) = 2.84, p = 0.100] and also abstract and concrete words to a similar degree [F(1, 41) = 1.34, p = 0.254]. Semantics predicted performance for nonwords significantly less well than for abstract words [F(1, 41) = 12.08, p = 0.001], and performance for concrete words significantly better than abstract words [F(1, 41) = 5.57, p = 0.023]. Overall, then, the behavioural results show that phonology is an important predictor for all word classes, but there is a graded relationship with semantics which is weakest for nonwords and strongest for concrete words.

Neuroimaging analyses

A lesion overlap map for stroke aphasic participants is provided in Fig. 1a (range 3–35 patients). As would be expected, this primarily covers the left hemisphere area supplied by the middle cerebral artery (Phan et al. 2005). The maximum number of participants who had a lesion in any one voxel was 35 (in the central operculum and anterior arcuate). Anatomical labels are based on the Harvard–Oxford and NatBrainLab templates provided with MRICron (version 4).

The lesion map demonstrates that we had good coverage of key left hemisphere regions associated with spoken language processing. In voxel-based correlational methodology (VBCM), both voxel integrity and behavioural measures are treated as continuous; all observations are used in the analysis for each voxel over the whole brain. This contrasts with Voxel-Based Lesion-Symptom Mapping, in which voxel integrity is binarised to produce groups of intact vs lesioned patients. This approach is problematic when one group is very small (i.e., in voxels lesioned in very few patients), and hence, a minimum lesion cutoff is applied. We acknowledge that, in this sample, the distribution of voxel integrity values could be bi-modal; however, as previous VBCM studies have not applied a lesion cutoff (Tyler et al. 2005; Butler et al. 2014; Halai et al. 2017), we did not do so in the present study.

As can be seen in Figure S1, and in agreement with our previous work (Butler et al. 2014; Halai et al. 2017), lesion volume was correlated with a number of regions around the left perisylvian fissure, which is to be expected give this sample all had MCA infarcts (Phan et al. 2005). Although some researchers suggest that lesion volume should be entered as a covariate in lesion-symptom mapping analyses to control for global severity, we would argue that this approach is too conservative when the areas correlated with lesion volume overlap with the key functional areas of interest. This is certainly so in this study, as the inferior, posterior, and superior extensions of lesions involve areas that we expect to be involved in reading. Moreover, it is these areas farther from the perisylvian focus that are most likely to be reading specific, so to control for lesion volume in our analyses would, in fact, work against detecting such regions. This would bias the results in favour of confirming the primary systems prediction that there should be no areas associated with reading of particular stimulus types that do not overlap with the regions supporting spoken language processing. For this reason, we did not control for lesion volume in our main analyses, but we do provide the results controlled for lesion volume controlled in Table S1. It must be kept in mind that the significant results within areas correlated with lesion volume must be interpreted with caution with respect to the specificity of their involvement in a particular behaviour.

Localising primary systems The VBCM results for the phonological and semantic factors are shown in Fig. 1b and Table 4. Each map shows where tissue concentration covaries uniquely with a given factor score. All results are thresholded at p < 0.005 voxel level, p ≤ 0.05 family wise error (FWE)-corrected cluster level.

Table 4 Results of the whole-brain VBCM analyses for speech measures

Performance on the phonological factor was uniquely correlated with a large cluster of a number of left hemisphere regions: frontal pole, middle frontal gyrus (MFG), inferior frontal gyrus (IFG) (triangularis and orbitalis), frontal and central opercular cortex, posterior insular cortex, temporal pole, planum polare, Heschl’s gyrus, planum temporale, superior temporal gyrus (STG), middle temporal gyrus (MTG), parietal opercular cortex, and posterior supramarginal gyrus (SMG). The phonological cluster, therefore, overlapped with the anterior and posterior segments of the arcuate fasciculus, a key aspect of the dorsal language pathway (Wise 2003; Catani and ffytche 2005; Catani et al. 2005; Duffau et al. 2005; Parker et al. 2005; Saur et al. 2008) as well as much of the inferior longitudinal fasciculus, part of the ventral language pathway (Wise 2003; Parker et al. 2005; Catani and Mesulam 2008; Saur et al. 2008; Schmahmann and Pandya 2008; Duffau et al. 2009).

Performance on the semantic factor was uniquely related to a cluster of voxels centred on the white matter in the left anterior temporal lobe (ATL), extending to the temporal pole, planum polare, anterior and posterior MTG, and inferior temporal gyrus (ITG), and the anterior fusiform. The semantic cluster involved the anterior section of the inferior longitudinal fasciculus (ILF), the inferior aspect of the anterior commissure, and uncinate fasciculus, all of which comprise the ventral language pathway (Wise 2003; Parker et al. 2005; Catani and Mesulam 2008; Saur et al. 2008; Schmahmann and Pandya 2008; Duffau et al. 2009).

Mapping reading deficits The VBCM results for reading performance are shown in Fig. 1c and Table 5. There was a large area that was associated with reading accuracy for all three stimulus types. We refer to this as the core reading network, which spanned superior temporal and inferior frontal regions, including: frontal pole, middle frontal gyrus, frontal orbital cortex, frontal and central opercular cortex, IFG (pars triangularis and pars opercularis), insular cortex, precentral gyrus, temporal pole, planum polare, anterior STG, and anterior MTG. In terms of white-matter connectivity, this region overlapped primarily with the ILF but also the uncinate fasciculus.

Table 5 Results of the whole-brain VBCM analyses for reading measures

Both nonword and abstract word reading were associated with tissue concentration in: the planum temporale and polare, parietal and central opercular cortex, insular cortex, precentral gyrus, posterior STG, and anterior MTG, and overlapped the anterior, posterior, and long segments of the arcuate fasciculus. Abstract and concrete word reading were associated with the frontal pole, frontal orbital cortex, IFG pars triangularis, insular cortex, anterior parahippocampal and anterior fusiform cortices, anterior and posterior ITG, temporal pole, planum polare, and anterior and posterior MTG, overlapping with the uncinate fasciculus, anterior ILF and IFOF. There were no regions that were associated with both nonword and concrete word reading.

Nonword reading was specifically associated with tissue concentration in: MFG, IFG (pars opercularis and triangularis), precentral gyrus, insular cortex, central and parietal opercular cortex, and anterior MTG, and overlapped the anterior and long segments of the arcuate fasciculus, frontal aslant tract [as presented by Catani et al. (2013)] and edged onto the internal capsule. Abstract word reading was associated specifically with posterior and temporo-occipital MTG, posterior STG, planum temporale, and anterior and posterior SMG, and overlapped the posterior arcuate, posterior ILF, and posterior inferior fronto-occipital fasciculus (IFOF). Concrete word reading was associated specifically with only a few areas in the frontal pole and insular cortex, just overlapping the anterior IFOF.

In summary, areas specific to nonword reading focussed on the MFG, IFG, inferior precentral gyrus, and insular and opercular cortices, and overlapped the arcuate fasciculus/frontal aslant tract, while those areas specific to concrete and abstract words centred on inferior frontal and temporal regions, mainly MTG but also involving the fusiform, and overlapping both the ILF and uncinate. Interestingly, areas supporting reading of concrete words fell almost entirely within those supporting abstract words, with the latter showing an additional correlation with more posterior MTG and parietal regions.

Testing the primary systems hypothesis Figure 2 shows the overlap between areas associated with phonological processing and reading of each word type, next to the correlations between scores on the phonological factor and reading accuracy. There is clearly a strong relationship between non-reading phonological skills and reading accuracy for all three string types at both the behavioural and neural levels. This is consistent with the primary systems view that phonology is involved in reading of all strings, but, for words, semantic processing would make an appreciable contribution. Indeed, as can be seen in comparison of Fig. 1b, c, the core reading network falls within the phonological but not the semantic processing region.

Fig. 2
figure 2

Relationship between phonological processing and reading of: a nonwords, b abstract words, and c concrete words. Behavioural data are shown in the scatter plots on the left. VBCM results are shown on the right, with areas relating to the phonological factor in blue, areas related to reading performance in green, and the overlap between them in cyan. Overlays show areas significant at p < 0.005 voxel level, p < 0.05 FWE-corrected cluster level, image threshold (t) = 2.7. Overlap between phonological processing and reading maps represents a conjunction as recommended by Nichols et al. (2005) with a significance level that is the product of that of each map (i.e., p < 0.000025 voxel level, p < 0.0025 FWE-corrected)

As shown in Fig. 2a, a large inferior portion of the areas involved in nonword reading overlapped with the phonology map. This included areas specifically associated with nonword reading (parietal, central, and frontal opercula, insula and pre- and post-central gyri, pars opercularis, and triangularis) and areas shared between nonword and abstract word reading (central operculum, precentral gyrus, insula, and planum temporale). As can be seen in Fig. 2b, the areas involved in abstract word reading were almost entirely contained within the phonology map. This involved areas specifically associated with abstract word reading (planum temporale and polare, insular cortex, parietal and central opercular cortex, posterior MTG, posterior STG, and SMG). In addition, as can be seen in Fig. 2c, areas shared between concrete and abstract word reading overlapped with the anterior and inferior aspects of the phonology cluster frontal and temporal pole, IFG (pars triangularis), frontal orbital and insular cortex, frontal and central opercular cortex, planum polare, Heschl’s gyrus, anterior and posterior MTG, and anterior STG.

Figure 3 shows the overlap between areas associated with semantic processing and reading of each word type, along with the correlations between semantic factor scores and reading performance. The strength of the relationship between the semantic factor and reading accuracy for each string type mirrors the proportion of the reading map that overlaps with the semantics map: this is minimal for the nonwords, intermediate for the abstract words, and highest for the concrete words. This is precisely what would be predicted by the primary systems hypothesis, as nonwords rely solely on the integrity of the phonological network, and hence are most often undermined after middle cerebral artery stroke. Words are doubly represented along the semantic and phonological pathways, and are therefore more robust to this damage.

Fig. 3
figure 3

Relationship between semantic processing and reading of: a nonwords, b abstract words, and c concrete words. Behavioural data are shown in the scatter plots on the left. VBCM results are shown on the right, with areas relating to the semantic factor in red, areas related to reading performance in green, and the overlap between them in yellow. Overlays show areas significant at p < 0.005 voxel level, p < 0.05 FWE-corrected cluster level, image threshold (t) = 2.7. Overlap between semantic processing and reading maps represents a conjunction as recommended by Nichols et al. (2005) with a significance level that is the product of that of each map (i.e., p < 0.000025 voxel level, p < 0.0025 FWE-corrected)

As shown in Fig. 3a, there was minimal overlap between the areas supporting nonword reading and the semantics map. In contrast, as shown in Fig. 3b, there was a considerable overlap between the areas supporting abstract word reading and the semantics map. This involved the posterior IFOF specifically associated with abstract word reading and areas shared between abstract and concrete (Fig. 3c) word reading (anterior parahippocampal and anterior fusiform cortices, anterior and posterior ITG, temporal pole, planum polare, and anterior and posterior MTG, overlapping with the uncinate fasciculus, anterior ILF, and anterior IFOF).

In addition to the reading areas identified that overlapped with primary systems, there were also some areas that did not. Most of these were small variations in the extent of the comparable lesion correlates. In the core reading network, a small portion of the pars opercularis that was implicated in reading all three types of strings fell outside the phonology cluster. Areas shared between nonwords and abstract words fell almost entirely within the phonology cluster, while those shared between the abstract and concrete words fell almost entirely within the phonology and semantics maps. The small areas associated specifically with concrete word reading (frontal pole and insular cortex, plus IFOF) did fall outside the primary systems regions, as did a few of the areas specifically associated with abstract word reading (posterior STG and particularly anterior and posterior SMG).

Aside from these relatively small variations in the extents of the lesion correlates, there was one more notable and intriguing additional large superior frontal region associated specifically with nonword reading that did not overlap with the phonology map. This area included IFG pars opercularis and triangularis, middle frontal gyrus, pre- and particularly the post-central gyrus, plus central and parietal opercular cortex, and involved the anterior and long segments of the arcuate fasciculus, overlapping the inferior portion of the frontal aslant tract. This dorsal nonword reading-specific region which we observe has been previously shown to correlate with the control of motor speech output (Price 2012; Richardson et al. 2012) and with speech fluency/quanta (Catani et al. 2013; Basilakos et al. 2014)—which is a distinct underlying component of the aphasic multi-dimensional profile, statistically separate from phonology, semantics, and cognitive factors (Halai et al. 2017).

To isolate this additional primary speech motor output system, we mapped the lesion correlates of “words per minute” during description of the Cookie Theft picture (Goodglass et al. 2000), a commonly used index of speech fluency. As can be seen in Fig. 4, the behavioural Pearson’s correlation between words per minute and reading performance, although significant for all word types (nonwords r = 0.553, p < 0.0005; abstract words r = 0.400, p = 0.008; concrete words r = 0.309, p = 0.043), is strongest for nonwords and weakest for concrete words. Moreover, there is considerable overlap between the maps of words per minute and nonword reading performance. Critically, this included many of the areas associated with nonword reading that did not overlap with the phonology factor, namely: the MFG, pre- and post-central gyrus and IFG pars opercularis, and the anterior and long segments of the arcuate and the frontal aslant tract. In contrast, the overlap between words per minute and reading performance for abstract words and concrete words fell almost entirely within the areas associated with phonology. Hence the lesion data indicate that there is a unique contribution of fluency to nonword reading over and above that of phonology. This then makes the very specific prediction that only nonword reading should show a significant relationship to fluency measures after controlling for phonological ability, and this was, indeed, the case in partial Pearson’s correlations (nonwords r = 0.429, p = 0.005; abstract words r = 0.180, p = 0.253; concrete words r = 0.072, p = 0.648).

Fig. 4
figure 4

Relationship between fluency and reading of: a nonwords, b abstract words, and c concrete words. Behavioural data are shown in the scatter plots on the left. VBCM results are shown on the right, with areas relating to fluency (as measured by words per minute in picture description as a percentage of the highest score) in violet, areas related to reading performance in green, and the overlap between them in white. Overlays show areas significant at p < 0.005 voxel level, p < 0.05 FWE-corrected cluster level, image threshold (t) = 2.7. Overlap between fluency and reading maps represents a conjunction as recommended by Nichols et al. (2005) with a significance level that is the product of that of each map (i.e., p < 0.000025 voxel level, p < 0.0025 FWE-corrected)

This result is theoretically significant for three reasons: it indicates that, by encompassing a large group of post-stroke aphasic cases, we have been able to add an important additional component to the primary systems framework (phonology, semantics, vision, and controlled speech output); it demonstrates that nonword reading loads much more heavily on controlled speech output, presumably because the pronunciation and articulation of nonwords involves a novel sequence; finally and relatedly, it suggests that patients with damage to this region will show nonword reading impairments disproportionate to those expected on the basis of their phonological scores alone.

Discussion

The purpose of this study was to provide a large-scale quantitative assessment of the primary systems account of acquired reading disorders, both behaviourally and neurally, for the first time. This neurocomputationally rooted theoretical framework predicts that the patterns of patients’ reading deficits are systematically related to the status of more domain-general primary systems (Patterson and Lambon Ralph 1999). We isolated these primary systems using a principal components analysis of a large battery of neuropsychological data. Consistent with the primary systems hypothesis, we found that nonword reading correlated only with the phonological factor, and its lesion map clearly overlapped with the phonological neural cluster. Abstract word reading correlated with both phonological and semantic scores, and its lesion map overlapped with both the semantic and the phonological neural clusters. Concrete word reading also correlated with semantic and phonological scores, and its lesion correlate overlapped with the semantic and frontal and inferior aspects of the phonological cluster. In addition, our large-scale neuropsychological study allowed us to identify a novel additional primary systems component—prefrontal regions associated with controlled speech production—that are particularly important for nonword reading.

Our results demonstrate a superior-to-inferior gradation of reading specialisation across the dorsal and ventral pathways according to lexicality and concreteness. Given the typical perisylvian distribution of middle cerebral artery stroke lesions and its overlap with the phonological network, our results provide a unified explanation not only for the fact that phonological deficits are so prominent in poststroke aphasia (Schwartz et al. 2012; Butler et al. 2014), but also why nonword reading is so strongly undermined (Fiez et al. 2006; Rapcsak et al. 2009; Brookshire et al. 2014). Indeed, our study aligns with previous work concerning lesion sites associated with phonological dyslexia, which have implicated a variety of perisylvian regions, particularly the insula and LIFG, not only in post-stroke aphasia (Fiez et al. 2006; Rapcsak et al. 2009; Ripamonti et al. 2014) but also in primary progressive aphasia (Henry et al. 2012). Previous functional neuroimaging meta-analyses have found these areas to be consistently more active for nonword than word reading (Taylor et al. 2013) and reliably involved in phonological processing (Vigneau et al. 2006).

Performance for words is more robust not only through practice and experience (which, by definition, nonwords do not have), but also, because words can draw upon both semantics and phonological processing, as reflected by the overlap of word reading with both of these primary systems in our results. We observed a strong influence of concreteness on the extent to which word reading draws upon phonology and semantics. The semantic representations of abstract words are less rich than concrete words (Plaut and Shallice 1993; Paivio 2010) and, as a consequence, need to draw more heavily on phonological processing (Westbury and Moroschan 2009). Indeed, we found that abstract words overlapped with more of the phonological network than concrete words, encompassing areas in the planum temporale and polare, insula, parietal and central opercula, posterior MTG, and posterior STG and SMG. These results are in line with neuroimaging studies that have compared concrete and abstract words (Binder et al. 2005; Sabsevitz et al. 2005) and also with recent TMS and lesion mapping investigations that have implicated the SMG as involved in phonological processing specifically for words (Mirman and Graziano 2013; Pattamadilok et al. 2015; Sliwinska et al. 2015).

Both abstract and concrete words relied upon a ventral semantic pathway involving regions of the ATL, including the anterior fusiform, ITG, MTG, temporal pole, and underlying white-matter connections. This result converges with recent fMRI findings implicating the ventrolateral ATL in representing the meaning of both concrete and abstract words (Hoffman et al. 2015b). These anterior and inferior temporal semantic regions, implicated in reading of abstract and concrete words, are the same areas that have been associated with the deficits of exception word reading that define surface dyslexia (Ripamonti et al. 2014), which usually co-occurs with semantic dementia (Woollams et al. 2007, 2010; Wilson et al. 2009, 2012; Henry et al. 2012). Although this area can be challenging to image successfully in functional neuroimaging studies (Visser et al. 2010a), it can be with appropriate methods (Binney et al. 2010; Visser et al. 2010b). Such studies have found that, when healthy readers pronounce exception words, activation is observed in precisely the same ATL regions identified in the current study as supporting both semantics and word reading (Wilson et al. 2012; Hoffman et al. 2015a).

As predicted by the primary systems hypothesis, we observed strong associations between semantic and phonological aspects of speech processing and word and nonword reading. Our large-scale neuropsychological and lesion-symptom mapping study also allowed us to identify a new additional component for inclusion in the primary systems framework. Specifically, we found a large prefrontal cluster for nonword reading that fell outside the phonological and semantic neural-clusters. This additional region included the IFG pars opercularis and triangularis, middle frontal gyrus, pre- and post-central gyrus, plus central and parietal opercular cortex, and overlapped the anterior and long sections of the arcuate fasciculus and also the inferior portion of the frontal aslant tract.

This constellation of cortical and white-matter areas is implicated in aspects of preparation for and execution of speech (Price 2012; Dick et al. 2014). For patients with chronic stroke aphasia, damage to the IFG predicts motor speech impairments (Richardson et al. 2012) and integrity of both the anterior arcuate and aslant tract have been linked to speech fluency not only in stroke and progressive aphasia (Catani et al. 2013; Basilakos et al. 2014), but also in developmental speech production disorders (Kronfeld-Duenias et al. 2016) and neurosurgical stimulation studies (Kinoshita et al. 2015; Vassal et al. 2014; Fujii et al. 2015). Recently, fluency has been identified as a distinct factor that explains a significant variance in stroke aphasic performance, over and above phonology, semantics, and cognition (Halai et al. 2017), with its lesion correlates encompassing the inferior portions of the frontal aslant tract.

The involvement of these speech-control areas and connections in nonword reading is to be expected, given that nonwords are by definition sequences of phonemes that would never have been pronounced before, and they will, therefore, draw most heavily on speech planning mechanisms. To directly establish the specific link between speech motor output and nonword reading, we considered the neural correlates of fluency in our sample, as measure by words spoken per minute. We found that there was clear overlap between the fluency maps and the areas specifically involved in nonword reading that fell outside the phonology maps. As predicted by the neural data, when we assessed the correlation between words per minute and reading performance controlling for phonological ability, it was only nonword reading that showed a significant relationship to fluency. Taken together, this pattern of results clearly indicates that the superior frontal cluster which we have identified is best seen as reflecting an additional primary motor output system that supports the speech production component of reading aloud.

The prefrontal/controlled-speech component which we have identified might explain why there have been a small number of reported individual cases with poor nonword reading yet better than expected phonological processing skills. If nonword reading is supported by phonology and speech motor output, then impairments of controlled speech output may not generate a large phonological impairment but would reduce nonword reading accuracy (e.g.,Tree and Kay 2006). This is potentially theoretically significant as these kinds of cases have been cited as evidence against the primary systems account (Tree 2008). There is, however, more than one reason for poor nonword reading. Pure alexic patients (Cumming et al. 2006), who have reading and visual impairments consequent on lesions in the ventral occipito-temporal region (Roberts et al. 2013), show a nonword reading impairment not due to any phonological impairment, but, rather, because their impaired recognition of words is ameliorated to some degree by top-down semantic support (Roberts et al. 2010) which nonwords, by definition, do not possess.

The primary systems view originated in the context of connectionist triangle models of reading aloud (Seidenberg and McClelland 1989; Plaut et al. 1996; Welbourne and Lambon Ralph 2007; Woollams et al. 2007; Welbourne et al. 2011) which focussed primarily on the cognitive–computational aspects of the theory and abstracted away from the specifics of neural implementation. More recently, ‘neurocomputational’ connectionist models have begun to incorporate neural constraints into the processing architecture whilst maintaining the requirement to explore the key cognitive principles and to generate detailed behavioural data. Accordingly, such models offer the chance to explore the bridge between neural, systems-level processes and higher cognitive behaviour. Recent prominent examples of this approach include explorations of the relative contributions of the dorsal and ventral language pathways in normal and aphasic language processing (Ueno et al. 2011) and the role of graded lateralisation of visual function in the posterior fusiform in pure alexia (Plaut and Behrmann 2011). The combined neural and behavioural exploration of the primary systems framework undertaken in this study provides important information on how to extend these neurocomputational approaches to model normal and impaired reading.

The multi-dimensional correlation-based approach which we have applied to reading in this study allows us to consider associations between the neural bases of distinct but hypothetically related abilities, in contrast to a more traditional focus on dissociating task-specific brain regions. Our study has allowed us to harness lesion data to reveal the basis for behavioural correlations between spoken language impairments and reading deficits in terms of shared brain regions that support both abilities. This approach has the potential to be applied to other domains of higher cognition, where data reduction techniques can offer behavioural variables suitable for use in lesion-symptom analyses of cross-task association. This approach has the advantage of offering control for factors such as global severity, which is a key issue in neurodegenerative conditions (Lambon Ralph et al. 2003) and also neurodevelopmental disorders. Application of the multi-dimensional correlation-based approach to these populations will allow us to understand how task-specific cognitive processes draw on domain-general networks to optimise the efficiency of the neural systems supporting higher cognition.