Keywords

1 Introduction

Focal cortical dysplasia (FCD) type II is a developmental malformation characterized by intracortical dyslamination and dysmorphic neurons, either in isolation (FCD-IIA) or together with balloon cells (FCD-IIB). This malformation is a frequent cause of pharmaco-resistant epilepsy. Its surgical removal is currently the only treatment option to arrest seizures. MRI has been instrumental in detecting FCD by unveiling its abnormal morphology and intensity features, thereby defining the surgical target [1]. While recent evidence suggests abnormal peri-lesional connectivity [2] and diverging functional profiles across histological grades [3], patterns of FCD connectivity to the rest of the brain remain unknown.

Here, we present a novel framework that subdivides a given FCD lesion into a set of constituents through an agglomerative hierarchical clustering procedure based on resting-state functional MRI (rs-fMRI) connectivity. Using supervised statistical learning, we assessed the yield of this connectome-based biotyping to predict histological grade and response to surgery in individual patients.

2 Methods

2.1 MRI Acquisition

Images were acquired on a 32-channel 3T Siemens TimTrio using 3D T1-weighted MPRAGE (TR = 2300 ms, TE = 2.98 ms, flip angle = 9°, voxel size = 1×1 × 1 mm3) and echo planar resting-state functional MRI (rs-fMRI; TR = 2020 ms, TE = 30 ms, flip angle = 90°, 34 slices, voxel size = 4 × 4 × 4 mm3, 150 volumes). For the latter, participants were instructed to lie still with their eyes closed while remaining awake.

2.2 MRI Processing

Structural MRI.

T1-weighted MRI underwent intensity inhomogeneity correction and normalization, linear registration to MNI152 space [4], as well as classification into white matter (WM), gray matter (GM), and cerebro-spinal fluid (CSF). GM-WM and GM-CSF surface models were constructed using CLASP, an algorithm relying on intensity and geometric constraints [5]. This approach iteratively warps a surface to the GM-WM boundary and estimates the outer surface by expanding the inner one along a Laplacian map. Surfaces were aligned based on cortical folding to improve inter-individual correspondence. Surface extraction accuracy was visually verified.

Resting state fMRI (rs-fMRI).

Processing was conducted using DPARSF (http://www.restfmri.net) [6]. After discarding the first 5 volumes, rs-fMRI underwent slice-time and motion correction, realignment, followed by statistical correction for effects of WM signal, CSF signal, and head motion. To correct for residual motion, we included time points with a frame-wise displacement >0.5 mm as separate covariates [7]. Signals were band-pass filtered at 0.01–0.08 Hz. The mean volume after filtering was linearly mapped to T1-weighted using a boundary-based registration that prioritizes cortical alignment [8].

2.3 Functional Connectivity Mapping of FCD Lesions

Figure 1 summarizes our profiling and prediction approach.

Fig. 1.
figure 1

General summary of the proposed framework. Each step is detailed in the text below.

Construction of whole-brain connectivity matrix.

Based on the automated anatomical labeling (AAL) [9], we first parcellated the entire neocortex into 78 macroscopic regions of interest (ROIs). We then subdivided these ROIs into smaller constituents with comparable cortical surface area, yielding 1000 parcels in total [10]. We sampled the rs-fMRI signal at mid-thickness between GM-CSF and GM-WM boundaries. For each parcel, we averaged the vertex-wise signals yielding a 1000 \( \times \) 145 time-series per subject. We then computed the Pearson product moment cross-correlation between all pairs of parcels to generate a subject-wise functional connectome (F; 1000 \( \times \) 1000).

Functional community detection (Fig. 1A-1). The functional network in the human brain is composed of several highly reproducible communities [11] i.e., the default mode, saliency, attention, fronto-parietal, visual, sensory-motor, and limbic networks. We extracted this canonical community structure based on F using a mixture model density estimation [11]. We included the data of patients and controls to make a group-unbiased functional community map, which was highly comparable to the one obtained in a large dataset of healthy controls (i.e., normalized mutual information between our result and a previously generated map: 0.58; r2 = 0.54) [11].

Patch-wise lesion connectivity estimation.

Two experts manually segmented FCD lesions on T1-weighted images based on a combination of its typical signs [1]. The lesion volume was subsequently intersected with cortical surfaces, providing a surface-based FCD label. For a given subject, we selected patches p that overlapped with the lesion label (Fig. 1A-2) and aggregated their connectivity profiles to the rest of the brain into a lesion-patch connectivity matrix L (Fig. 1A-3), which has the dimension of p \( \times \) 1000. We normalized the connectivity of each lesional patch with respect to the connectivity at the corresponding patch location in controls using a z-transform. Connectivity of a lesional patch could then be expressed in terms of deviation from controls (high/low z-scores indicating hyper-/hypo-connectivity). The matrix L was reordered according to the functional community each lesion fell into.

2.4 Data-Driven Clustering of Community-Based Lesion Connectivity

Community-based reconfiguration.

The procedure described in Sect. 2.3 assigns each cortical parcel to a given functional community. Therefore, connectivity patterns of a lesional patch to all other parcels can be split, depending on whether these parcels fall into the same community as the lesional patch (intra-community) or in a different community (inter-community) (Fig. 1B-1). To determine the degree of deviation with respect to controls, we sorted the connectivity profile based on their z-scores in intra- and inter-community separately, and appended them to generate a community-based, lesion-centered connectivity matrix (\( \hat{L} \); dimension: p \( \times \) 1000; Fig. 1B-2). In sum, each row in this matrix corresponds to a lesion patch connectivity profile, dichotomized into intra- and inter-community connectivity sorted with respect to their deviation from controls. This approach controls for variable lesion locations across subjects, providing consistent inputs for the subsequent clustering procedure.

Unsupervised clustering of lesion connectivity (Fig. 1C). We computed the similarity of intra- and inter-community connectivity for all pairs of lesional patches in \( \hat{L} \). We first binarized \( \hat{L} \) by |z| ≥ 1 to highlight patches with hypo- and hyper-connectivity profile in patients. To address the variable length of intra/inter-community connectivity across lesional patches (as each community consists of different number of parcels), we applied dynamic time warping [12], a non-linear similarity estimation between vectors of different length. Similarity matrices for intra- and inter-community connectivity were then averaged and fed into an agglomerative hierarchical clustering algorithm (kernel: Wald [13]), which groups lesional patches into discrete classes based on their similarity. To obtain the optimal number of classes k, we repeated the clustering with a k = 0 – 20 and chose the k that maximizes the ratio of between- to within-cluster distance.

Cluster description.

For each newly-discovered class (i.e. a collection of lesional patches with similar intra/inter-community connectivity profiles), we evaluated the overall connectivity patterns, comparing their average z-scores to those of controls using two sample t-tests. We furthermore assessed their selective participation in functional communities using Fisher’s exact test. Findings were corrected at a false-discovery-rate (FDR) of q < 0.05.

2.5 Supervised Prediction of Histology and Surgical Outcome (Fig. 1D)

Connectome-based prediction.

We used our connectivity-based profiling framework to predict the histological grade (i.e., FCD-IIA vs. IIB) and post-operative outcome (i.e., seizure free vs. non-free) in individual patients. Notably, as the above clustering procedure is patch-based (not lesion-based), lesions consisting of multiple patches and embracing heterogeneous connectivity profiles may be partitioned into more than one class. To address this issue, we developed a flexible ensemble classifier that could work even when the lesion labels are subdivided into different number of classes (in other words, when labels have a different length of feature vectors). Specifically, our classifier was composed of k base classifiers (one base classifier per class). Each base classifier was then trained and tested only using patches belonging to that specific class. The input to these classifiers was a 2D vector (i.e. the intra- and inter-community connectivity averaged across all target parcels) for a given patch. For a testing patch, the base classifier estimated the posterior probability that the patch belongs to a histological grade (i.e., IIA vs. IIB) or seizure outcome (i.e., seizure free vs. non-free). The final ensemble meta-classifier collected these probabilities from all patches within a given lesion label and predicted the category according to the average posterior probability. Base classifiers were implemented using Adaboost (learner: decision tree, 20 boosts, minimum leaf size = 5), an advanced learning algorithm particularly that shows reduced bias in accuracy estimation as compared to linear classifiers [14].

Validation.

To evaluate the benefits of our connectome-based clustering, we compared its prediction performance to the one of a class-naïve classifier which input is equally a community-based connectivity but without any connectome stratification. For classifier building and testing we used a 10-fold cross-validation with 100 iterations; at each iteration, 9 folds were used for model training while the remaining one was used for testing. Given that only two features (i.e., average z-scores of intra- and inter-community connectivity) were used for training, we did not setup any feature selection in our classifications. A permutation test (with 1,000 iterations) that randomly shuffled cortical patches determined the significance of classifier performance exceeding chance.

3 Experiment and Results

3.1 Subjects

We studied 27 consecutive patients (13 males; mean ± SD age = 27.3 ± 8.7 years) with histologically-proven FCD Type-II (11 IIA, 16 IIB). The presurgical workup included neurologic examination, assessment of seizure history, neuroimaging, and video-EEG telemetry. After surgery, 19 patients became completely seizure-free, while 8 had residual seizures. The FCD subgroups did not differ in age, sex, age at onset, disease duration, and focus lateralization. As control group, we studied 34 age- and sex-matched healthy individuals (20 males; mean ± SD age = 29.3 ± 7.0 years).

3.2 Connectome-Based Profiling of FCD Lesions

The high-resolution parcellation subdivided the lesion labels of the 27 patients into 210 different cortical patches with similar areas (mean ± SD = 139.2 ± 12.3 mm2). The number of patches in each lesion was highly variable across cases (mean ± SD = 8 ± 7, range = 1–27). To avoid the potential risk that our clustering could be biased by few large-sized lesions, we merged adjacent patches in a given lesion if they showed a high similarity of functional connectivity profiles (p for correlation <0.05), resulting in 55 patches across patients (# of patches per patient: mean ± SD = 2.0 ± 1.2, range = 1–5).

Hierarchical clustering applied to these 55 patches identified three similarly-sized subgroups. This solution was independently confirmed by a K-mean clustering algorithm, showing that the selected k provided the most consistent results across 1000 random-seed iterations, compared to other cluster numbers. Identified subgroups indeed showed distinctive connectivity profiles compared to controls (Fig. 2A, B). Specifically, Class-I (# of cortical patches, n = 20) was characterized by decreased intra- and inter-community connectivity (qFDR < 0.01); Class-II (n = 18) displayed a selective decrease in intra-connectivity (qFDR < 0.04); Class-III (n = 17) presented with increased intra- as well as inter-community (qFDR < 0.01). Functional communities were similarly represented across classes (Fig. 2B; Fisher’s exact test: p > 0.4), with the exception of default mode network being more frequently associated to Class-I (50/22/18% for the Class-I, -II and -III, respectively; p < 0.05).

Fig. 2.
figure 2

Lesion connectivity profiling and prediction. (A) Connectivity profiles across the three identified classes. (B) Overall connectivity profiles and functional community composition. (C) Prediction performance across 100 iterations for class-informed and –naïve classifiers (upper), and examples of the prediction result at a median performance (lower). Abbreviations. SF: seizure-free; NSF: not seizure-free

3.3 Machine-Learning Prediction of Histological Grade and Surgical Outcome

The class-informed classifier predicted histopathological grade and surgical outcome with high accuracy (86 ± 4% and 84 ± 5%, respectively; Fig. 2C). Repeating the experiments using a 5-fold cross-validation, we observed similar prediction accuracies (76/83% in histopathology/surgical outcome), suggesting generalizability. Several additional analyses supported the benefits of this approach. Firstly, permutation tests with randomly shuffled outcome labels confirmed that high accuracy of our approach was above-chance (p = 0.001). Secondly, the class-naive classifier which operated on connectivity patterns averaged within an entire lesion generated a lower accuracy (histological grade: 68 ± 3%, surgical outcome: 69 ± 4%; comparison between class-informed and class-naïve classifiers: p < 0.001). Our approach also outperformed the classifier built on random partitioning of patches into three classes with exactly equal sizes (i.e., n = 20/18/17; p = 0.001). These results suggest that the high performance of our class-informed prediction was not driven by overfitting (given the relatively small sample size), but rather by the clustering framework, which captured biologically meaningful connectivity variations across the FCD lesions.

4 Discussion

Neuroimaging-based disease subtyping has recently become an active line of research as it allows for the identification of novel subgroups [15], which may help to develop more targeted therapeutic strategies and improved diagnostic procedures for individual patients. We have developed a novel approach to profile functional connectivity of the dysplastic cortex using a network-community schema and applied a hierarchical clustering to derive connectome-level subtypes. Notably, we could show that FCD lesions do not present with a predominant connectivity pattern to the rest of the brain, but are variably composed of areas that belong to three subtypes of whole-brain functional embedding: overall disconnectivity, selective disconnectivity from the overarching functional community, and connectivity increases. This data provides the first evidence for substantial heterogeneity in the relation of FCD lesions to whole brain networks. In future work, it will be of interest to explore whether the functionally-derived subtypes also relate to differential patterns of underlying structural features and/or anatomical connectivity information derived from diffusion MRI.

The proposed clustering approach provided highly valuable information for the accurate prediction of histological grades and post-surgical response in a given patient. Given the irreversible nature of surgery, identifying pathology- and outcome-specific imaging signatures on pre-operative MRI have important clinical utility. Indeed, while the in vivo staging of histopathology becomes increasingly relevant with the emergence of minimally invasive surgical procedures (which do not supply specimens for histologic diagnosis) [16], improved outcome prediction may be of benefit for surgical planning and optimized patient counseling.