Keywords

1 Introduction

It is now well estabilished in neuroscience that functional segregation is a bearing principle of brain organization. However, the human brain is a complex network characterized by spatially interconnected regions that can be causally or concurrently activated during specific tasks or at rest. As a consequence, integration of segregated regions is emerging as the most probable organization explaining the complexity of brain function. This principle is however difficult to prove and a possible approach is to investigate it through functional connectivity analysis, which is usually based on the determination of functional correlations.

The main limitation of functional connectivity analysis is that structural information is not used at all, preventing any interpretation in terms of effective connectivity. Indeed, understanding the relationships between the functional activity in different brain regions and the structural network highlighted using tractography can convey useful information about brain functions. While in [19] it has been shown there is a significant overlap between neuroanatomical connections and correlations of fMRI signals, it is yet to be understood how whole-brain network interactions relate during specific tasks or at rest, where fundamental signals have been suggested to play a key role [4].

It has also been shown, for example, that the intensity of functional activity can be predicted by the strength of structural connections, suggesting that it is possible to predict resting-state functional activity from structural information [11]. On the same line, a predictive framework based on multiple sparse linear regression has been recently used to predict functional series from structural data [2], and a model called the “virtual brain” has been designed to simulate brain activity in injured and healthy subjects [12]. With the aim of highlighting the relationships between functional and structural connectivity, a different approach relies on a Bayesian framework to estimate the functional connectivity using a structural graph as a prior [10] has been proposed. However, results generating functional connectivity from structural information have been so far challenging, due to the fact that certain high correlations appear between brain regions not directly linked by structural connections.

An alternative schema adopted to infer the functional directed connections from brain activity is based on methods investigating the functional causality like the Dynamical Causal Model (DCM) [5] and the Granger Causality (GC) [8]. DCM uses an explicit model of regional neural dynamics to capture changes in regional activations and inter-regional coupling in response to stimulus or task demands. Statistical inference is used to estimate parameters for directed influences between neuronal elements. While being this a powerful method to study effective connectivity, its main limitation is the combinatorial complexity on the number of modeled regions and connections, which limits its applicability to only few regions. GC also measures the causal influence and the flow of information between regions. Despite the slow dynamics and the regions variability of hemodynamic response making it a controversial method for fMRI data, it has been also used to identify the dynamics of Blood-Oxygen-Level Dependent (BOLD) signal flow between brain regions [7, 14]. The advantage of GC is that it is based on a Multivariate Autoregressive model (MAR), a random process specifying that the variables depend linearly on their own previous values and on a stochastic term. Thank to this, GC does not suffer an excessive computational complexity. Nonetheless, in case of a large number of regions involved in the analysis, it is affected by cancellation issues and high sensitivity to noise. This makes it impossible to perform a whole brain analysis involving many brain regions. Moreover, this approach usually does not consider the structural connectivity, unless considering one fiber/connection at a time with univariate analysis [13].

Taking inspiration from these limitations, this paper presents an extension of the MAR model introducing a constrained MAR (CMAR), which uses the structural connectivity as a prior to bound the search space during parameter fitting. This fusion of structural connectivity and functional time-series aims at representing an effective brain connectivity, addressing a whole brain analysis thanks to the sparse representation of the connectivity matrix. Besides, the advantages of the proposed method are that only the structural connections justified by the functionality survive, and it can refine possible false positive connections [1]. The validation of the proposed method has been performed testing the quality of results in a typical brain community detection framework, using a recently developed technique based on group-wise spectral clustering [3]. This investigation allowed to validate the hypothesis that clustering the brain using the effective connectivity matrix retrieved with CMAR, while preserving the structural partitioning, it also optimizes the graph cut minimizing the loss of functional interactions.

2 Method

The aim of the proposed method is to reinforce the association between structural brain connectivity and functional brain activation. To obtain this result, we resort to a multivariate autoregressive model properly modified in order to allow for the estimation of the temporal brain activation biased by the structural connectivity.

2.1 Structurally Constrained Autoregressive Model

A multivariate autoregressive model of order n (MAR(n)) is a stochastic process defining r variables \(\mathbf {y}(t)\) as linearly dependent on their own previous values and on a stochastic term \(\epsilon \): \(\mathbf {y}(t) = \sum _{i=1}^n \mathbf {A}_i \cdot \mathbf {y}(t-i) + \mathbf {\epsilon }\), where the coefficient matrices \(\mathbf {A}_i\) are the model parameters. These can be estimated by minimizing the discrepancy between the observed and the implied covariance or constrained by prior knowledge, though this is cumbersome for detailed brain-wise tasks.

Given a structural connectivity matrix \(\mathbf {A}_\mathrm{init}\), the functional signal in a column-vector \(\mathbf {y}\) for all r regions of interest and all T frames, an improved connectivity matrix can be determined minimizing a reconstruction error of a MAR(1) model. Indeed, to fit the model parameters \(\mathbf {A}\) we used a gradient descent approach minimizing the reconstruction error in a least square fashion:

$$\begin{aligned} E = \frac{1}{2} \sum _{t=0}^{T-1} ||\mathbf {A} \cdot \mathbf {y}_t - \mathbf {y}_{t+1} ||_2^2, \end{aligned}$$
(1)

with the direction of the gradient computed regarding \(\mathbf {A}\) as follows:

$$\begin{aligned} \mathbf {\nabla E} = \sum _{t=0}^{T-1} (\mathbf {A} \cdot \mathbf {y}_t - \mathbf {y}_{t+1}) \cdot \mathbf {y}_t', \end{aligned}$$
(2)

where \(()'\) denotes the transpose. To introduce the structural bias into the model, the parameters fitting is constrained by the structural information in the update rule:

$$\begin{aligned} \mathbf {A}_\mathrm{new} = \left( \mathbf {A}_\mathrm{old} + \eta \mathbf {\nabla E} \right) \odot \mathbf {B}, \end{aligned}$$
(3)

where \(\eta \) is the learning rate, \(\odot \) is the Hadamard or element-wise product, and \(\mathbf {B}\) is a matrix of the same size of \(\mathbf {A}_\mathrm{init}\) with each element \(b_{ij}\) defined as

$$\begin{aligned} b_{ij} = \left\{ \begin{array}{lcl} 0 &{} {} &{} \text {if the }ij \text { element of } \mathbf {A}_\mathrm{init} \text { is 0}\\ 1 &{} &{} \text {otherwise.} \end{array} \right. \end{aligned}$$

In this manner, the lack of connections among certain regions in the initial matrix \(\mathbf {A}_\mathrm{init}\) is reinforced at each iteration keeping the connection originally set to zero when the gradient descent has introduced some non-null values in them. Thus, \(\mathbf {B}\) and \(\mathbf {A}_\mathrm{init}\) encode the prior structural connectivity that reinforce the relationship between functional and structural data. During the gradient descent, it is possible that some values of the matrix \(\mathbf {A}_\mathrm{new}\) become negative which might be meaningless in terms of causality. Therefore, at each iteration those values are set to zero to correct the descent. Setting to zero negative values at each iteration in the matrix \(\mathbf {A_{new}}\) is a common way to enforce non-negativity during the learning. Clearly, this has an effect on accuracy as the solution is sub-optimal with respect to an unconstrained fitting. On the other side, the need for non-negative coefficients is motivated by the fact that negative causalities cannot be interpreted.

The approach can be easily generalized to higher order MAR, where a different matrix \(\mathbf {A}_i\) has to be optimized for each order. Choosing the right model-order is a trade-off between optimizing the variance and the model complexity. In our experiments to have a direct validation of the method, we limited the model to the first order.

2.2 Effective Brain Community Detection

To assess whether the proposed method produces an effective connectivity information characterizing the structural connectivity enriched with functional information, a community detection analysis has been performed using a group-wise graph clustering algorithm recently proposed in [3] both on the set of structural connectivity matrices and on the effective connectivity matrices.

Given a set of connectivity matrices \(\mathcal {W} = \{\mathbf {W}_{i}\}\) representing undirected weighted graphs with positive weights, the normalized graph Laplacian is built as \(\mathbf {L_i} = \mathbf {D_{i}^{-\frac{1}{2}}} (\mathbf {D_{i}}- \mathbf {W}_{i}) \mathbf {D_{i}^{-\frac{1}{2}}}\), where \(\mathbf {D_i}\) is the diagonal degree matrix of \(\mathbf {W}_{i}\). However, in general, the connectivity matrices resulting from the above CMAR model computed for each subject are asymmetric (i.e., edges are directed), thus, they have been converted to undirected graphs aiming at maintaining the properties of the original graphs estimated from CMAR. To this aim, a symmetrization based on random walk was applied [15].

More specifically, given a directed graph \(\mathbf {M}\), the transition matrix of the random walk can be defined as \(\mathbf {P} = \mathbf {D_{out}^{-1}}\mathbf {M}\), where \(\mathbf {D_{out}}\) is a diagonal matrix built using nodes’ out-degree. The symmetric graph can be therefore defined as \(\mathbf {M_{sym}}= \frac{1}{2}(\mathbf {\Pi } \mathbf {P} + \mathbf {P}'\mathbf {\Pi })\), where \(\mathbf {\Pi }\) is a the diagonal matrix which defines the probability of a walker to stay in each node in a stationary distribution, defined as \(\mathbf {\Pi } = \frac{d_\mathrm{out}}{m}\), where \(d_{out}\) is the vector of the out-degree of each node and m is the number of nodes. Thanks to this new representation, the pipeline described by the authors in [3] can be applied, generating the Normalized Graph Laplacians for each subject, performing the joint diagonalization of multiple Laplacians to find a unique eigenspace and, finally, applying spectral clustering on the smallest joint eigenvectors. In order to decide the number of clusters, as usual in spectral clustering, we can look at the spectral gap of the mean approximated eigenvalues.

3 Data and Experimental Settings

All experiments have been performed on 20 right-handed healthy subjects from the Nathan Kline Institute-Rockland dataset [16]. For each subject, fMRI, DTI and T1 have been acquired and co-registered. FMRI data were acquired using a 3 T scanner, with TR/TE times as 1.4 s/30 ms, flip angle \(65^{\circ }\), and isotropic voxel size of 2 mm; resulting in resting-state time series 10 min long, where subjects were asked to keep the eyes open. DTI volumes were acquired with a 1.5 Tesla scanner and isotropic voxel-size of 2.5 mm using 35 gradient directions. The T1 weighted MRI data were acquired with the same scanner, using as TR/TE times 1.1 s/4.38 ms, flip angle \(15^{\circ }\), and isotropic voxel-size of 1 mm.

3.1 Pre-processing and Connectome Construction

FMRI data have been pre-processed according to a standard pipeline: motion correction, mean intensity subtraction, pass-band filtering with cutoff frequencies of [0.005–0.1 Hz] and skull removal. To account for potential noise from physiological processes such as cardiac and respiratory fluctuations, nine covariates of no interest have been identified for inclusion in our analyses [18]. To further reduce the effects of motion, compensation for frame-wise displacement has been carried out [17]. Eddy current correction and skull stripping have been carried out as the pre-processing for the DTI data. Linear registration has been applied between the AAL atlas and the T1 reference volume by using linear registration with 12 degrees of freedom.

Tractographies for all subjects have been generated processing DTI data with the Python library Dipy [6]. In particular, a deterministic algorithm called Euler Delta Crossings has been used stemming from 2,000,000 seed-points and stopping when the fractional anisotropy was smaller than \(<0.1\). Tracts shorter than 30 mm or in which a sharp angle occurred have been discarded. The final result yielded to about 250,000 fibers. To construct the connectome, the graph nodes have been determined using the 90 regions in the AAL atlas. Specifically, the structural connectome has been built counting the number of tracts connecting two regions, for any pair of regions. The same regions have been used to compute the averaged functional time series from the voxels in each region.

Fig. 1.
figure 1

Example of adjacency matrix for one subject plotted in logscale: (a) initial matrix obtained from the tractography; (b) effective matrix obtained with the proposed autoregressive model. The convergence process is shown in subfigure (c) as a decrease of error defined in Eq. (1), where each color line is a different subject.

Fig. 2.
figure 2

Axial view of joint spectral clustering using k = 8 on (a) the original structural joint eigenspace, and (b) on the joint eigenspace given by effective connectivity matrices.

4 Results and Discussions

Figure 1 depicts an example from one subject of (a) the original structural connectivity matrix and (b) the effective connectivity resulting from autoregressive model filtering. The figures highlight that some structural connections which are not “used” by the resting-state functional data are canceled out. This has clearly an effect on the subsequent clustering, which is shown in Fig. 2. Indeed, by analyzing the group-wise eigenvalues resulting from the joint Laplacians diagonalization, it has been noted a spectral gap at the 4th and 8th eigenvalues for both structural and effective connectivity matrices, in agreement with previous studies on other datasets [9]. The value \(k=8\) has been chosen to clusters the brain as it better explains the brain known communities. The resulting clustering of the brain regions based on the structural connectome (Fig. 2(a)) and on the effective connectome (Fig. 2(b)) are slightly different while preserving the overall organization. Regarding convergence, the gradient descent finds a sub-optimal solution by definition. The zeroing step makes convergence more cumbersome but does not prevent from reaching a minimum as shown in Fig. 1(c).

Fig. 3.
figure 3

(a) Reconstruction error of CMAR model after converging to the effective connectivity (green circles) or according to block-wise MAR based on the structural communities (red squares) and the effective communities (blue crosses). The lower the better. (b) Functional segregation of clusters using the effective communities (green dots) or structural communities (black stars). The higher the better.

We also devised an analysis to assess whether the clusters obtained form the autoregressive filtered data are more meaningful in relation to the fMRI time-series then the cluster obtained from the structural information. We carried out a block-wise definition of the effective connectivity matrices where one block at a time, defined by the brain regions belonging to a cluster, is used in a CMAR model involving only the relative fMRI series. Then, the reconstruction error of the fitted CMAR models for each cluster have been summed up over all clusters and compared each other. The underlying intuition is that partitioning the brain using an effective connectivity information would remove those structural connections which are also meaningless from a functional perspective, at least in the analyzed experimental data.

The reconstruction error per subject in Fig. 3(a) shows, as expected, that the lowest error is given by considering the whole network in the CMAR computation. However, when removing some connections according to the clustering results, the communities determined from the effective connectivity matrix show to be more self explanatory in terms of functional activity then the communities obtained from the structural connectivity only. Similar evidence is obtained when analyzing the cluster functional separation (CFS), defined as the average ratio between the intra- and inter-cluster cross-correlation as follows:

$$\begin{aligned} CFS = \frac{1}{k}\sum _{s=1}^k \frac{\sum _{i<j\in C_s}w_{ij}}{\sum _{i<j\in C_s}w_{ij} + \sum _{i\in C_s}\sum _{j\in C_t\ne C_s}w_{ij}} \end{aligned}$$
(4)

where \(w_{ij}\) is the functional cross-correlation of the time-series for nodes i and j.

This index has been computed for both structural and effective clustering result. Figure 3(b) shows that CFS with clusters determined using our CMAR approach is significantly higher when compared with the structural clusters (\(p<0.001\)), demonstrating that the effective clusters are also underpinned by the functional connectivity. Although, larger datasets experiments are required.

5 Conclusions

The effective connectivity inferred by the proposed CMAR model highlights a different brain architecture underpinned by both structural and functional connectivity. Thanks to this, the method can lead to new insights into understanding brain effective connections in healthy and pathological subjects.