Identifying Relationships in Functional and Structural Connectome Data Using a Hypergraph Learning Method

Munsell, Brent C.; Wu, Guorong; Gao, Yue; Desisto, Nicholas; Styner, Martin

doi:10.1007/978-3-319-46723-8_2

Identifying Relationships in Functional and Structural Connectome Data Using a Hypergraph Learning Method

Brent C. Munsell¹⁸,
Guorong Wu¹⁹,
Yue Gao²⁰,
Nicholas Desisto¹⁸ &
…
Martin Styner²¹

Conference paper
First Online: 02 October 2016

13k Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9901))

Abstract

The brain connectome provides an unprecedented degree of information about the organization of neuronal network architecture, both at a regional level, as well as regarding the entire brain network. Over the last several years the neuroimaging community has made tremendous advancements in the analysis of structural connectomes derived from white matter fiber tractography or functional connectomes derived from time-series blood oxygen level signals. However, computational techniques that combine structural and functional connectome data to discover complex relationships between fiber density and signal synchronization, including the relationship with health and disease, has not been consistently performed. To overcome this shortcoming, a novel connectome feature selection technique is proposed that uses hypergraphs to identify connectivity relationships when structural and functional connectome data is combined. Using publicly available connectome data from the UMCD database, experiments are provided that show SVM classifiers trained with structural and functional connectome features selected by our method are able to correctly identify autism subjects with 88 % accuracy. These results suggest our combined connectome feature selection approach may improve outcome forecasting in the context of autism.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Improvements in computational analyses of neuroimaging data now permit the assessment of whole brain maps of connectivity, commonly referred to as the brain connectome [7]. The brain connectome provides unprecedented information about global and regional conformations of neuronal network architecture (or network architecture for short) that is particularly relevant as it relates to neurological disorders. For this reason, the brain connectome has recently become instrumental in the investigation of network architecture organization and its relationship with health and disease, notably in the context of neurological conditions such as epilepsy, autism, Alzheimer’s, and Parkinson’s. In general, two connectome categories exist: (1) a structural connectome that is reconstructed using white matter fiber tractography from diffusion tensor imaging (DTI), and (2) a functional connectome that is reconstructed using resting-state time-series signal data from blood oxygen level dependent (BOLD) functional MRI (rsfMRI).

In mathematical terms, a connectome is a weighted undirected graph, where nodes in the graph represent brain regions (defined in an anatomical parcellation, or brain atlas), and the edge that connects two different nodes is weighted by a value that represents the level of neural-connectivity, or information exchange. To better understand how the brain network is organized, network analysis algorithms [4] are applied to the connectome to reveal the underlying network architecture of the brain, which can then be used to quantify the differences between healthy and disease conditions. Currently, network analysis techniques have mainly been applied to just structural or functional connectivity data. However, research that combines both types of data [1, 5, 6, 8] to better understand functional and structural connectivity relationships has gained attention in recent years.

Here a novel combined connectome feature selection technique is proposed that uses hypergraphs to discover latent relationships in node-based graph theoretic measures found in structural and function connectomes. The primary rational behind selecting features where structural and functional connectivity agree, is that fiber density and signal synchronization similarities are likely to be correlated, and when combined these similarities may be easier to identify and quantify. More specifically, for each diagnosis label (i.e. disease and healthy) the proposed feature selection technique uses a hypergraph learning algorithm to find a hypergraph Laplacian graph that combines structural and functional node-based connectivity measures. A hierarchical partitioning algorithm is then applied to the hypergraph Laplacian, which in turn creates a code vector that encodes structural and functional connectivity similarities. The resulting code vectors are then used to create a binary weight vector that only selects brain regions associated with structural or functional node-based connectivity measures capable of differentiating the disease condition from the healthy one. Lastly, the selected structural and functional connectome features are used to train a SVM classifier that can predict diagnosis label of subjects not included in the training procedure.

2 Materials and Methods

2.1 Participants and MRI Data Acquisition

All participant data was acquired from the publicly available University of Southern California (USC)/University of California Los Angeles (UCLA) multimodal connectivity database^{Footnote 1} (UMCD). In particular, high-functioning children and adolescents with an autism spectrum disorder (ASD), and healthy control (HC) children and adolescents were recruited. In total, the autism study has 70 participants (35 ASD and 35 HC) that had both rsfMRI and DTI scan data. A complete list of all the demographic data, including the scan parameters, from the original study can be found at [5].

2.2 Preprocessing and Connectome Reconstruction

Functional preprocessing steps were performed using the FSL^{Footnote 2} and AFNI^{Footnote 3} software libraries. In general, the following steps were performed: skull stripping, slice timing correction, motion correction with rigid-body alignment using MCFLIRT, geometric distortion correction using FUGUE. Structural preprocessing steps were performed using the FSL and Diffusion toolkit^{Footnote 4} software libraries. In general, the following steps were performed: skull stripping, eddy current correction, motion correction with rigid-body alignment using MCFLIRT, voxel-wise fractional anisotropy (FA) estimation, and fiber track assignment using FACT algorithm. A complete overview of all the preprocessing steps can be found at [5].

The FSL FEAT query function is then applied to the functional and structural preprocessed images. In particular, the atlas proposed in Power et al. [2] defines $m=264$ ROIs that are represented by a 10 mm diameter sphere. A symmetric $m \times m$ functional connectivity matrix $C_f$ is constructed using the extracted ROIs, where each element in the functional connectivity matrix reflects the signal synchronization between two ROIs, which is estimated by computing the correlation between two discrete time-series rsfMRI signals. Likewise, a symmetric $m \times m$ structural connectivity matrix $C_s$ is constructed using the same ROIs, where each element reflects the average number of fiber tracks, or fiber density, that connect the two ROIs.

2.3 Node-Based Connectome Feature Vector

The next step is to convert the values defined in C into node-based connectome feature vector $\mathbf{c}_{\alpha }=(c_{\alpha 1},\ldots , c_{\alpha m})$ using the betweenness centrality graph-theoretic connectivity measure, where $\alpha =s$ represents a node-based structural connectome feature vector, and $\alpha =f$ represents a node-based functional connectome feature vector. Betweenness centrality is a global measure that represents the fraction of shortest paths that go through a particular node (or brain region) defined in the connectome. The betweenness centrality measure for node i is

$$\begin{aligned} c_{\alpha i}=\frac{1}{(m-1)(m-2)}\sum _{h,j \in m} \frac{\rho _{hj}^i}{\rho _{hj}}, \end{aligned}$$

(1)

where $h \ne j$, $h \ne i$, $j \ne i$, The number of shortest path between node h and j is represented by $\rho _{hj}$, the number of these shortest paths going through node i is represented by $\rho _{hj}^i$. This is normalized to a value in [0 1], where $(m-1)(m-2)$ is the highest score attainable in the network.

2.4 Combined Connectome Feature Selection

Given a training data set $A=\{\mathbf{a}_{\phi } \}_{\phi =1}^n$ of n ASD subjects we compute set of graph Laplacians $\{L_{\phi }\}_{\phi = 1}^{n}$, where $\mathbf{a}_{\phi }=( \mathbf{c}_{\phi s}~|~\mathbf{c}_{\phi f})$ is a 2m dimension feature vector that combines structural and functional node-based connectome values for subject $\phi $. To do so, we first create a complete bipartite graph $G_{\phi }=(\mathbf{c}_{\phi s},\mathbf{c}_{\phi f},E_{\phi })$ for each subject, where the edge that connects structural node i to functional node j in the bipartite graph is weighted by $w_{ij}=1 - | c_{si} - c_{fj} |$. The proposed edge weight strategy has a very straight forward and intuitive meaning: If two brain regions both have similar connectivity values then $w_{ij} \approx 1$, conversely if two brain regions do not have similar connectivity values then $w_{ij} \approx 0$.

A $2m \times m^2$ dimension hypergraph incidence matrix $H_{\phi }$ for subject ${\phi }$ is then created using $G_{\phi }$. Because we use bipartite graph, it’s important to note that each hyper-edge only represents the structural-functional relationship between two node-based connectome features. Once $H_{\phi }$ is found, the normalized hypergraph Laplacian^{Footnote 5}

$$\begin{aligned} L_{\phi }= I - D^{-1/2}_{v} H_{\phi } D^{-1}_e H_{\phi }^t D^{-1/2}_{v} \end{aligned}$$

(2)

is computed [11], where $D_v$ is a diagonal matrix that defines the strength for each vertex in $H_{\phi }$, $D_e$ is a diagonal matrix that defines the strength for each edge in $H_{\phi }$, and I is the identity matrix. In general, our design has two advantages: (1) we only identify functional and structural connectivity relationships just between two different regions in the brain, and (2) the resulting hypergraph Laplacian is very sparse. A median hypergraph Laplacian $L_m$ is then found using each subject specific hypergraph Laplacian in $\{L_{\phi }\}_{\phi = 1}^{n}$, where $L_m(i,j)=median(\{L_1(i,j),L_2(i,j),\ldots ,L_n(i,j)\})$.

Eigen decomposition is applied to $L_m$ creating a 2m dimension embedding space and then a hierarchical partition is performed as illustrated in Fig. 1. More specifically, each embedding space partition in the hierarchy defines three cluster groups: (1) clusters that only have DTI brain regions, (2) clusters that only have rsfMRI brain regions, and (3) clusters that have both DTI and rsfMRI brain regions. At each partition the three cluster groups are found using the well-known normalized spectral clustering technique in [10]. However, instead of using a k-means algorithm the density estimation algorithm in [3] is applied, primarily because the number of clusters is automatically found and outliers can be automatically recognized and excluded. As shown in Fig. 1, the DTI and rsfMRI brain region cluster becomes the search space for the next partition in the hierarchy, and terminates when a DTI and rsfMRI brain region cluster does not exist.

In our approach, each partition level in the hierarchy represents a unique integer code, and partitions at the top of the hierarchy represent brain regions that show low structural and functional connectivity similarities (i.e. low code value), and partitions near the bottom of the hierarchy represent brain regions that show high structural and functional connectivity similarities (i.e. large code value). Lastly, a code vector $\mathbf{x}_{ad}=(x_{s1}, x_{s2}, \ldots , x_{sm}, x_{f1}, x_{f2}, \ldots , x_{fm})$ is created using the code values in partition hierarchy, and then normalized by simply dividing all the code values by the height of the partition hierarchy.

This exact same procedure outlined above is then applied to a training set of HC subjects, and a HC code vector $\mathbf{x}_{hc}$ is produced. Next, a weight vector

$$\begin{aligned} \mathbf{w} = |~\mathbf{x}_{ad} - \mathbf{x}_{hc}~| \end{aligned}$$

(3)

is created, where a weight value close to one represents structural or function brain regions that have dramatically different code values, which suggests these regions may better differentiate the disorder from the normal condition. On the other hand, a weight value close to zero represents structural or function brain regions that have the same (or very similar) code values, which suggests these regions may not be able to differentiate the disorder from the normal condition. Lastly, we make $\mathbf{w}$ binary by applying a threshold, i.e. $w_i \ge t_h = 1$ and $w_i < t_h = 0$. The primary motivation behind making the weight vector binary was to reduce the number of dimensions, which in turn will reduce the amount of error that may be introduced into the chosen classifier.

2.5 Linear SVM Classifier

Using a training data set $A=\{\mathbf{a}_{\phi } \}_{\phi =1}^n$ that now includes both ASD and HC subjects, the binary diagnosis labels $\mathbf{y}=(y_1,y_2,\ldots ,y_n)$, e.g. ASD = 1 and HC = 0, and the binary weight vector $\mathbf{w}$, a linear two-class SVM classifier based on the LIBSVM library^{Footnote 6} is trained. In particular, the binary values in $\mathbf{w}$ is applied to each feature vector in A, creating a new sparse training data matrix $\tilde{A}$. Finally, a SVM classifier is trained using $\tilde{A}$. Once the SVM classifier is trained, the diagnosis label of a subject not included in the training data set can be predicted as follows: First compute $\mathbf{a}=(\mathbf{c}_{s}~|~\mathbf{c}_{f})$, then create sparse feature vector $\tilde{\mathbf{a }}=(a_1 w_1, a_2 w_2, \cdots , a_{2m} w_{2m})$ by applying learned binary weights, and lastly calculate the predicted diagnosis label y using trained SVM classifier, where the sign of the y (i.e., $y \ge 0$ or $y <0$) determines the diagnosis label.

Since the proposed combined connectome feature selection has two free parameters, i.e. number of Eigen-values (or dimensions) used by cluster algorithm (d) and binary weight threshold ($t_h$) a grid search procedure is performed that uses 10-fold cross validation strategy. Specifically, an independent two-dimension grid-search procedure is performed for each left-out-fold, where the value stored at grid coordinate $(d,t_h)$ are the mean and standard deviation values for the accuracy (ACC), sensitivity (SEN), specificity (SPC), negative predictive value (NPV), and positive predictive value (PPV) measures. In particular, d is adjusted at increments of 1 starting at 1 and ending at 2m, and $t_h$ is adjusted at increments of 0.05 starting at 0.1 and ending at 1.0. Lastly, when the grid-search procedures completes the parameter values that have the highest ACC and PPV scores are selected.

3 Results

The grid search parameters the yielded the best ACC and PPV classification results are $d=3$ and $t_h = 0.8$. To assess the performance of the proposed feature selection method, SVM classifiers are also trained using structural and functional connectome features in training data set A that are selected by: (1) a linear regression technique that includes $\ell _1$ regularization (i.e. Lasso), and (2) no feature selection. As shown in Table 1, a SVM classifier trained using structural and functional connectome features selected by the proposed method is the most accurate at $88.3\,\%$, can predict the disease case (i.e. PPV) approximately $87.2\,\%$ of the time, and consistently shows the highest sensitivity, specificity, and NPV.

Table 1. ASD vs. HC 10-fold classification results in $\bar{x} \pm \sigma $ format. The highest performance measures are shown in bold font.

Full size table

The bar plots in Fig. 2 show the median^{Footnote 7} structural and functional weight values found using Eq. (3) when grid search parameter $t_h=0.8$ is used. The SVM classifier in Table 1 is trained only using the node-based connectivity values from the selected 47 regions (24 structural regions and 23 functional regions) also shown in Fig. 2. In general, the 47 regions have largest difference in code values, which suggests the structural and functional connectivity characteristics in these brain regions are significantly different between ASD and HC subjects.

Lastly, Fig. 3 shows the median (See footnote 7) top, middle, and bottom DTI and rsfMRI regions in the learned partition hierarchy. Included are tables that list the brain regions in the bottom level (i.e. last partition) of the hierarchy. These regions have the most similar structural and functional connectivity characteristics. Note: The term shared in this figure means in this grouping the same brain region is present in both connectomes.

4 Conclusion

A novel connectome feature selection technique is proposed that uses a hypergraph learning algorithm to identify brain regions that have similar structural and functional connectivity characteristics. Compared to other well-known feature selection techniques, SVM classifiers trained using structural and functional connectome features selected by our method are significantly better than SVM classifiers trained using connectome features selected by a state-of-the-art regression algorithm. Furthermore, since our approach converts a subject specific complete bipartite graph to an incidence matrix, the resulting incidence matrix is very sparse, which in turn greatly improves the space and time complexity of our approach. Visualizations that display brain regions in the top, middle, and bottom partitions in the proposed partition hierarchy show significant structural and functional connectivity differences in ASD and HC subjects and as seen in Fig. 3. Lastly, even though the betweenness centrality node-based connectivity measure is used, our method achieved similar accuracy and PPV classification results (mean ${\pm 3\,\%})$ when replaced by the Eigenvector centrality or clustering coefficient connectivity measures.

Notes

1.
http://umcd.humanconnectomeproject.org.
2.
http://www.fmrib.ox.ac.uk/fsl.
3.
https://afni.nimh.nih.gov/afni/.
4.
http://trackvis.org/dtk/.
5.
In our approach each hyper edge has the same influence, therefore W is the identity matrix and is omitted in Eq. (2).
6.
http://www.csie.ntu.edu.tw/~cjlin/libsvm.
7.
Median value is found using the results from all 10 folds.

References

Greicius, M.D., Supekar, K., Menon, V., Dougherty, R.F.: Resting-state functional connectivity reflects structural connectivity in the default mode network. Cereb. Cortex 19(1), 72–78 (2009)
Article Google Scholar
Power, J.D., Barnes, K.A., Snyder, A.Z., Schlaggar, B.L., Petersen, S.E.: Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage 59(3), 2142–2154 (2012)
Article Google Scholar
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Article Google Scholar
Rubinov, M., Sporns, O.: Complex network measures of brain connectivity: Uses and interpretations. NeuroImage 52(3), 1059–1069 (2010)
Article Google Scholar
Rudie, J., Brown, J., Beck-Pancer, D., Hernandez, L., Dennis, E., Thompson, P., Bookheimer, S., Dapretto, M.: Altered functional and structural brain network organization in autism. NeuroImage Clin. 2, 79–94 (2013)
Article Google Scholar
Saur, D., Schelter, B., Schnell, S., Kratochvil, D., Kpper, H., Kellmeyer, P., Kmmerer, D., Klppel, S., Glauche, V., Lange, R., Mader, W., Feess, D., Timmer, J., Weiller, C.: Combining functional and anatomical connectivity reveals brain networks for auditory language comprehension. NeuroImage 49(4), 3187–3197 (2010)
Article Google Scholar
Sporns, O.: The human connectome: origins and challenges. Neuroimage 80, 53–61 (2013)
Article Google Scholar
Teipel, S.J., Bokde, A.L., Meindl, T., Amaro, E., Soldner, J., Reiser, M.F., Herpertz, S.C., Moller, H.J., Hampel, H.: White matter microstructure underlying default mode network connectivity in the human brain. Neuroimage 49(3), 2021–2032 (2010)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 58, 267–288 (1994)
MathSciNet MATH Google Scholar
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Article MathSciNet Google Scholar
Zhu, D., Li, K., Terry, D.P., Puente, A.N., Wang, L., Shen, D., Miller, L.S., Liu, T.: Connectome-scale assessments of structural and functional connectivity in MCI. Hum. Brain Mapp. 35(7), 2911–2923 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, College of Charleston, Charleston, South Carolina, USA
Brent C. Munsell & Nicholas Desisto
Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Guorong Wu
School of Software, Tsinghua University, Beijing, China
Yue Gao
Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Martin Styner

Authors

Brent C. Munsell
View author publications
You can also search for this author in PubMed Google Scholar
Guorong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yue Gao
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Desisto
View author publications
You can also search for this author in PubMed Google Scholar
Martin Styner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brent C. Munsell .

Editor information

Editors and Affiliations

University College London , London, United Kingdom
Sebastien Ourselin
The Hebrew University of Jerusalem , Jerusalem, Israel
Leo Joskowicz
Harvard Medical School , Boston, Massachusetts, USA
Mert R. Sabuncu
Istanbul Technical University , Istanbul, Turkey
Gozde Unal
Harvard Medical School and Brigham and Women's Hospital, Boston, Massachusetts, USA
William Wells

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Munsell, B.C., Wu, G., Gao, Y., Desisto, N., Styner, M. (2016). Identifying Relationships in Functional and Structural Connectome Data Using a Hypergraph Learning Method. In: Ourselin, S., Joskowicz, L., Sabuncu, M., Unal, G., Wells, W. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016. MICCAI 2016. Lecture Notes in Computer Science(), vol 9901. Springer, Cham. https://doi.org/10.1007/978-3-319-46723-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-46723-8_2
Published: 02 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46722-1
Online ISBN: 978-3-319-46723-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)