Keywords

1 Introduction

Exploring the organizational architecture of human brain function has been of great interest in neuroscience community [1]. After decades of active research using noninvasive neuroimaging methods such as functional magnetic resonance imaging (fMRI), there has been mounting evidence that the brain function is realized by the interaction of multiple concurrent neural process or functional brain networks [2] and these networks are spatially distributed across specific structural substrate of neuroanatomical areas [3]. In these fMRI based studies, researchers developed a variety of brain network reconstruction and modeling techniques, such as the general linear model (GLM) [4], independent component analysis (ICA) [5] and sparse representation/dictionary learning methods [6, 7]. These methods reconstructed many meaningful functional brain networks which are characterized by both spatial maps and corresponding temporal time series from both tfMRI and rsfMRI data sets and greatly advanced our understanding of the regularity and variability of brain functions [4, 5].

However, those existing approaches which are based on shallow models are limited in faithful reconstruction and modeling of the hierarchical and temporal structures of brain functional networks in tfMRI data [8]. Recently, deep learning methods have attracted much attention in a variety of challenges [9]. The success of deep learning methods lies in the ability of automatically and hierarchically representing the raw data. Inspired by the great success of deep learning methods, more and more researchers applied deep learning methods in functional brain network analysis [10,11,12]. Although recent works demonstrate the advantages of deep learning methods, information of multiple temporal scales is rarely taken into consideration in these models, although it is known that brain activities have multiple time scales [13].

Recently, recurrent neural networks (RNNs) are gaining more and more attention [14]. Unlike traditional neural networks, RNNs can use their internal memory unit to process arbitrary sequences of inputs and model the sequential and time dependencies on multiple time scales [15]. That is, RNN models make their predictions based on not only the information available at a given time, but also the information that was available in the past. Actually, the brain activity is modulated by long temporal dependencies [16], which quite coincides with the characteristics of RNN models. Therefore, it is quite natural and well justified to adopt RNNs to explore the brain functional networks in tfMRI data. However, it has been rarely explored whether RNNs can be utilized to infer functional brain networks with the whole brain tfMRI data. In order to explore the possible advantages of RNN models, in this study, we proposed a novel, alternative framework of deep recurrent neural network (DRNN) for modeling functional brain networks in tfMRI data. An important characteristic of DRNN framework is that the task stimulus information is sequentially processed through the model and it automatically generates the observed whole brain voxel signals. In this way, the hierarchical and temporal structures of the brain activities are captured and brain networks of multiple time scales (especially time dependency sensitive brain networks) can be identified. We used the motor task tfMRI dataset of HCP 900 subjects data release as a test-bed, and extensive experimental results demonstrated the superiority of the proposed method in identifying functional brain networks of multiple time scales in tfMRI.

2 Materials and Methods

2.1 Overview

Figure 1 summarizes the proposed deep recurrent neural network (DRNN) model. There are three major steps to model tfMRI functional brain networks using DRNN. First, for each subject, the task design stimulus curves are gathered into a stimulus matrix \( \varvec{X } \) (\( k \) stimuli with \( t \) time series) as the input layer and the whole brain tfMRI signals are aggregated into a big signal matrix \( \varvec{Y } \) (\( m \) voxels with \( t \) time series). Then these task stimulus patterns passed through two hidden layers and each layer is of \( n_{h} \) RNN units, respectively. Next, the response of top hidden layer is connected to the whole brain signal matrix (m voxels’ signals with t time series) via a fully connected layer ([nh, m]). Specifically, each hidden node’s connection weight vector represents a typical functional brain network, and its corresponding hidden response to specific stimulus patterns represents the temporal activity pattern of the network.

Fig. 1.
figure 1

Overview of the DRNN model.

2.2 Data Acquisition and Pre-processing

The Human Connectome Project (HCP) dataset is one of the most systematic and comprehensive neuroimaging data set in current stage which aims to bring data from the major MRI neuroimaging modalities together into a cohesive framework to enable detailed comparisons between brain architecture, connectivity, and function across individual subjects. Importantly, this data set is publicly available which makes it a good test bed for different researchers. In this paper, we adopt motor tfMRI dataset of HCP 900 subjects data release to test our proposed method. The detailed design paradigms of motor task and other tasks are available in [17].

The detailed acquisition parameters of these tfMRI data were set as follows: 220 mm FOV, in-plane FOV: 208 × 180 mm, flip angle = 52, BW = 2290 Hz/Px, 2 × 2×2 mm spatial resolution, 90 × 104 matrix, 72 slices, TR = 0.72 s, TE = 33.1 ms. The preprocessing of the task fMRI data sets includes skull removal, motion correction, slice time correction, spatial smoothing, and global drift removal (high-pass filtering). All these preprocessing steps were implemented in FSL FEAT. All of these individual fMRI datasets are first registered to MNI common space for further study. Besides, the GLM-based activation results are also derived using FSL FEAT for comparison.

2.3 Deep Recurrent Neural Network Model

RNNs are feedforward neural networks augmented with edges spanning adjacent time steps where connections between units form a directed cycle. These connections introduce a notation of time and provide memory of past state. In contrast with traditional neural networks which only receive information at the bottom layer and output at the highest layer, RNNs receive input and produce output at each iteration step. However, a common RNN only process information through one layer before going to output, which could not provide hierarchical structure of processing the input information and the temporal hierarchy of input signals is not clear. In order to overcome these limitations, we propose a deep recurrent neural network (DRNN) framework for modeling functional brain networks in tfMRI data. The basic idea of DRNN is stacking RNNs to construct a hierarchical network architecture. Each hidden layer is a recurrent neural network and the hidden state of each layer is the input of next layer. In this way, new information propagates throughout the hierarchy during each network update and temporal context is added in each layer (Fig. 2). As demonstrated in character-based language modelling studies [15], stacking RNNs automatically creates different time scales across different levels and also forms a temporal hierarchical information processing structure.

Fig. 2.
figure 2

Illustrative map of DRNN. Blue circle represents input units, green circle represents hidden units, and red circle represents output units.

We define a DRNN with \( L \) layers and each layer has \( n_{i} \) hidden units. The input sequence is denoted as \( \left( {\varvec{x}^{\left( 1 \right)} ,\varvec{x}^{\left( 2 \right)} , \ldots ,\varvec{x}^{\left( t \right)} } \right) \) where each data point is a real-valued vector and the target sequence is denoted as \( \left( {\varvec{y}^{\left( 1 \right)} ,\varvec{y}^{\left( 2 \right)} , \ldots ,\varvec{y}^{\left( t \right)} } \right) \) and the hidden state of i-th layer is denoted as \( \varvec{h}_{i}^{\left( t \right)} \). In order to avoid confusion between the indices of nodes and sequence steps, we use superscripts for time and subscripts for layer index. The output of DRNN model can be modeled as Eq. (1), where \( \hat{\varvec{y}}^{t} \) is the estimated output from the top hidden layer and \( \varvec{V} \) is the weight matrix between hidden layer and output, and \( {\mathbf{b}}_{i} \) is the bias parameters which contain the offset of each node.

$$ \hat{\varvec{y}}^{t} = \sigma \left( {\varvec{Vh}_{i}^{t} + {\mathbf{b}}_{i} } \right) $$
(1)

There are different types of RNN architectures and the long short-term memory (LSTM) is among the most popular specialized memory units of RNNs, which is developed for long time series. The first hidden states of an LSTM unit are defined as:

$$ \varvec{h}^{t} = \varvec{o}^{t} \odot {\tanh} \left( {\varvec{c}^{t} } \right) $$
(2)
$$ \varvec{o}^{t} = \sigma \left( {\varvec{U}_{o} \varvec{h}^{t - 1} + \varvec{W}_{o} {\mathbf{x}}^{t} + \varvec{b}_{o} } \right) $$
(3)

where \( \varvec{c}^{t} \) is the cell state, \( \varvec{o}^{t} \) are the output gate activities, and \( \odot \) denotes elementwise multiplication. Information about the previous time points is stored in the cell state. What information will be retrieved from the cell state is controlled by the output gate. The second-layer and upper hidden states are defined similarly to the first-layer hidden/cell states, except for the input is replaced with the output of first-layer hidden states. The parameters in the DRNN framework is optimized to minimize the mean square error between the whole brain signals and their reconstructions. The TensorFlow [18] system is adopted to implement the models.

2.4 Identification of Functional Brain Networks

In the DRNN model, the task design stimulus information is separated in different time points and put into the model step by step in each iteration. In each network update, new information is propagated to the hierarchical structure and temporal context is added in each RNN layer. Each hidden layer in the DRNN is a recurrent neural network and each upper layer receives the hidden state information from previous layer as input. Thus, the output information through the stacking RNNs structure is of different time scales. Finally, the top hidden layer’s output is connected to the whole brain signal matrix via a fully connected layer. Specifically, each hidden node’s connection weight vector represents a typical functional brain network’s spatial distribution and its corresponding hidden response to specific stimulus represents the temporal pattern of the network. In order to compare the derived brain networks with those by other methods, a spatial matching method is adopted to calculate the spatial similarity between the identified networks and the network templates derived from other methods. The spatial similarity is defined as the spatial pattern overlap rate R:

$$ R\left( {\varvec{S},\varvec{T}} \right) = \frac{{\left| {\varvec{S}\,{\bigcap }\,\varvec{T}} \right|}}{{\left| \varvec{T} \right|}} $$
(4)

where \( \varvec{S} \) and \( \varvec{T} \) are cortical spatial maps of a brain network component and the brain network template, respectively.

3 Experimental Results

3.1 Identified Typical Functional Brain Networks

Figure 3 illustrates a few typical brain networks identified on the motor tfMRI dataset of HCP 900 subjects release using DRNN model. For comparison, we also list the GLM group-wise activation maps on the right column. This figure clearly shows that part of our trained functional networks are quite similar to the corresponding GLM activation maps. In order to quantitatively measure the similarity, we adopt the Eq. (4) to calculate the spatial overlap rate between the identified DRNN networks and the corresponding GLM activation maps, which are listed in the first row of Table 1. In addition, the corresponding temporal patterns are also quite similar to the common HRF response patterns (convolution results of task design paradigm and HRF function). Figure 4 shows that the corresponding temporal response patterns, the task design pattern and the HRF response patterns. It is easy to see that the temporal patterns of DRNN brain networks have high correlations to the HRF responses. Through comparisons, the high spatial overlap rate and close temporal correlation suggest that the proposed DRNN model can identify meaningful and reliable functional networks in an automatic way.

Fig. 3.
figure 3

A few identified functional brain networks in the motor task tfMRI dataset of HCP 900 subjects release. The left is the networks identified using DRNN model with LSTM units and the right is the GLM-derived group-wise activation maps. M1–M6 represent different stimuli.

Table 1. The first row shows the spatial overlap rate between the identified networks by DRNN and the corresponding GLM-derived group-wise activation maps. The second row shows the Pearson correlation between the temporal pattern and the common HRF response patterns.
Fig. 4.
figure 4

Temporal response patterns corresponding to the identified brain networks in Fig. 3.

3.2 Identified Functional Brain Networks of Multiple Time Scales

During the training stage, the task stimulus information goes through the hierarchical and temporal model iteratively, and the final output naturally reflects the brain network’s responses to the original stimulus information crossing multiple time scales. After training stage, we input each stimulus separately and obtain the corresponding temporal patterns for each network. In order to better interpret the identified functional brain networks, we further calculated the correlations between the identified temporal brain activity patterns and the theoretical regressor groups which were adopted in previous literature studies [12]. Essentially, the theoretical regressor groups represent the possible multiple time scale brain responses. Our basic idea is that if a specific temporal pattern is highly correlated with an extended theoretical regressor, the corresponding identified DRNN network should belong to the similar time scale network.

Figure 5 shows the temporal correlation maps between temporal response patterns of the 30 identified DRNN networks using Stimulus M6 and the extended hypothetical regressor groups in [12]. Similarly, we also extended the basic HRF response patterns with multiple delays, derivate, integral and inverse operation. From this figure, we can see that there are a few network temporal patterns highly correlated with the extended hypothetical regressors, and they represent the identified different time scales of brain networks. Figure 6 illustrates a few typical identified different time scales of brain networks and corresponding temporal patterns. From this result, we can see that a variety of time scales of theoretical response networks including multiple delays, multiple inversed HRFs and delays, different derivative and integral operations could be identified. We further checked the spatial patterns of these networks and it is interesting that these networks are similar but not the same. This is reasonable since these networks are evoked by the same stimulus but at different time scales. These multiple time scales of brain networks can be effectively identified with the DRNN framework, which is a major advantage of the proposed model.

Fig. 5.
figure 5

Temporal correlation maps between temporal response patterns of the identified 30 DRNN networks and the extended hypothetical regressor groups in [12]. (a) HRF delay group; (b) derivative form group; (c) integral form group; (d) inversed HRF group; (e) inversed derivative form group; (f) inversed integral form group. In each subfigure, each row represents a DRNN network and 7 columns represent 7 different time delays with an interval of 3 s.

Fig. 6.
figure 6

The spatial and temporal patterns of a few identified brain networks of multiple time scales shown in Fig. 5.

The proposed DRNN model was also applied on half of the HCP Q1 release dataset (34 subjects) and obtained similar and consistent results. However, more training data (HCP Q3 release) will improve the reliability and interpretive of the results. L1 and L2 norm regularization were tried during the training stage, but the training loss increased rapidly with either regularization. Therefore, only MSE was taken as the loss function.

4 Discussion and Conclusion

In this work, we proposed a novel deep recurrent neural network (DRNN) for modeling functional brain networks in tfMRI data. The DRNN framework naturally combines the common deep neural networks with RNN. Each hidden layer of DRNN is a recurrent neural network and the output of each layer is the input time series of the upper layer. This structure automatically creates different time scales across different levels and thus form a temporal hierarchy. After training with the task stimulus, the whole brain voxel signals are automatically reconstructed with the top hidden layer output. Specifically, the weight vector between the hidden units and the whole brain fMRI signals describes the spatial distribution of this network and the top hidden layer’s output under specific stimuli naturally represents the corresponding temporal patterns of the brain network. The hierarchical and temporal information of the brain activities is captured, and different time scales of brain networks can be identified. Extensive experiment results demonstrate the superiority of the proposed DRNN framework.