Abstract
Obsessive-compulsive disorder (OCD) is a serious mental illness that affects the overall quality of patients’ daily life. Since sparse learning can remove redundant information in resting-state functional magnetic resonance imaging (rs-fMRI) data via the brain functional connectivity network (BFCN) and retain good biological characteristics, it is an important method for OCD analysis. However, most existing methods ignore the relationship among subjects. To solve this problem, we propose a smoothing sparse network (SSN) to construct BFCN. Specifically, we add a smoothing term in the model to constrain the relationship and increase the similarity among the subjects. As a kind of deep learning method, the stacked sparse auto-encoder (SSAE) can learn the high level internal features from data and reduce its dimension. For this reason, we design an improved SSAE to learn the high level features of BFCN and reduce the data dimension. We add a \( \ell_{2} \)-norm to prevent overfitting as well. We apply this framework on OCD dataset self-collected from local hospitals. The experimental results show that our method can achieve quite promising performance and outperform the state-of-the-art methods.
This work was supported partly by National Natural Science Foundation of China (Nos. 31871113, 61871274, 61801305 and 81571758), National Natural Science Foundation of Guangdong Province (No. 2017A030313377), Guangdong Pearl River Talents Plan (2016ZT06S220), Shenzhen Peacock Plan (Nos. KQTD2016053112051497 and KQTD2015033016 104926), and Shenzhen Key Basic Research Project (Nos. JCYJ2017 0413152804728, JCYJ20180507184647636, JCYJ20170818142347251 and JCYJ20170818094109846).
1 Introduction
Obsessive-compulsive disorder (OCD) is a mental disease characterized by compulsive thoughts or behaviors, which often causes negative impacts on the daily life of patients [1]. According to clinical studies, OCD is hereditary and the siblings show similar symptoms. About 2% to 3% of people are affected by this disease in the world. However, there are still no accurate physiological and biochemical indicators for the diagnosis of patients with OCD in the clinic. Also, OCD often co-occurs with depression and anxiety, which may cause misdiagnosis [2].
For accurate and objective OCD diagnosis, it is known that resting functional magnetic resonance imaging (rs-fMRI) can show a steady-state pattern of brain co-activation. To achieve this, the brain functional connectivity network (BFCN) is first built from rs-fMRI to understand the functional interactions among the brain areas. Recently, many BFCN methods have been proposed. For example, Sen et al. [2] combined Pearson’s correlation (PC) network and adjacent matrices features selected by minimum redundancy maximum relevance method for OCD diagnosis. However, this method only considers the relationship between the two brain regions, which ignores the relationship among the target brain region and other multiple brain regions. To enhance it, Xing et al. [3] proposed the Riemann kernel to build BFCN and used principal components analysis (PCA) to reduce feature dimensions. However, this BFCN is too dense to represent features well. To construct a BFCN with less density, Wee et al. [4] proposed a group-constrained sparse (GCS) model to construct BFCN for mild cognitive impairment identification. Although this method removes a lot of irrelevant information, the data dimension of the BFCN features is still very high. Also, this method ignores the similarity among subjects. There are some commonly used methods (e.g., Lasso, PCA) for reducing the data dimension, but these methods cannot learn the inside relation of BFCN features. First, we propose a smoothing sparse network (SSN) to construct the BFCN based on GCS method, which can control the BFCN density and add similarity constraints among subjects.
The deep learning method has witnessed great success by addressing the issue of dimensionality curse. For example, Chen et al. [5] used the sparse auto-encoder (SAE) in polarimetric synthetic aperture radar image for reducing the data dimension. The stacked sparse auto-encoder (SSAE) can stack multiple SAEs to learn high level features and reduce the data dimension. This method has achieved good results for nuclei detection [6]. Inspired by these methods, we propose a novel method which combines the techniques of traditional machine learning and deep learning for OCD diagnosis. Specifically, the features extracted from BFCN are fed to the \( \ell_{2} \) regularized SSAE to learn high level features, which can express disease-related features and reduce the data dimension as well. Our method can learn the nonlinear relationship inside the feature to reduce feature dimension and the high level features are exploited for OCD diagnosis. Our method can not only consider the similarities of the subjects, but also learn the advanced features in BFCN to reduce the dimensions of the data. Experimental results on our self-collected data show that our method has achieved quite promising performance.
2 Methodology
2.1 Proposed Framework
Figure 1 shows our framework combining the traditional machine learning and deep learning techniques for OCD diagnosis. Firstly, we preprocess the original rs-fMRI data in a standard way. Secondly, we construct BFCN by the SSN method. Then, SSAE is applied to learn high level features, which can reduce the feature dimension effectively and enhance the feature representation ability for final classification.
2.2 Data Acquisition and Image Preprocessing
A Philips Medical Systems with 3.0-T MR was used for data acquisition. Subjects were instructed to relax with his eyes closed and remain awake without moving. The parameters are defined as follow: TR = 2000 ms; TE = 60 ms; flip angle = 90°, 33 slices, field of view = 240 mm × 240 mm, matrix = 64 × 64; slice thickness = 4.0 mm. The Statistical Parametric Mapping toolbox (SPM8), and Data Processing Assistant for Resting-State fMRI (DPARSFA, version 2.2) were used to preprocess the data. We discard the first 10 rs-fMRI volumes of each subject before any further processing to keep the magnetization equal. The remaining 170 volumes are corrected by the staggered sequence of slice collection, which takes advantage of the echo planar scan to ensure that the data on each slice corresponds to the unanimous point in time. The image preprocessing including: slice timing correction; head motion correction; realignment with the corresponding T1-volume; nuisance covariate regression (six head motion parameters, white matter signal and cerebrospinal fluid signal); spatial normalization into the stereotactic space of the Montreal Neurological Institute and resampling at 3 × 3 × 3 mm3; spatial smoothing with a 6-mm full-width half-maximum isotropic Gaussian kernel, and band-pass filtered (0.01–0.08 Hz).
The rs-fMRI is divided into 116 regions of interest (ROIs) using the automatic anatomical labeling (AAL) template. In addition, a high-pass filter is used to refine the average rs-fMRI time series of each brain region. Furthermore, we regress out head movement parameters, cerebrospinal fluid, and mean BOLD time series of the white matter. We extract the mean of the BOLD signal as the original rs-fMRI signal [7].
2.3 Smoothing Sparse Network
In this paper, matrices are represented in bold capital letters, the vectors are in bold lowercase letters, and the scalars are in normal italic letters. Assuming that there are N subjects and \( {\mathbf{X}} \) = \( \left[ {{\mathbf{x}}_{1} , \ldots ,{\mathbf{x}}_{r} , \ldots ,{\mathbf{x}}_{R} } \right] \in {\mathbb{R}}^{R \times N} \) denotes our input data, the AAL template is utilized to divide the brain into \( R \) ROIs and the \( r \)-th ROI with a BOLD regional mean time series (\( M \) length) is represented \( {\mathbf{x}}_{r}^{n} = \left[ {x_{1r}^{n} ,x_{2r}^{n} , \ldots ,x_{Mr}^{n} } \right] \in {\mathbb{R}}^{M \times 1} \). \( {\mathbf{A}}_{r}^{\varvec{n}} \) denotes all ROIs signal matrix except \( {\mathbf{x}}_{r}^{n} \), \( {\mathbf{A}}_{r}^{\varvec{n}} = \left[ {{\mathbf{x}}_{1}^{n} , \ldots {\mathbf{x}}_{r - 1}^{n} , \ldots {\mathbf{x}}_{r + 1}^{n} , \ldots {\mathbf{x}}_{R}^{n} } \right] \), \( {\mathbf{w}}_{r}^{n} \in {\mathbb{R}}^{R - 1} \) is a weighting regression coefficient vector, and \( {\mathbf{W}}_{r} = \left[ {{\mathbf{w}}_{r}^{1} , \ldots {\mathbf{w}}_{r}^{n} , \ldots ,{\mathbf{w}}_{r}^{N} } \right] \). The sparse networks used to represent brain functional connectivity can be constructed using GCS, which is defined as
where \( R_{g} ({\mathbf{W}}_{r} ) \) is a group regularization and defined as
where \( \lambda_{1} \) is the group regularization parameter, \( \left| {\left| {{\mathbf{W}}_{r} } \right|} \right|_{2,1} \) is the summation of \( l_{2} \)-norm of \( {\mathbf{w}}_{r}^{n} \). Specifically, we jointly select information by \( R - 1 \) ROIs’ weights. \( {\mathbf{w}}_{r}^{d} \) is the \( d \)-th row vector of \( {\mathbf{W}}_{r} \). As a sparse regression network method, GCS ensures all models in the unequal group with identical connections. The \( l_{2} \)-norm is imposed on identical elements across the unequal matrix \( {\mathbf{W}}_{r} \), which forces the weight corresponding to connections across different subjects to be grouped together. The constraint imposes a common connection topology among subjects, and leverages variation of connection weights among them. Therefore, the model is able to rebuild the target ROI using the remaining ROIs. Moreover, the reconstruction of each ROI is independent from others. However, the existing GCS model with penalty ignores the smoothing properties of different subjects within the model. To overcome this drawback, a novel model is devised to jointly learn shared functional brain networks of each subject by the group sparse regularization and the smoothness regularization. The objective function is defined as
where \( R_{g} ({\mathbf{W}}_{r} ) \) is the group regularization, and \( R_{s} ({\mathbf{W}}_{r} ) \) denotes the smoothness regularization, which is denoted as
where \( \lambda_{2} \) is the parameter of smoothness regularization. The second term \( \left| {\left| {{\mathbf{w}}_{r}^{n} - {\mathbf{w}}_{r}^{n + 1} } \right|} \right|_{1} \) constrains the diversity between two consecutive weighting vectors from the same groups to be as small as possible. When the smoothness regularization parameter \( \lambda_{2} \) is zero, the proposed method reproduces the original GCS method. Due to the use of \( l_{1} \)-norm in the fused smoothness term that encourages the sparsity on difference of weight vectors, there will be a lot of zero components in the weight difference vectors. The informative features will be selected due to the non-zero weights in our task. We introduce the smoothness terms to smooth the connectivity coefficients of the subjects. In addition, we fuse the regularization terms to impose a high level of constraints. We call this sparse learning model as smoothing sparse network. The asymmetrical BFCN does not contribute to the final classification accuracy. Hence, \( {\mathbf{W}}^{ *} = \left( {{\mathbf{W}}_{n} + {\mathbf{W}}_{n}^{\text{T}} } \right) \)/2 is defined to obtain symmetry. The local clustering coefficients of weighted graphs are used to extract features from each established BFCN [8, 9].
2.4 Stacked Sparse Auto-Encoder
Auto-encoder (AE) mainly consists of an encoding network and a decoding network, which is a symmetric neural network with only one hidden layer. The encoder network converts the input data from a high dimensional space to a feature space with a lower dimension, and the decoder network can reconstruct the input data from the feature space. Multiple auto-encoders can form a stacked AE, which can learn the high level features from the original features. SAE is one of the classic variants of learning relatively sparse features by penalizing hidden unit deviations. It improves the performance of traditional AEs and demonstrates more practical application. Our proposed SSAE not only learns high level features, but also controls the sparsity of features, which is more beneficial to the improvement of classification performance. The function of SSAE is denoted as:
The first item is the mean squared error. The second item is the \( \ell_{2} \) regularization part on encoding weights, \( \beta_{1} \) is the penalty coefficient of \( \ell_{2} \) regularization term. The third item is the sparsity constraint term, where \( \beta_{2} \) is the coefficient of the sparsity constraint term. \( \varvec{S}_{2} \left( \rho \right) \) is the Kullback-Leibler (KL) divergence. \( {\mathbf{X}} \in {\mathbb{R}}^{N \times F} \) denotes the input data, and \( {\mathbf{Y}} \in {\mathbb{R}}^{N \times D} \) denotes the reconstructed data. For N subjects, \( F \) and \( D \) represent the feature dimensions of the input data and the reconstructed data, respectively. \( {\mathbf{Z}} \in {\mathbb{R}}^{N \times K} \varvec{ } \) means the activation matrix of a hidden layer, which has K nodes. The weights \( {\mathbf{w}}_{1} \) and the bias \( {\mathbf{b}}_{1} \) are used to encode the input data \( {\mathbf{X}} \) as the activation matrix \( {\mathbf{Z}} \), where \( {\mathbf{Z}} = f\left( {{\mathbf{w}}_{1} {\mathbf{X}} + {\mathbf{b}}_{1} } \right) \). The weights \( {\mathbf{w}}_{2} \) and bias \( {\mathbf{b}}_{2} \varvec{ } \) decode activation matrix \( {\mathbf{Z}},\varvec{ } {\mathbf{Y}} = f\left( {{\mathbf{w}}_{2} {\mathbf{Z}} + {\mathbf{b}}_{2} } \right) \). \( \rho_{k}^{'} = \frac{1}{N}\mathop \sum \limits_{n = 1}^{N} \left[ {\varvec{Z}_{k} \left( {{\mathbf{x}}_{n} } \right)} \right] \) is the average activation of the k-th hidden node, and \( \rho \) is a constant. Then, the weights \( {\mathbf{w}} \) and bias \( {\mathbf{b}} \) are optimized by the scaled conjugate gradient descent algorithm. Therefore, the output \( {\mathbf{Z}} \) of each layer in AE is regarded as the input data of the next layer. In this paper, the \( \ell_{2} \) regularized SSAE contains two different AEs.
3 Experiments and Results
3.1 Experimental Setup
In this paper, we collect 180 subjects’ rs-fMRI data from local hospital, which contains 62 OCD patients, 53 OCD patients’ sibling and 65 normal control (NC) people. All data was collected from the Chinese Han population and was marked by two highly trained and experienced clinical psychiatrists and psychologists.
Since we have a small amount of data, the leave-one-out cross-validation (LOOCV) strategy is used to assess our proposed method. Specifically, given N subjects, one of them is left out for testing, and the rest N-1subjects are utilized for training. The hyperparameters in each method are empirically set by the greedy search to identify the optimal parameters. The three quantitative measurements are utilized to evaluate the diagnosis performance: accuracy (ACC), area under receiver operating characteristic curve (AUC) and sensitivity (SEN). The experiments are conducted using MATLAB 2018a to verify our proposed method. The SLEP and LibSVM toolboxes are used to construct sparse representation and classification, respectively.
3.2 Classification Performance
The experimental results are shown in Table 1 (boldfaces represent the best performance). The receiver operating characteristic (ROC) curves is shown in Fig. 2. To demonstrate the effectiveness of our proposed SSN approach, our BFCN construction method is compared to typical BFCN such as PC and GCS. Also, our classification results are compared to typical dimensionality reduction (DR) methods such as PCA, Lasso and SAE.
-
PCP: PCP uses PC for BFCN construction and PCA to reduce the features.
-
PCL: PCL uses PC to construct BFCN and Lasso for feature selection.
-
PCS: PCS utilizes PC to generate the BFCN and SAE to reduce the data dimension.
-
PCSS: PCSS uses PC to generate the BFCN and SSAE to reduce the data dimension.
-
GCSP: GCSP uses GCS to get BFCN and PCA to reduce the features.
-
GCSL: GCSL uses GCS to get BFCN and Lasso for feature selection.
-
GCSS: GCSS uses GCS to get BFCN and SAE to reduce the features.
-
GCSSS: GCSSS uses GCS to get the BFCN and SSAE to reduce the data dimension.
-
SSNP: SSNP uses SSN to get BFCN and PCA to reduce the features.
-
SSNL: SSNL utilizes SSN to get BFCN and Lasso for feature selection.
-
SSNS: SSNS uses SSN to get BFCN and SAE to reduce the data dimension.
-
SSNSS: SSNSS uses SSN to get BFCN and SSAE for feature dimension reduction.
Obviously, the SSNSS method achieves the best performance. In the classification task of OCD vs. NC, the highest accuracy of our SSNSS method is 88.82%, which is 6.30% higher than other methods. Similarly, for Sibling vs. NC task, our SSNSS model achieves an accuracy of 79.15%, which is 2.03% higher than other methods. For OCD vs. Sibling task, our SSNSS model obtains an accuracy of 79.48%, which is 1.74% higher than other methods. The above results demonstrate that our SSNSS model is effective and outperforms other competing methods.
We use the proposed method to build functional connectivity network of OCD, Sibling and NC. The feature maps of our BFCN are shown in Fig. 3. The image below is the whole brain BFCN, while the image above is a partial area magnified image. It can be seen that the network constructed by SSN is sparse.
The brain functional connectivity networks are shown in Fig. 4. Our method can clearly express the activity of the brain, which has been verified in the classification results. It can be clearly seen from the BFCN that there are some differences between OCD and NC. The OCD patients have similar characteristics to Siblings. The differences between OCD and NC brain activity status include frontal_sup_orb, hippocampus, caudate, putamen. These ROIs are similar to OCD related areas found by previous researchers [1, 10].
4 Conclusions
In this paper, a novel method of diagnosing OCD and OCD’s sibling has been proposed, which integrates the merits of both traditional machine learning and deep learning techniques. Specifically, the SSN model based on sparse learning has been proposed to construct the BFCN, which not only can control the density of BFCN, but also take into account the similarity among subjects. The SSAE is used to learn high-level features to obtain discriminative features for final classification. In our future work, we will consider more modalities and constraints to enhance the disease diagnosis accuracy. Also, the dynamic and high-order BFCN can be incorporated into our framework to enhance the performance of the entire framework as well.
References
Zhou, C., Cheng, Y., Ping, L., et al.: Support vector machine classification of obsessive-compulsive disorder based on whole-brain volumetry and diffusion tensor imaging. Front. Psychiatry 9, 1–9 (2018). https://doi.org/10.3389/fpsyt.2018.00524
Sen, B., Bernstein, G.A., Xu, T., et al.: Classification of obsessive-compulsive disorder from resting-state fMRI. In: 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 3606–3609. IEEE Press, New York (2016). https://doi.org/10.1109/embc.2016.7591508
Xing, X., Jin, L., Shi, F., et al.: Diagnosis of OCD using functional connectome and Riemann kernel PCA. In: SPIE Medical Imaging. SPIE, Washington DC (2019). https://doi.org/10.1117/12.2512316
Wee, C.Y., Yap, P.T., Zhang, D., et al.: Group-constrained sparse fMRI connectivity modeling for mild cognitive impairment identification. Brain Struct. Funct. 219, 641–656 (2014). https://doi.org/10.1007/s00429-013-0524-8
Chen, Y., Jiao, L., Li, Y., et al.: Multilayer projective dictionary pair learning and sparse autoencoder for PolSAR image classification. IEEE Trans. Geosci. Remote Sens. 55, 6683–6694 (2017). https://doi.org/10.1109/TGRS.2017.2727067
Xu, J., Xiang, L., Liu, Q., et al.: Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Trans. Med. Imaging 35, 119–130 (2016). https://doi.org/10.1109/TMI.2015.2458702
Wee, C.Y., Yang, S., Yap, P.T., et al.: Sparse temporally dynamic resting-state functional connectivity networks for early MCI identification. Brain Imag. Behav. 10, 342–356 (2016). https://doi.org/10.1007/s11682-015-9408-2
Chen, X., Zhang, H., Gao, Y., et al.: High-order resting-state functional connectivity network for MCI classification. Hum. Brain Mapp. 37, 3282–3296 (2016). https://doi.org/10.1002/hbm.23240
Rubinov, M., Sporns, O.: Complex network measures of brain connectivity: uses and interpretations. NeuroImage 52, 1059–1069 (2010). https://doi.org/10.1016/j.neuroimage.2009.10.003
Fan, J., Zhong, M., Zhu, X., et al.: Resting-state functional connectivity between right anterior insula and right orbital frontal cortex correlate with insight level in obsessive-compulsive disorder. NeuroImage Clin. 15, 1–7 (2017). https://doi.org/10.1016/j.nicl.2017.04.002
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, P., Jin, L., Xu, C., Wang, T., Lei, B., Peng, Z. (2019). OCD Diagnosis via Smoothing Sparse Network and Stacked Sparse Auto-Encoder Learning. In: Zhang, D., Zhou, L., Jie, B., Liu, M. (eds) Graph Learning in Medical Imaging. GLMI 2019. Lecture Notes in Computer Science(), vol 11849. Springer, Cham. https://doi.org/10.1007/978-3-030-35817-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-35817-4_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35816-7
Online ISBN: 978-3-030-35817-4
eBook Packages: Computer ScienceComputer Science (R0)