Clustering-Induced Multi-task Learning for AD/MCI Classification

  • Heung-Il Suk
  • Dinggang Shen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8675)


In this work, we formulate a clustering-induced multi-task learning method for feature selection in Alzheimer’s Disease (AD) or Mild Cognitive Impairment (MCI) diagnosis. Unlike the previous methods that often assumed a unimodal data distribution, we take into account the underlying multipeak distribution of classes. The rationale for our approach is that it is likely for neuroimaging data to have multiple peaks or modes in distribution due to the inter-subject variability. In this regard, we use a clustering method to discover the multipeak distributional characteristics and define subclasses based on the clustering results, in which each cluster covers a peak. We then encode the respective subclasses, i.e., clusters, with their unique codes by imposing the subclasses of the same original class close to each other and those of different original classes distinct from each other. We finally formulate a multi-task learning problem in an ℓ2,1-penalized regression framework by taking the codes as new label vectors of our training samples, through which we select features for classification. In our experimental results on the ADNI dataset, we validated the effectiveness of the proposed method by achieving the maximal classification accuracies of 95.18% (AD/Normal Control: NC), 79.52% (MCI/NC), and 72.02% (MCI converter/MCI non-converter), outperforming the competing single-task learning method.


Positron Emission Tomography Feature Selection Mild Cognitive Impairment Sparse Code Label Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    de Brecht, M., Yamagishi, N.: Combining sparseness and smoothness improves classification accuracy and interpretability. NeuroImage 60(2), 1550–1561 (2012)CrossRefGoogle Scholar
  2. 2.
    Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar
  3. 3.
    Fotenos, A., Snyder, A., Girton, L., Morris, J., Buckner, R.: Normative estimates of cross-sectional and longitudinal brain volume decline in aging and AD. Neurology, 1032–1039 (2005)Google Scholar
  4. 4.
    Kabani, N., MacDonald, D., Holmes, C., Evans, A.: A 3D atlas of the human brain. NeuroImage 7(4), S717 (1998)Google Scholar
  5. 5.
    Liao, S., Gao, Y., Shi, Y., Yousuf, A., Karademir, I., Oto, A., Shen, D.: Automatic prostate mr image segmentation with sparse label propagation and domain-specific manifold regularization. In: Gee, J.C., Joshi, S., Pohl, K.M., Wells, W.M., Zöllei, L. (eds.) IPMI 2013. LNCS, vol. 7917, pp. 511–523. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  6. 6.
    Mwangi, B., Tian, T., Soares, J.: A review of feature reduction techniques in neuroimaging. Neuroinformatics, 1–16 (2013)Google Scholar
  7. 7.
    Noppeney, U., Penny, W.D., Price, C.J., Flandin, G., Friston, K.J.: Identification of degenerate neuronal systems based on intersubject variability. NeuroImage 30(3), 885–890 (2006)CrossRefGoogle Scholar
  8. 8.
    Sled, J.G., Zijdenbos, A.P., Evans, A.C.: A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging 17(1), 87–97 (1998)CrossRefGoogle Scholar
  9. 9.
    Suk, H.I., Lee, S.W., Shen, D.: Latent feature representation with stacked auto-encoder for AD/MCI diagnosis. Brain Structure and Function, 1–19 (2013)Google Scholar
  10. 10.
    Suk, H.I., Wee, C.Y., Shen, D.: Discriminative group sparse representation for mild cognitive impairment classification. In: Wu, G., Zhang, D., Shen, D., Yan, P., Suzuki, K., Wang, F. (eds.) MLMI 2013. LNCS, vol. 8184, pp. 131–138. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  11. 11.
    Varoquaux, G., Gramfort, A., Poline, J.B., Thirion, B.: Brain covariance selection: better individual functional connectivity models using population prior. In: Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advanced in Neural Information Processing Systems, pp. 2334–2342 (2010)Google Scholar
  12. 12.
    Zhu, M., Martinez, A.M.: Subclass discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(8), 1274–1286 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Heung-Il Suk
    • 1
  • Dinggang Shen
    • 1
  1. 1.Biomedical Research Imaging CenterUniversity of North Carolina at Chapel HilUSA

Personalised recommendations