Fused Group Lasso Regularized Multi-Task Feature Learning and Its Application to the Cognitive Performance Prediction of Alzheimer’s Disease

  • Xiaoli Liu
  • Peng CaoEmail author
  • Jianzhong Wang
  • Jun Kong
  • Dazhe Zhao
Original Article


Alzheimer’s disease (AD) is characterized by gradual neurodegeneration and loss of brain function, especially for memory during early stages. Regression analysis has been widely applied to AD research to relate clinical and biomarker data such as predicting cognitive outcomes from MRI measures. Recently, multi-task based feature learning (MTFL) methods with sparsity-inducing \( \ell _{2,1} \)-norm have been widely studied to select a discriminative feature subset from MRI features by incorporating inherent correlations among multiple clinical cognitive measures. However, existing MTFL assumes the correlation among all tasks is uniform, and the task relatedness is modeled by encouraging a common subset of features via sparsity-inducing regularizations that neglect the inherent structure of tasks and MRI features. To address this issue, we proposed a fused group lasso regularization to model the underlying structures, involving 1) a graph structure within tasks and 2) a group structure among the image features. To this end, we present a multi-task feature learning framework with a mixed norm of fused group lasso and \( \ell _{2,1} \)-norm to model these more flexible structures. For optimization, we employed the alternating direction method of multipliers (ADMM) to efficiently solve the proposed non-smooth formulation. We evaluated the performance of the proposed method using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) datasets. The experimental results demonstrate that incorporating the two prior structures with fused group lasso norm into the multi-task feature learning can improve prediction performance over several competing methods, with estimated correlations of cognitive functions and identification of cognition-relevant imaging markers that are clinically and biologically meaningful.


Alzheimer’s disease Multi-task learning Sparse group lasso Fused lasso 



This research was supported by the National Science Foundation for Distinguished Young Scholars of China under Grant (No.71325002 and No.61225012), the National Natural Science Foundation of China (No.61502091), the Fundamental Research Funds for the Central Universities (No.N161604001 and No.N150408001).


  1. Alzheimer’s Association, & et al. (2016). Alzheimer’s disease facts and figures. Alzheimer’s & Dementia, 12(4), 459–509.CrossRefGoogle Scholar
  2. Argyriou, A., Evgeniou, T., Pontil, M. (2008). Convex multi-task feature learning. Machine Learning, 73(3), 243–272.CrossRefGoogle Scholar
  3. Batsch, N.L., & Mittelman, M.S. (2015). World Alzheimer Report 2012. Overcoming the stigma of dementia. Alzheimer’s Disease International (ADI), p. 5.Google Scholar
  4. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundation and Trends in Machine Learning, 3(1), 1–122.Google Scholar
  5. Cai, J.-F., Osher, S., Shen, Z. (2009). Split bregman methods and frame based image restoration. Multiscale modeling & simulation, 8(2), 337–369.CrossRefGoogle Scholar
  6. Cao, P., Liu, X., Yang, J., Zhao, D., Zaiane, O. (2017). Sparse multi-kernel based multi-task learning for joint prediction of clinical scores and biomarker identification in alzheimer’s disease. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 195–202.Google Scholar
  7. Caruana, R. (1998). Multitask learning. In Learning to learn. Springer, pp. 95–133.CrossRefGoogle Scholar
  8. Castellani, R.J., Rolston, R.K., Smith, M.A. (2010). Alzheimer disease. Disease-a-month: DM, 56(9), 484.CrossRefGoogle Scholar
  9. Chen, J., Zhou, J., Ye, J. (2011). Integrating low-rank and group-sparse structures for robust multi-task learning.Google Scholar
  10. Dale, A.M., Fischl, B., Sereno, M.I. (1999). Cortical surface-based analysis. I. Segmentation and surface reconstruction. NeuroImage, 9, 179–194.CrossRefGoogle Scholar
  11. Dale, A.M., & Sereno, M.I. (1993). Improved localization of cortical activity by combining EEG and MEG with MRI cortical surface reconstruction: a linear approach. Journal of Cognitive Neuroscience, 5(2), 162–176.CrossRefGoogle Scholar
  12. Desikan, R.S., Ségonne, F., Fischl, B., Quinn, B.T., Dickerson, B.C., Blacker, D., Buckner, R.L., Dale, A.M., Maguire, R.P., Hyman, B.T. (2006). An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage, 31(3), 968–980.CrossRefGoogle Scholar
  13. Evgeniou, T., & learning, M.P. (2004). Regularized multi–task. In Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp. 109–117.Google Scholar
  14. Fischl, B., Liu, A., Dale, A.M. (2001). Automated manifold surgery: constructing geometrically accurate and topologically correct models of the human cerebral cortex. IEEE Transactions on Medical Imaging, 20, 70–80.CrossRefGoogle Scholar
  15. Fischl, B., Salat, D.H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., van der Kouwe, A., Killiany, R., Kennedy, D., Klaveness, S., Montillo, A., Makris, N., Rosen, B., Dale, A.M. (2002). Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron, 33, 341–355.CrossRefGoogle Scholar
  16. Fischl, B., Salat, D.H., van der Kouwe, A.J., Makris, N., Segonne, F., Quinn, B.T., Dale, A.M. (2004). Sequence-independent segmentation of magnetic resonance images. NeuroImage, 23, S69–S84.CrossRefGoogle Scholar
  17. Frisoni, G.B., Fox, N.C., Jack, C.R., Scheltens, P., Thompson, P.M. (2010). The clinical use of structural MRI in Alzheimer disease. Nature Reviews Neurology, 6(2), 67–77.CrossRefGoogle Scholar
  18. Goncalves, A., Das, P., Chatterjee, S., Sivakumar, V., Zuben, F.J.V., Banerjee, A. (2014). Multi-task sparse structure learning. In In CIKM, pp. 451–460.Google Scholar
  19. Jebara, T. (2011). Multitask sparsity via maximum entropy discrimination. Journal of Machine Learning Research, 12(Jan), 75–110.Google Scholar
  20. Ji, S., & Ye, J. (2009). An accelerated gradient method for trace norm minimization. In Proceedings of the 26th annual international conference on machine learning. ACM, pp. 457–464.Google Scholar
  21. Liu, J., Ji, S., Ye, J. (2009). Multi-task feature learning via \(\ell _{2,1}\)-norm minimization. In Proceedings of the 25th conference on uncertainty in artificial intelligence. AUAI Press, pp. 339–348.Google Scholar
  22. Liu, X., Cao, P., Zhao, D., Zaiane, O., et al. (2017). Group guided sparse group lasso multi-task learning for cognitive performance prediction of alzheimer’s disease. In International Conference on Brain Informatics. Springer, pp. 202–212.Google Scholar
  23. Liu, X., Goncalves, A.R., Cao, P., Zhao, D., Banerjee, A., et al. (2017). Modeling Alzheimer’s disease cognitive scores using multi-task sparse group lasso. Computerized Medical Imaging and Graphics, 66, 100–114.CrossRefGoogle Scholar
  24. Reuter, M., Rosas, H.D., Fischl, B. (2010). Highly accurate inverse consistent registration: A robust approach. NeuroImage, 53(4), 1181–1196.CrossRefGoogle Scholar
  25. Segonne, F., Dale, A.M., Busa, E., Glessner, M., Salat, D., Hahn, H.K., Fischl, B. (2004). A hybrid approach to the skull stripping problem in MRI. NeuroImage, 22, 1060–1075.CrossRefGoogle Scholar
  26. Ségonne, F., Pacheco, J., Fischl, B. (2007). Geometrically accurate topology-correction of cortical surfaces using nonseparating loops. IEEE Transactions on Medical Imaging, 26(4), 518–529.CrossRefGoogle Scholar
  27. Li, S., Saykin, A.J., Risacher, S.L., Kim, S., Fang, S., Rao, B.D., Li, T., Yan, J., Zhang, Z., Wan, J. (2012). Sparse bayesian multi-task learning for predicting cognitive outcomes from neuroimaging measures in alzheimer.Google Scholar
  28. Sled, J.G., Zijdenbos, A.P., Evans, A.C. (1998). A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging, 17, 87–97.CrossRefGoogle Scholar
  29. Stonnington, C.M., Chu, C., Klöppel, S., Jack, C.R., Ashburner, J., Frackowiak, R.S.J. (2010). Alzheimer disease neuroimaging initiative predicting clinical scores from magnetic resonance scans in Alzheimer’s disease. NeuroImage, 51(4), 1405–1413.CrossRefGoogle Scholar
  30. Hoesen, G.W.v., Hyman, B.T., Damasio, A.R. (1991). Entorhinal cortex pathology in Alzheimer’s disease. Hippocampus, 1(1), 1–8.CrossRefGoogle Scholar
  31. Visser, P.J., Verhey, F.R.J., Hofman, P.A.M., Scheltens, P., Jolles, J. (2002). Medial temporal lobe atrophy predicts Alzheimer’s disease in patients with minor cognitive impairment. Journal of Neurology Neurosurgery & Psychiatry, 72(4), 491–497.Google Scholar
  32. Wan, J., Zhang, Z., Rao, B.D., Fang, S., Yan, J., Saykin, A.J., Li, S. (2014). Identifying the neuroanatomical basis of cognitive impairment in Alzheimer’s disease by correlation-and nonlinearity-aware sparse Bayesian learning. IEEE transactions on medical imaging, 33(7), 1475–1487.CrossRefGoogle Scholar
  33. Wan, J., Zhang, Z., Yan, J., Li, T., Rao, B.D., Fang, S., Kim, S., Risacher, S.L., Saykin, A.J., Li, S. (2012). Sparse Bayesian multi-task learning for predicting cognitive outcomes from neuroimaging measures in Alzheimer’s disease. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 940–947.Google Scholar
  34. Wang, H., Nie, F., Huang, H., Risacher, S., Ding, C., Saykin, A.J., Li, S. (2011). ADNI Sparse Multi-task regression and feature selection to identify brain imaging predictors for memory performance. In International Conference on Computer Vision, pp. 6–13.Google Scholar
  35. Wang, H., Nie, F., Huang, H., Yan, J., Kim, S., Risacher, S., Saykin, A., Li, S. (2012). High-order multi-task feature learning to identify longitudinal phenotypic markers for Alzheimer’s disease progression prediction. In Advances in Neural Information Processing Systems, pp. 1277–1285.Google Scholar
  36. Weiner, M.W., Aisen, P.S., Jack, C.R. Jr, Jagust, W.J., Trojanowski, J.Q., Shaw, L., Saykin, A.J., Morris, J.C., Cairns, N., Beckett, L.A., Toga, A., Green, R., Walter, S., Soares, H., Snyder, P., Siemers, E., Potter, W., Cole, P.E., Schmidt, M. (2010). The Alzheimer’s disease neuroimaging initiative: progress report and future plans. Alzheimer’s & Dementia, 6, 202–211.CrossRefGoogle Scholar
  37. Xu, L., Wu, X., Li, R., Chen, K., Long, Z., Zhang, J., Guo, X., Yao, L. (2016). Prediction of progressive mild cognitive impairment by multi-modal neuroimaging biomarkers. Journal of Alzheimer’s Disease, 51 (4), 1045–1056.CrossRefGoogle Scholar
  38. Xue, Y., Liao, X., Carin, L., Krishnapuram, B. (2007). Multi-task learning for classification with dirichlet process priors. Journal of Machine Learning Research, 8(Jan), 35–63.Google Scholar
  39. Yan, J., Huang, H., Risacher, S.L., Kim, S., Inlow, M., Moore, J.H., Saykin, A.J., Shen, L. (2013). Network-guided sparse learning for predicting cognitive outcomes from MRI measures. In International Workshop on Multimodal Brain Image Analysis. Springer, pp. 202–210.Google Scholar
  40. Yan, J., Li, T., Wang, H., Huang, H., Wan, J., Nho, K., Kim, S., Risacher, S.L., Saykin, A.J., Shen, L., et al. (2015). Cortical surface biomarkers for predicting cognitive outcomes using group \(\ell _{2,1}\) norm. Neurobiology of aging, 36, S185–S193.CrossRefGoogle Scholar
  41. Ye, G.-B., & Xie, X. (2011). Split bregman method for large scale fused lasso. Computational Statistics & Data Analysis, 55(4), 1552–1569.CrossRefGoogle Scholar
  42. Yu, K., Tresp, V., Schwaighofer, A. (2005). Learning gaussian processes from multiple tasks. In Proceedings of the 22nd international conference on Machine learning. ACM, pp. 1012–1019.Google Scholar
  43. Yuan, L., Liu, J., Ye, J. (2013). Efficient methods for overlapping group lasso. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(9), 2104–2116.CrossRefGoogle Scholar
  44. Zhang, D., Shen, D., Alzheimer’s Disease Neuroimaging Initiative. (2012). Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. NeuroImage, 59(2), 895–907.CrossRefGoogle Scholar
  45. Zhang, Y., & Yeung, D.-Y. (2012a). A convex formulation for learning task relationships in multi-task learning. In Conference on Uncertainty in Artificial Intelligence (UAI2010) 2010, pp. 733–742.Google Scholar
  46. Zhang, Y., & Yeung, D.-Y. (2012b). A convex formulation for learning task relationships in multi-task learning. arXiv:1203.3536.
  47. Zhou, J., Chen, J., Ye, J. (2011). Clustered multi-task learning via alternating structure optimization. In Advances in neural information processing systems, pp. 702–710.Google Scholar
  48. Zhou, J., Liu, J., Narayan, V.A., Ye, J., Alzheimer’s Disease Neuroimaging Initiative. (2013). Modeling disease progression via multi-task learning. NeuroImage, 78, 233–248.CrossRefGoogle Scholar
  49. Zhou, J.Y. Multi-task learning in crisis event classification. Technical report, Tech. Rep.,
  50. Zhu, X., Suk, H.-I., Lee, S.-W., Shen, D. (2016). Subspace regularized sparse multitask learning for multiclass neurodegenerative disease identification. IEEE Transactions on Biomedical Engineering, 63(3), 607–618.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computer Science and EngineeringNortheastern UniversityShenyangChina
  2. 2.Key Laboratory of Medical Image Computing of Ministry of EducationNortheastern UniversityShenyangChina
  3. 3.College of Information Science and TechnologyNortheast Normal UniversityChangchunChina
  4. 4.Key Laboratory of Applied Statistics of MOEChangchunChina

Personalised recommendations