Weighted feature-task-aware regularization learner for multitask learning

Theoretical advances


Multitask learning has recently received extensive attention due to the fact that it can share knowledge between tasks and improve the collective performance leverage shared structures among the tasks to jointly build a better model for each task. However, most existing multitask learning methods only focus on selecting features across the tasks, how to enhance the sparsity of the learned variables is not taken into consideration there is little concern on how to enhance the sparsity of the learned variables. In this paper, we first present a weighted feature-task-aware regularization learning model for multitask learning in order to enhance the sparsity of the weight matrices, and then propose an online learning algorithm to train the proposed model together with theoretical guarantee. Finally, we conduct experiments to compare the resulting approach with some related methods used for multitask learning, which illustrates the efficiency of the proposed method. To verify the effectiveness of the proposed approach, extensive experiments are conducted on two widely used data for multitask learning. The encouraging performance of the proposed approach over the related methods demonstrates its superiority.


Multitask learning Online learning Weighted optimization Sparsity 



The author thanks the editor and the reviewer for their numerous helpful comments, which greatly improved the presentation of this paper. This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61806004, U1636220, 11661007, 61602482 and 61472423), the Natural Science Foundation of Zhejiang Province, China (Grant No. LD19A010002), and the Major Technologies R & D Special Program of Anhui Province, China (Grant No. 16030901060).


  1. 1.
    Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: Advances in neural information processing systems, pp 41–48Google Scholar
  2. 2.
    Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272CrossRefGoogle Scholar
  3. 3.
    Candès EJ, Wakin MB (2008) An introduction to compressive sampling. IEEE Signal Process Mag 25(2):21–30CrossRefGoogle Scholar
  4. 4.
    Candès EJ, Wakin MB, Boyd SP (2008) Enhancing sparsity by reweighted \(l_1\) minimization. J Fourier Anal Appl 14(5):877–905MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Cavallanti G, Cesa-Bianchi N, Gentile C (2010) Linear algorithms for online multitask classification. J Mach Learn Res 11:2901–2934MathSciNetzbMATHGoogle Scholar
  6. 6.
    Dekel O, Long PM, Singer Y (2006) Online multitask learning. In: Proceedings of the 19th annual conference on learning theory, pp 453–467Google Scholar
  7. 7.
    Dekel O, Long PM, Singer Y (2007) Online learning of multiple tasks with a shared loss. J Mach Learn Res 8:2233–2264MathSciNetzbMATHGoogle Scholar
  8. 8.
    Donoho DL (2006) For most large underdetermined systems of linear equations the minimal \(l_1\)-norm solution is also the sparsest solution. Commun Pure Appl Math 59(6):797–829MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Evgeniou T, Micchelli CA, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6:615–637MathSciNetzbMATHGoogle Scholar
  10. 10.
    Gibert X, Patel VM, Chellappa R (2017) Deep multitask learning for railway track inspection. IEEE Trans Intell Transp Syst 18(1):153–164CrossRefGoogle Scholar
  11. 11.
    Lenk PJ, DeSarbo WS, Green PE, Young MR (1996) Hierarchical Bayes conjoint analysis: recovery of partworth heterogeneity from reduced experimental designs. Mark Sci 15(2):173–191CrossRefGoogle Scholar
  12. 12.
    Lu X, Li X, Mou L (2015) Semi-supervised multitask learning for scene recognition. IEEE Trans Cybern 45(9):1967–1976CrossRefGoogle Scholar
  13. 13.
    Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355MathSciNetCrossRefGoogle Scholar
  14. 14.
    Li Z, Tang J (2016) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288MathSciNetCrossRefGoogle Scholar
  15. 15.
    Li Z, Tang J, Mei T (2018) Deep collaborative embedding for social image understanding. IEEE Trans Pattern Anal Mach Intell. Google Scholar
  16. 16.
    Luo Y, Wen Y, Tao D, Gui J, Xu C (2016) Large margin multi-modal multi-task feature extraction for image classification. IEEE Trans Image Process 25(1):414–427MathSciNetCrossRefGoogle Scholar
  17. 17.
    Nassif R, Richard C, Ferrari A, Sayed AH (2016) Proximal multitask learning over networks with sparsity-inducing coregularization. IEEE Trans Signal Process 64(23):6329–344MathSciNetCrossRefGoogle Scholar
  18. 18.
    Nesterov Y (2009) Primal-dual subgradient methods for convex problems. Math Program 120(1):221–259MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Obozinski G, Taskar B, Jordan MI (2010) Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput 20(2):231–252MathSciNetCrossRefGoogle Scholar
  20. 20.
    Pillonetto G, Dinuzzo F, Nicolao GD (2010) Bayesian online multitask learning of Gaussian processes. IEEE Trans Pattern Anal Mach Intell 32(2):193–205CrossRefGoogle Scholar
  21. 21.
    Quattoni A, Carreras X, Collins M, Darrell T (2009) An efficient projection for \(l_{1,\infty }\) regularization. In: Proceedings of the 26th international conference on machine learning, pp 857–864Google Scholar
  22. 22.
    Saha A, Rai P, Daum\(\acute{e}\) III H, Venkatasubramanian S (2011) Online learning of multiple tasks and their relationships. In: Proceedings of the 14th international conference on artificial intelligence and statistics, pp 643–651Google Scholar
  23. 23.
    Su C, Yang F, Zhang S, Tian Q, Davis LS, Gao W (2018) Multi-task learning with low rank attribute embedding for multi-camera person re-identification. IEEE Trans Pattern Anal Mach Intell 40(5):1167–1181CrossRefGoogle Scholar
  24. 24.
    Xiao L (2010) Dual averaging methods for regularized stochastic learning and online optimization. J Mach Learn Res 11:2543–2596MathSciNetzbMATHGoogle Scholar
  25. 25.
    Yan Y, Ricci E, Subramanian R, Liu G, Lanz O, Sebe N (2016) A multi-task learning framework for head pose estimation under target motion. IEEE Trans Pattern Anal Mach Intell 38(6):266–278Google Scholar
  26. 26.
    Yang H, Lyu MR, King I (2013) Efficient online learning for multitask feature selection. ACM Trans Knowl Discov Data 7(2):1–27CrossRefGoogle Scholar
  27. 27.
    Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S (2016) Deep model based transfer and multi-task learning for biological image analysis. IEEE Trans Big Data. Google Scholar
  28. 28.
    Zhang Y, Yang Q (2018) An overview of multi-task learning. Nat Sci Rev 5(1):30–43CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyAnhui University of TechnologyMaanshanChina

Personalised recommendations