Skip to main content

An Information-Theoretic Approach for Multi-task Learning

  • Conference paper
Advanced Data Mining and Applications (ADMA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5678))

Included in the following conference series:

Abstract

Multi-task learning utilizes labeled data from other “similar” tasks and can achieve efficient knowledge-sharing between tasks. In this paper, a novel information-theoretic multi-task learning model, i.e. IBMTL, is proposed. The key idea of IBMTL is to minimize the loss mutual information during the classification, while constrain the Kullback Leibler divergence between multiple tasks to some maximal level. The basic trade-off is between maximize the relevant information while minimize the “dissimilarity” between multiple tasks. The IBMTL algorithm is compared with TrAdaBoost which extends AdaBoost for transfer learning. The experiments were conducted on two data sets for transfer learning, Email spam-filtering data set and sentiment classification data set. The experimental results demonstrate that IBMTL outperforms TrAdaBoost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. The Journal of Machine Learning Research 6(1), 1817–1853 (2005)

    MathSciNet  MATH  Google Scholar 

  2. Caruana, R.: Multi-task learning. Machine Learning 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  3. Bakker, B., Heskes, T.: Task clustering and gating for Bayesian multitask learning. The Journal of Machine Learning Research 4(12), 83–89 (2003)

    MATH  Google Scholar 

  4. Baxter, J.: A model of inductive bias learning. Journal of Artificial Intelligence Research 12, 149–198 (2000)

    MathSciNet  MATH  Google Scholar 

  5. Dai, W.Y., Yang, Q., Xue, G.R., et al.: Boosting for Transfer Learning. In: Proc of the 24th international conference on Machine learning, pp. 193–200. ACM Press, New York (2007)

    Google Scholar 

  6. Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6, 615–637 (2005)

    MathSciNet  MATH  Google Scholar 

  7. Heskes, T.: Empirical bayes for learning to learn. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 367–374. ACM Press, New York (2000)

    Google Scholar 

  8. Lawrence, N.D., Platt, J.C.: Learning to learn with the informative vector machine. In: Proceedings of the 21st International Conference on Machine Learning (2004)

    Google Scholar 

  9. Roy, D.M., Kaelbling, L.P.: Efficient Bayesian task-level transfer learning. In: Proc. of the 20th Joint Conference on Artificial Intelligence, pp. 2599–2604. ACM Press, New York (2007)

    Google Scholar 

  10. Yu, S.P., Tresp, V., Yu, K.: Robust multi-task learning with t-processes. In: Proc. of the 24th international conference on Machine learning, pp. 1103–1110. ACM Press, New York (2007)

    Google Scholar 

  11. Yu, K., Tresp, V., Schwaighofer, A.: Learning Gaussian processes from multiple tasks. In: Proceedings of the 22nd international conference on Machine learning, pp. 1012–1019. ACM Press, New York (2005)

    Google Scholar 

  12. Zhang, Y., Koren, J.: Efficient Bayesian hierarchical user modeling for recommendation systems. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 47–54. ACM Press, New York (2007)

    Google Scholar 

  13. Blitzer, J., Dredze, M., Pereira, F.: Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. In: Association of Computational Linguistics (ACL) (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, P., Tan, Q., Xu, H., Ding, Y. (2009). An Information-Theoretic Approach for Multi-task Learning. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03348-3_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03347-6

  • Online ISBN: 978-3-642-03348-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics