Abstract
Multi-task learning utilizes labeled data from other “similar” tasks and can achieve efficient knowledge-sharing between tasks. In this paper, a novel information-theoretic multi-task learning model, i.e. IBMTL, is proposed. The key idea of IBMTL is to minimize the loss mutual information during the classification, while constrain the Kullback Leibler divergence between multiple tasks to some maximal level. The basic trade-off is between maximize the relevant information while minimize the “dissimilarity” between multiple tasks. The IBMTL algorithm is compared with TrAdaBoost which extends AdaBoost for transfer learning. The experiments were conducted on two data sets for transfer learning, Email spam-filtering data set and sentiment classification data set. The experimental results demonstrate that IBMTL outperforms TrAdaBoost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. The Journal of Machine Learning Research 6(1), 1817–1853 (2005)
Caruana, R.: Multi-task learning. Machine Learning 28(1), 41–75 (1997)
Bakker, B., Heskes, T.: Task clustering and gating for Bayesian multitask learning. The Journal of Machine Learning Research 4(12), 83–89 (2003)
Baxter, J.: A model of inductive bias learning. Journal of Artificial Intelligence Research 12, 149–198 (2000)
Dai, W.Y., Yang, Q., Xue, G.R., et al.: Boosting for Transfer Learning. In: Proc of the 24th international conference on Machine learning, pp. 193–200. ACM Press, New York (2007)
Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6, 615–637 (2005)
Heskes, T.: Empirical bayes for learning to learn. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 367–374. ACM Press, New York (2000)
Lawrence, N.D., Platt, J.C.: Learning to learn with the informative vector machine. In: Proceedings of the 21st International Conference on Machine Learning (2004)
Roy, D.M., Kaelbling, L.P.: Efficient Bayesian task-level transfer learning. In: Proc. of the 20th Joint Conference on Artificial Intelligence, pp. 2599–2604. ACM Press, New York (2007)
Yu, S.P., Tresp, V., Yu, K.: Robust multi-task learning with t-processes. In: Proc. of the 24th international conference on Machine learning, pp. 1103–1110. ACM Press, New York (2007)
Yu, K., Tresp, V., Schwaighofer, A.: Learning Gaussian processes from multiple tasks. In: Proceedings of the 22nd international conference on Machine learning, pp. 1012–1019. ACM Press, New York (2005)
Zhang, Y., Koren, J.: Efficient Bayesian hierarchical user modeling for recommendation systems. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 47–54. ACM Press, New York (2007)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. In: Association of Computational Linguistics (ACL) (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, P., Tan, Q., Xu, H., Ding, Y. (2009). An Information-Theoretic Approach for Multi-task Learning. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-03348-3_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03347-6
Online ISBN: 978-3-642-03348-3
eBook Packages: Computer ScienceComputer Science (R0)