Adaptive Generative Initialization in Transfer Learning

Bai, Wenjun; Quan, Changqin; Luo, Zhi-Wei

doi:10.1007/978-3-319-98693-7_5

Wenjun Bai³,
Changqin Quan³ &
Zhi-Wei Luo³

Part of the book series: Studies in Computational Intelligence ((SCI,volume 791))

Included in the following conference series:

International Conference on Computer and Information Science

398 Accesses

Abstract

In spite of numerous researches on transfer learning, the consensus on the optimal method in transfer learning has not been reached. To render a unified theoretical understanding of transfer learning, we rephrase the crux of transfer learning as pursuing the optimal initialisation in facilitating the to-be-transferred task. Hence, to obtain an ideal initialisation, we propose a novel initialisation technique, i.e., adapted generative initialisation. Not limit to boost the task transfer, more importantly, the proposed initialisation can also bound the transfer benefits in defending the devastating negative transfer. At first stage in our proposed initialisation, the in-congruency between a task and its assigned learner (model) can be alleviated through feeding the knowledge of the target learner to train the source learner, whereas the later generative stage ensures the adapted initialisation can be properly produced to the target learner. The superiority of our proposed initialisation over conventional neural network based approaches was validated in our preliminary experiment on MNIST dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A learner in this article represents any types of computational models, such as neural networks.
2.
\(\rightarrow \) denotes the direction of knowledge transfer, e.g., T1D1 \(\rightarrow \) T2D2 means the knowledge is extracted from a prior learning of a complex task through a cumbersome learner, then transferred to assist the learning of a simple task through a compact learner.

References

Jiang, J., Zhai, C.: Instance weighting for domain adaptation in NLP. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 264–271 (2007)
Google Scholar
Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. In: Advances in Neural Information Processing Systems, pp. 41–48 (2007)
Google Scholar
Evgeniou, T., Pontil, M.: Regularized multi-task learning. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 109–117. ACM (2004)
Google Scholar
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1717–1724. IEEE (2014)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the Knowledge in a Neural Network. http://arxiv.org/abs/1503.02531 (2015)
Lee, G., Yang, E., Hwang, S.J.: Asymmetric multi-task learning based on task relatedness and loss. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML’16. JMLR.org, vol. 48, pp. 230–238. http://dl.acm.org/citation.cfm?id=3045390.3045416 (2016)
Mahmud, M.M., Ray, S.: Transfer learning using Kolmogorov complexity: basic theory and empirical evaluations. In: Advances in Neural Information Processing Systems, pp. 985–992 (2008)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: International Conference on Machine Learning, pp. 1139–1147 (2013)
Google Scholar
Neal, R.M.: Bayesian Learning for Neural Networks, vol. 118. Springer Science & Business Media (2012)
Google Scholar
Rubin, D.B.: The Bayesian Bootstrap, vol. 9, no. 1, pp. 130–134. https://projecteuclid.org/euclid.aos/1176345338 (1981)
Higgins, J.J.: Introduction to Modern Nonparametric Statistics (2003)
Google Scholar
Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859–877 (2017)
Article MathSciNet Google Scholar
Bonilla, E.V., Chai, K.M., Williams, C.: Multi-task Gaussian process prediction. In: Advances in Neural Information Processing Systems, pp. 153–160 (2008)
Google Scholar
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Article Google Scholar
Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 6(2), 181–214 (1994)
Article Google Scholar
Lin, T.-Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1449–1457 (2015)
Google Scholar

Download references

Acknowledgements

This study is partially supported by the Okawa Foundation for Information and Telecommunications, and National Natural Science Foundation of China under Grant No. 61472117.

Author information

Authors and Affiliations

Department of Computational Science, Kobe University, 1-1, Rokkodai, Nada, Kobe, 657-8501, Japan
Wenjun Bai, Changqin Quan & Zhi-Wei Luo

Authors

Wenjun Bai
View author publications
You can also search for this author in PubMed Google Scholar
Changqin Quan
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Wei Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenjun Bai .

Editor information

Editors and Affiliations

Software Engineering and Information Technology Institute, Central Michigan University, Mt. Pleasant, MI, USA
Roger Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bai, W., Quan, C., Luo, ZW. (2019). Adaptive Generative Initialization in Transfer Learning. In: Lee, R. (eds) Computer and Information Science. ICIS 2018. Studies in Computational Intelligence, vol 791. Springer, Cham. https://doi.org/10.1007/978-3-319-98693-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-98693-7_5
Published: 04 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98692-0
Online ISBN: 978-3-319-98693-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics