Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks

Ahmed, Amr; Yu, Kai; Xu, Wei; Gong, Yihong; Xing, Eric

doi:10.1007/978-3-540-88690-7_6

Amr Ahmed⁴,
Kai Yu⁵,
Wei Xu⁵,
Yihong Gong⁵ &
…
Eric Xing⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5304))

Included in the following conference series:

European Conference on Computer Vision

8818 Accesses
54 Citations

Abstract

Building visual recognition models that adapt across different domains is a challenging task for computer vision. While feature-learning machines in the form of hierarchial feed-forward models (e.g., convolutional neural networks) showed promise in this direction, they are still difficult to train especially when few training examples are available. In this paper, we present a framework for training hierarchical feed-forward models for visual recognition, using transfer learning from pseudo tasks. These pseudo tasks are automatically constructed from data without supervision and comprise a set of simple pattern-matching operations. We show that these pseudo tasks induce an informative inverse-Wishart prior on the functional behavior of the network, offering an effective way to incorporate useful prior knowledge into the network training. In addition to being extremely simple to implement, and adaptable across different domains with little or no extra tuning, our approach achieves promising results on challenging visual recognition tasks, including object recognition, gender recognition, and ethnicity recognition.

Download to read the full chapter text

Chapter PDF

Deep Architectures in Visual Transfer Learning

Big Transfer (BiT): General Visual Representation Learning

Training Vision Transformers with only 2040 Images

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Abu-Mostafa, Y.: Learning from hints in neural networks. Journal of Complexity 6, 192–198 (1990)
Article MathSciNet MATH Google Scholar
Ando, R., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. JMLR 6, 1817–1853 (2005)
MathSciNet MATH Google Scholar
Baluja, S., Rowley, H.: Boosting sex identification performance. International Journal of Computer Vision (2007)
Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: ICCV 2007 (2008)
Google Scholar
Caruana, R.: Multitask learning. Machine learning. Machine Learning 28(1), 41–75 (1997)
Article MathSciNet Google Scholar
Fei-Fei, L.: Knowledge transfer in learning to recognize visual object classes. In: International Conference on Development and Learning (ICDL) (2006)
Google Scholar
Fukushima, K., Miyake, S.: Object recognition with features inspired by visual cortex. Pattern Recognition (1982)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: CVPR 2005 (2005)
Google Scholar
Griffin, G., Holub, A., Perona, P.: Caltech 256 object category dataset. California Institute of Technology 04-1366 (2007)
Google Scholar
Gutta, S., Huang, J., Jonathon, P., Wechsler, H.: Mixture ofo experts for classification of gender, ethnic origin, and pose of human faces. IEEE Transactions on Neural Networks (2000)
Google Scholar
Hinton, G., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Computation 18, 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction, interaction and functional architecture in the cat’s visual cortex. J. Physiology 160, 106–154 (1968)
Article Google Scholar
Weston, R.C.J., Ratle, F.: Deep learning via semi-supervised embedding. In: ICML (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR 2006 (2006)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2) (2004)
Google Scholar
Moghaddam, B., Yang, M.-H.: Learning gender with support faces. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2002)
Google Scholar
Mutch, J., Lowe, D.G.: Multiclass object recognition with sparse, localized features. In: CVPR 2006 (2006)
Google Scholar
Philips, P.J., Flynn, P.J., Scruggs, T., Bower, K.W., Worek, W.: Preliminary face recognition grand challenge results. In: Proceedings of the Sevethn International Conference on Automatic Face and Gesture Recgonition (2006)
Google Scholar
Ranzato, M., Huang, F.-J., Boureau, Y.-L., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: CVPR 2007 (2007)
Google Scholar
Belongie, J.M.S., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Article Google Scholar
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: CVPR 2005 (2005)
Google Scholar
Torralba, A., Murphy, K., Freeman, W.: Sharing visual features for multiclass and multiview object detection. IEEE PAMI (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Carnegie Mellon University, USA
Amr Ahmed & Eric Xing
NEC Labs America, 10080 N Wolfe Road, Cupertino, CA 95014
Kai Yu, Wei Xu & Yihong Gong

Authors

Amr Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Kai Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yihong Gong
View author publications
You can also search for this author in PubMed Google Scholar
Eric Xing
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, IL 61801, Urbana, USA
David Forsyth
Department of Computing, Oxford Brookes University, OX33 1HX, Wheatley, Oxford, UK
Philip Torr
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmed, A., Yu, K., Xu, W., Gong, Y., Xing, E. (2008). Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88690-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-88690-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88689-1
Online ISBN: 978-3-540-88690-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks

Abstract

Chapter PDF

Similar content being viewed by others

Deep Architectures in Visual Transfer Learning

Big Transfer (BiT): General Visual Representation Learning

Training Vision Transformers with only 2040 Images

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks

Abstract

Chapter PDF

Similar content being viewed by others

Deep Architectures in Visual Transfer Learning

Big Transfer (BiT): General Visual Representation Learning

Training Vision Transformers with only 2040 Images

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation