Advertisement

Zero-Shot Learning

  • Zhengming Ding
  • Handong Zhao
  • Yun Fu
Chapter
Part of the Advanced Information and Knowledge Processing book series (AI&KP)

Abstract

Zero-shot learning targets at precisely recognizing unseen categories through a shared visual-semantic function, which is built on the seen categories and expected to well adapt to unseen categories. However, the semantic gap across visual features and their underlying semantics is still the most challenging obstacle. In this chapter, we tackle this issue by exploiting the intrinsic relationship in the semantic manifold and enhancing the transferability of visual-semantic function. Specifically, we propose an Adaptive Latent Semantic Representation (ALSR) model in a sparse dictionary learning scheme, where a generic semantic dictionary is learned to connect the latent semantic space with visual feature space. To build a fast inference model, we explore a non-linear network to approximate the latent sparse semantic representation, which lies in the semantic manifold space. Consequently, our model could extract a variety of visual characteristics within seen classes, which can be well generalized to unobserved classes.

References

  1. Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2927–2936Google Scholar
  2. Bucher M, Herbin S, Jurie F (2016) Improving semantic embedding consistency by metric learning for zero-shot classiffication. In: Proceedings of the European conference on computer vision. Springer, pp 730–746Google Scholar
  3. Changpinyo S, Chao W-L, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5327–5336Google Scholar
  4. Ding CH, Li T, Jordan MI (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55CrossRefGoogle Scholar
  5. Ding Z, Shao M, Fu Y (2017) Low-rank embedded ensemble semantic dictionary for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2050–2058Google Scholar
  6. Ding Z, Shao M, Fu Y (2018) Generative zero-shot learning via low-rank embedded semantic dictionary. IEEE Trans Pattern Anal Mach IntellGoogle Scholar
  7. Duan K, Parikh D, Crandall D, Grauman K (2012) Discovering localized attributes for fine-grained recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 3474–3481Google Scholar
  8. Fan K (1949) On a theorem of weyl concerning eigenvalues of linear transformations i. Proc Natl Acad Sci 35(11):652–655MathSciNetCrossRefGoogle Scholar
  9. Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1778–1785Google Scholar
  10. Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Mikolov T et al (2013) Devise: a deep visual-semantic embedding model. In: Proceedings of the advances in neural information processing systems, pp 2121–2129Google Scholar
  11. Fu Y, Hospedales TM, Xiang T, Gong S (2015) Transductive multi-view zero-shot learning. IEEE Trans Pattern Anal Mach Intell 37(11):2332–2345CrossRefGoogle Scholar
  12. Fu Z, Xiang T, Kodirov E, Gong S (2015) Zero-shot object recognition by semantic manifold distance. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2635–2644Google Scholar
  13. Gregor K, LeCun Y (2010) Learning fast approximations of sparse coding. In: Proceedings of the 27th international conference on machine learning, pp 399–406Google Scholar
  14. Jiang H, Wang R, Shan S, Yang Y, Chen X (2017) Learning discriminative latent attributes for zero-shot classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4223–4232Google Scholar
  15. Kodirov E, Xiang T, Fu Z, Gong S (2015) Unsupervised domain adaptation for zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 2452–2460Google Scholar
  16. Kuhn HW (2014) Nonlinear programming: a historical view. Traces and emergence of nonlinear programming. Springer, Berlin, pp 396–414Google Scholar
  17. Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 951–958Google Scholar
  18. Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465CrossRefGoogle Scholar
  19. Lee H, Battle A, Raina R, Ng AY (2007) Efficient sparse coding algorithms. In: Proceedings of the advances in neural information processing systems, pp 801–808Google Scholar
  20. Li X, Guo Y, Schuurmans D (2015) Semi-supervised zero-shot classification with label representation learning. In: Proceedings of the IEEE international conference on computer vision, pp 4211–4219Google Scholar
  21. Li Y, Wang D, Hu H, Lin Y, Zhuang Y (2017) Zero-shot recognition using dual visual-semantic mapping paths, pp 3279–3287Google Scholar
  22. Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184CrossRefGoogle Scholar
  23. Long Y, Liu L, Shen F, Shao L, Li X (2017) Zero-shot learning using synthesised unseen visual data with diffusion regularisation. IEEE Trans Pattern Anal Mach IntellGoogle Scholar
  24. Mensink T, Gavves E, Snoek CG (2014) Costa: co-occurrence statistics for zero-shot classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2441–2448Google Scholar
  25. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the advances in neural information processing systems, pp 3111–3119Google Scholar
  26. Palatucci M, Pomerleau D, Hinton GE, Mitchell TM (2009) Zero-shot learning with semantic output codes. In: Proceedings of the advances in neural information processing systems, pp 1410–1418Google Scholar
  27. Parikh D, Grauman K (2011a) Interactively building a discriminative vocabulary of nameable attributes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1681–1688Google Scholar
  28. Parikh D, Grauman K (2011b) Relative attributes. In: Proceedings of the IEEE international conference on computer vision, pp 503–510Google Scholar
  29. Patterson G, Hays J (2012) Sun attribute database: discovering, annotating, and recognizing scene attributes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2751–2758Google Scholar
  30. Peng P, Tian Y, Xiang T, Wang Y, Huang T (2016) Joint learning of semantic and latent attributes. In: Proceedings of the European conference on computer vision, pp 336–353CrossRefGoogle Scholar
  31. Qi G-J, Liu W, Aggarwal C, Huang TS (2016) Joint intermodal and intramodal label transfers for extremely rare or unseen classes. IEEE Trans Pattern Anal Mach IntellGoogle Scholar
  32. Qiao R, Liu L, Shen C, Hengel Avd (2016) Less is more: zero-shot learning from online textual documents with noise suppression. In: Proceedings of the IEEE international conference on computer visionGoogle Scholar
  33. Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: Proceedings of The 32nd international conference on machine learning, pp 2152–2161Google Scholar
  34. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 Google Scholar
  35. Socher R, Ganjoo M, Manning CD, Ng A (2013) Zero-shot learning through cross-modal transfer. In: Proceedings of the advances in neural information processing systems, pp 935–943Google Scholar
  36. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A ( 2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9Google Scholar
  37. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-UCSD birds-200-2011 dataset. Technical reportGoogle Scholar
  38. Xu X, Hospedales TM, Gong S (2016) Multi-task zero-shot action recognition with prioritised data augmentation. In: Proceedings of European conference on computer vision. Springer, pp 343–359Google Scholar
  39. Xu X, Shen F, Yang Y, Zhang D, Shen HT, Song J (2017) Matrix tri-factorization with manifold regularizations for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3798–3807Google Scholar
  40. You C, Robinson D, Vidal R (2016) Scalable sparse subspace clustering by orthogonal matching pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3918–3927Google Scholar
  41. Yu X, Aloimonos Y (2010) Attribute-based transfer learning for object categorization with zero/one training example. In: Proceedings of the European conference on computer vision. Springer, pp 127–140Google Scholar
  42. Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174Google Scholar
  43. Zhang Z, Saligrama V (2016) Zero-shot learning via joint latent similarity embedding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6034–6042Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Indiana University-Purdue University IndianapolisIndianapolisUSA
  2. 2.Adobe ResearchSan JoseUSA
  3. 3.Northeastern UniversityBostonUSA

Personalised recommendations