Zero-shot classification with unseen prototype learning

Abstract

Zero-shot learning (ZSL) aims at recognizing instances from unseen classes via training a classification model with only seen data. Most existing approaches easily suffer from the classification bias from unseen to seen categories since the models are only trained with seen data. In this paper, we tackle the task of ZSL with a novel Unseen Prototype Learning (UPL) model, which is a simple yet effective framework to learn visual prototypes for unseen categories from the corresponding class-level semantic information, and the learned features are treated as latent classifiers directly. Two types of constraints are proposed to improve the performance of the learned prototypes. Firstly, we utilize an autoencoder framework to learn visual prototypes from the semantic prototypes and reconstruct the original semantic information by a decoder to ensure the prototypes have a strong correlation with the corresponding categories. Secondly, we employ a triplet loss to make the average of visual features per class supervise the learned visual prototypes. In this way, the visual prototypes are more discriminative and as a result, the classification bias problem can be alleviated well. Besides, based on the episodic training paradigm in meta-learning, the model can accumulate wealthy experiences in predicting unseen classes. Extensive experiments on four datasets under both traditional ZSL and generalized ZSL show the effectiveness of our proposed UPL method.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. 1.

    Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 819–826

  2. 2.

    Akata Z, Reed S, Walter DJ, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2927–2936

  3. 3.

    Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5327–5336

  4. 4.

    Chao W, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: European conference on computer vision, pp 1–27

  5. 5.

    Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning, pp 1126–1135

  6. 6.

    Fontanini T, Iotti E, Prati A (2019) Metalgan: a cluster-based adaptive training for few-shot adversarial colorization. In: International conference on image analysis and processing, pp 280–291

  7. 7.

    Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato M, Mikolov T (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129

  8. 8.

    Gao R, Hou X, Qin J, Chen J, Liu L, Zhu F, Zhang Z, Shao L (2020) Zero-vae-gan: generating unseen features for generalized and transductive zero-shot learning. IEEE Trans Image Process 29:3665–3680

    Article  Google Scholar 

  9. 9.

    Gao X, Zhang Z, Mu T, Zhang X, Cui C, Wang M (2020) Self-attention driven adversarial similarity learning network. Pattern Recogn 5:107331

    Article  Google Scholar 

  10. 10.

    Goodfellow I, Pougetabadie J, Mirza M, Xu B, Wardefarley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  11. 11.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  12. 12.

    Huang H, Wang C, Yu PS, Wang C (2019) Generative dual adversarial network for generalized zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 801–810

  13. 13.

    Jayaraman D, Grauman K (2014) Zero shot recognition with unreliable attributes. In: Advances in neural information processing systems, pp 3464–3472

  14. 14.

    Ji Z, Cui B, Li H, Jiang YG, Xiang T, Hospedales T, Fu Y (2020) Deep ranking for image zero-shot multi-label classification. IEEE Trans Image Process 29:6549–6560

    MathSciNet  Article  Google Scholar 

  15. 15.

    Ji Z, Sun Y, Yu Y, Pang Y, Han J (2020) Attribute-guided network for cross-modal zero-shot hashing. IEEE Trans Neural Netw 31(1):321–330

    Article  Google Scholar 

  16. 16.

    Ji Z, Yan J, Wang Q, Pang Y, Li X (2020) Triple discriminator generative adversarial network for zero-shot image classification. Sci China Inf Sci 5:10

    Google Scholar 

  17. 17.

    Jiang H, Wang R, Shan S, Chen X (2019) Transferable contrastive network for generalized zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 9765–9774

  18. 18.

    Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: International conference on learning representations, pp 1–14

  19. 19.

    Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations

  20. 20.

    Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4447–4456

  21. 21.

    Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition, pp 951–958

  22. 22.

    Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465

    Article  Google Scholar 

  23. 23.

    Li J, Jing M, Lu K, Ding Z, Zhu L, Huang Z (2019) Leveraging the invariant side of generative zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7402–7411

  24. 24.

    Liu L, Zhang H, Xu X, Zhang Z, Yan S (2019) Collocating clothes with generative adversarial networks cosupervised by categories and attributes: a multidiscriminator framework. IEEE Trans Neural Netw Learn Syst 31(9):3540–3554

    MathSciNet  Article  Google Scholar 

  25. 25.

    Liu Y, Gao Q, Li J, Han J, Shao L (2018) Zero shot learning via low-rank embedded semantic autoencoder. In: IJCAI, pp 2490–2496

  26. 26.

    Liu Y, Xie DY, Gao Q, Han J, Wang S, Gao X (2019) Graph and autoencoder based feature extraction for zero-shot learning. In: Twenty-Eighth international joint conference on artificial intelligence, pp 3038–3044

  27. 27.

    Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

  28. 28.

    Mishra AK, Reddy MSK, Mittal A, Murthy HA (2017) A generative model for zero shot learning using conditional variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1–9

  29. 29.

    Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: Conference on computer vision, graphics and image processing, pp 722–729

  30. 30.

    Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations, pp 1–11

  31. 31.

    Reed S, Akata Z, Lee H, Schiele B (2016) Learning deep representations of fine-grained visual descriptions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 49–58

  32. 32.

    Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161

  33. 33.

    Romeraparedes B, Torr PHS (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161

  34. 34.

    Schonfeld E, Ebrahimi S, Sinha S, Darrell T, Akata Z (2019) Generalized zero- and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8247–8255

  35. 35.

    Shigeto Y, Suzuki I, Hara K, Shimbo M, Matsumoto Y (2015) Ridge regression, hubness, and zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 135–151

  36. 36.

    Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208

  37. 37.

    Tang Z, Jiang W, Zhang Z, Zhao M, Zhang L, Wang M (2019) Densenet with up-sampling block for recognizing texts in images. Neural Comput Appl 20:1–9

    Google Scholar 

  38. 38.

    Verma VK, Arora G, Mishra AK, Rai P (2018) Generalized zero-shot learning via synthesized examples. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4281–4289

  39. 39.

    Verma VK, Brahma D, Rai P (2020) A meta-learning framework for generalized zero-shot learning. In: AAAI conference on artificial intelligence, pp 1–8

  40. 40.

    Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, pp 3630–3638

  41. 41.

    Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset

  42. 42.

    Wang X, Ye Y, Gupta A (2018) Zero-shot recognition via semantic embeddings and knowledge graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6857–6866

  43. 43.

    Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 69–77

  44. 44.

    Xian Y, Lampert CH, Schiele B, Akata Z (2019) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265

    Article  Google Scholar 

  45. 45.

    Xian Y, Lorenz T, Schiele B, Akata Z (2018) Feature generating networks for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5542–5551

  46. 46.

    Xian Y, Sharma S, Schiele B, Akata Z (2019) F-vaegan-d2: a feature generating framework for any-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10275–10284

  47. 47.

    Ye Z, Lyu F, Li L, Fu Q, Ren J, Hu F (2019) Sr-gan: semantic rectifying generative adversarial network for zero-shot learning. In: IEEE international conference on multimedia and expo, pp 85–90

  48. 48.

    Yin C, Tang J, Xu Z, Wang Y (2018) Adversarial meta-learning. arXiv:1806.03316

  49. 49.

    Yu Y, Ji Z, Guo J, Zhang Z (2018) Zero-shot learning via latent space encoding. IEEE Trans Cybern 49(10):3755–3766

    Article  Google Scholar 

  50. 50.

    Yu Y, Ji Z, Han J, Zhang Z (2020) Episode-based prototype generating network for zero-shot learning. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 14035–14044

  51. 51.

    Zeng W, Zhao M, Gao Y, Zhang Z (2020) Tilegan: category-oriented attention-based high-quality tiled clothes generation from dressed person. Neural Comput Appl 5:1–14

    Google Scholar 

  52. 52.

    Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3010–3019

  53. 53.

    Zhang R, Che T, Ghahramani Z, Bengio Y, Song Y (2018) Metagan: an adversarial approach to few-shot learning. In: Advances in neural information processing systems, pp 2371–2380

  54. 54.

    Zhang Y, Li X, Lin M, Chiu B, Zhao M (2020) Deep-recursive residual network for image semantic segmentation. Neural Comput Appl 5:1–13

    Google Scholar 

  55. 55.

    Zhao M, Liu Y, Li X, Zhang Z, Zhang Y (2020) An end-to-end framework for clothing collocation based on semantic feature fusion. IEEE Multimedia 27(4):122–132

    Article  Google Scholar 

  56. 56.

    Zhu Y, Elhoseiny M, Liu B, Peng X, Elgammal A (2018) A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1004–1013

Download references

Acknowledgements

This research is partially supported by the Fundamental Research Funds for the Central Universities 2020QNA5010 and the National Natural Science Foundation of China under Grant 61771329 and Grant 62002320, the Central Funds Guiding the Local Science and Technology Development (Grant No. 206Z5001G).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yunlong Yu.

Ethics declarations

Conflict of interest

We wish to draw the attention of the Editor to the following facts which may be considered as potential conflict of interest and to significant financial contributions to this work. We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Ethical aprroval

We confirm that the manuscript has been read and approved by all named authors. We further confirm that the order of authors listed in the manuscript has been approved by all of us. The roles of all authors are listed as follows: Zhong Ji contributed to conceptualization and writing—review. Biying Cui contributed to software and writing—original draft. Yunlong Yu (Corresponding author) contributed to methodology and supervision. Yanwei Pang contributed to writing—review and editing. Zhongfei Zhang contributed to writing—review and editing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ji, Z., Cui, B., Yu, Y. et al. Zero-shot classification with unseen prototype learning. Neural Comput & Applic (2021). https://doi.org/10.1007/s00521-021-05746-9

Download citation

Keywords

  • Zero-shot learning
  • Meta-learning
  • Prototype learning
  • Image classification