Abstract
Zero-shot learning aims to recognize objects from unseen classes, where samples are not available at the training stage, by transferring knowledge from seen classes, where labeled samples are provided. It bridges seen and unseen classes via a shared semantic space such as class attribute space or class prototype space. While previous approaches have tried to learning a mapping function from the visual space to the semantic space with different objective functions, we take a different approach and try to map from the semantic space to the visual space. The inverse mapping predicts the visual feature prototype of each unseen class via the semantic vector for image classification. We also propose a heuristic algorithm to select a high density set from data of each seen class. The visual feature prototypes from the high density sets are more discriminative, which is benefit to the classification. Our approach is evaluated for zero-shot recognition on four benchmark data sets and significantly outperforms the state-of-the-art methods on AWA, SUN, APY.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: AAAI, vol. 1, p. 3 (2008)
Guo, Y., Ding, G., Han, J., Tang, S.: Zero-shot learning with attribute selection. In: AAAI, pp. 6870–6877. AAAI Press (2018)
Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 4447–4456 (2017)
Norouzi, M., et al.: Zero-shot learning by convex combination of semantic embeddings. CoRR, vol. abs/1312.5650 (2013)
Shigeto, Y., Suzuki, I., Hara, K., Shimbo, M., Matsumoto, Y.: Ridge regression, hubness, and zero-shot learning. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9284, pp. 135–151. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23528-8_9
Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Transductive multi-view zero-shot learning. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2332–2345 (2015)
Li, Y., Wang, D., Hu, H., Lin, Y., Zhuang, Y.: Zero-shot recognition using dual visual-semantic mapping paths. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5207–5215 (2017)
Changpinyo, S., Chao, W.-L., Sha, F.: Predicting visual exemplars of unseen classes for zero-shot learning. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 3496–3505 (2017)
Frome, A., et al.: DeVISE: a deep visual-semantic embedding model. In: International Conference on Neural Information Processing Systems, pp. 2121–2129 (2013)
Akata, Z., Reed, S.E., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: CVPR, pp. 2927–2936. IEEE Computer Society (2015)
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 69–77 (2016)
Bucher, M., Herbin, S., Jurie, F.: Generating visual representations for zero-shot classification, pp. 2666–2673 (2017)
Jiang, H., Kim, B., Guan, M., Gupta, M.: To trust or not to trust a classifier. In: Advances in Neural Information Processing Systems, pp. 5542–5553 (2018)
Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning - the good, the bad and the ugly. In: CVPR, pp. 3077–3086. IEEE Computer Society (2017)
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2014)
Socher, R., Ganjoo, M., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: NIPS, pp. 935–943 (2013)
Zhang, Z., Saligrama, V.: Zero-shot recognition via structured prediction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 533–548. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_33
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q.N., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. CoRR, vol. abs/1603.08895 (2016)
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label embedding for image classication. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)
Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning, pp. 2152–2161 (2015)
Morgado, P., Vasconcelos, N.: Semantically consistent regularization for zero-shot recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 2037–2046 (2017)
Verma, V.K., Rai, P.: A simple exponential family framework for zero shot learning. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10535, pp. 792–808. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71246-8_48
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer (2009)
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1778–1785. IEEE (2009)
Socher, R., Ganjoo, M., Sridhar, H., Bastani, O., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: International Conference on Neural Information Processing Systems (2013)
Elhoseiny, M.: Write a classifier: zero shot learning using purely textual descriptions. In: IEEE International Conference on Computer Vision (2014)
Lei Ba, J., Swersky, K., Fidler, S., Salakhutdinov, R.: Predicting deep zero-shot convolutional neural networks using textual descriptions (2015)
Acknowledgment
This work is supported by NSFC project Grant No. U18331 01, Shenzhen Science and Technologies project under Grant No. JCYJ201604281 82137473 and the Joint Research Center of Tencent and Tsinghua.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, X., Song, B., Wang, Z., Yuan, C. (2020). An Inverse Mapping with Manifold Alignment for Zero-Shot Learning. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-37734-2_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37733-5
Online ISBN: 978-3-030-37734-2
eBook Packages: Computer ScienceComputer Science (R0)