Advertisement

Boosting Zero-Shot Image Classification via Pairwise Relationship Learning

  • Hanhui Li
  • Hefeng WuEmail author
  • Shujin Lin
  • Liang Lin
  • Xiaonan Luo
  • Ebroul Izquierdo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10111)

Abstract

Zero-shot image classification (ZSIC) is one of the emerging challenges in the communities of computer vision, artificial intelligence and machine learning. In this paper, we propose to exploit the pairwise relationships between test instances to increase the performance of conventional methods, e.g. direct attribute prediction (DAP), for the ZSIC problem. To infer pairwise relationships between test instances, we introduce two different methods, a binary classification based method and a metric learning based method. Based on the inferred relationships, we construct a similarity graph to represent test instances, and then employ an adaptive graph anchors voting method to refine the results of DAP iteratively: In each iteration, we partition the similarity graph with the normalized spectral clustering method, and determine the class label of each cluster via the voting of graph anchors. Extensive experiments validate the effectiveness of our method: with the properly learned pairwise relationships, we successfully boost the mean class accuracy of DAP on two standard benchmarks for the ZSIC problem, Animal with Attribute and aPascal-aYahoo, from \(57.46\%\) to \(84.43\%\) and \(26.59\%\) to \(70.09\%\), respectively. Besides, experimental results on the SUN Attribute also suggest our method can obtain considerable performance improvement for the large-scale ZSIC problem.

Notes

Acknowledgement

This research is supported by National Natural Science Foundation of China (61320106008, 61232011, 61402120, 61572531, 61622214), Educational Commission of Guangdong Province (2013CXZDB001), and Natural Science Foundation of Guangdong Province (2014A030310348). The corresponding author is Hefeng Wu.

References

  1. 1.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 951–958 (2009)Google Scholar
  2. 2.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1778–1785(2009)Google Scholar
  3. 3.
    Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised Learning. MIT Press, Cambridge (2006)CrossRefGoogle Scholar
  4. 4.
    Hu, J., Lu, J., Tan, Y.: Discriminative deep metric learning for face verification in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1875–1882(2014)Google Scholar
  5. 5.
    Li, H., Li, D., Luo, X.: BAP: bimodal attribute prediction for zero-shot image categorization. In: Proceedings of the ACM International Conference on Multimedia, pp. 1013–1016 (2014)Google Scholar
  6. 6.
    Maggini, M., Melacci, S., Sarti, L.: Learning from pairwise constraints by similarity neural networks. Neural Netw. 26, 141–158 (2012)CrossRefGoogle Scholar
  7. 7.
    Li, Z., Liu, J., Tang, X.: Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML), pp. 576–583 (2008)Google Scholar
  8. 8.
    Baghshah, M.S., Shouraki, S.B.: Semi-supervised metric learning using pairwise constraints. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), pp. 1217–1222 (2009)Google Scholar
  9. 9.
    Zhu, G., Yan, S., Ma, Y.: Image tag refinement towards low-rank, content-tag prior and error sparsity. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 461–470 (2010)Google Scholar
  10. 10.
    Hong, S., Choi, J., Feyereisl, J., Han, B., Davis, L.S.: Joint image clustering and labeling by matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 38, 1411–1424 (2016)CrossRefGoogle Scholar
  11. 11.
    Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 819–826 (2013)Google Scholar
  12. 12.
    Romera-Paredes, B., Torr, P.H.S.: An embarrassingly simple approach to zero-shot learning. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 2152–2161 (2015)Google Scholar
  13. 13.
    Jayaraman, D., Grauman, K.: Zero-shot recognition with unreliable attributes. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 3464–3472 (2014)Google Scholar
  14. 14.
    Fu, Z., Xiang, T.A., Kodirov, E., Gong, S.: Zero-shot object recognition by semantic manifold distance. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2635–2644 (2015)Google Scholar
  15. 15.
    Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 4166–4174 (2015)Google Scholar
  16. 16.
    Mensink, T., Gavves, E., Snoek, C.G.M.: COSTA: co-occurrence statistics for zero-shot classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2441–2448 (2014)Google Scholar
  17. 17.
    Elhoseiny, M., Saleh, B., Elgammal, A.M.: Write a classifier: zero-shot learning using purely textual descriptions. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2584–2591 (2013)Google Scholar
  18. 18.
    Da, Q., Yu, Y., Zhou, Z.: Learning with augmented class by exploiting unlabeled data. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1760–1766 (2014)Google Scholar
  19. 19.
    Guo, Y., Ding, G., Jin, X., Wang, J.: Transductive zero-shot recognition via shared model space learning. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 3434–3500 (2016)Google Scholar
  20. 20.
    Wang, D., Li, Y., Lin, Y., Zhuang, Y.: Relational knowledge transfer for zero-shot learning. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 2145–2151 (2016)Google Scholar
  21. 21.
    Gan, C., Lin, M., Yang, Y., de Melo, G., Hauptmann, A.G.: Concepts not alone: exploring pairwise relationships for zero-shot video activity recognition. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 3487–3493 (2016)Google Scholar
  22. 22.
    Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6034–6042 (2016)Google Scholar
  23. 23.
    Socher, R., Ganjoo, M., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 935–943 (2013)Google Scholar
  24. 24.
    Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Transductive multi-view zero-shot learning. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2332–2345 (2015)CrossRefGoogle Scholar
  25. 25.
    Guadarrama, S., Krishnamoorthy, N., Malkarnenkar, G., Venugopalan, S., Mooney, R.J., Darrell, T., Saenko, K.: YouTube2Text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2712–2719 (2013)Google Scholar
  26. 26.
    Cheng, H., Griss, M.L., Davis, P., Li, J., You, D.: Towards zero-shot learning for human activity recognition using semantic attribute sequence model. In: Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 355–358 (2013)Google Scholar
  27. 27.
    Chang, X., Yang, Y., Hauptmann, A.G., Xing, E.P., Yu, Y.: Semantic concept discovery for large-scale zero-shot event detection. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, pp. 2234–2240 (2015)Google Scholar
  28. 28.
    Wu, S., Bondugula, S., Luisier, F., Zhuang, X., Natarajan, P.: Zero-shot event detection using multi-modal fusion of weakly supervised concepts. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, pp. 2665–2672, 23–28 June 2014Google Scholar
  29. 29.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 453–465 (2014)CrossRefGoogle Scholar
  30. 30.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, United States, pp. 1106–1114, 3–6 December 2012Google Scholar
  31. 31.
    Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning (ICML), Corvallis, Oregon, USA, pp. 209–216, 20–24 June 2007Google Scholar
  32. 32.
    von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, pp. 849–856, 3–8 December 2001Google Scholar
  34. 34.
    Liu, W., He, J., Chang, S.: Large graph construction for scalable semi-supervised learning. In: Proceedings of the 27th International Conference on Machine Learning (ICML), Haifa, Israel, pp. 679–686, 21–24 June 2010Google Scholar
  35. 35.
    Patterson, G., Xu, C., Su, H., Hays, J.: The SUN attribute database: beyond categories for deeper scene understanding. Int. J. Comput. Vision 108, 59–81 (2014)CrossRefGoogle Scholar
  36. 36.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations (ICLR) (2015)Google Scholar
  37. 37.
    Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  38. 38.
    Li, X., Guo, Y., Schuurmans, D.: Semi-supervised zero-shot classification with label representation learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4211–4219 (2015)Google Scholar
  39. 39.
    Escorcia, V., Niebles, J.C., Ghanem, B.: On the relationship between visual attributes and convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1256–1264 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Hanhui Li
    • 1
    • 2
  • Hefeng Wu
    • 1
    • 3
    Email author
  • Shujin Lin
    • 1
  • Liang Lin
    • 2
  • Xiaonan Luo
    • 1
    • 4
  • Ebroul Izquierdo
    • 5
  1. 1.National Engineering Research Center of Digital LifeSun Yat-sen UniversityGuangzhouChina
  2. 2.School of Data and Computer ScienceSun Yat-sen UniversityGuangzhouChina
  3. 3.School of InformaticsGuangdong University of Foreign StudiesGuangzhouChina
  4. 4.Beijing Key Laboratory of Multimedia and Intelligent Software Technology, College of Metropolitan TransportationBeijing University of TechnologyBeijingChina
  5. 5.Queen Mary, University of LondonLondonUK

Personalised recommendations