Multimedia Tools and Applications

, Volume 77, Issue 17, pp 22185–22198 | Cite as

Semantic binary coding for visual recognition via joint concept-attribute modelling

  • Xing XuEmail author
  • Haiping Wu
  • Yang Yang
  • Fumin Shen
  • Ning Xie
  • Yanli Ji


Recent years have witnessed the unprecedented efforts of visual representation for enabling various efficient and effective multimedia applications. In this paper, we propose a novel visual representation learning framework, which generates efficient semantic hash codes for visual samples by substantially exploring concepts, semantic attributes as well as their inter-correlations. Specifically, we construct a conceptual space, where the semantic knowledge of concepts and attributes is embedded. Then, we develop an effective on-line feature coding scheme for visual objects by leveraging the inter-concept relationships through the intermediate representative power of attributes. The code process is formulated as an overlapping group lasso problem, which can be efficiently solved. Finally, we may binarize the visual representation to generate efficient hash codes. Extensive experiments have been conducted to illustrate the superiority of our proposed framework on visual retrieval task as compared to state-of-the-art methods.


Visual computing Binary representation Semantics Attributes 



This work was supported in part by the National Science Foundation of China under Project 61572108, Project 61602089, Project 61502081, Project 61632007, and the Fundamental Research Funds for the Central Universities under Project ZYGX2014Z007, Project ZYGX2015J055 and the 111 Project No. B17008.


  1. 1.
    Chiang C-K, Su T-F, Yen C, Lai S-H (2013) Multi-attributed dictionary learning for sparse coding. In: CVPR, pp 1137–1144Google Scholar
  2. 2.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, vol 1, pp 886–893Google Scholar
  3. 3.
    Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: SCG. ACM, pp 253–262Google Scholar
  4. 4.
    Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: CVPR, pp 1778–1785Google Scholar
  5. 5.
    Farhadi A, Endres I, Hoiem D (2010) Attribute-centric recognition for cross-category generalization. In: CVPR, pp 2352–2359Google Scholar
  6. 6.
    Gao S, Chia L-T, Tsang IW-H (2011) Multi-layer group sparse coding—for concurrent image classification and annotation. In: CVPR, pp 2809–2816Google Scholar
  7. 7.
    Gong Y, Lazebnik S (2011) Iterative quantization: a procrustean approach to learning binary codes. In: CVPR, pp 817–824Google Scholar
  8. 8.
    Hu M, Yang Y, Shen F, Zhang L, Shen HT, Xuelong L (2017) Robust web image annotation via exploring multi-facet and structural knowledge. IEEE Trans Image Process 26(10):4871–4884MathSciNetCrossRefGoogle Scholar
  9. 9.
    Hu M, Yang Y, Shen F, Xie N, Shen HT (2018) Hashing with angular reconstructive embeddings. IEEE Trans Image Process 27(2):545–555MathSciNetCrossRefGoogle Scholar
  10. 10.
    Huang J, Liu H, Shen J, Yan S (2013) Towards efficient sparse coding for scalable image annotation. In: MM. ACM, pp 947–956Google Scholar
  11. 11.
    Jacob L, Obozinski G, Vert J-P (2009) Group lasso with overlap and graph lasso. In: ICML, pp 433–440Google Scholar
  12. 12.
    Kang W-C, Li W-J, Zhou Z-H (2016) Column sampling based discrete supervised hashing. In: AAAI, pp 1230–1236Google Scholar
  13. 13.
    Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp 951–958Google Scholar
  14. 14.
    Li C, Feng Z, Han Y (2016) Image attribute learning with ontology guided fused lasso. Multimedia Tools Appl 75(12):7029–7043CrossRefGoogle Scholar
  15. 15.
    Lin G, Shen C, Shi Q, van den Hengel A, Suter D (2014) Fast supervised hashing with decision trees for high-dimensional data. In: CVPR, pp 1963–1970Google Scholar
  16. 16.
    Liu J, Ji S, Ye J (2009) SLEP: sparse learning with efficient projections. Arizona State UniversityGoogle Scholar
  17. 17.
    Liu W, Wang J, Ji R, Jiang Y-G, Chang S-F (2012) Supervised hashing with kernels. In: CVPR, pp 2074–2081Google Scholar
  18. 18.
    Lowe DG (1999) Object recognition from local scale-invariant features. In: ICCV, vol 2, pp 1150–1157Google Scholar
  19. 19.
    Luo Y, Yang Y, Shen F, Huang Z, Zhou P, Shen HT (2017) Robust discrete code modeling for supervised hashing. Pattern Recogn 75:128–135CrossRefGoogle Scholar
  20. 20.
    Nie L, Yan S, Wang M, Hong R, Chua T-S (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on multimedia, pp 59–68Google Scholar
  21. 21.
    Ouyang W, Li H, Zeng X, Wang X (2015) Learning deep representation with large-scale attributes. In: CVPR, pp 1895–1903Google Scholar
  22. 22.
    Raginsky M, Lazebnik S (2009) Locality-sensitive binary codes from shift-invariant kernels. In: NIPS, pp 1509–1517Google Scholar
  23. 23.
    Ri C, Yao M (2015) Bayesian network based semantic image classification with attributed relational graph. Multimedia Tools Appl 74(13):4965–4986CrossRefGoogle Scholar
  24. 24.
    Shih TK (2002) Distributed multimedia databases: techniques and applications. IGI Global, HersheyCrossRefGoogle Scholar
  25. 25.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
  26. 26.
    Tang J, Shao L, Li X (2014) Efficient dictionary learning for visual categorization. Comput Vis Image Underst 124:91–98CrossRefGoogle Scholar
  27. 27.
    Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol 73(3):273–282MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Wang B, Yang Y, Xu X, Hanjalic A, Shen HT (2017) Adversarial cross-modal retrieval. In: ACM multimedia, pp 154–162Google Scholar
  29. 29.
    Wu L, Wang Y, Pan S (2016) Exploiting attribute correlations: a novel trace lasso-based weakly supervised dictionary learning method. IEEE Transactions on Cybernetics 47(12):4497–4508CrossRefGoogle Scholar
  30. 30.
    Wu H, Yang Y, Xu X, Shen F, Xie N, Ji Y (2017) Exploiting concept correlation with attributes for semantic binary representation learning. In: ICIMCSGoogle Scholar
  31. 31.
    Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507MathSciNetCrossRefGoogle Scholar
  32. 32.
    Yan Y, Nie F, Li W, Gao C, Yang Y, Xu D (2016) Image classification by cross-media active learning with privileged information. IEEE Trans Multimedia 18 (12):2494–2502CrossRefGoogle Scholar
  33. 33.
    Yang Y, Yang Y, Huang Z, Shen HT, Nie F (2011) Tag localization with spatial correlations and joint group sparsity. In: CVPR, pp 881–888Google Scholar
  34. 34.
    Yang Y, Nie F, Xu D, Luo J, Zhuang Y, Pan Y (2012) A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans Pattern Anal Mach Intell 34(4):723–742CrossRefGoogle Scholar
  35. 35.
    Yang Y, Wu F, Nie F, Shen HT, Zhuang Y, Hauptmann AG (2012) Web and personal image annotation by mining label correlation with relaxed visual graph embedding. IEEE Trans Image Process 21(3):1339–1351MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Yang Y, Zhang H, Zhang M, Shen F, Li X (2015) Visual coding in a semantic hierarchy. In: MM, pp 59–68Google Scholar
  37. 37.
    Yang Y, Zhang H, Zhang M, Shen F, Li X (2015) Visual coding in a semantic hierarchy. In: Proceedings of the 23rd ACM international conference on multimedia, MM ’15, pp 59–68Google Scholar
  38. 38.
    Yang Y, Luo Y, Chen W, Shen F, Shao J, Shen HT (2016) Zero-shot hashing via transferring supervised knowledge. In: Proceedings of the 2016 ACM on multimedia conference, pp 1286–1295Google Scholar
  39. 39.
    Yang B, Gu C, Wu K, Zhang T, Guan X (2017) Simultaneous dimensionality reduction and dictionary learning for sparse representation based classification. Multimedia Tools Appl 76(6):8969–8990CrossRefGoogle Scholar
  40. 40.
    Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68(1):49–67MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Zhang S, Huang J, Li H, Metaxas DN (2012) Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern B Cybern 42(3):838–849CrossRefGoogle Scholar
  42. 42.
    Zhang H, Zha Z, Yang Y, Yan S, Gao Y, Chua T (2013) Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In: ACM multimedia conference, MM ’13, Barcelona, Spain, October 21–25, 2013, pp 33–42Google Scholar
  43. 43.
    Zhang H, Shen F, Liu W, He X, Luan H, Chua T (2016) Discrete collaborative filtering. In: ACM SIGIR, pp 325–334Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Guizhou Provincial Key Laboratory of Public Big DataGuizhou UniversityGuiyangChina
  2. 2.Center for Future Media & School of Computer Science and EngineeringUniversity of Electronic Science and Technology of ChinaChengduChina

Personalised recommendations