Learning Contextual Metrics for Automatic Image Annotation

  • Zuotao Liu
  • Xiangdong Zhou
  • Yu Xiang
  • Yan-Tao Zheng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6297)


The semantic contextual information is shown to be an important resource for improving the scene and image recognition, but is seldom explored in the literature of previous distance metric learning (DML) for images. In this work, we present a novel Contextual Metric Learning (CML) method for learning a set of contextual distance metrics for real world multi-label images. The relationships between classes are formulated as contextual constraints for the optimization framework to leverage the learning performance. In the experiment, we apply the proposed method for automatic image annotation task. The experimental results show that our approach outperforms the start-of-the-art DML algorithms.


Mahalanobis Distance Learning Framework Semantic Context Pairwise Constraint Contextual Constraint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning distance functions using equivalence relations. In: ICML (2003)Google Scholar
  2. 2.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: NIPS (2004)Google Scholar
  3. 3.
    Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: NIPS (2005)Google Scholar
  4. 4.
    Hoi, S.C., Liu, W., Lyu, M.R., Ma, W.Y.: Learning distance metrics with contextual constraints for image retrieval. In: CVPR (2006)Google Scholar
  5. 5.
    Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: ICML (2007)Google Scholar
  6. 6.
    Wu, L., Hoi, S.C., Jin, R., Zhu, J., Yu, N.: Distance metric learning from uncertain side information with application to automated photo tagging. In: ACM MM (2009)Google Scholar
  7. 7.
    Qi, G.J., Hua, X.S., Zhang, H.J.: Learning semantic distance from community-tagged media collection. In: ACM MM (2009)Google Scholar
  8. 8.
    Weinberger, K.Q., Saul, L.K.: Fast solvers and efficient implementations for distance metric learning. In: ICML (2008)Google Scholar
  9. 9.
    Duygulu, P., Barnard, K., de Freitas, J., Forsyth, D.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Xiang, Y., Zhou, X., Chua, T.S., Ngo, C.W.: A revisit of generative model for automatic image annotation using markov random fields. In: CVPR (2009)Google Scholar
  11. 11.
    Babenko, B., Branson, S., Belongie, S.: Similarity metrics for categorization: from monolithic to category specific. In: ICCV (2009)Google Scholar
  12. 12.
    Zhan, D.C., Li, M., Li, Y.F., Zhou, Z.H.: Learning instance specific distance using metric propagation. In: ICML (2009)Google Scholar
  13. 13.
    Xiang, Y., Zhou, X., Liu, Z., Chua, T.S., Ngo, C.W.: Semantic context modeling with maximal margin conditional random fields for automatic image annotation. In: CVPR (2010)Google Scholar
  14. 14.
    Zhou, N., Cheung, W., Xue, X.Y., Qiu, G.: Collaborative and content-based image labeling. In: ICPR (2008)Google Scholar
  15. 15.
    Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (2009)Google Scholar
  16. 16.
    Daivs, J.V., Dhillon, I.: Differential entorpic clustering of multivariate gaussians. In: NIPS (2006)Google Scholar
  17. 17.
    Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2003)Google Scholar
  18. 18.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the patial envelop. IJCV 42, 145–175 (2001)zbMATHCrossRefGoogle Scholar
  19. 19.
    Jiang, Y., Ngo, C., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: CIVR (2007)Google Scholar
  20. 20.
    Fukunaga, K.: Introduction to statistical pattern recognition. Elsevier, Amsterdam (1990)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Zuotao Liu
    • 1
  • Xiangdong Zhou
    • 1
  • Yu Xiang
    • 1
  • Yan-Tao Zheng
    • 2
  1. 1.Fudan UniversityShanghaiChina
  2. 2.Institute for Infocomm ResearchSingapore

Personalised recommendations