Automatic Attribute Discovery and Characterization from Noisy Web Data

  • Tamara L. Berg
  • Alexander C. Berg
  • Jonathan Shih
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6311)


It is common to use domain specific terminology – attributes – to describe the visual appearance of objects. In order to scale the use of these describable visual attributes to a large number of categories, especially those not well studied by psychologists or linguists, it will be necessary to find alternative techniques for identifying attribute vocabularies and for learning to recognize attributes without hand labeled training data. We demonstrate that it is possible to accomplish both these tasks automatically by mining text and image data sampled from the Internet. The proposed approach also characterizes attributes according to their visual representation: global or local, and type: color, texture, or shape. This work focuses on discovering attributes and their visual appearance, and is as agnostic as possible about the textual description.


Object Category Visual Appearance Visual Attribute Potential Attribute Multiple Instance Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Kumar, N., Berg, A.C., Belhumeur, P., Nayar, S.K.: Attribute and simile classifiers for face verification. In: ICCV (2009)Google Scholar
  2. 2.
    Kumar, N., Belhumeur, P., Nayar, S.K.: FaceTracer: A search engine for large collections of images with faces. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 340–353. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)Google Scholar
  4. 4.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  5. 5.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR (2003)Google Scholar
  6. 6.
    Fei-Fei, L., Fergus, R., Perona, P.: A Bayesian approach to unsupervised one-shot learning of object categories. In: ICCV (2003)Google Scholar
  7. 7.
    Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.A.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  8. 8.
    Barnard, K., Duygulu, P., Forsyth, D.A.: Clustering art. In: CVPR (2005)Google Scholar
  9. 9.
    Yanai, K., Barnard, K.: Image region entropy: A measure of ‘visualness’ of web images associated with one concept. In: WWW (2005)Google Scholar
  10. 10.
    Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.I.: Matching words and pictures. JMLR 3, 1107–1135 (2003)zbMATHCrossRefGoogle Scholar
  11. 11.
    Wang, J., Markert, K., Everingham, M.: Learning models for object recognition from natural language descriptions. In: BMVC (2009)Google Scholar
  12. 12.
    Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)Google Scholar
  13. 13.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI 99 (2009)Google Scholar
  14. 14.
    Galleguillos, C., Babenko, B., Rabinovich, A., Belongie, S.: Weakly supervised object recognition and localization with stable segmentations. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 193–207. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L., et al: Pascal Voc Workshops (2005-2009) Google Scholar
  16. 16.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: ICCV (2005)Google Scholar
  17. 17.
    Sivic, J., Russell, B.C., Zisserman, A., Freeman, W.T., Efros, A.A.: Discovering object categories in image collections. In: ICCV (2005)Google Scholar
  18. 18.
    Berg, T.L., Forsyth, D.A.: Animals on the web. In: CVPR (2006)Google Scholar
  19. 19.
    Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. In: ICCV (2007)Google Scholar
  20. 20.
    Berg, T.L., Berg, A.C.: Finding iconic images. In: CVPR Internet Vision Workshop (2009)Google Scholar
  21. 21.
    Berg, T.L., Berg, A.C., Edwards, J., Forsyth, D.A.: Who’s in the picture? NIPS (2004)Google Scholar
  22. 22.
    Collins, B., Deng, J., Li, K., Fei-Fei, L.: Towards scalable dataset construction: An active learning approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 86–98. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  23. 23.
    Viola, P., Plattand, J.C., Zhang, C.: Multiple instance boosting for object detection. In: NIPS (2007)Google Scholar
  24. 24.
    Berg, A.C., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondence. In: CVPR (2005)Google Scholar
  25. 25.
    Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. IJCV 43, 29–44 (2001)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Tamara L. Berg
    • 1
  • Alexander C. Berg
    • 2
  • Jonathan Shih
    • 3
  1. 1.Stony Brook UniversityStony BrookUSA
  2. 2.Columbia UniversityNew YorkUSA
  3. 3.University of CaliforniaBerkeleyUSA

Personalised recommendations