Describing Clothing by Semantic Attributes

  • Huizhong Chen
  • Andrew Gallagher
  • Bernd Girod
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7574)


Describing clothing appearance with semantic attributes is an appealing technique for many important applications. In this paper, we propose a fully automated system that is capable of generating a list of nameable attributes for clothes on human body in unconstrained images. We extract low-level features in a pose-adaptive manner, and combine complementary features for learning attribute classifiers. Mutual dependencies between the attributes are then explored by a Conditional Random Field to further improve the predictions from independent classifiers. We validate the performance of our system on a challenging clothing attribute dataset, and introduce a novel application of dressing style analysis that utilizes the semantic attributes produced by our system.


Semantic Attribute Sift Descriptor Attribute Prediction Gender Recognition Solid Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Kumar, N., Belhumeur, P.N., Nayar, S.K.: FaceTracer: A Search Engine for Large Collections of Images with Faces. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 340–353. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  2. 2.
    Anguelov, D., Lee, K., Gokturk, S.B., Sumengen, B.: Contextual identity recognition in personal photo albums. In: CVPR (2007)Google Scholar
  3. 3.
    Lin, D., Kapoor, A., Hua, G., Baker, S.: Joint People, Event, and Location Recognition in Personal Photo Collections Using Cross-Domain Context. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 243–256. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Gallagher, A.C., Chen, T.: Clothing cosegmentation for recognizing people. In: CVPR (2008)Google Scholar
  5. 5.
    Cao, L., Dikmen, M., Fu, Y., Huang, T.S.: Gender recognition from body. ACM Multimedia (2008)Google Scholar
  6. 6.
    Bourdev, L., Maji, S., Malik, J.: Describing people: Poselet-based attribute classification. In: ICCV (2011)Google Scholar
  7. 7.
    Eichner, M., Marin-Jimenez, M., Zisserman, A., Ferrari, V.: Articulated human pose estimation and search in (almost) unconstrained still images. Technical Report 272, ETH Zurich, D-ITET, BIWI (2010)Google Scholar
  8. 8.
    Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)Google Scholar
  9. 9.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  10. 10.
    Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: ECCV (2010)Google Scholar
  11. 11.
    Parikh, D., Grauman, K.: Interactively building a discriminative vocabulary of nameable attributes. In: CVPR (2011)Google Scholar
  12. 12.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: ICCV (2009)Google Scholar
  13. 13.
    Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: CVPR (2011)Google Scholar
  14. 14.
    Berg, T.L., Berg, A.C., Shih, J.: Automatic Attribute Discovery and Characterization from Noisy Web Data. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 663–676. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Kulkarni, G., Premraj, V., Dhar, S., Li, S., Berg, A., Choi, Y., Berg, T.: Baby talk: Understanding and generating image descriptions. In: CVPR (2011)Google Scholar
  16. 16.
    Farhadi, A., Hejrati, M., Sadeghi, M.A., Young, P., Rashtchian, C., Hockenmaier, J., Forsyth, D.: Every Picture Tells a Story: Generating Sentences from Images. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 15–29. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  17. 17.
    Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3d human pose annotations. In: ICCV (2009)Google Scholar
  18. 18.
    Song, Z., Wang, M., Hua, X., Yan, S.: Predicting occupation via human clothing and contexts. In: ICCV (2011)Google Scholar
  19. 19.
    Yang, M., Yu, K.: Real-time clothing recognition in surveillance videos. In: ICIP (2011)Google Scholar
  20. 20.
    Zhang, W., Begole, B., Chu, M., Liu, J., Yee, N.: Real-time clothes comparison based on multi-view vision. In: ICDSC (2008)Google Scholar
  21. 21.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI (2009)Google Scholar
  22. 22.
    Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR (2011)Google Scholar
  23. 23.
    Shotton, J., Fitzgibbon, A., Cook, M., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR (2011)Google Scholar
  24. 24.
    Viola, P., Jones, M.: Robust real-time object detection. IJCV (2001)Google Scholar
  25. 25.
    Rother, C., Kolmogorov, V., Blake, A.: Grabcut - interactive foreground extraction using iterated graph cuts. In: SIGGRAPH (2004)Google Scholar
  26. 26.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV (2004)Google Scholar
  27. 27.
    Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. IJCV (2005)Google Scholar
  28. 28.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)Google Scholar
  29. 29.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. on Intel. Sys. and Tech. (2011)Google Scholar
  30. 30.
    Xiao, J., Hays, J., Ehinger, K.A., Torralba, A., Oliva, A.: Sun database: Large scale scene recognition from abbey to zoo. In: CVPR (2010)Google Scholar
  31. 31.
    Tappen, M.F., Freeman, W.T.: Comparison of graph cuts with belief propagation for stereo, using identical mrf parameters. In: ICCV (2003)Google Scholar
  32. 32.
    Gallagher, A.C., Chen, T.: Understanding images of groups of people. In: CVPR (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Huizhong Chen
    • 1
  • Andrew Gallagher
    • 2
    • 3
  • Bernd Girod
    • 1
  1. 1.Department of Electrical EngineeringStanford UniversityStanfordUSA
  2. 2.Kodak Research Laboratories, RochesterUSA
  3. 3.Cornell UniversityIthacaUSA

Personalised recommendations