Sparse Representations and Distance Learning for Attribute Based Category Recognition

  • Grigorios Tsagkatakis
  • Andreas Savakis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6553)


While traditional approaches in object recognition require the specification of training examples from each class and the application of class specific classifiers, in real world situations, the immensity of the number of image classes makes this task daunting. A novel approach in object recognition is attribute based classification, where instead of training classifiers for the recognition of specific object class instances, classifiers are trained on attributes of the object images and these attributes are subsequently used for the object recognition. The attributes based paradigm offers significant advantages including the ability to train classifiers without any visual examples. We begin by discussing a scenario for object recognition on mobile devices where the attribute prediction and the attribute-to-class mapping are decoupled in order to meet the specific resource constraints of mobile systems. We next present two extensions on the attribute based classification paradigm by introducing alternative approaches in attribute prediction and attribute-to-class mapping. For the attribute prediction, we employ the recently proposed Sparse Representations Classification scheme that offers significant benefits compared to the previous SVM based approaches, such as increased accuracy and elimination of the training stage. For the attribute-to-class mapping, we employ a Distance Metric Learning algorithm that automatically infers the significance of each attribute instead of assuming uniform attribute importance. The benefits of the proposed extensions are validated through experimental results.


Attribute Based Object Recognition Sparse Representations Classification Distance Metric Learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  2. 2.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)Google Scholar
  3. 3.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust Face Recognition via Sparse Representation. IEEE Trans. PAMI 31(2), 210–227 (2009)CrossRefGoogle Scholar
  4. 4.
    Huang, J., Huang, X., Metaxas, D.: Simultaneous Image Transformation and Sparse Representation Recovery. In: CVPR (2008)Google Scholar
  5. 5.
    Li, T., Mei, T., Yan, S., Kweon, I.S., Lee, C.: Contextual decomposition of multi-label images. In: CVPR, pp. 2270–2277 (2009)Google Scholar
  6. 6.
    Rohrbach, M., Stark, M., Szarvas, G., Schiele, B., Gurevych, I.: What Helps Where – And Why? Semantic Relatedness for Knowledge Transfer. In: CVPR (2010)Google Scholar
  7. 7.
    Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)Google Scholar
  8. 8.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and Simile Classifiers for Face Verification. In: ICCV (2009)Google Scholar
  9. 9.
    Wang, G., Forsyth, D.: Joint learning of visual attributes, object classes and visual saliency. In: ICCV (2009)Google Scholar
  10. 10.
    Wang, J., Markert, K., Everingham, M.: Learning models for object recognition from natural language descriptions. In: BMVC (2009)Google Scholar
  11. 11.
    Candes, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Info. Theory 52(2), 489–509 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Donoho, D.L.: Compressed sensing. IEEE Trans. Info. Theory 52(4), 1289–1306 (2006)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Xing, E., Ng, A., Jordan, M., Russell, S.: Distance metric learning with application to clustering with side-information. In: Adv. NIPS (2003)Google Scholar
  14. 14.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighborhood component analysis. In: Adv. NIPS (2004)Google Scholar
  15. 15.
    Globerson, A., Roweis, S.: Metric learning by collapsing classes. In: Adv. NIPS (2006)Google Scholar
  16. 16.
    Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-Theoretic Metric Learning. In: ICML (2007)Google Scholar
  17. 17.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, 207–244 (2009)zbMATHGoogle Scholar
  18. 18.
    Calderbank, R., Jafarpour, S., Schapire, R.: Compressed learning: Universal sparse dimensionality reduction and learning in the measurement domain (2010) (preprint)Google Scholar
  19. 19.
    Baraniuk, R., Davenport, M., DeVore, R., Wakin, M.A.: Simple Proof of the Restricted Isometry Property for random matrices. Constr. Approx. 28(3), 253–263 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Ta, D.N., Chen, W.C., Gelfand, N., Pulli, K.: SURFTrac: Efficient tracking and continuous object recognition using local feature descriptors. In: CVPR (2009)Google Scholar
  21. 21.
    Wagner, D., Reitmayr, G., Mulloni, A., Drummond, T., Schmalstieg, D.: Pose tracking from natural features on mobile phones. In: ISMAR (2008)Google Scholar
  22. 22.
    Donoho, D.: Sparselab, (retrieved March 2010)
  23. 23.
    Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Fast L1-Minimization Algorithms and An Application in Robust Face Recognition: A Review. University of California at Berkeley Technical report UCB/EECS-2010-13 (2010)Google Scholar
  24. 24.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR (2006)Google Scholar
  25. 25.
    Aharon, M., Elad, M., Bruckstein, A.M.: The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representations. IEEE Trans. SP 54(11), 4311–4322 (2006)CrossRefGoogle Scholar
  26. 26.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Supervised dictionary learning. In: Adv. NIPS 21 (2009)Google Scholar
  27. 27.
    Wagner, D., Schmalstieg, D., Bischof, H.: Multiple target detection and tracking with guaranteed framerates on mobile phones. In: IEEE ISMAR (2009)Google Scholar
  28. 28.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) Challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  29. 29.
    Jain, P., Kulis, B., Dhillon, I., Grauman, K.: Online Metric Learning and Fast Similarity Search. In: Adv. NIPS (2008)Google Scholar
  30. 30.
    Vaquero, D.A., Feris, R.S., Tran, D., Brown, L., Hampapur, A., Turk, M.: Attribute-Based People Search in Surveillance Environments. In: IEEE WACV (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Grigorios Tsagkatakis
    • 1
  • Andreas Savakis
    • 2
  1. 1.Center for Imaging ScienceRochester Institute of TechnologyRochesterUSA
  2. 2.Department of Computer EngineeringRochester Institute of TechnologyRochesterUSA

Personalised recommendations