Multi-label Classification for Image Annotation via Sparse Similarity Voting

  • Tomoya Sakai
  • Hayato Itoh
  • Atsushi Imiya
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6469)


We present a supervised multi-label classification method for automatic image annotation. Our method estimates the annotation labels for a test image by accumulating similarities between the test image and labeled training images. The similarities are measured on the basis of sparse representation of the test image by the training images, which avoids similarity votes for irrelevant classes. Besides, our sparse representation-based multi-label classification can estimate a suitable combination of labels even if the combination is unlearned. Experimental results using the PASCAL dataset suggest effectiveness for image annotation compared to the existing SVM-based multi-labeling methods. Nonlinear mapping of the image representation using the kernel trick is also shown to enhance the annotation performance.


Test Image Training Image Sparse Representation Image Annotation Orthogonal Match Pursuit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)Google Scholar
  2. 2.
    Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) Proceedings of ECML-1998, 10th European Conference on Machine Learning, pp. 137–142. Springer, Heidelberg (1998)Google Scholar
  3. 3.
    Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  4. 4.
    Kressel, U.H.G.: Pairwise classification and support vector machines. MIT Press, Cambridge (1999)Google Scholar
  5. 5.
    Bucak, S.S., Mallapragada, P.K., Jin, R., Jain, A.K.: Efficient multi-label ranking for multi-class learning: approach to object recognition. In: International Conference on Computer Vision (2009)Google Scholar
  6. 6.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 210–227 (2009)CrossRefGoogle Scholar
  7. 7.
    Wang, C., Yan, S., Zhang, L., Zhang, H.J.: Multi-label sparse coding for automatic image annotation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 0, pp. 1643–1650 (2009)Google Scholar
  8. 8.
    Hsu, D., Kakade, S., Langford, J., Zhang, T.: Multi-label prediction via compressed sensing. In: 23rd Annual Conference on Neural Information Processing Systems (2009)Google Scholar
  9. 9.
    Donoho, D.: Compressed sensing. IEEE Trans. Information Theory 52, 1289–1306 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52, 489–509 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Candès, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Comm. on Pure and Applied Math. 59, 1207–1223 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Candès, E.J.: The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique 346, 589–592 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Candès, E.J., Wakin, M.B.: An introduction to compressive sampling. IEEE Signal Processing Magazine, 21–30 (March 2008)Google Scholar
  14. 14.
    Gribonval, R., Nielsen, M.: Sparse representations in unions of bases. IEEE Transactions on Information Theory 49, 3320–3325 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Donoho, D., Elad, M.: Optimally sparse representation in general (non-orthogonal) dictionaries via l 1 minimization. Proc. the National Academy of Sciences of the United States of America, 2197–2202 (2003)Google Scholar
  16. 16.
    Candès, E.J., Tao, T.: Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Transactions on Information Theory 52, 5406–5425 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Transactions on Information Theory 51, 4203–4215 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Rudelson, M., Vershynin, R., Rudelson, M., Vershynin, R.: Geometric approach to error correcting codes and reconstruction of signals. Int. Math. Res. Not. 64, 4019–4041 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale l 1-regularized least squares. IEEE Journal on Selected Topics in Signal Processing 1, 606–617 (2007)CrossRefGoogle Scholar
  22. 22.
    Figueiredo, M.A.T., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing 1, 586–597 (2007)CrossRefGoogle Scholar
  23. 23.
    Tomioka, R., Sugiyama, M.: Dual augmented lagrangian method for efficient sparse reconstruction. Technical report, arXiv:0904.0584, (preprint, 2009)Google Scholar
  24. 24.
    Pati, Y.C., Rezaiifar, R., Rezaiifar, Y.C.P.R., Krishnaprasad, P.S.: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 40–44 (1993)Google Scholar
  25. 25.
    Tropp, J.A., Anna, G.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Information Theory 53, 4655–4666 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Needell, D., Vershynin, R.: Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit. Foundations of Computational Mathematic 9, 317–334 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Needell, D., Tropp, J.A.: CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis 26, 301–321 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Mallat, S., Zhang, Z.: Matching pursuit with time-frequency dictionaries. IEEE Transactions on Signal Processing 41, 3397–3415 (1993)CrossRefzbMATHGoogle Scholar
  29. 29.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision 88, 303–338 (2010)CrossRefGoogle Scholar
  30. 30.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tomoya Sakai
    • 1
  • Hayato Itoh
    • 2
  • Atsushi Imiya
    • 3
  1. 1.Faculty of EngineeringNagasaki UniversityJapan
  2. 2.Graduate School of Science and TechnologyChiba UniversityJapan
  3. 3.Institute of Media and Information TechnologyChiba UniversityJapan

Personalised recommendations