Multimedia Tools and Applications

, Volume 78, Issue 10, pp 13213–13225 | Cite as

Image annotation refinement via 2P-KNN based group sparse reconstruction

  • Qian Ji
  • Liyan ZhangEmail author
  • Xiangbo Shu
  • Jinhui Tang


Image annotation aims at predicting labels that can accurately describe the semantic information of images. In the past few years, many methods have been proposed to solve the image annotation problem. However, the predicted labels of the images by these methods are usually incomplete, insufficient and noisy, which is unsatisfactory. In this paper, we propose a new method denoted as 2PKNN-GSR (Group Sparse Reconstruction) for image annotation and label refinement. First, we get the predicted labels of the testing images using the traditional method, i.e., a two-step variant of the classical K-nearest neighbor algorithm, called 2PKNN. Then, according to the obtained labels, we divide the K nearest neighbors of an image in the training images into several groups. Finally, we utilize the group sparse reconstruction algorithm to refine the annotated label results which are obtained in the first step. Experimental results on three standard datasets, i.e., Corel 5K, IAPR TC12 and ESP Game, show the superior performance of the proposed method compared with the state-of-the-art methods.


Image annotation K nearest neighbor Group sparsity Sparse reconstruction 


  1. 1.
    Bahmanyar R, Ambar MMD, Datcu M (2015) The semantic gap: an exploration of user and computer perspectives in earth observation images. IEEE Geosci Remote Sens Lett 12(10):2046–2050CrossRefGoogle Scholar
  2. 2.
    Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. Journal of Machine Learning Research 3:993–1022zbMATHGoogle Scholar
  3. 3.
    Duygulu P, Barnard K, de Freitas JF, Forsyth DA (2002) Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. European Conference on Computer Vision 4:97–112zbMATHGoogle Scholar
  4. 4.
    Feng S, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. Comput Vis Pattern Recognit 2:1002–1009Google Scholar
  5. 5.
    Fu H, Zhang Q, Qiu G (2012) Random forest for image annotation. European Conference on Computer Vision 2:86–99Google Scholar
  6. 6.
    Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: IEEE 12th International Conference on Computer Vision. IEEE, pp 309–316Google Scholar
  7. 7.
    Han Y, Wu F, Tian Q, Zhuang Y (2012) Image Annotation by InputCOutput Structural Grouping Sparsity. IEEE Trans Image Process 21(6):3066–3079MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Hong R, Wang M, Gao Y, Tao D, Li X, Wu X (2014) Image annotation by multiple-instance learning with discriminative feature mapping and selection. IEEE Trans Cybern 44(5):669–680CrossRefGoogle Scholar
  9. 9.
    Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 119–126Google Scholar
  10. 10.
    Lavrenko V, Manmatha R, Jeon J (2004) A model for learning the semantics of pictures. In: Advances in Neural Information Processing Systems, vol 16, pp 553–560Google Scholar
  11. 11.
    Li X, Snoek CGM, Worring M (2008) Learning tag relevance by neighbor voting for social image retrieval. Proceedings of 1st ACM international conference on multimedia information retrieval. ACM, pp 180–187Google Scholar
  12. 12.
    Lin Z, Ding G, Hu M, Wang J, Ye X (2013) Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1618–1625Google Scholar
  13. 13.
    Liu J, Li M, Liu Q, Lu H, Ma S (2009) Image annotation via graph learning. Pattern Recogn 42(2):218–228CrossRefzbMATHGoogle Scholar
  14. 14.
    Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. European Conference on Computer Vision 3:316–329Google Scholar
  15. 15.
    Moran S, Lavrenko V (2014) Sparse kernel learning for image annotation. Proceedings of international conference on multimedia retrieval, pp 113–120Google Scholar
  16. 16.
    Nakayama H (2011) Linear distance metric learning for large-scale generic image recognition. PhD thesis, The University of TokyoGoogle Scholar
  17. 17.
    Putthividhya D, Attias HT, Nagarajan SS (2010) Supervised topic model for automatic image annotation. IEEE International Conference on Acoustics, Speech, & Signal Processing 1:1894–1897Google Scholar
  18. 18.
    Szummer M, Picard R (1998) Indoor-outdoor image classification. In: Proceedings of IEEE international workshop on Contentbased Access of Image and Video Database, pp 42–51Google Scholar
  19. 19.
    Tang J, Hong R, Yan S, Chua TS, Qi GJ, Jain R (2011) Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans Intell Syst Technol 2(2):1–15CrossRefGoogle Scholar
  20. 20.
    Tang J, Shu X, Qi G, Li Z, Wang M, Yan S, Jain R (2016) Generalized Deep Transfer Networks for Knowledge Propagation in Heterogeneous Domains. CM Trans Multimed Comput Commun Appl 12(4s):68Google Scholar
  21. 21.
    Tang J, Shu X, Qi G, Li Z, Wang M, Yan S, Jain R (2016) ri-Clustered Tensor Completion for Social-Aware Image Tag Refinement. IEEE Transactions on Pattern Analysis Machine Intelligence. pp(99), pp 1-1Google Scholar
  22. 22.
    Tao D, Tang X, Li X, Wu X (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099CrossRefGoogle Scholar
  23. 23.
    Verma Y, Jawahar C (2012) Image annotation using metric learning in semantic neighborhoods. European Conference on Computer Vision 3:836–849Google Scholar
  24. 24.
    Verma Y, Jawahar C (2013) Exploring SVM for image annotation in presence of confusing labels. British Machine Vision Conference 1:1–11Google Scholar
  25. 25.
    Von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: SIGCHI Conference on Human Factors in Computing Systems, pp 319–326Google Scholar
  26. 26.
    Yu J, Rui Y, Tao D (2014) Click Prediction for Web Image Reranking Using Multimodal Sparse Coding. IEEE Trans Image Process 23(5):2019–2032MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2010) Automatic image annotation using group sparsity. Comput Vis Pattern Recognit 3:3312–3319Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Qian Ji
    • 1
  • Liyan Zhang
    • 2
    Email author
  • Xiangbo Shu
    • 1
  • Jinhui Tang
    • 1
  1. 1.Nanjing University of Science and TechnologyJiangsuChina
  2. 2.Nanjing University of Aeronautics and AstronauticsJiangsuChina

Personalised recommendations