Skip to main content

Towards Training-Free Refinement for Semantic Indexing of Visual Media

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9516))

Included in the following conference series:

Abstract

Indexing of visual media based on content analysis has now moved beyond using individual concept detectors and there is now a focus on combining concepts or post-processing the outputs of individual concept detection. Due to the limitations and availability of training corpora which are usually sparsely and imprecisely labeled, training-based refinement methods for semantic indexing of visual media suffer in correctly capturing relationships between concepts, including co-occurrence and ontological relationships. In contrast to training-dependent methods which dominate this field, this paper presents a training-free refinement (TFR) algorithm for enhancing semantic indexing of visual media based purely on concept detection results, making the refinement of initial concept detections based on semantic enhancement, practical and flexible. This is achieved using global and temporal neighbourhood information inferred from the original concept detections in terms of weighted non-negative matrix factorization and neighbourhood-based graph propagation, respectively. Any available ontological concept relationships can also be integrated into this model as an additional source of external a priori knowledge. Experiments on two datasets demonstrate the efficacy of the proposed TFR solution.

P. Wang—This work was part-funded by 973 Program under Grant No. 2011CB302206, National Natural Science Foundation of China under Grant No. 61272231, 61472204, 61502264, Beijing Key Laboratory of Networked Multimedia and by Science Foundation Ireland under grant SFI/12/RC/2289. We also thank Prof. Philip S. Yu for helpful discussions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://vireo.cs.cityu.edu.hk/research/vireo374/.

References

  1. Aly, R., Hiemstra, D., de Jong, F., Apers, P.: Simulating the future of concept-based video retrieval under improved detector performance. Multimedia Tools Appl. 60(1), 203–231 (2012)

    Article  Google Scholar 

  2. Jiang, W., Chang, S.-F., Loui, A.: Context-based concept fusion with boosted conditional random fields. In: ICASSP, p. I-949 (2007)

    Google Scholar 

  3. Jiang, Y.-G., Dai, Q., Wang, J., Ngo, C.-W., Xue, X., Chang, S.-F.: Fast semantic diffusion for large-scale context-based image and video annotation. IEEE Trans. Image Proc. 21(6), 3080–3091 (2012)

    Article  MathSciNet  Google Scholar 

  4. Jiang, Y.-G., Wang, J., Chang, S.-F., Ngo, C.-W.: Domain adaptive semantic diffusion for large scale context-based video annotation. In: ICCV, pp. 1420–1427 (2009)

    Google Scholar 

  5. Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR), pp. 494–501. ACM (2007)

    Google Scholar 

  6. Jin, Y., Khan, L., Wang, L., Awad, M.: Image annotations by combining multiple evidence & WordNet. In: ACM Multimedia, pp. 706–715 (2005)

    Google Scholar 

  7. Kennedy, L.S., Chang, S.-F.: A reranking approach for context-based concept fusion in video indexing and retrieval. In: CIVR, pp. 333–340. ACM (2007)

    Google Scholar 

  8. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS, pp. 556–562. MIT Press, April 2001

    Google Scholar 

  9. Li, B., Goh, K., Chang, E.Y.: Confidence-based dynamic ensemble for image annotation and semantics discovery. In: ACM Multimedia, pp. 195–206 (2003)

    Google Scholar 

  10. Naphade, M., Smith, J.R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE Multimedia 13(3), 86–91 (2006)

    Article  Google Scholar 

  11. Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., Zhang, H.-J.: Correlative multi-label video annotation. In: ACM Multimedia, pp. 17–26 (2007)

    Google Scholar 

  12. Smeaton, A., Over, P., Kraaij, W.: High level feature detection from video in TRECVid: a 5-year retrospective of achievements. In: Divakaran, A. (ed.) Multimedia Content Analysis, Theory and Applications, pp. 151–174 (2008)

    Google Scholar 

  13. Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proceedings of the ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM (2006)

    Google Scholar 

  14. Snoek, C.G.M., Worring, M.: Concept-based video retrieval. Found. Trends Inf. Retrieval 2(4), 215–322 (2008)

    Article  Google Scholar 

  15. Wang, C., Jing, F., Zhang, L., Zhang, H.-J.: Image annotation refinement using random walk with restarts. In: ACM Multimedia, pp. 647–650 (2006)

    Google Scholar 

  16. Wang, C., Jing, F., Zhang, L., Zhang, H.-J.: Content-based image annotation refinement. In: CVPR, pp. 1–8 (2007)

    Google Scholar 

  17. Wang, P., Smeaton, A.F., Gurrin, C.: Factorizing time-aware multi-way tensors for enhancing semantic wearable sensing. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part I. LNCS, vol. 8935, pp. 571–582. Springer, Heidelberg (2015)

    Google Scholar 

  18. Wu, Y., Tseng, B., Smith, J.: Ontology-based multi-classification learning for video concept detection. In: ICME, vol. 2, pp. 1003–1006 (2004)

    Google Scholar 

  19. Xu, D., Cui, P., Zhu, W., Yang, S.: Find you from your friends: graph-based residence location prediction for users in social media. In: ICME, pp. 1–6 (2014)

    Google Scholar 

  20. Xue, X., Zhang, W., Zhang, J., Wu, B., Fan, J., Lu, Y.: Correlative multi-label multi-instance image annotation. In: ICCV, pp. 651–658. IEEE (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, P., Sun, L., Yang, S., Smeaton, A.F. (2016). Towards Training-Free Refinement for Semantic Indexing of Visual Media. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27671-7_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27670-0

  • Online ISBN: 978-3-319-27671-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics