Towards Training-Free Refinement for Semantic Indexing of Visual Media

Wang, Peng; Sun, Lifeng; Yang, Shiqang; Smeaton, Alan F.

doi:10.1007/978-3-319-27671-7_21

Peng Wang¹⁹,
Lifeng Sun¹⁹,
Shiqang Yang¹⁹ &
…
Alan F. Smeaton²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9516))

Included in the following conference series:

International Conference on Multimedia Modeling

2898 Accesses
4 Citations

Abstract

Indexing of visual media based on content analysis has now moved beyond using individual concept detectors and there is now a focus on combining concepts or post-processing the outputs of individual concept detection. Due to the limitations and availability of training corpora which are usually sparsely and imprecisely labeled, training-based refinement methods for semantic indexing of visual media suffer in correctly capturing relationships between concepts, including co-occurrence and ontological relationships. In contrast to training-dependent methods which dominate this field, this paper presents a training-free refinement (TFR) algorithm for enhancing semantic indexing of visual media based purely on concept detection results, making the refinement of initial concept detections based on semantic enhancement, practical and flexible. This is achieved using global and temporal neighbourhood information inferred from the original concept detections in terms of weighted non-negative matrix factorization and neighbourhood-based graph propagation, respectively. Any available ontological concept relationships can also be integrated into this model as an additional source of external a priori knowledge. Experiments on two datasets demonstrate the efficacy of the proposed TFR solution.

P. Wang—This work was part-funded by 973 Program under Grant No. 2011CB302206, National Natural Science Foundation of China under Grant No. 61272231, 61472204, 61502264, Beijing Key Laboratory of Networked Multimedia and by Science Foundation Ireland under grant SFI/12/RC/2289. We also thank Prof. Philip S. Yu for helpful discussions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://vireo.cs.cityu.edu.hk/research/vireo374/.

References

Aly, R., Hiemstra, D., de Jong, F., Apers, P.: Simulating the future of concept-based video retrieval under improved detector performance. Multimedia Tools Appl. 60(1), 203–231 (2012)
Article Google Scholar
Jiang, W., Chang, S.-F., Loui, A.: Context-based concept fusion with boosted conditional random fields. In: ICASSP, p. I-949 (2007)
Google Scholar
Jiang, Y.-G., Dai, Q., Wang, J., Ngo, C.-W., Xue, X., Chang, S.-F.: Fast semantic diffusion for large-scale context-based image and video annotation. IEEE Trans. Image Proc. 21(6), 3080–3091 (2012)
Article MathSciNet Google Scholar
Jiang, Y.-G., Wang, J., Chang, S.-F., Ngo, C.-W.: Domain adaptive semantic diffusion for large scale context-based video annotation. In: ICCV, pp. 1420–1427 (2009)
Google Scholar
Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR), pp. 494–501. ACM (2007)
Google Scholar
Jin, Y., Khan, L., Wang, L., Awad, M.: Image annotations by combining multiple evidence & WordNet. In: ACM Multimedia, pp. 706–715 (2005)
Google Scholar
Kennedy, L.S., Chang, S.-F.: A reranking approach for context-based concept fusion in video indexing and retrieval. In: CIVR, pp. 333–340. ACM (2007)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS, pp. 556–562. MIT Press, April 2001
Google Scholar
Li, B., Goh, K., Chang, E.Y.: Confidence-based dynamic ensemble for image annotation and semantics discovery. In: ACM Multimedia, pp. 195–206 (2003)
Google Scholar
Naphade, M., Smith, J.R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE Multimedia 13(3), 86–91 (2006)
Article Google Scholar
Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., Zhang, H.-J.: Correlative multi-label video annotation. In: ACM Multimedia, pp. 17–26 (2007)
Google Scholar
Smeaton, A., Over, P., Kraaij, W.: High level feature detection from video in TRECVid: a 5-year retrospective of achievements. In: Divakaran, A. (ed.) Multimedia Content Analysis, Theory and Applications, pp. 151–174 (2008)
Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proceedings of the ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM (2006)
Google Scholar
Snoek, C.G.M., Worring, M.: Concept-based video retrieval. Found. Trends Inf. Retrieval 2(4), 215–322 (2008)
Article Google Scholar
Wang, C., Jing, F., Zhang, L., Zhang, H.-J.: Image annotation refinement using random walk with restarts. In: ACM Multimedia, pp. 647–650 (2006)
Google Scholar
Wang, C., Jing, F., Zhang, L., Zhang, H.-J.: Content-based image annotation refinement. In: CVPR, pp. 1–8 (2007)
Google Scholar
Wang, P., Smeaton, A.F., Gurrin, C.: Factorizing time-aware multi-way tensors for enhancing semantic wearable sensing. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part I. LNCS, vol. 8935, pp. 571–582. Springer, Heidelberg (2015)
Google Scholar
Wu, Y., Tseng, B., Smith, J.: Ontology-based multi-classification learning for video concept detection. In: ICME, vol. 2, pp. 1003–1006 (2004)
Google Scholar
Xu, D., Cui, P., Zhu, W., Yang, S.: Find you from your friends: graph-based residence location prediction for users in social media. In: ICME, pp. 1–6 (2014)
Google Scholar
Xue, X., Zhang, W., Zhang, J., Wu, B., Fan, J., Lu, Y.: Correlative multi-label multi-instance image annotation. In: ICCV, pp. 651–658. IEEE (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China
Peng Wang, Lifeng Sun & Shiqang Yang
Insight Centre for Data Analytics, Dublin City University, Glasnevin, Dublin 9, Ireland
Alan F. Smeaton

Authors

Peng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lifeng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Shiqang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Alan F. Smeaton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Wang .

Editor information

Editors and Affiliations

University of Texas at San Antonio, San Antonio, USA
Qi Tian
Dept. of Information Engineering, University of Trento, Povo, Trento, Italy
Nicu Sebe
EECS, University of Central Florida, Orlando, Florida, USA
Guo-Jun Qi
EURECOM, Sophia-Antipolis, France
Benoit Huet
Hefei University of Technology, Hefei, Anhui, China
Richang Hong
School of Computing and Information, Hefei University of Technology, Hefei, Anhui, China
Xueliang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, P., Sun, L., Yang, S., Smeaton, A.F. (2016). Towards Training-Free Refinement for Semantic Indexing of Visual Media. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-27671-7_21
Published: 03 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27670-0
Online ISBN: 978-3-319-27671-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics