Multiple hypergraph ranking for video concept detection

Han, Ya-hong; Shao, Jian; Wu, Fei; Wei, Bao-gang

doi:10.1631/jzus.C0910453

Multiple hypergraph ranking for video concept detection

Published: 03 July 2010

Volume 11, pages 525–537, (2010)
Cite this article

Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Ya-hong Han¹,
Jian Shao¹,
Fei Wu¹ &
…
Bao-gang Wei¹

185 Accesses
10 Citations
Explore all metrics

Abstract

This paper tackles the problem of video concept detection using the multi-modality fusion method. Motivated by multi-view learning algorithms, multi-modality features of videos can be represented by multiple graphs. And the graph-based semi-supervised learning methods can be extended to multiple graphs to predict the semantic labels for unlabeled video data. However, traditional graphs represent only homogeneous pairwise linking relations, and therefore the high-order correlations inherent in videos, such as high-order visual similarities, are ignored. In this paper we represent heterogeneous features by multiple hypergraphs and then the high-order correlated samples can be associated with hyperedges. Furthermore, the multi-hypergraph ranking (MHR) algorithm is proposed by defining Markov random walk on each hypergraph and then forming the mixture Markov chains so as to perform transductive learning in multiple hypergraphs. In experiments on the TRECVID dataset, a triple-hypergraph consisting of visual, textual features and multiple labeled tags is constructed to predict concept labels for unlabeled video shots by the MHR framework. Experimental results show that our approach is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual Re-Ranking via Adaptive Collaborative Hypergraph Learning for Image Retrieval

Complex video event detection via pairwise fusion of trajectory and multi-label hypergraphs

Article 26 February 2015

Xiao-jun Chen, Yong-zhao Zhan, … Xiao-bo Chen

DMH-FSL: Dual-Modal Hypergraph for Few-Shot Learning

Article 04 January 2022

Rui Xu, Baodi Liu, … Weifeng Liu

References

Bickel, S., Scheffer, T., 2004. Multi-View Clustering. Proc. 4th IEEE Int. Conf. on Data Mining, p.19–26. [doi:10.1109/ICDM.2004.10095]
Dhillon, I.S., 2001. Co-clustering Documents and Words Using Bipartite Spectral Graph Partitioning. Proc. 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.269–274. [doi:10.1145/502512.502550]
Dumais, S.T., Furnas, G.W., Landauer, T.K., 1998. Using Latent Semantic Analysis to Improve Access to Textual Information. Proc. SIGCHI Conf. on Human Factors in Computing Systems, p.281–285.
Frey, B.J., Dueck, D., 2007. Clustering by passing messages between data points. Science, 315(5814):972–976. [doi:10.1126/science.1136800]
Article MathSciNet Google Scholar
He, J., Li, M., Zhang, H.J., Tong, H.H., Zhang, C.S., 2004. Manifold-Ranking Based Image Retrieval. Proc. 12th Annual ACM Int. Conf. on Multimedia, p.9–16. [doi:10.1145/1027527.1027531]
Hoi, S.C.H., Lyu, M.R., 2008. A multimodal and multilevel ranking scheme for large-scale video retrieval. IEEE Trans. Multimedia, 10(4):607–619. [doi:10.1109/TMM.2008.921735]
Article Google Scholar
Liu, J., Lai, W., Hua, X., Huang, Y., Li, S., 2007. Video Search Re-ranking via Multi-Graph Propagation. Proc. 15th Annual ACM Int. Conf. on Multimedia, p.208–217. [doi:10.1145/1291233.1291279]
Liu, Y., Wu, F., Zhuang, Y., Xiao, J., 2008. Active Post-Refined Multimodality Video Semantic Concept Detection with Tensor Representation. Proc. 16th Annual ACM Int. Conf. on Multimedia, p.91–100. [doi:10.1145/1459359.1459372]
Long, B., Yu, P.S., Zhang, Z.F., 2008. A General Model for Multiple View Unsupervised Learning. Proc. SIAM Int. Conf. on Data Mining, p.822–833.
Naphade, M., Smith, J.R., Tesic, J., Chang, S.F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J., 2006. Large-scale concept ontology for multimedia. IEEE Multimedia, 13(3):86–91. [doi:10.1109/MMUL.2006.63]
Article Google Scholar
Qi, G., Hua, X.S., Rui, Y., Tang, J., Mei, T., Zhang, H.J., 2007. Correlative Multi-Label Video Annotation. Proc. 15th Annual ACM Int. Conf. on Multimedia, p.17–26. [doi:10.1145/1291233.1291245]
Spielman, D.A., Teng, S.H., 2003. Solving Sparse, Symmetric, Diagonally-Dominant Linear Systems in Time O(m ^1.31). 44th Annual IEEE Symp. on Foundations of Computer Science, p.416–427. [doi:10.1109/SFCS.2003.1238215]
Sun, L., Ji, S., Ye, J., 2008. Hypergraph Spectral Learning for Multi-Label Classification. Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.668–676. [doi:10.1145/1401890.1401971]
Tan, H., Ngo, C., Wu, X., 2008. Modeling Video Hyperlinks with Hypergraph for Web Video Reranking. Proc. 16th Annual ACM Int. Conf. on Multimedia, p.659–662. [doi:10.1145/1459359.1459453]
Tang, J., Hua, X.S., Qi, G., Wang, M., Mei, T., Wu, X., 2007. Structure-Sensitive Manifold Ranking for Video Concept Detection. Proc. 15th Annual ACM Int. Conf. on Multimedia, p.852–861. [doi:10.1145/1291233.1291430]
Tong, H., He, J., Li, M., Zhang, C., Ma, W.Y., 2005. Graph Based Multi-Modality Learning. Proc. 13th Annual ACM Int. Conf. on Multimedia, p.862–871. [doi:10.1145/1101149.1101337]
Virginia, R.S., 2005. Spectral Clustering with Two Views. Proc. 22nd Int. Conf. on Machine Learning, p.20–27.
Wang, J., Zhao, Y., Wu, X., Hua, X., 2008. Transductive Multi-Label Learning for Video Concept Detection. Proc. 1st Annual ACM Int. Conf. on Multimedia Information Retrieval, p.298–304. [doi:10.1145/1460096.1460145]
Wang, M., Mei, T., Yuan, X., Song, Y., Dai, L., 2007a. Video Annotation by Graph-Based Learning with Neighborhood Similarity. Proc. 15th Annual ACM Int. Conf. on Multimedia, p.325–328. [doi:10.1145/1291233.1291303]
Wang, M., Hua, X.S., Yuan, X., Song, Y., Dai, L., 2007b. Optimizing Multi-Graph Learning: Towards a Unified Video Annotation Scheme. Proc. 15th Annual ACM Int. Conf. on Multimedia, p.862–871. [doi:10.1145/1291233.1291431]
Weng, M., Chuang, Y., 2008. Multi-Cue Fusion for Semantic Video Indexing. Proc. 16th Annual ACM Int. Conf. Multimedia, p.71–80. [doi:10.1145/1459359.1459370]
Yanagawa, A., Chang, S.F., Kennedy, L., Hsu, W., 2007. Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts. ADVENT Technical Report No. 222-2006-8, Columbia University, New York.
Google Scholar
Yang, Y., Zhuang, Y., Wu, F., Pan, Y., 2008. Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans. Multimedia, 10(3):437–446. [doi:10.1109/TMM.2008.917359]
Article Google Scholar
Yuan, X., Hua, X.S., Wang, M., Wu, X., 2006. Manifold-Ranking Based Video Concept Detection on Large Database and Feature Pool. Proc. 14th Annual ACM Int. Conf. on Multimedia, p.623–626. [doi:10.1145/1180639.1180768]
Zha, Z., Mei, T., Wang, J., Wang, Z., Hua, X., 2009. Graph-based semi-supervised learning with multiple labels. J. Vis. Commun. Image Represent., 20(2):97–103. [doi:10.1016/j.jvcir.2008.11.009]
Article Google Scholar
Zhang, H., Zhuang, Y., Wu, F., 2007. Cross-Modal Correlation Learning for Clustering on Image-Audio Dataset. Proc. 15th Annual ACM Int. Conf. on Multimedia, p. 273–276. [doi:10.1145/1291233.1291290]
Zhang, M., Zhou, Z., 2008. M³MIML: a Maximum Margin Method for Multi-Instance Multi-Label Learning. Proc. 8th IEEE Int. Conf. on Data Mining, p.688–697. [doi:10.1109/ICDM.2008.27]
Zhao, W., Ngo, C., Tan, H., Wu, X., 2007. Near-duplicate keyframe identification with interest point marching and pattern learning. IEEE Trans. Multimedia, 9(5):1037–1048. [doi:10.1109/TMM.2007.898928]
Article Google Scholar
Zhou, D., Burges, C.J.C., 2007. Spectral Clustering and Transductive Learning with Multiple Views. Proc. 24th Int. Conf. on Machine Learning, p.1159–1166. [doi:10.1145/1273496.1273642]
Zhou, D., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B., 2004a. Ranking on Data Manifolds. Advances in Neural Information Processing Systems 16, p.169–176.
Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B., 2004b. Learning with Local and Global Consistency. Advances in Neural Information Processing Systems 16, p.321–328.
Google Scholar
Zhou, D., Huang, J., Schölkopf, B., 2007. Learning with Hypergraphs Clustering, Classification, and Embedding. Advances in Neural Information Processing Systems 19, p.1601–1608.
Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J., 2003. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. Proc. 20th Int. Conf. on Machine Learning, p.912–919. [doi:10.1109/18.850663]

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Ya-hong Han, Jian Shao, Fei Wu & Bao-gang Wei

Authors

Ya-hong Han
View author publications
You can also search for this author in PubMed Google Scholar
Jian Shao
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Bao-gang Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Shao.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 60603096 and 60673088), the National High-Tech Research and Development Program (863) of China (No. 2006AA010 107), and the Program for Changjiang Scholars and Innovative Research Team in University of China (No. IRT0652)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, Yh., Shao, J., Wu, F. et al. Multiple hypergraph ranking for video concept detection. J. Zhejiang Univ. - Sci. C 11, 525–537 (2010). https://doi.org/10.1631/jzus.C0910453

Download citation

Received: 25 July 2009
Accepted: 30 September 2009
Published: 03 July 2010
Issue Date: July 2010
DOI: https://doi.org/10.1631/jzus.C0910453

Key words

CLC number

TP391

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple hypergraph ranking for video concept detection

Abstract

Access this article

Similar content being viewed by others

Visual Re-Ranking via Adaptive Collaborative Hypergraph Learning for Image Retrieval

Complex video event detection via pairwise fusion of trajectory and multi-label hypergraphs

DMH-FSL: Dual-Modal Hypergraph for Few-Shot Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Multiple hypergraph ranking for video concept detection

Abstract

Access this article

Similar content being viewed by others

Visual Re-Ranking via Adaptive Collaborative Hypergraph Learning for Image Retrieval

Complex video event detection via pairwise fusion of trajectory and multi-label hypergraphs

DMH-FSL: Dual-Modal Hypergraph for Few-Shot Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation