Exploiting Multimodality in Video Hyperlinking to Improve Target Diversity

Bois, Rémi; Vukotić, Vedran; Simon, Anca-Roxana; Sicre, Ronan; Raymond, Christian; Sébillot, Pascale; Gravier, Guillaume

doi:10.1007/978-3-319-51814-5_16

Rémi Bois¹⁸,
Vedran Vukotić¹⁹,
Anca-Roxana Simon²¹,
Ronan Sicre²⁰,
Christian Raymond¹⁹,
Pascale Sébillot¹⁹ &
…
Guillaume Gravier¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10133))

Included in the following conference series:

International Conference on Multimedia Modeling

1590 Accesses
4 Citations
7 Altmetric

Abstract

Video hyperlinking is the process of creating links within a collection of videos to help navigation and information seeking. Starting from a given set of video segments, called anchors, a set of related segments, called targets, must be provided. In past years, a number of content-based approaches have been proposed with good results obtained by searching for target segments that are very similar to the anchor in terms of content and information. Unfortunately, relevance has been obtained to the expense of diversity. In this paper, we study multimodal approaches and their ability to provide a set of diverse yet relevant targets. We compare two recently introduced cross-modal approaches, namely, deep auto-encoders and bimodal LDA, and experimentally show that both provide significantly more diverse targets than a state-of-the-art baseline. Bimodal autoencoders offer the best trade-off between relevance and diversity, with bimodal LDA exhibiting slightly more diverse targets at a lower precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/Lasagne/Lasagne.

References

Barrios, J.M., Saavedra, J.M., Ramirez, F., Contreras, D.: ORAND at TRECVID 2015: instance search and video hyperlinking tasks. In: Proceedings of TRECVID (2015)
Google Scholar
Bhatt, C., Pappas, N., Habibi, M., Popescu-Belis, A.: Idiap at MediaEval 2013: search and hyperlinking task. In: Proceedings of the MediaEval Workshop (2013)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Bois, R., Şimon, A.-R., Sicre, R., Gravier, G., Sébillot, P.: IRISA at TrecVid2015 2015: leveraging multimodal LDA for video hyperlinking. In: Proceedings of TRECVID (2015)
Google Scholar
Campr, M., Ježek, K.: Comparing semantic models for evaluating automatic document summarization. In: Král, P., Matoušek, V. (eds.) TSD 2015. LNCS (LNAI), vol. 9302, pp. 252–260. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24033-6_29
Chapter Google Scholar
Cheng, Z., Li, X., Shen, J., Hauptmann, A.G.: CMU-SMU@TRECVID 2015: video hyperlinking. In: Proceedings of TRECVID (2015)
Google Scholar
De Nies, T., De Neve, W., Mannens, E., Van de Walle, R.: Ghent University-iMinds at MediaEval 2013: an unsupervised named entity-based similarity measure for search and hyperlinking. In: Proceedings of the MediaEval Workshop (2013)
Google Scholar
Eskevich, M., Aly, R., Racca, D.N., Ordelman, R., Chen, S., Jones G.J.F.: The search and hyperlinking task at MediaEval 2014. In: Proceedings of the MediaEval Workshop (2014)
Google Scholar
Eskevich, M., Jones, G.J., Chen, S., Aly, R., Ordelman, R., Nadeem, D., Guinaudeau, C., Gravier, G., Sébillot, P., Nies, T.D., Debevere, P., de Walle, R.V., Galušcáková, P., Pecina, P., Larson, M.: Multimedia information seeking through search and hyperlinking. In: ACM International Conference on Multimedia Retrieval (2013)
Google Scholar
Eskevich, M., Larson, M., Aly, R., Sabetghadam, S., Jones, G.J.F., Ordelman, R., Huet, B.: Multimodal video-to-video linking: turning to the crowd for insight and evaluation. In: Proceedings of the 23rd International Conference on Multimedia Modeling (2017)
Google Scholar
Feng, F., Wang, X., Li, R.: Cross-modal retrieval with correspondence autoencoder. In: ACM International Conference on Multimedia, pp. 7–16 (2014)
Google Scholar
Galuscáková, P., Krulis, M., Lokoc, J., Pecina, P.: CUNI at MediaEval 2014 search and hyperlinking task: visual and prosodic features in hyperlinking. In: Working Notes Proceedings of the MediaEval Workshop (2014)
Google Scholar
Gauvain, J.-L., Lamel, L., Adda, G.: The LIMSI broadcast news transcription system. Speech commun. 37(1), 89–108 (2002)
Article MATH Google Scholar
Guinaudeau, C., Gravier, G., Sébillot, P.: IRISA at MediaEval 2012: search and hyperlinking task. In: Working Notes Proceedings of the MediaEval Workshop (2012)
Google Scholar
Hasan, K.S., Ng, V.: Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In: Proceedings of the 23rd International Conference on Computational Linguistics (2010)
Google Scholar
Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1
Google Scholar
Le, H.A., Bui, Q., Huet, B., et al.: LinkedTV at MediaEval 2014 search and hyperlinking task. In: Proceedings of the MediaEval Workshop (2014)
Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of International Conference on Machine Learning
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems (2013)
Google Scholar
Over, P., Awad, G., Michel, M., Fiscus, J., Kraaij, W., Smeaton, A.F., Quénot, G., Ordelman, R.: TRECVID 2015 – an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID (2015)
Google Scholar
Pang, L., Ngo, C.-W.: VIREO @ TRECVID 2015: video hyperlinking. In: Proceedings of TRECVID (2015)
Google Scholar
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Inc., New York (1986)
MATH Google Scholar
Simon, A.-R.: Semantic structuring of video collections from speech: segmentation and hyperlinking. Ph.D. thesis, Université de Rennes 1 (2015)
Google Scholar
Smet, W.D., Moens, M.: Cross-language linking of news stories on the web using interlingual topic modelling. In: ACM Workshop on Social Web Search and Mining (2009)
Google Scholar
Steyvers, M., Griffiths, T.: Probabilistic topic models. Handb. Latent Semant. Anal. 427(7), 424–440 (2007)
Google Scholar
Tommasi, T., Aly, R.B.N., McGuinness, K., Chatfield, K., et al.: Beyond metadata: searching your archive based on its audio-visual content. In: Proceedings of the International Broadcasting Convention (2014)
Google Scholar
Vukotić, V., Raymond, C., Gravier, G.: Bidirectional joint representation learning with symmetrical deep neural networks for multimodal and crossmodal applications. In: Proceedings of the ACM International Conference on Multimedia Retrieval (2016)
Google Scholar
Vukotic, V., Raymond, C., Gravier, G.: Multimodal and crossmodal representation learning from textual and visual features with bidirectional deep neural networks for video hyperlinking. In: ACM Multimedia 2016 Workshop: Vision and Language Integration Meets Multimedia Fusion (iV&L-MM 2016), Amsterdam, Netherlands. ACM, October 2016
Google Scholar
Vulić, I., Moens, M.-F.: Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
Google Scholar
Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: learning to rank with joint word-image embeddings. Mach. Learn. 81(1), 21–35 (2010)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

CNRS, IRISA and Inria Rennes, Rennes, France
Rémi Bois & Guillaume Gravier
INSA Rennes, IRISA and Inria Rennes, Rennes, France
Vedran Vukotić, Christian Raymond & Pascale Sébillot
Inria, IRISA and Inria Rennes, Rennes, France
Ronan Sicre
University Rennes 1, IRISA and Inria Rennes, Rennes, France
Anca-Roxana Simon

Authors

Rémi Bois
View author publications
You can also search for this author in PubMed Google Scholar
Vedran Vukotić
View author publications
You can also search for this author in PubMed Google Scholar
Anca-Roxana Simon
View author publications
You can also search for this author in PubMed Google Scholar
Ronan Sicre
View author publications
You can also search for this author in PubMed Google Scholar
Christian Raymond
View author publications
You can also search for this author in PubMed Google Scholar
Pascale Sébillot
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Gravier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rémi Bois .

Editor information

Editors and Affiliations

CNRS–IRISA, Rennes, France
Laurent Amsaleg
Reykjavík University, Reykjavik, Iceland
Gylfi Þór Guðmundsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
Reykjavik University, Reykjavik, Ireland
Björn Þór Jónsson
National Institute of Informatics, Tokyo, Japan
Shin’ichi Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bois, R. et al. (2017). Exploiting Multimodality in Video Hyperlinking to Improve Target Diversity. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10133. Springer, Cham. https://doi.org/10.1007/978-3-319-51814-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-51814-5_16
Published: 31 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51813-8
Online ISBN: 978-3-319-51814-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics