Cross-Media Information Retrieval with Deep Convolutional Neural Network

Bai, Liang; Yu, Tianyuan; Guo, Jinlin; Yang, Zheng; Xie, Yuxiang

doi:10.1007/978-981-10-3611-8_34

Cross-Media Information Retrieval with Deep Convolutional Neural Network

Liang Bai¹⁴,
Tianyuan Yu¹⁴,
Jinlin Guo¹⁴,
Zheng Yang¹⁴ &
…
Yuxiang Xie¹⁴

Conference paper
First Online: 08 January 2017

998 Accesses
1 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 681))

Abstract

With the explosive growth of multimedia data, different types of media data often coexist in web repositories. Accordingly, it is more and more important to explore underlying intricate cross-media correlation so as to improve the retrieval results from cross-media data. However, how to effectively discover the correlations between multi-modal data has been a barrier to successful retrieval of cross-media information. To address the above problems, we propose a novel model projecting both the text modality and the visual modality into a common semantic feature space with the convolutional neural network feature. Unlike the existing approaches, the proposed model learns the high-level feature representation shared by multiple modalities for cross-media information retrieval. Experiments are conducted on public benchmark dataset, and results show the effectiveness of our approach.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. ACM, pp. 127-134 (2003)
Google Scholar
Pereira, J.C., Coviello, E., Doyle, G., et al.: On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 521–535 (2014)
Article Google Scholar
Frome, A., Corrado, G.S., Shlens, J., et al.: Devise: a deep visual-semantic embedding model. In: Advances in Neural Information Processing Systems, pp. 2121–2129 (2013)
Google Scholar
Socher, R., Karpathy, A., Le, Q.V., et al.: Grounded compositional semantics for finding and describing images with sentences. Trans. Assoc. Comput. Linguist. 2, 207–218 (2014)
Google Scholar
Karpathy, A., Joulin, A., Li, F.F.: Deep fragment embeddings for bidirectional image sentence mapping. Advances in Neural Information Processing Systems, pp. 1889–1897 (2014)
Google Scholar
Young, P., Lai, A., Hodosh, M., et al.: From image descriptions to visual denotations: new simi-larity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 2, 67–78 (2014)
Google Scholar
Huang, P.S., He, X., Gao, J., et al.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management. ACM, pp. 2333–2338 (2013)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information System and Management, National University of Defense Technology, Changsha, 410073, China
Liang Bai, Tianyuan Yu, Jinlin Guo, Zheng Yang & Yuxiang Xie

Authors

Liang Bai
View author publications
You can also search for this author in PubMed Google Scholar
Tianyuan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jinlin Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuxiang Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianyuan Yu .

Editor information

Editors and Affiliations

Xidian University, Xi’an, China
Maoguo Gong
Huazhong University of Science and Technology, Wuhan, China
Linqiang Pan
China University of Petroleum, Qingdao, China
Tao Song
Faculty of Engineering, Computing and Science, Swinburne University of Technology Sarawak Campus, Kuching, China
Gexiang Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bai, L., Yu, T., Guo, J., Yang, Z., Xie, Y. (2016). Cross-Media Information Retrieval with Deep Convolutional Neural Network. In: Gong, M., Pan, L., Song, T., Zhang, G. (eds) Bio-inspired Computing – Theories and Applications. BIC-TA 2016. Communications in Computer and Information Science, vol 681. Springer, Singapore. https://doi.org/10.1007/978-981-10-3611-8_34

Download citation

DOI: https://doi.org/10.1007/978-981-10-3611-8_34
Published: 08 January 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3610-1
Online ISBN: 978-981-10-3611-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics