Skip to main content

Cross Modal Retrieval for Different Modalities in Multimedia

  • Conference paper
  • First Online:
Computational Vision and Bio-Inspired Computing ( ICCVBIC 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1108))

  • 1835 Accesses

Abstract

Multimedia data like text, image and video is used widely, along with the development of social media. In order, to obtain the accurate multimedia information rapidly and effectively for a huge amount of sources remains as a challenging task. Cross-modal retrieval tries to break through the modality of different media objects that can be regarded as a unified multimedia retrieval approach. For many real-world applications, cross-modal retrieval is becoming essential from inputting the image to load the connected text documents or considering text to choose the accurate results. Video retrieval depends on semantics that includes characteristics like graphical and notion based video. Because of the combined exploitation of all these methodologies, the cross-modal content framework of multimedia data is effectively conserved, when this data is mapped into the combined subspace. The aim is to group together the text, image and video components of multimedia document which shows the similarity in features and to retrieve the most accurate image, text, or video according to the query given, based on the semantics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, L., Sun, W., Zhao, Z., Su, F.: Modeling intra- and inter-pair correlation via heterogeneous high-order preserving for cross-modal retrieval. Sig. Process. 131, 249–260 (2017)

    Article  Google Scholar 

  2. Bai, X., Yan, C., Yang, H., Bai, L., Zhou, J., Hancock, E.R.: Adaptive hash retrieval with kernel based similarity (2017)

    Google Scholar 

  3. Wang, K., He, R., Wang, W., Wang, L., Tan, T.: Learning coupled feature spaces for crossmodal matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2095 (2013)

    Google Scholar 

  4. Wang, K., He, R., Wang, L., Wang, W., Tan, T.: Joint feature selection and subspace learning for cross-modal retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2010–2023 (2016)

    Article  Google Scholar 

  5. Wang, J., He, Y., Kang, C., Xiang, S., Pan, C.: Image-text cross-modal retrieval via modality-specific feature learning. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 347–354 (2015)

    Google Scholar 

  6. Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM on Multimedia Conference, pp. 251–260 (2010)

    Google Scholar 

  7. Jiang, B., Yang, J., Lv, Z., Tian, K., Meng, Q., Yan, Y.: Internet cross-media retrieval based on deep learning. J. Vis. Commun. Image Retrieval 48, 356–366 (2017)

    Article  Google Scholar 

  8. Wei, Y., Zhao, Y., Lu, C., Wei, S., Liu, L., Zhu, Z., Yan, S.: Cross-modal retrieval with CNN visual features: a new baseline. IEEE Trans. Cybern. 47, 251–260 (2016)

    Google Scholar 

  9. Pereira, J.C., Vasconcelos, N.: Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems. Comput. Vis. Image Underst. 124, 123–135 (2014)

    Article  Google Scholar 

  10. He, J., Ma, B., Wang, S., Liu, Y.: Multi-label double-layer learning for cross-modal retrieval. Neuro-computing 123–135 (2017)

    Google Scholar 

  11. Hu, X., Yu, Z., Zhou, H., Lv, H., Jiang, Z., Zhou, X.: An adaptive solution for large-scale, cross-video, and real-time visual analytics. In: IEEE International Conference on Multimedia Big Data, pp. 251–260 (2015)

    Google Scholar 

  12. Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: NIPS (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. J. Osheen .

Editor information

Editors and Affiliations

Ethics declarations

✓ All authors declare that there is no conflict of interest.

✓ No humans/animals involved in this research work.

✓ We have used our own data.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Osheen, T.J., Mathew, L.S. (2020). Cross Modal Retrieval for Different Modalities in Multimedia. In: Smys, S., Tavares, J., Balas, V., Iliyasu, A. (eds) Computational Vision and Bio-Inspired Computing. ICCVBIC 2019. Advances in Intelligent Systems and Computing, vol 1108. Springer, Cham. https://doi.org/10.1007/978-3-030-37218-7_19

Download citation

Publish with us

Policies and ethics