Skip to main content

An Adversarial Learning and Canonical Correlation Analysis Based Cross-Modal Retrieval Model

  • Conference paper
  • First Online:
  • 1877 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11431))

Abstract

The key of cross-modal retrieval approaches is to find a maximally correlated subspace among multiple datasets. This paper introduces a novel Adversarial Learning and Canonical Correlation Analysis based Cross-Modal Retrieval (ALCCA-CMR) model. For each modality, the ALCCA phase finds an effective common subspace and calculates the similarity by canonical correlation analysis embedding for cross-modal retrieval. We demonstrate an application of ALCCA-CMR model implemented for the dataset of two modalities. Experimental results on real music data show the efficacy of the proposed method in comparison with other existing ones.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Andrew, G., Arora, R., Bilmes, J., Livescu, K.: Deep canonical correlation analysis. In: International Conference on Machine Learning, pp. 1247–1255 (2013)

    Google Scholar 

  2. Boutell, M., Luo, J.: Photo classification by integrating image content and camera metadata. In: 2004 Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 4, pp. 901–904. IEEE (2004)

    Google Scholar 

  3. Chaudhuri, K., Kakade, S.M., Livescu, K., Sridharan, K.: Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 129–136. ACM (2009)

    Google Scholar 

  4. De Bie, T., De Moor, B.: On the regularization of canonical correlation analysis. In: International Symposium on ICA and BSS, pp. 785–790 (2003)

    Google Scholar 

  5. Feng, F., Li, R., Wang, X.: Deep correspondence restricted boltzmann machine for cross-modal retrieval. Neurocomputing 154, 50–60 (2015)

    Article  Google Scholar 

  6. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  7. Hu, X., Downie, J.S., Ehmann, A.F.: Lyric text mining in music mood classification. Am. Music 183(5,049), 2–209 (2009)

    Google Scholar 

  8. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)

    Google Scholar 

  9. Mandal, A., Maji, P.: Regularization and shrinkage in rough set based canonical correlation analysis. In: Polkowski, L., et al. (eds.) IJCRS 2017. LNCS (LNAI), vol. 10313, pp. 432–446. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60837-2_36

    Chapter  Google Scholar 

  10. Mandal, A., Maji, P.: FaRoC: fast and robust supervised canonical correlation analysis for multimodal omics data. IEEE Trans. Cybern. 48(4), 1229–1241 (2018)

    Article  Google Scholar 

  11. McAuley, J., Leskovec, J.: Image labeling on a network: using social-network metadata for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 828–841. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_59

    Chapter  Google Scholar 

  12. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 689–696 (2011)

    Google Scholar 

  13. Peng, Y., Huang, X., Qi, J.: Cross-media shared representation by hierarchical learning with multiple deep networks. In: IJCAI, pp. 3846–3853 (2016)

    Google Scholar 

  14. Wang, B., Yang, Y., Xu, X., Hanjalic, A., Shen, H.T.: Adversarial cross-modal retrieval. In: Proceedings of the 2017 ACM on Multimedia Conference, pp. 154–162. ACM (2017)

    Google Scholar 

  15. Wang, K., He, R., Wang, W., Wang, L., Tan, T.: Learning coupled feature spaces for cross-modal matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2095 (2013)

    Google Scholar 

  16. Wang, K., Yin, Q., Wang, W., Wu, S., Wang, L.: A comprehensive survey on cross-modal retrieval. arXiv preprint arXiv:1607.06215 (2016)

  17. Xia, R., Pan, Y., Lai, H., Liu, C., Yan, S.: Supervised hashing for image retrieval via image representation learning. In: AAAI, vol. 1, p. 2 (2014)

    Google Scholar 

  18. Yan, F., Mikolajczyk, K.: Deep correlation for matching images and text. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3441–3450 (2015)

    Google Scholar 

  19. Yao, T., Mei, T., Ngo, C.W.: Learning query and image similarities with ranking canonical correlation analysis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 28–36 (2015)

    Google Scholar 

  20. Yu, Y., Tang, S., Raposo, F., Chen, L.: Deep cross-modal correlation learning for audio and lyrics in music retrieval. arXiv preprint arXiv:1711.08976 (2017)

  21. Zhang, H., et al.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. arXiv preprint (2017)

    Google Scholar 

  22. Zhang, J., Peng, Y., Yuan, M.: Unsupervised generative adversarial cross-modal hashing. arXiv preprint arXiv:1712.00358 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thi-Hong Vuong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vuong, TH., Pham, TH., Nguyen, TT., Ha, QT. (2019). An Adversarial Learning and Canonical Correlation Analysis Based Cross-Modal Retrieval Model. In: Nguyen, N., Gaol, F., Hong, TP., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2019. Lecture Notes in Computer Science(), vol 11431. Springer, Cham. https://doi.org/10.1007/978-3-030-14799-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-14799-0_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-14798-3

  • Online ISBN: 978-3-030-14799-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics