Coupled Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval

Tang, Xu; Yang, Yanhua; Deng, Cheng; Gao, Xinbo

doi:10.1007/978-3-319-23989-7_17

Xu Tang²¹,
Yanhua Yang²¹,
Cheng Deng²¹ &
…
Xinbo Gao²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9242))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

2590 Accesses

Abstract

Cross-modal retrieval has been an active research topic in recent years. However, most existing methods ignored discovering the common semantic relationship among different modalities so as to seriously reduce the retrieval accuracy. To cope with this problem, we propose a novel cross-modal retrieval method based on coupled dictionary learning with common label alignment. Concretely, our method first conducts coupled dictionary learning on the data from different modalities separately and then projects them into a common space, where the correlation between these modalities is encouraged by using common label alignment. Experimental results on two public datasets demonstrate that our method outperforms several state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Chen, Y., Wang, L., Wang, W., Zhang, Z.: Continuum regression for cross-modal multimedia retrieval. In: Proceedings of IEEE Conference on Image Processing, pp. 1949–1952 (2012)
Google Scholar
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48:1–48:9 (2009)
Google Scholar
Hardoon, D., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)
Article MATH Google Scholar
Huang, D.A., Wang, Y.C.F.: Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2496–2503 (2013)
Google Scholar
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Advances in Neural Information Processing Systems, pp. 801–808 (2006)
Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of the 26th International Conference on Machine Learning, pp. 689–696 (2009)
Google Scholar
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the ACM International Conference on Multimedia, pp. 251–260 (2010)
Google Scholar
Sharma, A., Jacobs, D.W.: Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600 (2011)
Google Scholar
Sharma, A., Kumar, A., Daume, H., Jacobs, D.W.: Generalized multiview analysis: a discriminative latent space. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2160–2167 (2012)
Google Scholar
Tenenbaum, J.B., Freeman, W.T.: Separating style and content with bilinear models. Neural Comput. 12(6), 1247–1283 (2000)
Article Google Scholar
Wang, S., Zhang, L., Liang, Y., Pan, Q.: Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2216–2223 (2012)
Google Scholar
Zhuang, Y., Wang, Y.F., Wu, F., Zhang, Y., Lu, W.: Supervised coupled dictionary learning with group structures for multi-modal retrieval. In: Proceedings of the Twenty-seventh AAAI Conference on Artificial Intelligence, pp. 1070–1076 (2013)
Google Scholar

Download references

Acknowledgments

This work is supported by the National High Technology Research and Development Program of China (2013AA01A602), the Program for New Century Excellent Talents in University (NCET-12-0917), the Fundamental Research Funds for the Central Universities (No. K5051302019), the Key Science and Technology Program of Shaanxi Province, China (2014K05-16).

Author information

Authors and Affiliations

School of Electronic Engineering, Xidian University, Xi’an, 710071, China
Xu Tang, Yanhua Yang, Cheng Deng & Xinbo Gao

Authors

Xu Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yanhua Yang
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Deng
View author publications
You can also search for this author in PubMed Google Scholar
Xinbo Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Deng .

Editor information

Editors and Affiliations

Zhejiang University, Hangzhou, China
Xiaofei He
Xidian University, Xi'an, China
Xinbo Gao
Northwestern Polytechnical University, Shaanxi, China
Yanning Zhang
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Chinese Academy of Sciences, Beijing, China
Zhi-Yong Liu
Suzhou University of Science and Technology, Suzhou, China
Baochuan Fu
Suzhou University of Science and Technology, Jiangsu, China
Fuyuan Hu
Suzhou University of Science and Technology, Jiangsu, China
Zhancheng Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, X., Yang, Y., Deng, C., Gao, X. (2015). Coupled Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval. In: He, X., et al. Intelligence Science and Big Data Engineering. Image and Video Data Engineering. IScIDE 2015. Lecture Notes in Computer Science(), vol 9242. Springer, Cham. https://doi.org/10.1007/978-3-319-23989-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-23989-7_17
Published: 22 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23987-3
Online ISBN: 978-3-319-23989-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics