Abstract
Cross-modal retrieval has been an active research topic in recent years. However, most existing methods ignored discovering the common semantic relationship among different modalities so as to seriously reduce the retrieval accuracy. To cope with this problem, we propose a novel cross-modal retrieval method based on coupled dictionary learning with common label alignment. Concretely, our method first conducts coupled dictionary learning on the data from different modalities separately and then projects them into a common space, where the correlation between these modalities is encouraged by using common label alignment. Experimental results on two public datasets demonstrate that our method outperforms several state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Chen, Y., Wang, L., Wang, W., Zhang, Z.: Continuum regression for cross-modal multimedia retrieval. In: Proceedings of IEEE Conference on Image Processing, pp. 1949–1952 (2012)
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48:1–48:9 (2009)
Hardoon, D., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)
Huang, D.A., Wang, Y.C.F.: Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2496–2503 (2013)
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Advances in Neural Information Processing Systems, pp. 801–808 (2006)
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of the 26th International Conference on Machine Learning, pp. 689–696 (2009)
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the ACM International Conference on Multimedia, pp. 251–260 (2010)
Sharma, A., Jacobs, D.W.: Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600 (2011)
Sharma, A., Kumar, A., Daume, H., Jacobs, D.W.: Generalized multiview analysis: a discriminative latent space. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2160–2167 (2012)
Tenenbaum, J.B., Freeman, W.T.: Separating style and content with bilinear models. Neural Comput. 12(6), 1247–1283 (2000)
Wang, S., Zhang, L., Liang, Y., Pan, Q.: Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2216–2223 (2012)
Zhuang, Y., Wang, Y.F., Wu, F., Zhang, Y., Lu, W.: Supervised coupled dictionary learning with group structures for multi-modal retrieval. In: Proceedings of the Twenty-seventh AAAI Conference on Artificial Intelligence, pp. 1070–1076 (2013)
Acknowledgments
This work is supported by the National High Technology Research and Development Program of China (2013AA01A602), the Program for New Century Excellent Talents in University (NCET-12-0917), the Fundamental Research Funds for the Central Universities (No. K5051302019), the Key Science and Technology Program of Shaanxi Province, China (2014K05-16).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Tang, X., Yang, Y., Deng, C., Gao, X. (2015). Coupled Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval. In: He, X., et al. Intelligence Science and Big Data Engineering. Image and Video Data Engineering. IScIDE 2015. Lecture Notes in Computer Science(), vol 9242. Springer, Cham. https://doi.org/10.1007/978-3-319-23989-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-23989-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23987-3
Online ISBN: 978-3-319-23989-7
eBook Packages: Computer ScienceComputer Science (R0)