Cross-Model Retrieval with Reconstruct Hashing

  • Yun Liu
  • Cheng YanEmail author
  • Xiao Bai
  • Jun Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11004)


Hashing has been widely used in large-scale vision problems thanks to its efficiency in both storage and speed. For fast cross-modal retrieval task, cross-modal hashing (CMH) has received increasing attention recently with its ability to improve quality of hash coding by exploiting the semantic correlation across different modalities. Most traditional CMH methods focus on designing a good hash function to use supervised information appropriately, but the performance are limited by hand-crafted features. Some deep learning based CMH methods focus on learning good features by using deep network, however, directly quantizing the feature may result in large loss for hashing. In this paper, we propose a novel end-to-end deep cross-modal hashing framework, integrating feature and hash-code learning into the same network. We keep the relationship of features between modalities. For hash process, we design a novel net structure and loss for hash learning as well as reconstruct the hash codes to features to improve the quality of codes. Experiments on standard databases for cross-modal retrieval show the proposed methods yields substantial boosts over latest state-of-the-art hashing methods.



This work was supported by the National Natural Science Foundation of China project No. 61772057, in part by Beijing Natural Science Foundation project No. 4162037, and the support funding from State Key Lab of Software Development Environment.


  1. 1.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software:
  2. 2.
    Andrew, G., Arora, R., Bilmes, J., Livescu, K.: Deep canonical correlation analysis. In: ICML, pp. III–1247 (2013)Google Scholar
  3. 3.
    Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR, pp. 3594–3601 (2010)Google Scholar
  4. 4.
    Cao, Y., Long, M., Wang, J., Yang, Q., Yu, P.S.: Deep visual-semantic hashing for cross-modal retrieval. In: SIGKDD, pp. 1445–1454 (2016)Google Scholar
  5. 5.
    Cao, Z., Long, M., Yang, Q.: Transitive hashing network for heterogeneous multimedia retrieval. In: AAAIGoogle Scholar
  6. 6.
    Carreira-Perpinan, M.A., Raziperchikolaei, R.: Hashing with binary autoencoders. In: CVPR, pp. 557–566 (2015)Google Scholar
  7. 7.
    Feng, F., Wang, X., Li, R.: Cross-modal retrieval with correspondence autoencoder. In: MM, pp. 7–16 (2014)Google Scholar
  8. 8.
    Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. TPAMI 35(12), 2916–2929 (2013)CrossRefGoogle Scholar
  9. 9.
    Yang, H., et al.: Maximum margin hashing with supervised information. MTAP 75, 3955–3971 (2016)Google Scholar
  10. 10.
    Heo, J.P., Lee, Y., He, J., Chang, S.F.: Spherical hashing. In: CVPR, pp. 2957–2964 (2012)Google Scholar
  11. 11.
    Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: SIGIR, pp. 39–43 (2008)Google Scholar
  12. 12.
    Jiang, Q.Y., Li, W.J.: Deep cross-modal hashing. In: CVPR, pp. 3232–3240 (2017)Google Scholar
  13. 13.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)Google Scholar
  14. 14.
    Kumar, S., Udupa, R.: Learning hash functions for cross-view similarity search. In: IJCAI, pp. 1360–1365 (2011)Google Scholar
  15. 15.
    Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: CVPR, pp. 3270–3278 (2015)Google Scholar
  16. 16.
    Zhou, L., Bai, X., Liu, X., Zhou, J.: Binary coding by matrix classifier for efficient subspace retrieval. In: ICMR, pp. 82–90 (2018)Google Scholar
  17. 17.
    Li, W.J., Wang, S., Kang, W.C.: Feature learning based deep supervised hashing with pairwise labels. In: IJCAI, pp. 1711–1717 (2016)Google Scholar
  18. 18.
    Lin, G., Shen, C., Shi, Q., Van den Hengel, A., Suter, D.: Fast supervised hashing with decision trees for high-dimensional data. In: CVPR, pp. 1971–1978 (2014)Google Scholar
  19. 19.
    Lin, J., Li, Z., Tang, J.: Discriminative deep hashing for scalable face image retrieval. In: IJCAI, pp. 2266–2272 (2017)Google Scholar
  20. 20.
    Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: CVPR, pp. 3864–3872 (2015)Google Scholar
  21. 21.
    Liong, V.E., Lu, J., Wang, G., Moulin, P., Zhou, J.: Deep hashing for compact binary codes learning. In: CVPR, pp. 2475–2483 (2015)Google Scholar
  22. 22.
    Liu, W., Wang, J., Ji, R., Jiang, Y.-G., Chang, S.-F.: Supervised hashing with kernels. In: CVPR, pp. 2074–2081 (2012)Google Scholar
  23. 23.
    Liu, X., He, J., Deng, C., Lang, B.: Collaborative hashing. In: CVPR, pp. 2147–2154 (2014)Google Scholar
  24. 24.
    Masci, J., Bronstein, M.M., Bronstein, A.M., Schmidhuber, J.: Multimodal similarity-preserving hashing. TPAMI 36(4), 824–830 (2014)CrossRefGoogle Scholar
  25. 25.
    Shen, F., Shen, C., Shi, Q., Van den Hengel, A., Tang, Z.: Inductive hashing on manifolds. In: CVPR, pp. 1562–1569 (2013)Google Scholar
  26. 26.
    Song, J., Yang, Y., Yang, Y., Huang, Z., Shen, H.T.: Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: SIGMOD, pp. 785–796 (2013)Google Scholar
  27. 27.
    Strecha, C., Bronstein, A.M., Bronstein, M.M., Fua, P.: LDAHash: improved matching with smaller descriptors. TPAMI 34(1), 66–78 (2012)CrossRefGoogle Scholar
  28. 28.
    Torralba, A., Fergus, R., Weiss, Y.: Small codes and large image databases for recognition. In: CVPR, pp. 1–8 (2008)Google Scholar
  29. 29.
    Wang, D., Gao, X., Wang, X., He, L.: Semantic topic multimodal hashing for cross-media retrieval. In: AAAI, pp. 3890–3896 (2015)Google Scholar
  30. 30.
    Wang, J., Kumar, S., Chang, S.-F.: Semi-supervised hashing for large-scale search. TPAMI 34(12), 2393–2406 (2012)CrossRefGoogle Scholar
  31. 31.
    Wang, W., Ooi, B.C., Yang, X., Zhang, D., Zhuang, Y.: Effective multi-modal retrieval based on stacked auto-encoders, pp. 649–660 (2014)Google Scholar
  32. 32.
    Wu, B., Yang, Q., Zheng, W.S., Wang, Y., Wang, J.: Quantized correlation hashing for fast cross-modal search. In: AAAI, pp. 3946–3952 (2015)Google Scholar
  33. 33.
    Bai, X., Yan, C., Yang, H., Bai, L., Zhou, J., Handcock, E.R.: Adaptive hash retrieval with kernel based similarity. PR 75, 136–148 (2018)Google Scholar
  34. 34.
    Bai, X., Yang, H., Zhou, J., Ren, P., Cheng, J.: Data-dependent hashing based on p-stable distribution. TIP 23, 5033–5046 (2014)MathSciNetzbMATHGoogle Scholar
  35. 35.
    Zhen, Y., Yeung, D.Y.: Co-regularized hashing for multimodal data. In: NIPS, pp. 1376–1384 (2012)Google Scholar
  36. 36.
    Zhang, D., Li, W.J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI, pp. 2177–2183 (2014)Google Scholar
  37. 37.
    Zhu, H., Long, M., Wang, J., Cao, Y.: Deep hashing network for efficient similarity retrieval. In: AAAI, pp. 2415–2421 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Automation Science and Electrical EngineeringBeihang UniversityBeijingChina
  2. 2.School of Computer Science and EngineeringBeihang UniversityBeijingChina
  3. 3.School of Information and Communication TechnologyGriffith UniversityNathanAustralia

Personalised recommendations