Advertisement

International Journal of Computer Vision

, Volume 127, Issue 11–12, pp 1614–1628 | Cite as

Unsupervised Binary Representation Learning with Deep Variational Networks

  • Yuming Shen
  • Li Liu
  • Ling ShaoEmail author
Article
  • 560 Downloads

Abstract

Learning to hash is regarded as an efficient approach for image retrieval and many other big-data applications. Recently, deep learning frameworks are adopted for image hashing, suggesting an alternative way to formulate the encoding function other than the conventional projections. Although deep learning has been proved to be successful in supervised hashing, existing unsupervised deep hashing techniques still cannot produce leading performance compared with the non-deep methods, as it is hard to unveil the intrinsic structure of the whole sample space by simply regularizing the output codes within each single training batch. To tackle this problem, in this paper, we propose a novel unsupervised deep hashing model, named deep variational binaries (DVB). The conditional auto-encoding variational Bayesian networks are introduced in this work to exploit the feature space structure of the training data using the latent variables. Integrating the probabilistic inference process with hashing objectives, the proposed DVB model estimates the statistics of data representations, and thus produces compact binary codes. Experimental results on three benchmark datasets, i.e., CIFAR-10, SUN-397 and NUS-WIDE, demonstrate that DVB outperforms state-of-the-art unsupervised hashing methods with significant margins.

Keywords

Hashing Unsupervised learning Deep learning Image retrieval 

Notes

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., & Devin, M., et al. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.
  2. Belkin, M., & Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems (NIPS).Google Scholar
  3. Cao, Y., Liu, B., Long, M., & Wang, J. (2018). Hashgan: Deep learning to hash with pair conditional wasserstein gan. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  4. Cao, Y., Long, M., Wang, J., Zhu, H., & Wen, Q. (2016). Deep quantization network for efficient image retrieval. In AAAI conference on artificial intelligence (AAAI).Google Scholar
  5. Carreira-Perpinán, M. A., & Raziperchikolaei, R. (2015). Hashing with binary autoencoders. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  6. Chaidaroon, S., & Fang, Y. (2017). Variational deep semantic hashing for text documents. In ACM conference on research and development in information retrieval (SIGIR).Google Scholar
  7. Charikar, M.S. (2002). Similarity estimation techniques from rounding algorithms. In ACM symposium on theory of computing (STOC).Google Scholar
  8. Chua, T. S., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y. (2009). Nus-wide: A real-world web image database from national university of singapore. In ACM international conference on image and video retrieval (CIVR).Google Scholar
  9. Dai, B., Guo, R., Kumar, S., He, N., & Song, L. (2017). Stochastic generative hashing. In International conference on machine learning (ICML).Google Scholar
  10. Do, T. T., Doan, A. D., & Cheung, N. M. (2016). Learning to hash with binary deep neural network. In European conference on computer vision (ECCV).Google Scholar
  11. Erin Liong, V., Lu, J., Tan, Y. P., & Zhou, J. (2017). Cross-modal deep variational hashing. In IEEE international conference on computer vision (ICCV).Google Scholar
  12. Erin Liong, V., Lu, J., Wang, G., Moulin, P., & Zhou, J. (2015). Deep hashing for compact binary codes learning. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  13. Eslami, S.A., Heess, N., Weber, T., Tassa, Y., Szepesvari, D., & Hinton, G.E., et al. (2016). Attend, infer, repeat: Fast scene understanding with generative models. In Advances in neural information processing systems (NIPS).Google Scholar
  14. Gong, Y., Lazebnik, S., Gordo, A., & Perronnin, F. (2013). Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2916–2929. As References are repeated twice in reference list, we have deleted the duplicate reference please check and confirm.CrossRefGoogle Scholar
  15. Guo, Y., Ding, G., Liu, L., Han, J., & Shao, L. (2017). Learning to hash with optimized anchor embedding for scalable retrieval. IEEE Transactions on Image Processing, 26(3), 1344–1354.MathSciNetCrossRefGoogle Scholar
  16. He, K., Wen, F., & Sun, J. (2013). K-means hashing: An affinity-preserving quantization method for learning binary compact codes. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  17. He, X., & Niyogi, P. (2003). Locality preserving projections. In Advances in neural information processing systems (NIPS).Google Scholar
  18. Heo, J. P., Lee, Y., He, J., Chang, S. F., & Yoon, S. E. (2012). Spherical hashing. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  19. Hu, G., Hua, Y., Yuan, Y., Zhang, Z., Lu, Z., Mukherjee, S. S., Hospedales, T. M., Robertson, N. M., & Yang, Y. (2017). Attribute-enhanced face recognition with neural tensor fusion networks. In IEEE International conference on computer vision (ICCV).Google Scholar
  20. Jiang, Q. Y., & Li, W. J. (2017). Deep cross-modal hashing. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  21. Kingma, D., & Ba, J. (2015). Adam: A method for acm symposium on theory of computing (stoc)hastic optimization. In International conference on learning representations (ICLR).Google Scholar
  22. Kingma, D., & Welling, M. (2014). Auto-encoding variational bayes. In International conference on learning representations (ICLR).Google Scholar
  23. Kingma, D. P., Mohamed, S., Rezende, D. J., & Welling, M. (2014). Semi-supervised learning with deep generative models. In Advances in neural information processing systems (NIPS).Google Scholar
  24. Kong, W., & Li, W. J. (2012). Isotropic hashing. In Advances in neural information processing systems (NIPS).Google Scholar
  25. Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.Google Scholar
  26. Kulis, B., & Darrell, T. (2009). Learning to hash with binary reconstructive embeddings. In Advances in neural information processing systems (NIPS).Google Scholar
  27. Kulis, B., & Grauman, K. (2009). Kernelized locality-sensitive hashing for scalable image search. In IEEE international conference on computer vision (ICCV).Google Scholar
  28. Kulkarni, T. D., Whitney, W. F., Kohli, P., & Tenenbaum, J. (2015). Deep convolutional inverse graphics network. In Advances in neural information processing systems (NIPS).Google Scholar
  29. Lai, H., Pan, Y., Liu, Y., & Yan, S. (2015). Simultaneous feature learning and hash coding with deep neural networks. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  30. Lin, K., Lu, J., Chen, C. S., & Zhou, J. (2016). Learning compact binary descriptors with unsupervised deep neural networks. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  31. Liu, L., Lin, Z., Shao, L., Shen, F., Ding, G., & Han, J. (2017). Sequential discrete hashing for scalable cross-modality similarity retrieval. IEEE Transactions on Image Processing, 26(1), 107–118.MathSciNetCrossRefGoogle Scholar
  32. Liu, L., & Shao, L. (2016). Sequential compact code learning for unsupervised image hashing. IEEE Transactions on Neural Networks and Learning Systems, 27(12), 2526–2536.CrossRefGoogle Scholar
  33. Liu, L., Shao, L., Shen, F., & Yu, M. (2017). Discretely coding semantic rank orders for supervised image hashing. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  34. Liu, L., Shen, F., Shen, Y., Liu, X., & Shao, L. (2017). Deep sketch hashing: Fast free-hand sketch-based image retrieval. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  35. Liu, L., Yu, M., & Shao, L. (2016). Unsupervised local feature hashing for image similarity search. IEEE Transactions on Cybernetics, 46(11), 2548–2558.CrossRefGoogle Scholar
  36. Liu, L., Yu, M., & Shao, L. (2017). Latent structure preserving hashing. International Journal of Computer Vision, 122(3), 439–457.MathSciNetCrossRefGoogle Scholar
  37. Liu, L., Yu, M., & Shao, L. (2017). Learning short binary codes for large-scale image retrieval. IEEE Transactions on Image Processing, 26(3), 1289–1299.MathSciNetCrossRefGoogle Scholar
  38. Liu, W., Mu, C., Kumar, S., & Chang, S. F. (2014). Discrete graph hashing. In Advances in neural information processing systems (NIPS).Google Scholar
  39. Liu, W., Wang, J., Kumar, S., & Chang, S. F. (2011). Hashing with graphs. In International conference on machine learning (ICML).Google Scholar
  40. Maaten, Lvd, & Hinton, G. (2008). Visualizing data using t-sne. Journal of Machine Learning Research, 9(Nov), 2579–2605.zbMATHGoogle Scholar
  41. Norouzi, M., & Blei, D. M. (2011). Minimal loss hashing for compact binary codes. In International conference on machine learning (ICML).Google Scholar
  42. Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.CrossRefGoogle Scholar
  43. Purushotham, S., Carvalho, W., Nilanon, T., & Liu, Y. (2017). Variational recurrent adversarial deep domain adaptation. In International conference on learning representations (ICLR).Google Scholar
  44. Raginsky, M., & Lazebnik, S. (2009). Locality-sensitive binary codes from shift-invariant kernels. In Advances in neural information processing systems (NIPS).Google Scholar
  45. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.MathSciNetCrossRefGoogle Scholar
  46. Salakhutdinov, R., & Hinton, G. (2009). Semantic hashing. International Journal of Approximate Reasoning, 50(7), 969–978.CrossRefGoogle Scholar
  47. Serban, I. V., Sordoni, A., Lowe, R., Charlin, L., Pineau, J., Courville, A. C., & Bengio, Y. (2017). A hierarchical latent variable encoder-decoder model for generating dialogues. In AAAI conference on artificial intelligence (AAAI).Google Scholar
  48. Shen, F., Shen, C., Liu, W., & Tao Shen, H. (2015). Supervised discrete hashing. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  49. Shen, Y., Liu, l., & Shao, L. (2017). Unsupervised deep generative hashing. In British machine vision conference (BMVC).Google Scholar
  50. Shen, Y., Liu, l., Shao, L., & Song, J. (2017). Deep binaries: Encoding semantic-rich cues for efficient textual-visual cross retrieval. In IEEE international conference on computer vision (ICCV).Google Scholar
  51. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference in learning representations (ICLR).Google Scholar
  52. Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems (NIPS).Google Scholar
  53. Song, J., Yang, Y., Yang, Y., Huang, Z., & Shen, H. T. (2013). Inter-media hashing for large-scale retrieval from heterogeneous data sources. In ACM international conference on management of data (SIGMOD).Google Scholar
  54. Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3), 279–311.MathSciNetCrossRefGoogle Scholar
  55. Wang, J., Kumar, S., & Chang, S. F. (2012). Semi-supervised hashing for large-scale search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(12), 2393–2406.CrossRefGoogle Scholar
  56. Weiss, Y., Torralba, A., & Fergus, R. (2009). Spectral hashing. In Advances in neural information processing systems (NIPS).Google Scholar
  57. Xia, R., Pan, Y., Lai, H., Liu, C., & Yan, S. (2014). Supervised hashing for image retrieval via image representation learning. In AAAI conference on artificial intelligence (AAAI).Google Scholar
  58. Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  59. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., & Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning (ICML).Google Scholar
  60. Yan, X., Yang, J., Sohn, K., & Lee, H. (2016). Attribute2image: Conditional image generation from visual attributes. In European conference on computer vision (ECCV).Google Scholar
  61. Yang, Z., Hu, Z., Salakhutdinov, R., & Berg-Kirkpatrick, T. (2017). Improved variational autoencoders for text modeling using dilated convolutions. In arXiv preprint arXiv:1702.08139.
  62. Yu, M., Liu, L., & Shao, L. (2016). Structure-preserving binary representations for rgb-d action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(8), 1651–1664.CrossRefGoogle Scholar
  63. Zhu, H., Long, M., Wang, J., & Cao, Y. (2016). Deep hashing network for efficient similarity retrieval. In AAAI conference on artificial intelligence (AAAI).Google Scholar
  64. Zhu, X., Zhang, L., & Huang, Z. (2014). A sparse embedding and least variance encoding approach to hashing. IEEE Transactions on Image Processing, 23(9), 3737–3750.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Inception Institute of Artificial IntelligenceAbu DhabiUAE

Personalised recommendations