Abstract
In large-scale image retrieval tasks, hashing methods based on deep convolutional neural networks (CNNs) play an important role due to elaborate semantic feature representation. However, they usually progressively discard information during feature transformation, thus leading to incomplete and unsatisfactory hashing codes for image retrieval. This study tries to design an invertible architecture to maintain image information, meanwhile focus on necessary parts of image features. Consequently, in this paper, we propose a novel attention-aware invertible hashing network (AIHN) for image retrieval. By invertible feature representations, the final hash codes can be completely obtained from input images without any information loss. For highlighting informative regions, we present a novel attention-aware invertible block as the basic module of AIHN, which can promote generalization ability by spatial attention mechanism. Extensive experiments conducted on benchmark datasets demonstrate the effectiveness of our invertible feature representation on hash code generation, and show the promising performance on image retrieval of our methods against the state-of-the-arts.
S. Li, Q. Cai and Z. Li—These authors contributed equally to this paper and share the first authorship.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
With the explosive growth of data in practical applications such as image retrieval, approximate nearest neighbor (ANN) search has become a hot topic in recent years. In the existing ANN technology, hashing method has become one of the most popular and effective technologies because of its fast query speed and low memory cost. Amounts of studies have shown that hashing has improved the performance on image retrieval tasks [7, 23]. However, these methods are defective in feature representation and can not be trained end-to-end.
Recently, convolutional neural networks (CNNs) are gradually applied to image hashing retrieval, and have achieved promising performance. Xia et al. [22] firstly adopt the CNN architecture in the hash algorithm. Later, series of deep hashing methods based on CNN [16, 17] are proposed in an end-to-end manner, showing the effectiveness of deep feature representation. The performance of these deep learning hash methods has been greatly improved compared with the traditional hash method in many benchmarks. Moreover, it proves crucial to jointly learn similarity-preserving representations and control quantization error of converting continuous representation into binary codes [3]. However, existing deep feature representation are generated with gradually discarding image information. It may result in discarding of representative feature variability in the process of feature transformation, which can not guarantee obtaining complete image information. In addition, informative regions of image are not highlighted well in existing algorithm, causing poor generalization ability.
To effectively solve the above-mentioned problems, we propose a novel image retrieval framework based on invertible network with spatial attention mechanism. Firstly, a reversible network is proposed, which guarantee the lossless representative features transformed from original image. In such a way, all the information of the image will be forwarded through the network. Then, we adopt spatial attention architecture to tell where to focus, which also improves the representation of interests. Spatial attention effectively learns which information to emphasize or suppress in the process of information transmission. As shown in Fig. 1, our method yield most of state-of-the-art retrieval performance. To summarize, the main contributions of this paper are three-fold:
-
We propose an effective invertible network with lossless image information for image retrieval, where the whole framework can be trained end-to-end;
-
To excavate informative regions of features, we adopt spatial attention module in our invertible block to learn how to focus on objective information and suppress unnecessary ones.
-
Extensive experiments on benchmark datasets show that our architecture is effective and achieves promising performance.
The rest of the paper is organized as follows: in Sect. 2, we introduce some related work about our algorithm. The proposed method is illustrated in Sect. 3, followed by the experimental results in Sect. 4. In Sect. 5, we conclude our work.
2 Related Work
2.1 Hashing Methods
Existing hashing methods [1, 25] can be roughly divided into two categories, namely unsupervised hashing and supervised hashing. Unsupervised hashing exploit unlabeled data to learn a set of functions, which encode data to binary codes [5, 21]. Locality-Sensitive Hashing (LSH) [5] is the most representative unsupervised hashing algorithm, achieving promising performance compared with previous approaches. LSH guarantees similar data points preserve similar binary codes after the same hash mapping, vice versa. Supervised hashing [18, 20] further exploit label information during learning to generate compact hash code. Supervised Hashing with Kernels (KSH) utilizes the pair-wised labels to generate effective hash functions, which guarantees minimizing the Hamming distances for similar pair-wise data and meanwhile maximizing the dissimilar ones.
In recent years, CNN have shown significant success in computer vision [13,14,15, 19, 26,27,28,29,30,31, 34,35,37]. In the domain of hashing retrieval, [22] was the first deep neural network, achieving promising performance compared with conventional approaches. Deep Hashing Network (DHN) [33] not only preserves pairwise similarity but also controls the quantization error. For improving DHN, HashNet balances training data consisting of positive pairs and negative pairs, and reduces quantization error by continuation technology, thus gaining the most advanced performance on several benchmark datasets. But the high-dimensional features obtained in these methods are accompanied with gradual loss of image information, and we can not ensure whether the discarded information variability is significant.
2.2 Attention Mechanism
The attention mechanism can be viewed as a strategy to bias the allocation of available processing resources towards the most informative components of an input [10]. Attention module has been widely applied in the Natural Language Processing (NLP) field like machine translation, sentence generation etc. And these performance is surprisingly remarkable. Meanwhile, in the image vision field, attention mechanism also demonstrates powerful capabilities. For example, Hu et al. [9] utilize attention to propose an object relation module, which models the relationship among a set of objects and improves object recognition. In this work [24], a self-attention module is introduced in order to better generate images. A channel-wise attention was proposed for image super-resolution task [32]. In our work, the attention-aware invertible hashing network aims at utilizing spatial attention to enhance informative features from the spatial domain, which can accurately tell which information to emphasize or suppress.
3 Our Method
3.1 Overview
The architecture of AIHN is shown in Fig. 1. The pair-wise images are firstly fed into an invertible downsample layer to increase the number of output channels, while decreasing the spatial resolution. Then, the output is split into two sublayers (\(x_1\), \(y_1\)) of equal channel dimension. Next, sublayers (\(x_1\), \(y_1\)) are put into the invertible block. It is worth noting that spatial attention and invertible downsampling module are introduced in the invertible block. Spatial attention module is to notice the most informative components of an input, and invertible downsampling module is adopted to reduce the number of computations while maintaining good performance. More details about these two will be introduced in Sects. 3.2 and 3.3 below. After totally 100 similar blocks, invertible high-dimensional features are obtained through followed concatenation operation. The invertible features are send to average pooling and linear layer after a ReLU nonlinearity. The results are quantized by Sgn function to get pair-wise binary hash codes. The pairwise similarity loss is adopted for similarity-preserving learning in the Hamming space, and quantization loss is to control both the binarization error and the hash code quality. The invertible downsampling, spatial attention module, as well as invertible block will be introduced in next sections in detail.
3.2 Invertible Downsampling
In order to facilitate calculation and avoid the use of irreversible module at the same time, we introduce invertible downsampling module to our architecture instead of Maxpooling used in [6]. It not only reduce the spatial resolution of the input for the sake of simplicity but also potentially increase the number of channel for lossless information. As shown in Fig. 2, downsampled by a factor of \(\theta \) 4, the output’s channel is 4 times the original, and the size of each feature map is reduced by 4 times. And also invertible downsampling preserves roughly the spatial ordering, thus avoiding mixing different neighborhoods via the next convolution. Invertible downsampling operation can be written as below:
where \(\theta \) represents scaling factor which determines the downsampled size directly, T is the function of downsampling operation, and Fe(c, w, h) denotes feature maps with channel c, width w, and height h.
For reducing computational costs, invertible downsampling is designed tightly for our architecture. It will correspond to an invertible downsampling operator respectively at the begin of our network and depth d = 6, 22, 94.
3.3 Invertible Block
The invertible block is an important component for our invertible hashing network. It not only determines the reversibility of information flow, but also generates attentioned features with lossless information. Spatial attention module and invertible downsampling module introduced in Sects. 3.2 and 3.4 are adopted in the invertible block. In particular, spatial attention module mining the objective information and invertible downsampling module allows us to reduce the number of computations while maintaining good performance. The details of the invertible block are illustrated as Fig. 4.
Detailedly, sublayers \((x_i,y_i)\) obtained through splitting operation are doing different two operations. \(x_i\) is feed to a invertible downsampling layer with scaling factor \(\theta \) 4 directly, so we can get \(T(4,x_i)\). \(y_i\) is sent to a bottleneck block F, mainly consisting of a succession of 3 convolutional operators. The second convolutional layer has four times fewer channels than the other two, while their corresponding filter sizes are respectively 1Â \(\times \)Â 1, 3Â \(\times \)Â 3, 1Â \(\times \)Â 1. The first and the second are preceded by spatial attention module, Batch normalization (BN) and ReLU non-linearity. What needs to be emphasized is that the last convolution layer are followed by batch normalization and ReLU non-linearity only. Obtained \(F(y_i)\) plus \(T(4,x_i)\), then \(Y_{i+1}\) is got. Meanwhile, \(y_i\) is also feed to an invertible downsampling layer for convenient calculation, and \(x_{i+1}\) is equal to output \(T(4,y_i)\). In summary, the detailed operation is described as below:
and reverse propagation can be computed by the following:
where \(T^{-1}\) represents reverse calculation of T function.
3.4 Spatial Attention
The spatial attention module aims to highlight the expressions of key objects for image retrieval. Firstly, it learns a set of weight maps from the feature maps, and provides a larger weight for the informative region in each feature map, while providing a smaller weight for the background region. Then, the learned weight maps is multiplied by the feature map, so feature maps focusing on key objects and suppressing background regions is obtained. More specifically, the spatial attention tell which information to emphasize or suppress in the process of feature transmission. As shown in Fig. 3, feature maps are send to a Max-pooling and average pooling operation respectively, both which demonstrate effective in highlighting informative regions. Then concatenating the both outputs to generate a concentrated feature descriptor. Next, we apply a convolution layer followed by sigmoid operation on the attentioned feature descriptor to get a spatial attention map \(SA(Fe) \in R^ {H\times W}\), which tell information flow which part to emphasize or suppress. In short, the detailed operation is described as below:
where \(\sigma \) presents the sigmoid operation, \(g^{7\times 7}\) denotes a convolution operation with kernel size of \(7\times 7\), cat is concatenation operation along the channel axis, Ap(Fe) and Mp(Fe) respectively represent average pooling and Max-pooling operation, and Fe is a brief expression of feature map.
3.5 Loss Function
In our paper, we focus on the supervised setting utilizing label information. We can easily obtain a set of image pairs, where each pair (\(a_i,a_j\)) consists of an image \(a_i\) and \(a_j (j \ne i)\). Using both category information, we can get the similarity \(s_{ij}\) of image pair (\(a_i,a_j\)). Following [3, 33], the similarity information is constructed directly by image labels:if two images \(a_i\) and \(a_j\) share at least one label, they are similar and \(s_{ij}=1\); otherwise, they are dissimilar and \(s_{ij}=0\).
Intuitively, the desired hash codes should be able to preserve the relative similarities in the image pairs. Corresponding optimization goal is to make the Hamming distance between two similar points as small as possible, and simultaneously make the Hamming distance between two dissimilar points as large as possible. In this way, we can define a pairwise loss that has also been successfully applied in prior research [16], which is defined over the output binary codes \((b_i,b_j)\) corresponding to the training image pair (\(a_i,a_j\)):
where \(\beta _{ij} = \frac{1}{2} b^T_i b_j\), \(s_{ij}\) presents the similarity of image pair (\(a_i,a_j\)). To pursue representative hash codes, we learn our Invertible Hashing Network by minimizing the pairwise loss. This can drive our network to process strong capability of distinguishing the images. Since the hash codes are discrete, we additionally adopt the following quantization loss for each image \(a_i\):
where \(u_i \in \mathbb {R}^{c\times 1} \), \(b_i \in {(-1,1)}^c\), and c represent the hash code length. Based on the two type of loss, we can train our Invertible Hashing Network through the following function:
where \(\lambda \) is the hyper-parameter; \(||.||_2\) denotes the \(l_2\) norm.
4 Experiments
4.1 Dataset and Evaluation
We evaluate the effect of the proposed AIHN with several state-of-the-art hashing methods on two benchmark datasets.
-
CIFAR-10 is a single-label dataset with 60000 images divided into 10 categories (6000 images per class). We follow [2, 33] to randomly select 100 images per class for training, 500 images per class for testing, and the rest 54000 images used as database.
-
NUS-WIDE81 is a multi-label dataset, which contains 269648 images consisting of 81 categories. We follow similar experimental protocols in [3, 33], and randomly sample 5000 images as test images, 10000 images for training, and the remaining images used as database. To evaluate our method, the mean average precision (MAP) is used to measure the accuracy of our proposed method and other baselines (Fig. 5).
4.2 Implementation Detail
Network Detail. The proposed network is trained specifically for image hashing retrieval. The input image size of our network is 3Â \(\times \)Â 224Â \(\times \)Â 224. After the invertible downsampling with \(\theta =4\), the size of features becomes 12Â \(\times \)Â 112Â \(\times \)Â 112. Then, the spliting operation guarantees two sublayers with equal channel. Next, both sublayers pass through totally 100 similar invertible blocks. It will correspond to an invertible downsampling operator respectively at the block = 6, 22, 94. The spatial resolution of these layers is reduced by a factor 4 while increasing the number of channels respectively to 48, 192, 768 and 3072. Furthermore, it means that the corresponding spatial resolutions are respectively 56Â \(\times \)Â 56, 28Â \(\times \)Â 28, 14Â \(\times \)Â 14, 7Â \(\times \)Â 7. Last, the obtained representation is spatially averaged and projected onto one-dimensional vector after a ReLU nonlinearity. Binary hashing code can be obtained through Sgn computation conducted on the one-dimensional vector.
Training Detail. We randomly crop a set of 224Â \(\times \)Â 224 patches for training. The training batch size is set to 64 in each back-propagation. This network is trained via an end-to-end manner. Pairwise similarity loss and quantilization loss are concurrently adopted in CIFAR-10 and NUS-WIDE81, where the data augmentation with random horizontal flip is adopted. The SGD is adopted for optimizing our network, and the initial learning rate is set to 0.05. For each 50 epochs, the learning rate will decrease by the scale of 0.1. The hyper-parameter \(\lambda \) in our network is chosen by a validation set, which is 10 for CIFAR-10 and 100 for NUS-WIDE81. At test time, we rescale the image size to \(256\times 256\) and perform a center crop of size \(224\times 224\). The curve convergence of our network on CIFAR-10 with 16 bits code are shown below. Experiments are performed on two NVIDIA Titan XP GPUs for training and testing.
4.3 Compare with State-of-the-Arts
We use MAP evaluation metrics to compare retrieval performance of AIHN with classical or state-of-the-art methods: supervised shallow methods ITQ-CCA [8], BRE [11], KSH [18], SDH [20] and supervised deep methods CNNH [22], DNNH [12], DHN [33], HashNet [3]. For fair comparison, all methods use identical training and test sets. We adopt MAP@5000 for evaluation in NUS-WIDES. For shallow hashing methods, we use as image features the 4096-dimensional \(DeCAF_7\) feature [4]. For deep hashing methods, we use raw images as the input. We adopt the AlexNet architecture for all deep hashing methods.
Experimental results are as shown in Table 1. It can be seen that our method AIHN achieves the best performance among all the methods. Specifically, compared to the best shallow hashing method using deep features as input, ITQ-CCA, we achieve absolute boosts of 33.45%, 48% in average MAP for different bits on CIFAR-10 and NUS-WIDE dataset respectively. Compared to the state-of-the-art deep hashing method, HashNet, we achieve absolute boosts of 10.21%, 7.9% in average MAP for different bits on the two datasets, respectively. An interesting phenomenon is that the performance boost of AIHN over HashNet is significantly different across the two datasets. Specifically, the performance boost on NUS-WIDES is much larger than that on CIFAR-10 generally. But with the code length increasing, MAP has an exciting increase in CIFAR10 dataset.
4.4 Ablation Experiment
For investigating the effectiveness of proposed two different components, we research two AIHN variants: (1) AIHN-AI is a AIHN variant without using spatial attention module, and replace invertible network with Alexnet which may cause gradually information lost. (2) AIHN-A is a AIHN variant using invertible network for feature extraction. But in each invertible block, there is no spatial attention module adopted.
AIHN-A outperforms AIHN-AI by very large margins of 10.58\(\%\), 8.69\(\%\), 10.71\(\%\) and 9\(\%\) in MAP with corresponding 16, 32, 48, 64 code lengths on CIFAR-10. The invertible Network guarantees that the final hash codes can be completely obtained from input images without any information loss. AIHN outperforms AIHN-A by 0.75\(\%\), 1.48\(\%\), 0.03\(\%\), 1.77\(\%\) in MAP with different 16, 32, 48, 64 code lengths on CIFAR-10 respectively. These results validate that the spatial attention module can enhance efficiency and improve MAP results. That is because the spatial attention module can better capture the objective information. As shown in Table 2, our proposed AIHN achieves the highest result in terms of the MAP evaluation metrics in CIFAR-10 dataset. Further analysis, we can find that Invertible Network which guarantees the lossless generated features contributes to our network largely. This can be explained as the following: when learning image features, progressively discarding variability about the input image may cause effective information to be discarded.
5 Conclusion
In this paper, we propose a novel attention-aware invertible hashing network for image retrieval. By invertible feature representations, the final hash codes can be completely obtained from input images without any information loss, so as to produce accurate hash codes with complete image information. For highlighting informative regions, we present a novel attention-aware invertible block as the basic module of AIHN, which can promote generalization ability by spatial attention mechanism. Extensive experiments conducted on benchmark datasets have demonstrated the state-of-the-art performance of our method.
References
Cao, Y., Long, M., Wang, J., Liu, S.: Collective deep quantization for efficient cross-modal retrieval. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Cao, Y., Long, M., Wang, J., Zhu, H., Wen, Q.: Deep quantization network for efficient image retrieval. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Cao, Z., Long, M., Wang, J., Yu, P.S.: Hashnet: deep learning to hash by continuation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5608–5617 (2017)
Donahue, J., et al.: Decaf: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655 (2014)
Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. In: VLDB, vol. 99, pp. 518–529 (1999)
Gomez, A.N., Ren, M., Urtasun, R., Grosse, R.B.: The reversible residual network: backpropagation without storing activations. In: Advances in Neural Information Processing Systems, pp. 2214–2224 (2017)
Gong, Y., Kumar, S., Rowley, H.A., Lazebnik, S.: Learning binary codes for high-dimensional data using bilinear projections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 484–491 (2013)
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2013)
Hu, H., Gu, J., Zhang, Z., Dai, J., Wei, Y.: Relation networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3588–3597 (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. In: Advances in Neural Information Processing Systems, pp. 1042–1050 (2009)
Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3270–3278 (2015)
Li, C., Liu, Q., Liu, J., Lu, H.: Ordinal distance metric learning for image ranking. IEEE Trans. Neural Netw. Learn. Syst. 26(7), 1551–1559 (2014)
Li, C., Wang, X., Dong, W., Yan, J., Liu, Q., Zha, H.: Joint active learning with feature selection via cur matrix decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1382–1396 (2018)
Li, C., Wei, F., Dong, W., Wang, X., Liu, Q., Zhang, X.: Dynamic structure embedded online multiple-output regression for streaming data. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 323–336 (2018)
Li, W.J., Wang, S., Kang, W.C.: Feature learning based deep supervised hashing with pairwise labels. arXiv preprint arXiv:1511.03855 (2015)
Liu, H., Wang, R., Shan, S., Chen, X.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2064–2072 (2016)
Liu, W., Wang, J., Ji, R., Jiang, Y.G., Chang, S.F.: Supervised hashing with kernels. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2074–2081. IEEE (2012)
Liu, Y., Zhu, X., Zhao, X., Cao, Y.: Adversarial learning for constrained image splicing detection and localization based on atrous convolution. IEEE Trans. Inf. Forensics Secur. (2019)
Shen, F., Shen, C., Liu, W., Tao Shen, H.: Supervised discrete hashing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 37–45 (2015)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in Neural Information Processing Systems, pp. 1753–1760 (2009)
Xia, R., Pan, Y., Lai, H., Liu, C., Yan, S.: Supervised hashing for image retrieval via image representation learning. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)
Yu, F., Kumar, S., Gong, Y., Chang, S.F.: Circulant binary embedding. In: International Conference on Machine Learning, pp. 946–954 (2014)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318 (2018)
Zhang, P., Zhang, W., Li, W.J., Guo, M.: Supervised hashing with latent factor models. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 173–182. ACM (2014)
Zhang, X.Y.: Simultaneous optimization for robust correlation estimation in partially observed social network. Neurocomputing 205, 455–462 (2016)
Zhang, X.Y., Shi, H., Li, C., Zheng, K., Zhu, X., Duan, L.: Learning transferable self-attentive representations for action recognition in untrimmed videos with weak supervision. arXiv preprint arXiv:1902.07370 (2019)
Zhang, X.Y., Shi, H., Zhu, X., Li, P.: Active semi-supervised learning based onself-expressive correlation with generative adversarial networks. Neurocomputing 345, 103–113 (2019)
Zhang, X.Y., Wang, S., Yun, X.: Bidirectional active learning: a two-way exploration into unlabeled and labeled data set. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3034–3044 (2015)
Zhang, X.Y., Wang, S., Zhu, X., Yun, X., Wu, G., Wang, Y.: Update vs. upgrade: modeling with indeterminate multi-class active learning. Neurocomputing 162, 163–170 (2015)
Zhang, X.: Interactive patent classification based on multi-classifier fusion and active learning. Neurocomputing 127, 200–205 (2014)
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18
Zhu, H., Long, M., Wang, J., Cao, Y.: Deep hashing network for efficient similarity retrieval. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Zhu, X., Li, Z., Zhang, X.Y., Li, C., Liu, Y., Xue, Z.: Residual invertible spatio-temporal network for video super-resolution. In: AAAI Conference on Artificial Intelligence (2019)
Zhu, X., Li, Z., Zhang, X., Li, H., Xue, Z., Wang, L.: Generative adversarial image super-resolution through deep dense skip connections. In: Computer Graphics Forum, vol. 37, pp. 289–300. Wiley Online Library (2018)
Zhu, X., Liu, J., Wang, J., Li, C., Lu, H.: Sparse representation for robust abnormality detection in crowded scenes. Pattern Recogn. 47(5), 1791–1799 (2014)
Zhu, X., Zhang, X., Zhang, X.Y., Xue, Z., Wang, L.: A novel framework for semantic segmentation with generative adversarial network. J. Vis. Commun. Image Represent. 58, 532–543 (2019)
Acknowledgement
This work was supported by National Key R&D Program of China (2018YFB0803700) and National Natural Science Foundation of China (61602517, 61877002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, S., Cai, Q., Li, Z., Li, H., Zhang, N., Cao, J. (2019). Attention-Aware Invertible Hashing Network. In: Zhao, Y., Barnes, N., Chen, B., Westermann, R., Kong, X., Lin, C. (eds) Image and Graphics. ICIG 2019. Lecture Notes in Computer Science(), vol 11903. Springer, Cham. https://doi.org/10.1007/978-3-030-34113-8_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-34113-8_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34112-1
Online ISBN: 978-3-030-34113-8
eBook Packages: Computer ScienceComputer Science (R0)