An Efficient Quality Enhancement Solution for Stereo Images

Peng, Yingqing; Jin, Zhi; Zou, Wenbin; Tang, Yi; Li, Xia

doi:10.1007/978-3-030-34113-8_15

Yingqing Peng¹⁴,
Zhi Jin^14,15,
Wenbin Zou¹⁴,
Yi Tang¹⁴ &
…
Xia Li¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11903))

Included in the following conference series:

International Conference on Image and Graphics

1678 Accesses

Abstract

Recently, with additional information in the disparity variant, quality enhancement for stereo images has become an active research field. Current methods generally adopt cost volumes for stereo matching methods to learn correspondence between stereo image pairs. However, with the large disparity in the different viewpoints of stereo images, how to learn the accurate corresponding information remains a challenge. In addition, as the network deepens, traditional convolutional neural networks (CNNs) adopt cascading methods, which results in the high computational cost and memory consumption. In this paper, we propose an end-to-end effective CNN model. Channel-wise attention-based information distillation and long short-term memory (LSTM) are the basic components, which contribute to reconstruct high quality image (DCL network). Within a stereo image pair, we use high quality (HQ) image to guide the image reconstruction of low quality (LQ). To incorporate the stereo correspondence, information fusion-based LSTM module can be used to learn the disparity variant in stereo images. Specially, in order to distill and enhance effective features map, we introduce channel-wise attention-based a long distillation information module with the consideration of interdependencies among feature channels. Experimental results demonstrate that the proposed network achieves the best performance with comparatively less parameters.

Y. Peng and W. Zou—Equal contribution.

You have full access to this open access chapter, Download conference paper PDF

Multiple attention networks for stereo matching

Article 07 June 2021

Longyuan Guo, Houyu Duan & Wuwei Zhou

Multi-scale inputs and context-aware aggregation network for stereo matching

Article 12 February 2024

Liqing Shi, Taiping Xiong, … Xiangjie Wu

Accurate stereo image super-resolution using spatial-attention-enhance residual network

Article 19 September 2022

Wenyuan Ying, Tianyang Dong & Chen Shentu

Keywords

1 Introduction

With the different views of additional information, stereo images are used in various ranges of applications including 3D model reconstruction [1] and autonomous driving for vehicles [2]. Since the seminal work of super resolution convolutional neural network (SRCNN) [3] had proposed, learning-based methods [4, 5] are widely adopted to improve image quality. As CNN-based methods resize input before sending them in the network, and adopt a deeper and recursive network to gain better reconstruction performance. But it demands large computational cost and memory consumption, which are hard to applied in mobile phones and embedded devices. Moreover, the traditional convolutional methods [6, 7] design networks by cascading technologies, which lead to the features redundancy because features map of each layer sending to the sequence layer without difference. However, Hu et al. [8] demonstrated that the representational power of a network can be improved by recalibrating channel-wise feature responses. Recent video quality enhancement method [9] focuses on the exploitation of correspondence between adjacent frames in local region. Video quality enhancement methods cannot be directly applied to stereo image quality enhancement, since stereo images have a long-range dependency and non-local characteristic. Current stereo images enhancement methods leverage stereo matching [10,11,12] to learn correspondence between a stereo image pair. They use cost volumes to model long-range dependency in the network. But these methods are insufficient for estimating accurate correspondence in the large disparity.

To address these problems, we propose an end-to-end CNN model (DCL network) to incorporate stereo correspondence for the task of quality enhancement. Given a stereo image pair, a feature extraction block is firstly used to separately extract features from input images. Secondly, we employ long information distillation blocks (LDBlock) on the LQ image to distill useful information, and information distillation blocks (DBlock) [13] on the HQ image. Because the LQ image requires deeper network to learn more features than HQ image. Features are extracted from LQ and HQ image, and then fed to information fusion based LSTM [14] module to capture stereo correspondence. In addition, we use channel-wise attention following and embedded the information distillation block, which focuses on key information and neglects irrelevant information by considering interdependencies among channels.

The main contributions can be summarized as follows:

1.
HQ image is used to guide the image reconstruction of LQ image within stereo image in our network.
2.
The proposed long information distillation block extracts the LQ features and combines with channel-wise attention to distill and enhance useful and efficient features.
3.
We propose information fusion based LSTM to handle the disparity variations between two viewpoints in one stereo image pair.

2 Related Work

Stereo images quality enhancement methods have been extensively studied in the computer vision community. In this section, we focus on the works related to quality enhancement and long-range dependency learning.

2.1 Quality Enhancement

CNNs have shown to be the state-of-the-art methods for the task of quality enhancement over recent years. Model SRCNN [3], as a pioneer in image reconstruction by using deep learning, is an end-to-end CNN model with three layers: patch extraction and representation, non-linear mapping, reconstruction. But the feature extraction uses only one layer so that it has small receptive field and gets local features. To address this problem, Dong et al. [4] introduced Artifacts Reduction Convolutional Neural Network (called ARCNN), which added a feature enhancement layer. Yu et al. [5] proposed a faster CNN with five layers (FastARCNN). With the employment of deeper and wider networks, CNN-based methods [7] suffer from computational complexity and memory consumption in practice. Hui et al. [13] proposed information distillation block, and used few filters per layer. Although the information distillation block is deep, the convolutional network is compact. Thus, it achieves better results with higher speed and accuracy. In addition, to address the problem of noticeable visual artifacts with a high compression ratio, based on quality enhancement, Jin et al. [15] introduced a fully convolutional neural network. They extracted the corresponding high frequency information in HQ image and fused it with LQ image, which can enhance the LQ image quality in asymmetric stereo images by exploiting inter-view correlation.

2.2 Long-Range Dependency Learning

To leverage the disparity information from both right and left views in stereo images, long-range dependency learning has become an important concept in deep neural networks. With the development of a great number of algorithms for stereo correspondence, related works with stereo matching mainly aim to strive for better performance [16,17,18]. In recent years, Zbonta et al. [19] concatenate the left and right features using CNNs to compute the stereo matching cost by learning a similarity on small image patches. To address the problem that current networks depend on path-based network, these methods [10, 12] employ 4D cost volume to effectively exploit global context information. However, with the challenges of computational complexity and memory consumption, Liang et al. [11] proposed 4D cost volume by incorporating all steps into a single network for stereo matching with sharing the same features.

Attention mechanisms have been widely applied in diverse prediction tasks including localization and understanding in images [20, 21], image captioning [22] and so on. It was first introduced by Bahdanau et al. [23]. Visual attention can be seen as a dynamic feature extraction mechanism [24, 25]. These methods [26, 27] can process data in parallel and model complex contexts. SCA-CNN [28] demonstrated that existing visual spatial attention is only applied in the last conv-layer, where the size of receptive field will be quite large and the differences between each receptive field region are quite limited. Therefore, they proposed to incorporate spatial and channel-wise attention in a CNN model.

Inspired by visual attention model, and that stereo images can be seen as the consecutive frames in videos, we propose to combine attention with LSTM to learn the disparity variations in different stereo images. In particular however, our work directly extends [13, 15].

3 Proposed Method

In this section, we describe the proposed model architecture and the long information distillation module. In the following, we introduce the loss function adopted in our network.

3.1 Network Structure

Our method takes a stereo image pair as input, which contains a LQ (right) image and a HQ (left) image. The output is the enhanced LQ (right) image. The architecture of our network is illustrated in Fig. 1 and Table 1. The network comprises of four modules: features extraction, information distillation, information fusion and image reconstruction.

Table 1. The proposed network architecture includes four stages

Full size table

Firstly, We adopt two 3 $\times $ 3 convolutions to extract original features of input image by features extraction module (FBlock) [13]. The extracted features map is fed to the information distillation module to distill more useful information, whose results are 64 features map. Each information distillation module combines channel-wise attention module to focus on the key information. In order to reduce data dimension and further distill relevant information for following network, we use a 1 $\times $ 1 convolutional layer. In addition, it can increase the nonlinear characteristics while maintaining the size of the image features. This process can be formulated as:

$$\begin{aligned} {{D_{i} = C(D_{i-1}(f(x))) , i=1,...,n, }} \end{aligned}$$

(1)

$$\begin{aligned} {{P_{i} = P(D_{i}) }} \end{aligned}$$

(2)

Where x denotes the input of right LQ image and left HQ image; $ f $ represents the operation of feature extraction; $D_{i}$ indicates the i-th LDBlock or DBlock function; C, P represent the operation of the channel-wise attention and compression respectively.

Then, the extracted features of two streams for LQ and HQ image make a information fusion by a 4-layer CNN. The operation of these layers can be formulated as:

$$\begin{aligned} {{ F_{0} = F(I_{low} + I_{high}) }} \end{aligned}$$

(3)

where $I_{low}$, $I_{high}$ denote the output of longer information distillation module and information distillation module, respectively; F represents the information fusion of the left HQ features and right LQ features; $F_{0}$ denotes the output of information fusion module.

Finally, we use a LSTM network [14] that focuses on learning the corresponding information at the different views of location using features similarities, and keep the left and right of a stereo image pair in location consistency. In order to improve features utilization, we combine the previous features map from the input of LSTM with some current information, which can effectively reconstruct a HQ image. Here, the final enhanced LQ image can be expressed as:

$$\begin{aligned} {{y = F_{0} + L( F_{0}) }} \end{aligned}$$

(4)

where L denotes the function of LSTM, y represents a output of the network.

3.2 Long Information Distillation

Motivated by an enhancement unit in the IDN [13], we use stacked information distillation blocks to effectively extract image features. And inspired by inception model in GoogLeNet [29], we try to design a deeper and wider network to generate more features maps. Combined the above methods, we design a deeper and wider information distillation block called long information distillation (LDBlock). It is shown in Fig. 2. Based on enhancement unit in the DBlock, we use stacked convolution operation respectively after slicing features map to extract more information in LQ image. In order to reduce the parameters of our network, we leverage the grouped convolutional layers in the second convolutional layer in each enhancement unit with 4 groups. Specially, we adopt the channel-wise attention to adaptively rescale features by considering interdependencies among feature channels.

3.3 Loss Function

Our network is optimized with loss function. We design two loss functions including total loss $L_{total}$ and LSTM loss $L_{lstm}$. The total loss $L_{total}$ is to measure the difference of predicted LQ image $I_{low}$ and the corresponding uncompressed ground-truth image $I_{GT}$. We use the mean square error (MSE) as our total loss, which is most widely applied in image restoration. To optimize the difference of left and right image location, we introduce LSTM loss $L_{lstm}$. Aiming at improving the effectiveness of our network, we choose to optimize the same loss function as previous works.

$$\begin{aligned} {{ L_{total}(\varTheta ) = \frac{1}{N} \sum \nolimits _{i=1}^{N}\Vert F(I_{low}^{i},I_{high}^{i}; \varTheta ) - I_{GT}^{i} \Vert _{2}^{2}} } \end{aligned}$$

(5)

$$\begin{aligned} {{ L_{lstm} =\frac{1}{N}\sum \nolimits _{i=1}^{N}\Vert I_{lstm}^{i} - I_{GT}^{i} \Vert _{2}^{2} } } \end{aligned}$$

(6)

Where $\varTheta $ contains the parameter set of the network, including both weights and biases. F represents network to generate the predicted images. $I_{lstm}^{i}$ denotes the reconstructed images in a LSTM module. Therefore, the overall loss function is formulated as:

$$\begin{aligned} {{Loss = \lambda _{1}(L_{total}) + \lambda _{2}( L_{lstm})}} \end{aligned}$$

(7)

Where $\lambda $ is the weight balancing two losses. Here, $\lambda _{1}$, $\lambda _{2}$ is set to 0.8 and 0.2 in our experiment, respectively. More details of training is shown in Sect. 4.2.

4 Experiment

In this section, we first introduce the datasets and implementation details, and then analyze the proposed network architecture. We further compare our network to the state-of-the-art networks on two multiview datasets.

4.1 Dataset

To train the proposed network, we follow [15] and adopt the Middlebury 2014 stereo image dataset including 18 images as our training data. For testing, we use 5 remaining images. Taking into account the training complexity, we leverage the small patch training strategy to crop the image size with 300 $\times $ 300. Meanwhile, the corresponding patches in HQ images and ground-truth images are also obtained. There are 942 $\times $ 2 images in the total training. In order to evaluate the performance of the proposed network, JPEG quality is set to 10 and 20 to generate image of a different compression quality. However, for testing, the larger size testing image is unable to process. We crop the test image into a set of $ l_{sub}$ x $ l_{sub}$ with same equal proportion in different sizes of the image.

4.2 Implementation Details

To improve the robustness and generalization ability of model, data augmentation is adopted in four ways: (1) rotate the image randomly by $90^{\circ }$; (2) crop in a 160 size image; (3) flip images horizontally; (4) flip images vertically. In this work, our model is trained by Adam optimizer with $\beta _{1}$ = 0.9, $\beta _{2}$ = 0.999 and the batch size is 12. There are 800 epochs in total, since the learning rate approaches to zero if there are too many epochs. The learning rate is initially set to 0.0001 and decreases by the factor of 10 during fine-tune phase. In addition, the LeakyReLU is applied after each convolution operation, and the negative scope is set to 0.05. In order to focus on the quality enhancement of the image luminance, we adopt a single channel image. We conduct our experiment on a Nvidia GTX 1080Ti GPU and to train a model it need half a day. We implement our network on the Pytorch platform, where its flexibility and efficiency enable us to easily develop the network.

4.3 Network Architecture Analysis

Stereo Image vs Single Image. In order to validate the effectiveness of stereo information for image quality enhancement, we do an experiment based on our network to use a single image (i.e., LQ images), stereo image pairs (HQ and LQ) from the different view as the input. The result is shown in Table 2. It is demonstrated that HQ image contributes to improve LQ image reconstruction. Compared to use a LQ image as the input, restructured image trained by this network decreases 0.74 dB (from 41.12 to 41.38) in terms of peak signal-to-noise ratio (PSNR).

Table 2. Comparative results achieved on the Middlebury 2014 stereo image dataset by our network with different inputs at q20.

Full size table

Table 3. Comparative results achieved on the Middlebury 2014 by our network with the CW inside DBlock/LDBlock at q20.

Full size table

Effectiveness of Channel-Wise Attention. Information distillation module is utilized to distill and enhance features map from the feature extraction. More importantly, channel-wise attention is employed inside and outside of the information distillation block, which can learn the more representative features. To demonstrate its effectiveness, we introduce some implementations by removing channel-wise (CW) in different conditions. From the Table 3, our network only has 41.36 dB in PSNR and 0.9861 in structural similarity values (SSIM) by removing the CW inside the information distillation module of both LQ and HQ image stream. After inserting CW into the DBlock or LDBlock, the performance reaches 41.47 dB and 41.48 dB, respectively. As the same implementation, Table 4 also indicates that LQ image performance benefits from CW outside the information distillation. The increase of parameters is rarely though CW. Theses comparisons show that CW is essential to focus on the effective features in information distillation for deep networks. The results in Tables 3 and 4 denote that the channel-wise features really improve the performance.

Table 4. Comparative results achieved on the Middlebury 2014 by our network with the CW outside DBlock/LDBlock for q20.

Full size table

Effectiveness of Long Short-Term Memory. In order to validate the effectiveness of LSTM module for image quality enhancement, we do a comparative experiment after the information fusion. From Table 5, it is shown that our network with LSTM gets a better performance. The PSNR value is higher than network without LSTM by 0.38 dB.

Table 5. Comparative results achieved on the Middlebury 2014 stereo image dataset by our network with different inputs at q20.

Full size table

4.4 Comparison to State-of-the-Art Approaches

In order to evaluate the performance of our network, we compare with other methods including JPEG [30], SA-DCT [31], ARCNN [4], FastARCNN [5], Fusion-4 and Fusion-8 [15]. The comparison results of the PSNR and SSIM on the Middlebury dataset at JPEG quality 10 and 20 are shown in Table 5. Further more, the number of network parameters for the deep learning based methods are also given. From these results, it is clear that our method achieves the best performance than other methods except Fusion-8 because of the input of Fusion method [15] with the same view. It [15] neglects the different stereo images with large disparity variations. However, our method captures the more reliable correspondence. It can be observed that SSIM achieves the best performance. Compared with Fusion-8, our method reduces the parameters by three times while guaranteeing a higher PSNR (Table 6).

Table 6. Quality enhancement comparison with the state-of-the-art algorithm on the Middlebury.

Full size table

5 Conclusion

In this paper, an efficient deep-learning-based method is proposed to enhance LQ image quality by exploiting from a stereo image pair. We design a deeper and wider information distillation combined with channel-wise attention to extract abundant and efficient features for the LQ image reconstruction. Moreover, our method using information fusion based LSTM module can handle disparity in different views of stereo images. Experiments demonstrate that our method can capture correspondence in stereo image, and achieves the state-of-the-art performance.

References

Chen, X., et al.: 3D object proposals for accurate object class detection. In: Advances in Neural Information Processing Systems, pp. 424–432 (2015)
Google Scholar
Zhang, C., Li, Z., Cheng, Y., Cai, R., Chao, H., Rui, Y.: Meshstereo: a global stereo model with mesh alignment regularization for view interpolation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2057–2065 (2015)
Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks, vol. 38, pp. 295–307. IEEE (2015)
Google Scholar
Dong, C., Deng, Y., Loy, C.C., Tang, X.: Compression artifacts reduction by a deep convolutional network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 576–584 (2015)
Google Scholar
Yu, K., Dong, C., Loy, C.C., Tang, X.: Deep convolution networks for compression artifacts reduction. arXiv preprint arXiv:1608.02778 (2016)
Kim, J., Lee, J.K., Lee, K.M.: Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637–1645 (2016)
Google Scholar
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4472–4480 (2017)
Google Scholar
Chang, J.-R., Chen, Y.-S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)
Google Scholar
Liang, Z., et al.: Learning for disparity estimation through feature constancy. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2811–2820 (2018)
Google Scholar
Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 66–75 (2017)
Google Scholar
Hui, Z., Wang, X., Gao, X.: Fast and accurate single image super-resolution via information distillation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 723–731 (2018)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jin, Z., Luo, H., Luo, L., Zou, W., Lil, X., Steinbach, E.: Information fusion based quality enhancement for 3D stereo images using CNN. In 2018 26th European Signal Processing Conference (EUSIPCO), pp. 1447–1451. IEEE (2018)
Google Scholar
Barnard, S.T.: Stochastic stereo matching over scale. Int. J. Comput. Vis. 3(1), 17–32 (1989)
Article Google Scholar
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision 47(1–3), 7–42 (2002)
Article Google Scholar
Lee, S.H., Kanatsugu, Y., Park, J.-I.: Map-based stochastic diffusion for stereo matching and line fields estimation. Int. J. Comput. Vision 47(1–3), 195–218 (2002)
Article Google Scholar
Zbontar, J., LeCun, Y., et al.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1–32), 2 (2016)
MATH Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)
Google Scholar
Cao, C., et al.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2956–2964 (2015)
Google Scholar
Xu, K., et al.: Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044 (2015)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Google Scholar
Stollenga, M.F., Masci, J., Gomez, F., Schmidhuber, J.: Deep networks with internal selective attention through feedback connections. In: Advances in Neural Information Processing Systems, pp. 3545–3553 (2014)
Google Scholar
Xu, H., Saenko, K.: Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 451–466. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_28
Chapter Google Scholar
You, Q., Jin, H., Wang, Z., Fang, C., Luo, J.: Image captioning with semantic attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4651–4659 (2016)
Google Scholar
Chen, L., et al.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5659–5667 (2017)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)
Google Scholar
Foi, A., Katkovnik, V., Egiazarian, K.: Pointwise shape-adaptive dct for high-quality denoising and deblocking of grayscale and color images. IEEE Trans. Image Process. 16(5), 1395–1411 (2007)
Article MathSciNet Google Scholar

Download references

Acknowledgement

This work was supported in part by the NSFC Project under Grants 61771321, 61701313, and 61871273, in part by the China Postdoctoral Science Foundation under Grants 2017M622778, in part by the Key Research Platform of Universities in Guangdong under Grants 2018WCXTD015, in part by the Natural Science Foundation of Shenzhen under Grants KQJSCX20170327151357330, JCYJ20170818091621856 and JSGG20170822153717702, and in part by the Interdisciplinary Innovation Team of Shenzhen University, in part by the China Postdoctoral Science Foundation Grants (2017M622778).

Author information

Authors and Affiliations

College of Electronics and Information Engineering, Shenzhen University, Shenzhen, People’s Republic of China
Yingqing Peng, Zhi Jin, Wenbin Zou, Yi Tang & Xia Li
School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou, People’s Republic of China
Zhi Jin

Authors

Yingqing Peng
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Jin
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Zou
View author publications
You can also search for this author in PubMed Google Scholar
Yi Tang
View author publications
You can also search for this author in PubMed Google Scholar
Xia Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenbin Zou .

Editor information

Editors and Affiliations

Beijing Jiaotong University, Beijing, China
Yao Zhao
The Australian National University, Canberra, Australia
Nick Barnes
Peking University, Peking, China
Baoquan Chen
The Technical University of Munich, München, Bayern, Germany
Rüdiger Westermann
Zhejiang University, Hangzhou, China
Xiangwei Kong
Beijing Jiaotong University, Beijing, China
Chunyu Lin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, Y., Jin, Z., Zou, W., Tang, Y., Li, X. (2019). An Efficient Quality Enhancement Solution for Stereo Images. In: Zhao, Y., Barnes, N., Chen, B., Westermann, R., Kong, X., Lin, C. (eds) Image and Graphics. ICIG 2019. Lecture Notes in Computer Science(), vol 11903. Springer, Cham. https://doi.org/10.1007/978-3-030-34113-8_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-34113-8_15
Published: 28 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34112-1
Online ISBN: 978-3-030-34113-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

An Efficient Quality Enhancement Solution for Stereo Images

Abstract

Similar content being viewed by others

Multiple attention networks for stereo matching

Multi-scale inputs and context-aware aggregation network for stereo matching

Accurate stereo image super-resolution using spatial-attention-enhance residual network

Keywords

1 Introduction