Advertisement

An Efficient Quality Enhancement Solution for Stereo Images

  • Yingqing Peng
  • Zhi Jin
  • Wenbin ZouEmail author
  • Yi Tang
  • Xia Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11903)

Abstract

Recently, with additional information in the disparity variant, quality enhancement for stereo images has become an active research field. Current methods generally adopt cost volumes for stereo matching methods to learn correspondence between stereo image pairs. However, with the large disparity in the different viewpoints of stereo images, how to learn the accurate corresponding information remains a challenge. In addition, as the network deepens, traditional convolutional neural networks (CNNs) adopt cascading methods, which results in the high computational cost and memory consumption. In this paper, we propose an end-to-end effective CNN model. Channel-wise attention-based information distillation and long short-term memory (LSTM) are the basic components, which contribute to reconstruct high quality image (DCL network). Within a stereo image pair, we use high quality (HQ) image to guide the image reconstruction of low quality (LQ). To incorporate the stereo correspondence, information fusion-based LSTM module can be used to learn the disparity variant in stereo images. Specially, in order to distill and enhance effective features map, we introduce channel-wise attention-based a long distillation information module with the consideration of interdependencies among feature channels. Experimental results demonstrate that the proposed network achieves the best performance with comparatively less parameters.

Keywords

Quality enhancement Stereo image Information distillation Channel attention LSTM 

Notes

Acknowledgement

This work was supported in part by the NSFC Project under Grants 61771321, 61701313, and 61871273, in part by the China Postdoctoral Science Foundation under Grants 2017M622778, in part by the Key Research Platform of Universities in Guangdong under Grants 2018WCXTD015, in part by the Natural Science Foundation of Shenzhen under Grants KQJSCX20170327151357330, JCYJ20170818091621856 and JSGG20170822153717702, and in part by the Interdisciplinary Innovation Team of Shenzhen University, in part by the China Postdoctoral Science Foundation Grants (2017M622778).

References

  1. 1.
    Chen, X., et al.: 3D object proposals for accurate object class detection. In: Advances in Neural Information Processing Systems, pp. 424–432 (2015)Google Scholar
  2. 2.
    Zhang, C., Li, Z., Cheng, Y., Cai, R., Chao, H., Rui, Y.: Meshstereo: a global stereo model with mesh alignment regularization for view interpolation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2057–2065 (2015)Google Scholar
  3. 3.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks, vol. 38, pp. 295–307. IEEE (2015)Google Scholar
  4. 4.
    Dong, C., Deng, Y., Loy, C.C., Tang, X.: Compression artifacts reduction by a deep convolutional network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 576–584 (2015)Google Scholar
  5. 5.
    Yu, K., Dong, C., Loy, C.C., Tang, X.: Deep convolution networks for compression artifacts reduction. arXiv preprint arXiv:1608.02778 (2016)
  6. 6.
    Kim, J., Lee, J.K., Lee, K.M.: Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637–1645 (2016)Google Scholar
  7. 7.
    Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)Google Scholar
  8. 8.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)Google Scholar
  9. 9.
    Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4472–4480 (2017)Google Scholar
  10. 10.
    Chang, J.-R., Chen, Y.-S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)Google Scholar
  11. 11.
    Liang, Z., et al.: Learning for disparity estimation through feature constancy. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2811–2820 (2018)Google Scholar
  12. 12.
    Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 66–75 (2017)Google Scholar
  13. 13.
    Hui, Z., Wang, X., Gao, X.: Fast and accurate single image super-resolution via information distillation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 723–731 (2018)Google Scholar
  14. 14.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  15. 15.
    Jin, Z., Luo, H., Luo, L., Zou, W., Lil, X., Steinbach, E.: Information fusion based quality enhancement for 3D stereo images using CNN. In 2018 26th European Signal Processing Conference (EUSIPCO), pp. 1447–1451. IEEE (2018)Google Scholar
  16. 16.
    Barnard, S.T.: Stochastic stereo matching over scale. Int. J. Comput. Vis. 3(1), 17–32 (1989)CrossRefGoogle Scholar
  17. 17.
    Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision 47(1–3), 7–42 (2002)CrossRefGoogle Scholar
  18. 18.
    Lee, S.H., Kanatsugu, Y., Park, J.-I.: Map-based stochastic diffusion for stereo matching and line fields estimation. Int. J. Comput. Vision 47(1–3), 195–218 (2002)CrossRefGoogle Scholar
  19. 19.
    Zbontar, J., LeCun, Y., et al.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1–32), 2 (2016)zbMATHGoogle Scholar
  20. 20.
    Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)Google Scholar
  21. 21.
    Cao, C., et al.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2956–2964 (2015)Google Scholar
  22. 22.
    Xu, K., et al.: Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044 (2015)
  23. 23.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  24. 24.
    Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)Google Scholar
  25. 25.
    Stollenga, M.F., Masci, J., Gomez, F., Schmidhuber, J.: Deep networks with internal selective attention through feedback connections. In: Advances in Neural Information Processing Systems, pp. 3545–3553 (2014)Google Scholar
  26. 26.
    Xu, H., Saenko, K.: Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 451–466. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_28CrossRefGoogle Scholar
  27. 27.
    You, Q., Jin, H., Wang, Z., Fang, C., Luo, J.: Image captioning with semantic attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4651–4659 (2016)Google Scholar
  28. 28.
    Chen, L., et al.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5659–5667 (2017)Google Scholar
  29. 29.
    Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)Google Scholar
  30. 30.
    Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)Google Scholar
  31. 31.
    Foi, A., Katkovnik, V., Egiazarian, K.: Pointwise shape-adaptive dct for high-quality denoising and deblocking of grayscale and color images. IEEE Trans. Image Process. 16(5), 1395–1411 (2007)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yingqing Peng
    • 1
  • Zhi Jin
    • 1
    • 2
  • Wenbin Zou
    • 1
    Email author
  • Yi Tang
    • 1
  • Xia Li
    • 1
  1. 1.College of Electronics and Information EngineeringShenzhen UniversityShenzhenPeople’s Republic of China
  2. 2.School of Intelligent Systems EngineeringSun Yat-sen UniversityGuangzhouPeople’s Republic of China

Personalised recommendations