Multi–scale Recursive and Perception–Distortion Controllable Image Super–Resolution
Abstract
We describe our solution for the PIRM Super–Resolution Challenge 2018 where we achieved the \(\varvec{2^{nd}}\) best perceptual quality for average \(RMSE\leqslant 16\), \(5^{th}\) best for \(RMSE\leqslant 12.5\), and \(7^{th}\) best for \(RMSE\leqslant 11.5\). We modify a recently proposed Multi–Grid Back–Projection (MGBP) architecture to work as a generative system with an input parameter that can control the amount of artificial details in the output. We propose a discriminator for adversarial training with the following novel properties: it is multi–scale that resembles a progressive–GAN; it is recursive that balances the architecture of the generator; and it includes a new layer to capture significant statistics of natural images. Finally, we propose a training strategy that avoids conflicts between reconstruction and perceptual losses. Our configuration uses only 281 k parameters and upscales each image of the competition in 0.2 s in average.
Keywords
Backprojection Multigrid Perceptual qualityReferences
- 1.Agustsson, E., Timofte, R.: Ntire 2017 challenge on single image super-resolution: dataset and study. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017Google Scholar
- 2.Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., Zelnik-Manor, L.: 2018 PIRM challenge on perceptual image super-resolution (2018). http://arxiv.org/abs/1809.07517
- 3.Blau, Y., Michaeli, T.: The perception-distortion tradeoff. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
- 4.Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13CrossRefGoogle Scholar
- 5.Dong, C., Loy, C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)CrossRefGoogle Scholar
- 6.Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. J. Vis. August 2015. http://arxiv.org/abs/1508.06576
- 7.Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
- 8.Haris, M., Shakhnarovich, G., Ukita, N.: Deep back-projection networks for super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
- 9.Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
- 10.Irani, M., Peleg, S.: Improving resolution by image registration. CVGIP: Graph. Models Image Process. 53(3), 231–239 (1991). https://doi.org/10.1016/1049-9652(91)90045-LCrossRefGoogle Scholar
- 11.Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
- 12.Johnson, J., Alahi, A., Li, F.: Perceptual losses for real-time style transfer and super-resolution. CoRR abs/1603.08155 (2016). http://arxiv.org/abs/1603.08155
- 13.Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations (2018)Google Scholar
- 14.Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Fast and accurate image super-resolution with deep laplacian pyramid networks. arXiv:1710.01992 (2017)
- 15.Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. CoRR abs/1609.04802 (2016). http://arxiv.org/abs/1609.04802
- 16.Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017Google Scholar
- 17.Ma, C., Yang, C., Yang, X., Yang, M.: Learning a no-reference quality metric for single-image super-resolution. Comput. Vis. Image Underst. 158, 1–16 (2017). http://arxiv.org/abs/org/abs/1612.05890CrossRefGoogle Scholar
- 18.Mechrez, R., Talmi, I., Shama, F., Zelnik-Manor, L.: Learning to maintain natural image statistics. arXiv preprint arXiv:1803.04626 (2018)
- 19.Mechrez, R., Talmi, I., Zelnik-Manor, L.: The contextual loss for image transformation with non-aligned data. arXiv preprint arXiv:1803.02077 (2018)
- 20.Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “Completely Blind” image quality analyzer. IEEE Signal Process. Lett. 20, 209–212 (2013). https://doi.org/10.1109/LSP.2012.2227726CrossRefGoogle Scholar
- 21.Mittal, A., Moorthy, A.K., Bovik, A.C.: No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process 21, 4695–4708 (2012)MathSciNetCrossRefGoogle Scholar
- 22.Mitter, S.K.: Nonlinear filtering of diffusion processes a guided tour. In: Fleming, W.H., Gorostiza, L.G. (eds.) Advances in Filtering and Optimal Stochastic Control. LNCIS, vol. 42, pp. 256–266. Springer, Heidelberg (1982). https://doi.org/10.1007/BFb0004544CrossRefGoogle Scholar
- 23.Navarrete Michelini, P., Liu, H., Zhu, D.: Multigrid backprojection super–resolution and deep filter visualization. In: Proceedings of the Thirty–Third AAAI Conference on Artificial Intelligence (AAAI 2019). AAAI (2019). http://arxiv.org/abs/1809.09326
- 24.Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR abs/1511.06434 (2015)Google Scholar
- 25.Ruderman, D.L.: The statistics of natural images. Netw. Comput. Neural Syst. 5, 517–548 (1994)CrossRefGoogle Scholar
- 26.Sajjadi, M.S.M., Scholkopf, B., Hirsch, M.: Enhancenet: single image super-resolution through automated texture synthesis. In: The IEEE International Conference on Computer Vision (ICCV), October 2017Google Scholar
- 27.Seshadrinathan, K., Soundararajan, R., Bovik, A., Cormack, L.: Study of subjective and objective quality assessment of video. IEEE Trans. Image Process. 19(6), 1427–1441 (2010)MathSciNetCrossRefGoogle Scholar
- 28.Sheikh, H.R., Bovik, A.C.: Image information and visual quality. Trans. Img. Proc. 15(2), 430–444 (2006)CrossRefGoogle Scholar
- 29.Timofte, R., et al.: Ntire 2017 challenge on single image super-resolution: methods and results. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017Google Scholar
- 30.Timofte, R., et al.: NTIRE 2018 challenge on single image super-resolution: methods and results. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018Google Scholar
- 31.Trottenberg, U., Schuller, A.: Multigrid. Academic Press, Inc., Orlando (2001)zbMATHGoogle Scholar
- 32.Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRefGoogle Scholar
- 33.Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593 (2017)