PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report
Abstract
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with a DSLR camera. The target metric used in this challenge combined the runtime, PSNR scores and solutions’ perceptual results measured in the user study. To ensure the efficiency of the submitted models, we additionally measured their runtime and memory requirements on Android smartphones. The proposed solutions significantly improved baseline results defining the state-of-the-art for image enhancement on smartphones.
Keywords
Image enhancement Image super-resolution Challenge Efficiency Deep learning Mobile Android SmartphonesNotes
Acknowledgements
We thank the PIRM2018 sponsors: ETH Zurich (Computer Vision Lab), Huawei Inc., MediaTek Inc., and Israel Institute of Technology.
References
- 1.Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, vol. 3, p. 2 (2017)Google Scholar
- 2.Agustsson, E., Timofte, R., Van Gool, L.: Anchored regression networks applied to age estimation and super resolution. In: The IEEE International Conference on Computer Vision (ICCV), October 2017Google Scholar
- 3.Ancuti, C., Ancuti, C.O., Timofte, R.: NTIRE 2018 challenge on image dehazing: methods and results. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018Google Scholar
- 4.Arad, B., Ben-Shahar, O., Timofte, R.: NTIRE 2018 challenge on spectral reconstruction from RGB images. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018Google Scholar
- 5.Barron, J.T.: A more general robust loss function. arXiv preprint arXiv:1701.03077 (2017)
- 6.Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., Zelnik-Manor, L.: 2018 PIRM challenge on perceptual image super-resolution. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 7.Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)Google Scholar
- 8.Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)CrossRefGoogle Scholar
- 9.Haris, M., Shakhnarovich, G., Ukita, N.: Deep backprojection networks for super-resolution. In: Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
- 10.Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
- 11.Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: IEEE CVPR, vol. 4 (2017)Google Scholar
- 12.Hui, Z., Wang, X., Deng, L., Gao, X.: Perception-preserving convolutional networks for image enhancement on smartphones. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 13.Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: DSLR-quality photos on mobile devices with deep convolutional networks. In: The IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
- 14.Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: WESPE: weakly supervised photo enhancer for digital cameras. arXiv preprint arXiv:1709.01118 (2017)
- 15.Ignatov, A., et al.: AI benchmark: Running deep neural networks on android smartphones. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 16.Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard GAN. arXiv preprint arXiv:1807.00734 (2018)
- 17.Kligvasser, I., Shaham, T.R., Michaeli, T.: xUnit: learning a spatial activation function for efficient image restoration. arXiv preprint arXiv:1711.06445 (2017)
- 18.Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep Laplacian pyramid networks for fast and accurate superresolution. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, p. 5 (2017)Google Scholar
- 19.Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, vol. 2, p. 4 (2017)Google Scholar
- 20.Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015)Google Scholar
- 21.Li, L.J., Socher, R., Fei-Fei, L.: Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2036–2043. IEEE (2009)Google Scholar
- 22.Li, Y., Eirikur Agustsson, E., Gu, S., Timofte, R., Van Gool, L.: CARN: convolutional anchored regression network for fast and accurate single image super-resolution. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 23.Liu, H., Navarrete Michelini, P., Zhu, D.: Deep networks for image to image translation with Mux and Demux layers. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 24.Liu, J., Jung, C.: Multiple connected residual network for image enhancement on smartphones. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 25.Pengfei, Z., et al.: Range scaling global u-net for perceptual image enhancement on mobile devices. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 26.Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)Google Scholar
- 27.Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)Google Scholar
- 28.Shoeiby, M., Robles-Kelly, A., Timofte, R., et al.: PIRM 2018 challenge on spectral image super-resolution: methods and results. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 29.Sim, H., Ki, S., Choi, J.S., Seo, S., Kim, S., Kim, M.: High-resolution image dehazing with respect to training losses and receptive field sizes. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018Google Scholar
- 30.de Stoutz, E., Ignatov, A., Kobyshev, N., Timofte, R., Van Gool, L.: Fast perceptual image enhancement. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 31.Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)Google Scholar
- 32.Timofte, R., et al.: NTIRE 2017 challenge on single image super-resolution: methods and results. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1110–1121, July 2017. https://doi.org/10.1109/CVPRW.2017.149
- 33.Timofte, R., De Smet, V., Van Gool, L.: Anchored neighborhood regression for fast example-based super-resolution. In: The IEEE International Conference on Computer Vision (ICCV), December 2013Google Scholar
- 34.Timofte, R., De Smet, V., Van Gool, L.: A+: adjusted anchored neighborhood regression for fast super-resolution. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9006, pp. 111–126. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16817-3_8CrossRefGoogle Scholar
- 35.Timofte, R., Gu, S., Wu, J., Van Gool, L.: NTIRE 2018 challenge on single image super-resolution: methods and results. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018Google Scholar
- 36.Van Vu, T., Van Nguyen, C., Pham, T.X., Liu, T.M., Youu, C.D.: Fast and efficient image quality enhancement via desubpixel convolutional neural networks. In: European Conference on Computer Vision Workshops (2018)Google Scholar
- 37.Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, vol. 2, pp. 1398–1402, November 2003. https://doi.org/10.1109/ACSSC.2003.1292216
- 38.Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)CrossRefGoogle Scholar
- 39.Yang, Y., Zhong, Z., Shen, T., Lin, Z.: Convolutional neural networks with alternately updated clique. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2413–2422 (2018)Google Scholar