The Contextual Loss for Image Transformation with Non-aligned Data

  • Roey MechrezEmail author
  • Itamar Talmi
  • Lihi Zelnik-Manor
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11218)


Feed-forward CNNs trained for image transformation problems rely on loss functions that measure the similarity between the generated image and a target image. Most of the common loss functions assume that these images are spatially aligned and compare pixels at corresponding locations. However, for many tasks, aligned training pairs of images will not be available. We present an alternative loss function that does not require alignment, thus providing an effective and simple solution for a new space of problems. Our loss is based on both context and semantics – it compares regions with similar semantic meaning, while considering the context of the entire image. Hence, for example, when transferring the style of one face to another, it will translate eyes-to-eyes and mouth-to-mouth. Our code can be found at



This research was supported by the Israel Science Foundation under Grant 1089/16 and by the Ollendorf foundation.


  1. 1.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)Google Scholar
  2. 2.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)Google Scholar
  3. 3.
    Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR (2017)Google Scholar
  4. 4.
    Sajjadi, M.S., Scholkopf, B., Hirsch, M.: Enhancenet: single image super-resolution through automated texture synthesis. In: ICCV (2017)Google Scholar
  5. 5.
    Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  6. 6.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR (2016)Google Scholar
  7. 7.
    Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. In: CVPR (2016)Google Scholar
  8. 8.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). Scholar
  9. 9.
    Xu, L., Ren, J.S., Liu, C., Jia, J.: Deep convolutional neural network for image deconvolution. In: NIPS (2014)Google Scholar
  10. 10.
    Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: ICCV (2017)Google Scholar
  11. 11.
    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Diversified texture synthesis with feed-forward networks. In: CVPR (2017)Google Scholar
  12. 12.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  13. 13.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. arXiv preprint arXiv:1711.11585 (2017)
  14. 14.
    Kim, T., Cha, M., Kim, H., Lee, J., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. arXiv preprint arXiv:1703.05192 (2017)
  15. 15.
    Yi, Z., Zhang, H., Gong, P.T., et al.: DualGAN: unsupervised dual learning for image-to-image translation. arXiv preprint arXiv:1704.02510 (2017)
  16. 16.
    Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: Computer Graphics and Interactive Techniques. ACM (2001)Google Scholar
  17. 17.
    Liang, L., Liu, C., Xu, Y.Q., Guo, B., Shum, H.Y.: Real-time texture synthesis by patch-based sampling. ACM ToG 20(3), 127–150 (2001)CrossRefGoogle Scholar
  18. 18.
    Elad, M., Milanfar, P.: Style transfer via texture synthesis. IEEE Trans. Image Process. 26(5), 23338–2351 (2017)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Frigo, O., Sabater, N., Delon, J., Hellier, P.: Split and match: example-based adaptive patch sampling for unsupervised style transfer. In: CVPR (2016)Google Scholar
  20. 20.
    Chen, T.Q., Schmidt, M.: Fast patch-based style transfer of arbitrary style. arXiv preprint arXiv:1612.04337 (2016)
  21. 21.
    Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
  22. 22.
    Jing, Y., Yang, Y., Feng, Z., Ye, J., Song, M.: Neural style transfer: a review. arXiv preprint arXiv:1705.04058 (2017)
  23. 23.
    Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. In: ICLR (2017)Google Scholar
  24. 24.
    Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.S.: Texture networks: feed-forward synthesis of textures and stylized images. In: ICML, pp. 1349–1357 (2016)Google Scholar
  25. 25.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)Google Scholar
  26. 26.
    Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep photo style transfer. In: CVPR (2017)Google Scholar
  27. 27.
    Zhao, H., Rosin, P.L., Lai, Y.K.: Automatic semantic style transfer using deep convolutional neural networks and soft masks. arXiv preprint arXiv:1708.09641 (2017)
  28. 28.
    Risser, E., Wilmot, P., Barnes, C.: Stable and controllable neural texture synthesis and style transfer using histogram losses. arXiv preprint arXiv:1701.08893 (2017)
  29. 29.
    Shih, Y., Paris, S., Durand, F., Freeman, W.T.: Data-driven hallucination of different times of day from a single outdoor photo. In: ACM ToG (2013)Google Scholar
  30. 30.
    Shih, Y., Paris, S., Barnes, C., Freeman, W.T., Durand, F.: Style transfer for headshot portraits. ACM ToG 33(4), 148 (2014)CrossRefGoogle Scholar
  31. 31.
    Talmi, I., Mechrez, R., Zelnik-Manor, L.: Template matching with deformable diversity similarity. In: CVPR (2017)Google Scholar
  32. 32.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  33. 33.
    Dekel, T., Oron, S., Rubinstein, M., Avidan, S., Freeman, W.T.: Best-buddies similarity for robust template matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2021–2029 (2015)Google Scholar
  34. 34.
    Mechrez, R., Talmi, I., Shama, F., Zelnik-Manor, L.: Learning to maintain natural image statistics. arXiv preprint arXiv:1803.04626 (2018)
  35. 35.
    Abadi, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
  36. 36.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  37. 37.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Technion - Israel Institute of TechnologyHaifaIsrael

Personalised recommendations