Semantics Images Synthesis and Resolution Refinement Using Generative Adversarial Networks

  • Jian HanEmail author
  • Zijie Zhang
  • Ailing Mao
  • Yuan Zhou
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 516)


In this paper, we proposed a method to synthesizing a super-resolution image with the given image and text descriptions. Our work contains two parts. Wasserstein GAN is used to generate low-level resolution image under the guidance of a novel loss function. Then, a convolution net is followed to refine the resolution. This is an end-to-end network architecture. We have validated our model on Caltech-200 bird dataset, Oxford-102 flower dataset, and BSD300 dataset. The experiments show that the generated images not only match the given descriptions well but also maintain detailed features of original images with a higher resolution.


Generative Adversarial Networks (GANs) Semantics images synthesis Resolution refinement 


  1. 1.
    Reed S, et al. Generative adversarial text to image synthesis. In: International conference on machine learning, 2016. p. 1060–9 ( Scholar
  2. 2.
    Dong H, Yu S, Wu C, Guo, Y. Semantic image synthesis via adversarial learning. In: IEEE international conference on computer vision (ICCV). New York: IEEE; 2017. p. 5707–15.Google Scholar
  3. 3.
    Kingma D, Ba J. Adam: A method for stochastic optimization. In: ICLR, 2014.Google Scholar
  4. 4.
    Goodfellow IJ, et al. Generative adversarial nets. In: Neural information processing systems, 2014. p. 2672–80.Google Scholar
  5. 5.
    Zhang, H, Xu, T, Li, H. StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial networks. In: International conference on computer vision. New York: IEEE; 2017. p. 5908–16.Google Scholar
  6. 6.
    Donahue J, Krahenbuhl P, Darrell T. Adversarial feature learning. In: International conference on learning representations, 2017.Google Scholar
  7. 7.
    Larsen, ABL, Larochelle, H, Winther O. Autoencoding beyond pixels using a learned similarity metric. In: International conference on machine learning, 2016. p. 1558–66 ( Scholar
  8. 8.
    Shi W, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Computer vision and pattern recognition. New York: IEEE; 2016. p. 1874–83.Google Scholar
  9. 9.
    Arjovsky M, Chintala S, Bottou L. Wasserstein GAN. arXiv, 2017.Google Scholar
  10. 10.
    Wah C, et al. The Caltech-UCSD Birds-200-2011 Dataset. In: Advances in water resources, 2011.Google Scholar
  11. 11.
    Nilsback M, Zisserman A. Automated flower classification over a large number of classes. In: Indian conference on computer vision, graphics and image processing, 2008. p. 722–9.Google Scholar
  12. 12.
    Salimans T, et al. Improved techniques for training GANs. In: Neural information processing systems, 2016. p. 2234–42.Google Scholar
  13. 13.
    Russakovsky O, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115:211–52.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.School of Electrical and Information EngineeringTianjin UniversityTianjinChina

Personalised recommendations