Multimedia Tools and Applications

, Volume 77, Issue 17, pp 22339–22366 | Cite as

A one-to-many conditional generative adversarial network framework for multiple image-to-image translations

  • Chunlei Chai
  • Jing Liao
  • Ning ZouEmail author
  • Lingyun Sun


Image-to-Image translation was proposed as a general form of many image learning problems. While generative adversarial networks were successfully applied on many image-to-image translations, many models were limited to specific translation tasks and were difficult to satisfy practical needs. In this work, we introduce a One-to-Many conditional generative adversarial network, which could learn from heterogeneous sources of images. This is achieved by training multiple generators against a discriminator in synthesized learning way. This framework supports generative models to generate images in each source, so output images follow corresponding target patterns. Two implementations, hybrid fake and cascading learning, of the synthesized adversarial training scheme are also proposed, and experimented on two benchmark datasets, UTZap50K and MVOD5K, as well as a new high-quality dataset BehTex7K. We consider five challenging image-to-image translation tasks: edges-to-photo, edges-to-similar-photo translation on UTZap50K, cross-view translation on MVOD5K, and grey-to-color, grey-to-Oil-Paint on BehTex7K. We show that both implementations are able to faithfully translate from an image to another image in edges-to-photo, edges-to-similar-photo, grey-to-color, and grey-to-Oil-Paint translation tasks. The quality of output images in cross-view translation need to be further boosted.


Image-to-image translation Generative adversarial network One-to-many conditional generative adversarial network Deep learning 



This paper is supported by the National Natural Science Foundation of China (61303137), the National Science and Technology Support Program (2015BAH21F01) and the Art Project for National Social-Science Foundation (15BG084). We thank Dr. Preben Hansen from Stockholm University, Department of Computer Science, for assistance in proofreading and technical editing of the manuscript.


  1. 1.
    Cai B, Xu X, Jia K, Qing C, Tao D (2016) DehazeNet: an end-to-end system for single image haze removal. IEEE Trans Image Process 25(11):5187–5198MathSciNetCrossRefGoogle Scholar
  2. 2.
    Çalışır F, Baştan M, Ulusoy Ö, Güdükbay U (2017) Mobile multi-view object image search. Multimedia Tools & Applications 76(10):12433–12456CrossRefGoogle Scholar
  3. 3.
    Chen M, Denoyer L (2016) Multi-view Generative Adversarial Networks arXiv eprint arXiv:1611.02019Google Scholar
  4. 4.
    Elgammal A, Liu B, Elhoseiny M, Mazzone M (2017) CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms. arXiv eprint arXiv:1706.07068Google Scholar
  5. 5.
    Gao Z, Zhang H, Xu GP, Xue YB, Hauptmannc AG (2015) Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition. Signal Process 112:83–97CrossRefGoogle Scholar
  6. 6.
    Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2414–2423Google Scholar
  7. 7.
    Ghosh A, Kulharia V, Namboodiri V, Torr PHS, Dokania PK (2017). Multi-Agent Diverse Generative Adversarial Networks. arXiv eprint arXiv:1606.07536Google Scholar
  8. 8.
    Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: International Conference on Neural Information Processing Systems, pp 2672–2680Google Scholar
  9. 9.
    Isola P, Zhu JY, Zhou TH, Efros, AA (2016) Image-to-Image Translation with Conditional Adversarial Networks arXiv eprint arXiv:1611.07004Google Scholar
  10. 10.
    Jacob VG, Gupta S (2009) Colorization of grayscale images and videos using a semiautomatic approach. In: 2009 16th IEEE International Conference on Image Processing, pp 1653–1656. doi:10.1109/ICIP.2009.5413392Google Scholar
  11. 11.
    Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. arXiv eprint arXiv:1703.05192Google Scholar
  12. 12.
    Kwak H, Zhang BT (2016) Ways of Conditioning Generative Adversarial Networks. arXiv eprint arXiv:1611.01455Google Scholar
  13. 13.
    Liu MY, Tuzel O (2016) Coupled generative adversarial networks. arXiv preprint arXiv:Google Scholar
  14. 14.
    Liu A-A, Su Y-T, Jia P-P, Gao Z, Hao T, Yang Z-X (Jun. 2015) (2015) Multipe/single-view human action recognition via part-induced multitask structural learning. IEEE Transactions on Cybernetics 45(6):1194–1208CrossRefGoogle Scholar
  15. 15.
    Liu Y, Qin Z, Luo Z, Wang H (2017) Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks. arXiv eprint arXiv:1705.01908Google Scholar
  16. 16.
    Liu Z et al. (2017) Multiview and multimodal pervasive indoor localization. ACM on Multimedia Conference ACM: 109–117Google Scholar
  17. 17.
    Luan F, Paris S, Bala K (2017) Deep Photo Style Transfer. arXiv eprint arXiv:1703.07511Google Scholar
  18. 18.
    Mirza M, Osindero S (2014) Conditional generative adversarial nets. Computer Science 2672–2680Google Scholar
  19. 19.
    Nie L, Wang M, Zha Z, et al (2011) Multimedia answering: enriching text QA with media information: 695–704Google Scholar
  20. 20.
    Perarnau G, Weijer JVD, Raducanu B, Álvarez JM (2016) Invertible Conditional GANs for image editing. In Conference and Workshop on Neural Information Processing Systems 2016. arXiv eprint arXiv:1611.06355Google Scholar
  21. 21.
    Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved Techniques for Training GANs. arXiv eprint arXiv:1606.03498Google Scholar
  22. 22.
    Sheikh HR, Bovik AC (2006) Image information and visual quality. IEEE Trans Image Process 15(2):430–444. CrossRefGoogle Scholar
  23. 23.
    Vedran V, Raymond C, Gravier G (2017) Generative adversarial networks for multimodal representation learning in video hyperlinking. In: ACM on International Conference on Multimedia Retrieval, pp 416–419Google Scholar
  24. 24.
    Wang X, Gupta A (2016) Generative Image Modeling Using Style and Structure Adversarial Networks. arXiv eprint arXiv:1603.05631Google Scholar
  25. 25.
    Wang Y, Zhang L, Weijer JVD (2016) Ensembles of Generative Adversarial Networks. arXiv eprint arXiv:1612.00991Google Scholar
  26. 26.
    Wang C, Xu C, Tao D (2017) Perceptual Adversarial Networks for Image-to-Image Transformation. arXiv eprint arXiv:1706.09138Google Scholar
  27. 27.
    Xie S, Tu Z (2017) Holistically-nested edge detection. Int J Comput Vis 125:3–18MathSciNetCrossRefGoogle Scholar
  28. 28.
    Yang Y, Ma Z, Hauptmann AG, Sebe N (2013) Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Transactions on Multimedia 15(3):661–669CrossRefGoogle Scholar
  29. 29.
    Yi Z, Zhang H, Tan P, Gong M (2017) DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. arXiv eprint arXiv:1704.02510Google Scholar
  30. 30.
    Yu A, Grauman K (2014) Fine-grained visual comparisons with local learning. In: Computer Vision and Pattern Recognition, pp 192–199Google Scholar
  31. 31.
    Zhang L, Zhang L, Mou X, Zhang D (2012) A comprehensive evaluation of full reference image quality assessment algorithms. In: 2012 19th IEEE International Conference on Image Processing, pp 1477–1480. doi:10.1109/ICIP.2012.6467150Google Scholar
  32. 32.
    Zhang R, Isola P, Efros AA (2016). Colorful Image Colorization. arXiv eprint arXiv:1603.08511Google Scholar
  33. 33.
    Zhang H et al (2016) Online collaborative learning for open-vocabulary visual classifiers. IEEE Computer Vision and Pattern Recognition: 2809–2817Google Scholar
  34. 34.
    Zhang H, Sindagi V, Patel VM (2017) Image De-raining Using a Conditional Generative Adversarial Network. arXiv eprint arXiv:1701.05957Google Scholar
  35. 35.
    Zhou W, Bovik AC (2002) A universal image quality index. IEEE Signal Processing Letters 9(3):81–84. CrossRefGoogle Scholar
  36. 36.
    Zhou W, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612. CrossRefGoogle Scholar
  37. 37.
    Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv eprint arXiv:1703.10593Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Chunlei Chai
    • 1
  • Jing Liao
    • 1
  • Ning Zou
    • 1
    Email author
  • Lingyun Sun
    • 1
  1. 1.Laboratory of CAD&CGZhejiang UniversityHangzhouChina

Personalised recommendations