Advertisement

Full-Body High-Resolution Anime Generation with Progressive Structure-Conditional Generative Adversarial Networks

  • Koichi HamadaEmail author
  • Kentaro Tachibana
  • Tianqi Li
  • Hiroto Honda
  • Yusuke Uchida
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11131)

Abstract

We propose Progressive Structure-conditional Generative Adversarial Networks (PSGAN), a new framework that can generate full-body and high-resolution character images based on structural information. Recent progress in generative adversarial networks with progressive training has made it possible to generate high-resolution images. However, existing approaches have limitations in achieving both high image quality and structural consistency at the same time. Our method tackles the limitations by progressively increasing the resolution of both generated images and structural conditions during training. In this paper, we empirically demonstrate the effectiveness of this method by showing the comparison with existing approaches and video generation results of diverse anime characters at 1024 \(\times \) 1024 based on target pose sequences. We also create a novel dataset containing full-body 1024 \(\times \) 1024 high-resolution images and exact 2D pose keypoints using Unity 3D Avatar models.

Keywords

Generative adversarial networks Anime generation Image generation Video generation 

Supplementary material

478822_1_En_8_MOESM1_ESM.pdf (1.9 mb)
Supplementary material 1 (pdf 1995 KB)

References

  1. 1.
    Balakrishnan, G., Zhao, A., Dalca, A.V., Durand, F., Guttag, J.: Synthesizing images of humans in unseen poses. In: Proceedings of CVPR (2018)Google Scholar
  2. 2.
    Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of CVPR (2016)Google Scholar
  3. 3.
    Chen, W., et al.: Synthesizing training images for boosting human 3D pose estimation. In: Proceedings of 3D Vision (2016)Google Scholar
  4. 4.
    Goodfellow, I.: NIPS 2016 tutorial: generative adversarial networks. arXiv:1701.00160 (2017)
  5. 5.
    Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of NIPS (2014)Google Scholar
  6. 6.
    Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of Wasserstein GANs. In: Proceedings of NIPS (2017)Google Scholar
  7. 7.
    Hu, Y., Wu, X., Yu, B., He, R., Sun, Z.: Pose-guided photorealistic face rotation. In: Proceedings of CVPR (2018)Google Scholar
  8. 8.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of CVPR (2017)Google Scholar
  9. 9.
    Jin, Y., Zhang, J., Li, M., Tian, Y., Zhu, H.: Towards the high-quality anime characters generation with generative adversarial networks. In: Proceedings of NIPS Workshop on Machine Learning for Creativity and Design (2017)Google Scholar
  10. 10.
    Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, and stability, and variation. In: Proceedings of ICLR (2018)Google Scholar
  11. 11.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimizations. In: Proceedings of ICLR (2015)Google Scholar
  12. 12.
    Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of CVPR (2016)Google Scholar
  13. 13.
    Ma, L., Sun, Q., Georgoulis, S., Gool, L.V., Schiele, B., Fritz, M.: Disentangled person image generation. In: Proceedings of CVPR (2018)Google Scholar
  14. 14.
    Ma, L., Sun, Q., Jia, X., Schiele, B., Tuytelaars, T., Gool, L.V.: Pose guided person image generation. In: Proceedings of NIPS (2017)Google Scholar
  15. 15.
    Qiao, F., Yao, N., Jiao, Z., Li, Z., Chen, H., Wang, H.: Geometry-contrastive generative adversarial network for facial expression synthesis. arXiv:1802.01822 (2018)
  16. 16.
    Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of ICLR (2016)Google Scholar
  17. 17.
    Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. In: Proceedings of ICML (2017)Google Scholar
  18. 18.
    Si, C., Wang, W., Wang, L., Tan, T.: Multistage adversarial losses for pose-based human image synthesis. In: Proceedings of CVPR (2018)Google Scholar
  19. 19.
    Siarohin, A., Sangineto, E., Lathuiliere, S., Sebe, N.: Deformable GANs for pose-based human image generation. In: Proceedings of CVPR (2018)Google Scholar
  20. 20.
    Varol, G., et al.: Learning from synthetic humans. In: Proceedings of CVPR (2017)Google Scholar
  21. 21.
    Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: Proceedings of NIPS (2016)Google Scholar
  22. 22.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of CVPR (2018)Google Scholar
  23. 23.
    Zhang, H., et al.: Stackgan++: realistic image synthesis with stacked generative adversarial networks. TPAMI (2018)Google Scholar
  24. 24.
    Zhang, Z., Xie, Y., Yang, L.: Photographic text-to-image synthesis with a hierarchically-nested adversarial network. In: Proceedings of CVPR (2018)Google Scholar
  25. 25.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of ICCV (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Koichi Hamada
    • 1
    Email author
  • Kentaro Tachibana
    • 1
  • Tianqi Li
    • 1
  • Hiroto Honda
    • 1
  • Yusuke Uchida
    • 1
  1. 1.DeNA Co., Ltd.TokyoJapan

Personalised recommendations