Skip to main content

Multimodal Fusion for Traditional Chinese Painting Generation

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing – PCM 2018 (PCM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11166))

Included in the following conference series:

Abstract

Creativity is a fundamental feature of human intelligence, and a challenge for artificial intelligence (AI). In recent years, AI has gained tremendous development in many single tasks with single models, such as classification, detection and parsing. As the development continued, AI has been increasingly used for more complex tasks, multitasking for example, and then research in multimodal fusion naturally became a new trend. In this paper, we propose a multimodal fusion framework and system to generate traditional Chinese paintings. We select suitable existing networks for different elements generation in this oldest continuous artistic traditions artwork, and finally fusion these networks and elements to create a complete new painting. Meanwhile, we propose a divide-and-conquer strategy to generate large images with limited GPU resources. In our end-to-end system, a large image becomes a traditional Chinese painting in minutes automatically. It shows that our multimodal fusion framework works well and AI methods has good performance in traditional Chinese painting creation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boden, M.A.: Creativity and artificial intelligence. Artif. Intell. 103(1/2), 347–356 (1998)

    Article  MathSciNet  Google Scholar 

  2. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR (2015)

    Google Scholar 

  3. http://oilpaintingfactory.com/traditional-Chinese-painting.html

  4. Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)

    Google Scholar 

  5. Gauthier, J.: Conditional generative adversarial nets for convolutional face generation. In: Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester (2014)

    Google Scholar 

  6. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

  7. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  8. Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593 (2017)

  9. Gregor, K., Danihelka, I., Graves, A., Wierstra, D.D.: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623 (2015)

  10. van den Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759 (2016)

  11. Yang, J., Reed, S., Yang, M.-H., Lee, H.: Weakly-supervised disentangling with recurrent transformations for 3D view synthesis. In: NIPS (2015)

    Google Scholar 

  12. Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)

  13. Tieleman, T.: Optimizing neural networks that generate images. Ph.D. thesis, University of Toronto (2014)

    Google Scholar 

  14. Dosovitskiy, A., Springenberg, J., Tatarchenko, M., Brox, T.: Learning to generate chairs, tables and cars with convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2016). https://doi.org/10.1109/TPAMI.2016.2567384

    Article  Google Scholar 

  15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)

    Google Scholar 

  16. van den Oord, A., et al.: Conditional image generation with PixelCNN decoders. CoRR, abs/1606.05328 (2016)

    Google Scholar 

  17. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. arXiv:1605.05396 (2016)

  18. Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene paring through ADE20K dataset. In: Proceedings of CVPR (2017)

    Google Scholar 

  19. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR (2017)

    Google Scholar 

  20. https://github.com/justdark/pytorch-poetry-gen

  21. https://github.com/kaonashi-tyc/zi2zi

  22. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)

    Google Scholar 

  23. Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. arXiv preprint arXiv:1711.11585

  24. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS (2016)

    Google Scholar 

  25. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  26. Karpathy, A.: The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy Blog (2015). http://karpathy.github.io

  27. Antol, S., et al.: VQA: visual question answering. In: ICCV (2015)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Nos. U1536203, 61572493), IIE project (No. Y6Z0021102, No. Y7Z0241102) and CCF-Tencent Open Research Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luo, S., Liu, S., Han, J., Guo, T. (2018). Multimodal Fusion for Traditional Chinese Painting Generation. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11166. Springer, Cham. https://doi.org/10.1007/978-3-030-00764-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00764-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00763-8

  • Online ISBN: 978-3-030-00764-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics