Multimodal Fusion for Traditional Chinese Painting Generation

Luo, Sanbi; Liu, Si; Han, Jizhong; Guo, Tao

doi:10.1007/978-3-030-00764-5_3

Sanbi Luo¹⁸,
Si Liu¹⁸,
Jizhong Han¹⁸ &
…
Tao Guo¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11166))

Included in the following conference series:

Pacific Rim Conference on Multimedia

3308 Accesses
4 Citations

Abstract

Creativity is a fundamental feature of human intelligence, and a challenge for artificial intelligence (AI). In recent years, AI has gained tremendous development in many single tasks with single models, such as classification, detection and parsing. As the development continued, AI has been increasingly used for more complex tasks, multitasking for example, and then research in multimodal fusion naturally became a new trend. In this paper, we propose a multimodal fusion framework and system to generate traditional Chinese paintings. We select suitable existing networks for different elements generation in this oldest continuous artistic traditions artwork, and finally fusion these networks and elements to create a complete new painting. Meanwhile, we propose a divide-and-conquer strategy to generate large images with limited GPU resources. In our end-to-end system, a large image becomes a traditional Chinese painting in minutes automatically. It shows that our multimodal fusion framework works well and AI methods has good performance in traditional Chinese painting creation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boden, M.A.: Creativity and artificial intelligence. Artif. Intell. 103(1/2), 347–356 (1998)
Article MathSciNet Google Scholar
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR (2015)
Google Scholar
http://oilpaintingfactory.com/traditional-Chinese-painting.html
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Gauthier, J.: Conditional generative adversarial nets for convolutional face generation. In: Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester (2014)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593 (2017)
Gregor, K., Danihelka, I., Graves, A., Wierstra, D.D.: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623 (2015)
van den Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759 (2016)
Yang, J., Reed, S., Yang, M.-H., Lee, H.: Weakly-supervised disentangling with recurrent transformations for 3D view synthesis. In: NIPS (2015)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Tieleman, T.: Optimizing neural networks that generate images. Ph.D. thesis, University of Toronto (2014)
Google Scholar
Dosovitskiy, A., Springenberg, J., Tatarchenko, M., Brox, T.: Learning to generate chairs, tables and cars with convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2016). https://doi.org/10.1109/TPAMI.2016.2567384
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)
Google Scholar
van den Oord, A., et al.: Conditional image generation with PixelCNN decoders. CoRR, abs/1606.05328 (2016)
Google Scholar
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. arXiv:1605.05396 (2016)
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene paring through ADE20K dataset. In: Proceedings of CVPR (2017)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR (2017)
Google Scholar
https://github.com/justdark/pytorch-poetry-gen
https://github.com/kaonashi-tyc/zi2zi
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. arXiv preprint arXiv:1711.11585
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS (2016)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Karpathy, A.: The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy Blog (2015). http://karpathy.github.io
Antol, S., et al.: VQA: visual question answering. In: ICCV (2015)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Nos. U1536203, 61572493), IIE project (No. Y6Z0021102, No. Y7Z0241102) and CCF-Tencent Open Research Fund.

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Sanbi Luo, Si Liu, Jizhong Han & Tao Guo

Authors

Sanbi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Si Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jizhong Han
View author publications
You can also search for this author in PubMed Google Scholar
Tao Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Guo .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, S., Liu, S., Han, J., Guo, T. (2018). Multimodal Fusion for Traditional Chinese Painting Generation. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11166. Springer, Cham. https://doi.org/10.1007/978-3-030-00764-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-00764-5_3
Published: 18 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00763-8
Online ISBN: 978-3-030-00764-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics