Locally controllable neural style transfer on mobile devices

Abstract

Mobile expressive rendering gained increasing popularity among users seeking casual creativity by image stylization and supports the development of mobile artists as a new user group. In particular, neural style transfer has advanced as a core technology to emulate characteristics of manifold artistic styles. However, when it comes to creative expression, the technology still faces inherent limitations in providing low-level controls for localized image stylization. In this work, we first propose a problem characterization of interactive style transfer representing a trade-off between visual quality, run-time performance, and user control. We then present MaeSTrO, a mobile app for orchestration of neural style transfer techniques using iterative, multi-style generative and adaptive neural networks that can be locally controlled by on-screen painting metaphors. At this, we enhance state-of-the-art neural style transfer techniques by mask-based loss terms that can be interactively parameterized by a generalized user interface to facilitate a creative and localized editing process. We report on a usability study and an online survey that demonstrate the ability of our app to transfer styles at improved semantic plausibility.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Notes

  1. 1.

    Mainly recruited through volunteers on r/samplesize.

  2. 2.

    http://www.mut1ny.com/face-headsegmentation-dataset.

References

  1. 1.

    Aydın, T.O., Smolic, A., Gross, M.: Automated aesthetic analysis of photographic images. IEEE Trans. Vis. Comput. Graph. 21(1), 31–42 (2015). https://doi.org/10.1109/TVCG.2014.2325047

    Article  Google Scholar 

  2. 2.

    Bakhshi, S., Shamma, D.A., Kennedy, L., Gilbert, E.: Why we filter our photos and how it impacts engagement. In: Proceedings of the ICWSM, pp. 12–21 (2015)

  3. 3.

    Berry, M.: Re-imagining place with filters: more than meets the eye. J. Creat. Technol. 4, 81–96 (2014)

    Google Scholar 

  4. 4.

    Caesar, H., Uijlings, J., Ferrari, V.: COCO-Stuff: thing and stuff classes in context. In: Proceedings of the CVPR, pp. 1209–1218 (2018). https://doi.org/10.1109/CVPR.2018.00132

  5. 5.

    Champandard, A.J.: Semantic style transfer and turning two-bit doodles into fine artworks. Tech. Rep., arXiv arXiv:1612.04337 (2016)

  6. 6.

    Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: StyleBank: an explicit representation for neural image style transfer. In: Proceedings of the CVPR, pp. 2770–2779. IEEE Computer Society, Los Alamitos (2017). https://doi.org/10.1109/CVPR.2017.296

  7. 7.

    DeCarlo, D., Santella, A.: Stylization and abstraction of photographs. ACM Trans. Graph. 21(3), 769–776 (2002). https://doi.org/10.1145/566654.566650

    Article  Google Scholar 

  8. 8.

    Dev, K.: Mobile expressive renderings: the state of the art. IEEE Comput. Graph. Appl. 33(3), 22–31 (2013). https://doi.org/10.1109/MCG.2013.20

    Article  Google Scholar 

  9. 9.

    Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. In: Proceedings of the ICLR, p. 9 (2017)

  10. 10.

    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the CVPR, pp. 2414–2423. IEEE Computer Society, Los Alamitos (2016). https://doi.org/10.1109/CVPR.2016.265

  11. 11.

    Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: Proceedings of the CVPR, pp. 3730–3738. IEEE Computer Society, Los Alamitos (2017). https://doi.org/10.1109/CVPR.2017.397

  12. 12.

    Ghiasi, G., Lee, H., Kudlur, M., Dumoulin, V., Shlens, J.: Exploring the structure of a real-time, arbitrary neural artistic stylization network. Tech. Rep., arXiv arXiv:1705.06830 (2017)

  13. 13.

    Gooch, A.A., Long, J., Ji, L., Estey, A., Gooch, B.S.: Viewing progress in non-photorealistic rendering through Heinlein’s lens. In: Proceedings of the NPAR, pp. 165–171. ACM, New York (2010). https://doi.org/10.1145/1809939.1809959

  14. 14.

    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. Tech. Rep., arXiv arXiv:1703.06868 (2017)

  15. 15.

    Isenberg, T.: Interactive NPAR: what type of tools should we create? In: Proceedings of the NPAR, pp. 89–96. Eurographics Association, Goslar, Germany (2016). https://doi.org/10.2312/exp.20161067

  16. 16.

    Jing, Y., Yang, Y., Feng, Z., Ye, J., Song, M.: Neural style transfer: a review. Tech. Rep., arXiv arXiv:1705.04058 (2018)

  17. 17.

    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Proceedings of the ECCV, pp. 694–711. Springer International, Cham, Switzerland (2016). https://doi.org/10.1007/978-3-319-46475-6_43

  18. 18.

    Keep, D.: Artist with a Camera-Phone: A Decade of Mobile Photography, pp. 14–24. Palgrave Macmillan US, New York (2014). https://doi.org/10.1057/9781137469816_2

    Google Scholar 

  19. 19.

    Klingbeil, M., Pasewaldt, S., Semmo, A., Döllner, J.: Challenges in user experience design of image filtering apps. In: Proceedings of the MGIA, pp. 22:1–22:6. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3132787.3132803

  20. 20.

    Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the CVPR, pp. 2479–2486. IEEE Computer Society, Los Alamitos (2016). https://doi.org/10.1109/CVPR.2016.272

  21. 21.

    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Diversified texture synthesis with feed-forward networks. In: Proceedings of the CVPR, pp. 266–274 (2017). https://doi.org/10.1109/CVPR.2017.36

  22. 22.

    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 386–396. Curran Associates, Red Hook (2017)

    Google Scholar 

  23. 23.

    Liao, J., Yao, Y., Yuan, L., Hua, G., Kang, S.B.: Visual attribute transfer through deep image analogy. Tech. Rep., arXiv arXiv:1705.01088 (2017)

  24. 24.

    Lin, T., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, L.C.: Microsoft COCO: common objects in context. Tech. Rep., arXiv arXiv:1405.0312 (2014)

  25. 25.

    Liu, X.C., Cheng, M.M., Lai, Y.K., Rosin, P.L.: Depth-aware neural style transfer. In: Proceedings of the NPAR, pp. 4:1–4:10. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3092919.3092924

  26. 26.

    Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep photo style transfer. Tech. Rep., arXiv arXiv:1703.07511 (2017)

  27. 27.

    Narvekar, N.D., Karam, L.J.: A no-reference image blur metric based on the cumulative probability of blur detection (CPBD). IEEE Trans. Image Process. 20(9), 2678–2683 (2011). https://doi.org/10.1109/TIP.2011.2131660

    MathSciNet  Article  MATH  Google Scholar 

  28. 28.

    Reimann, M., Klingbeil, M., Pasewaldt, S., Semmo, A., Döllner, J., Trapp, M.: MaeSTrO: a mobile app for style transfer orchestration using neural networks. In: Proceedings International Conference on Cyberworlds, pp. 9–16. IEEE (2018). https://doi.org/10.1109/CW.2018.00016

  29. 29.

    Rudner, R.: On semiotic aesthetics. J. Aesthet. Art Crit. 10(1), 67–77 (1951)

    Article  Google Scholar 

  30. 30.

    Salesin, D.H.: Non-Photorealistic Animation & Rendering: 7 Grand Challenges. Keynote talk at NPAR (2002)

  31. 31.

    Santella, A., DeCarlo, D.: Visual interest and NPR: an evaluation and manifesto. In: Proceedings of the NPAR, pp. 71–150. ACM, New York, NY, USA (2004). https://doi.org/10.1145/987657.987669

  32. 32.

    Seims, J.: Putting the artist in the loop. ACM SIGGRAPH Comput. Graph. 33(1), 52–53 (1999)

    Article  Google Scholar 

  33. 33.

    Selim, A., Elgharib, M., Doyle, L.: Painting style transfer for head portraits using convolutional neural networks. ACM Trans. Graph. 35(4), 129:1–129:18 (2016). https://doi.org/10.1145/2897824.2925968

    Article  Google Scholar 

  34. 34.

    Semmo, A., Dürschmid, T., Trapp, M., Klingbeil, M., Döllner, J., Pasewaldt, S.: Interactive image filtering with multiple levels-of-control on mobile devices. In: Proceedings of the MGIA, pp. 2:1–2:8. ACM, New York (2016). https://doi.org/10.1145/2999508.2999521

  35. 35.

    Semmo, A., Isenberg, T., Döllner, J.: Neural style transfer: a paradigm shift for image-based artistic rendering? In: Proceedings of the NPAR, pp. 5:1–5:13. ACM, New York (2017). https://doi.org/10.1145/3092919.3092920

  36. 36.

    Semmo, A., Trapp, M., Döllner, J., Klingbeil, M.: Pictory: combining neural style transfer and image filtering. In: Proceedings of the SIGGRAPH Appy Hour, pp. 5:1–5:2. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3098900.3098906

  37. 37.

    Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017). https://doi.org/10.1109/TPAMI.2016.2572683

    Article  Google Scholar 

  38. 38.

    Shneiderman, B.: Creativity support tools: accelerating discovery and innovation. Commun. ACM 50(12), 20–32 (2007). https://doi.org/10.1145/1323688.1323689

    Article  Google Scholar 

  39. 39.

    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Tech. Rep., arXiv arXiv:1409.1556 (2015)

  40. 40.

    Tanno, R., Matsuo, S., Shimoda, W., Yanai, K.: DeepStyleCam: a real-time style transfer app on iOS. In: Proceedings of the MultiMedia Modeling, pp. 446–449. Springer International, Cham, Switzerland (2017). https://doi.org/10.1007/978-3-319-51814-5_39

  41. 41.

    Toyoura, M., Abe, N., Mao, X.: Painterly image generation using scene-aware style transferring. In: Proceedings of the International Conference on Cyberworlds, pp. 73–80. IEEE Computer Society, Los Alamitos (2016). https://doi.org/10.1109/CW.2016.18

  42. 42.

    Toyoura, M., Abe, N., Mao, X.: Scene-aware style transferring using GIST. In: Transactions on Computational Science XXX: Special Issue on Cyberworlds and Cybersecurity, pp. 29–49. Springer, Berlin, Heidelberg (2017). https://doi.org/10.1007/978-3-662-56006-8_3

  43. 43.

    Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.S.: Texture networks: feed-forward synthesis of textures and stylized images. In: Proceedings of the ICML, pp. 1349–1357. JMLR.org, New York (2016)

  44. 44.

    Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Instance normalization: the missing ingredient for fast stylization. Tech. Rep., arXiv arXiv:1607.08022 (2016)

  45. 45.

    Winnemöller, H.: NPR in the wild. In: Rosin, P., Collomosse, J. (eds.) Image and Video based Artistic Stylisation. Computational Imaging and Vision, Chap. 17, vol. 42, pp. 353–374. Springer, New York (2013). https://doi.org/10.1007/978-1-4471-4519-6_17

    Google Scholar 

  46. 46.

    Zhang, H., Dana, K.: Multi-style generative network for real-time transfer. Tech. Rep., arXiv arXiv:1703.06953 (2017)

  47. 47.

    Zhao, H., Rosin, P.L., Lai, Y.: Automatic semantic style transfer using deep convolutional neural networks and soft masks. Tech. Rep., arXiv arXiv:1708.09641 (2017)

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their valuable feedback. This work was funded by the Federal Ministry of Education and Research (BMBF), Germany, for the AVA project 01IS15041.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Max Reimann.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (mp4 129663 KB)

Supplementary material 1 (pdf 755 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Reimann, M., Klingbeil, M., Pasewaldt, S. et al. Locally controllable neural style transfer on mobile devices. Vis Comput 35, 1531–1547 (2019). https://doi.org/10.1007/s00371-019-01654-1

Download citation

Keywords

  • Non-photorealistic rendering
  • Style transfer
  • Neural networks
  • Mobile devices
  • Interactive control
  • Expressive rendering