Joint learning of visual and spatial features for edit propagation from a single image

  • Yan GuiEmail author
  • Guang Zeng
Original Article


In this paper, we regard edit propagation as a multi-class classification problem and deep neural network (DNN) is used to solve the problem. We design a shallow and fully convolutional DNN that can be trained end-to-end. To achieve this, our method uses combinations of low-level visual features, which are extracted from the input image, and spatial features, which are computed through transforming user interactions, as input of the DNN, which efficiently performs a joint learning of visual and spatial features. We then train the DNN on many of such combinations in order to build a DNN-based pixel-level classifier. Our DNN is also equipped with patch-by-patch training and whole image estimation, speeding up learning and inference. Finally, we improve classification accuracy of the DNN by employing a fully connected conditional random field. Experimental results show that our method can respond to user interactions well and generate precise results compared with the state-of-art edit propagation approaches. Furthermore, we demonstrate our method on various applications.


Image editing Edit propagation Deep neural network Fully connected conditional random field 



We would like to thank Prof. Yiyu Cai and Dr. Zhifeng Xie for proofreading our paper. We would also like to thank the reviewers for their valuable comments. This study was funded by the National Natural Science Foundations of P. R. China (Grant Nos. 61402053; 61602059; 61772087; 61802031) and the Scientific Research Fund of Education Department of Hunan Province (Grant Nos. 16C0046; 16A008).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

371_2019_1633_MOESM1_ESM.pdf (569 kb)
Supplementary material 1 (pdf 568 KB)
371_2019_1633_MOESM2_ESM.pdf (710 kb)
Supplementary material 2 (pdf 710 KB)
371_2019_1633_MOESM3_ESM.pdf (2.5 mb)
Supplementary material 3 (pdf 2543 KB)
371_2019_1633_MOESM4_ESM.pdf (263 kb)
Supplementary material 4 (pdf 263 KB)


  1. 1.
    Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM Trans. Graph. 23(3), 689–694 (2004)CrossRefGoogle Scholar
  2. 2.
    Yatziv, L., Sapiro, G.: Fast image and video colorization using chrominance blending. IEEE Trans. Image Process. 15(5), 1120–1129 (2006)CrossRefGoogle Scholar
  3. 3.
    Qu, Y., Wong, T.T., Heng, P.A.: Manga colorization. ACM Trans. Graph. 25(3), 1214–1220 (2006)CrossRefGoogle Scholar
  4. 4.
    Luan, Q., Wen, F., Cohen-Or, D., Liang, L., Xu, Y.Q., Shum, H.Y.: Natural image colorization. In: Proceedings of the Eurographics Symposium on Rendering Techniques, pp. 309–320 (2007)Google Scholar
  5. 5.
    Lischinski, D., Farbman, Z., Uyttendaele, M., Szeliski, R.: Interactive local adjustment of tonal values. ACM Trans. Graph. 25(3), 646–653 (2006)CrossRefGoogle Scholar
  6. 6.
    Pellacini, F., Lawrence, J.: AppWand: editing measured materials using appearance-driven optimization. ACM Trans. Graph. 26(3), 54 (2007)CrossRefGoogle Scholar
  7. 7.
    An, X., Pellacini, F.: AppProp: all-pairs appearance-space edit propagation. ACM Trans. Graph. 27(3), 40:1–40:9 (2008)CrossRefGoogle Scholar
  8. 8.
    Xu, K., Li, Y., Ju, T., Hu, S.M., Liu, T.Q.: Efficient affinity-based edit propagation using K-D tree. ACM Trans. Graph. 28(5), 118:1–118:6 (2009)Google Scholar
  9. 9.
    Li, Y., Ju, T., Hu, S.M.: Instant propagation of sparse edits on images and videos. Comput. Graph. Forum 29(7), 2049–2054 (2010)CrossRefGoogle Scholar
  10. 10.
    Bie, X., Huang, H., Wang, W.: Real time edit propagation by efficient sampling. Comput. Graph. Forum 30(7), 2041–2048 (2011)CrossRefGoogle Scholar
  11. 11.
    Xiao, C., Nie, Y., Tang, F.: Efficient edit propagation using hierarchical data structure. IEEE Trans. Vis. Comput. Graph. 17(8), 1135–1147 (2011)CrossRefGoogle Scholar
  12. 12.
    Criminisi, A., Sharp, T., Rother, C., Perez, P.: Geodesic image and video editing. ACM Trans. Graph. 29(5), 134:1–134:15 (2010)CrossRefGoogle Scholar
  13. 13.
    Farbman, Z., Fattal, R., Lischinski, D.: Diffusion maps for edge-aware image editing. ACM Trans. Graph. 29(6), 145:1–145:10 (2010)CrossRefGoogle Scholar
  14. 14.
    Ma, L.Q., Xu, K.: Efficient antialiased edit propagation for images and videos. Comput. Graph. 36(8), 1005–1012 (2012)CrossRefGoogle Scholar
  15. 15.
    Chen, X., Zou, D., Zhao, Q., Tan, P.: Manifold preserving edit propagation. ACM Trans. Graph. 31(6), 132:1–132:7 (2012)Google Scholar
  16. 16.
    Musialski, P., Cui, M., Ye, J.P., Razdan, A., Wonka, P.: A framework for interactive image color editing. Vis. Comput. 39(11), 1173–1186 (2013)CrossRefGoogle Scholar
  17. 17.
    Xu, L., Yan, Q., Jia, J.Y.: A sparse control model for image and video editing. ACM Trans. Graph. 32(6), 197:1–197:10 (2013)Google Scholar
  18. 18.
    Yatagawa, T., Yamaguchi, Y.: Sparse pixel sampling for appearance edit propagation. Vis. Comput. 31, 1101–1111 (2015)CrossRefGoogle Scholar
  19. 19.
    Li, Y., Adelson, E., Agarwala, A.: Scribbleboost: adding classification to edge-aware interpolation of local image and video adjustments. EGSR 08, 1255–1264 (2008)Google Scholar
  20. 20.
    Dalmau, O., Rivera, M., Alarcon, T.: Bayesian scheme for interactive colourization, recolourization and image/video editing. Comput. Graph. Forum 29(8), 2372–2386 (2010)CrossRefGoogle Scholar
  21. 21.
    Chen, X., Zou, D., Li, J., Cao, X., Zhao, Q., Zhang, H.: Sparse dictionary learning for edit propagation of high-resolution images. CVPR 2014, 2854–2861 (2014)Google Scholar
  22. 22.
    Levin, A., Lischinski, D., Weiss, Y.: A closed form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2008)CrossRefGoogle Scholar
  23. 23.
    Cho, H., Lee, H., Kang, H., Lee, S.: Bilateral texture filtering. ACM Trans. Graph. 33(4), 128:1–128:8 (2014)CrossRefGoogle Scholar
  24. 24.
    Cambra, A.B., Murillo, A.C., Munõz, A.: A generic tool for interactive complex image editing. Vis. Comput. 34, 1493–1505 (2017)CrossRefGoogle Scholar
  25. 25.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  26. 26.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)Google Scholar
  27. 27.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556 (2014)
  28. 28.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)Google Scholar
  29. 29.
    Ren, S., He, K.M., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)Google Scholar
  30. 30.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)Google Scholar
  31. 31.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)CrossRefGoogle Scholar
  32. 32.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)CrossRefGoogle Scholar
  33. 33.
    Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 1–14 (2017)CrossRefGoogle Scholar
  34. 34.
    Endo, Y., Iizuka, S., Kanamori, Y., Mitani, J.: DeepProp: extracting deep features from a single image for edit propagation. Comput. Graph. Forum 35, 189–201 (2016)CrossRefGoogle Scholar
  35. 35.
    Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. ACM Trans. Graph. 35(2), 11:1–11:15 (2016)CrossRefGoogle Scholar
  36. 36.
    Xu, N., Price, B.L., Cohen, S., Yang, J., Huang, T.S.: Deep interactive object selection. In: CVPR, pp. 373–381 (2016)Google Scholar
  37. 37.
    Zhang, R., Zhu, J.Y., Isola, P., Geng, X.Y., Lin, A.S., Yu, T., Efros, A.A.: Real-time user-guided image colorization with learned deep priors. ACM Trans. Graph. 36(4), 119:1–119:11 (2017)Google Scholar
  38. 38.
    Kingma, D. P., Ba, J.: Adam: a method for stochastic optimization. CoRR arXiv:1412.6980 (2014)
  39. 39.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)CrossRefGoogle Scholar
  40. 40.
    Krähenbühl P., Koltun V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: NIPS, pp. 109–117 (2011)Google Scholar
  41. 41.
    Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: CVPR, pp. 1597–1604 (2009)Google Scholar
  42. 42.
    Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: ECCV, pp. 740–755 (2014)Google Scholar
  43. 43.
    Gui, Y., Zeng, G., Tang, W.: Fast and robust image cutout using bilateral grid and confidence based color model. J. Comput. Aided Des. Comput. Graph. 30(7), 1284–1296 (2018). (in Chinese)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer and Communication EngineeringChangsha University of Science and TechnologyChangshaPeople’s Republic of China
  2. 2.Changsha University of Science and TechnologyChangshaPeople’s Republic of China

Personalised recommendations