Part-Level Sketch Segmentation and Labeling Using Dual-CNN

  • Xianyi Zhu
  • Yi XiaoEmail author
  • Yan Zheng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11301)


Part-level sketch segmentation and labeling refers to segment an object sketch to semantic component parts. It is a hard task since sketches carry much fewer features than natural images. Inspired by the neural networks used in sketch classification, which shows the performance of the network is significantly affected by the kernel size, we propose a dual-convolutional neural network (CNN) method to tackle automatic sketch segmentation and labeling. The dual-CNN model contains two CNNs, one with large-size convolutional kernels to process long sketches, the other with small-size kernels to work on short ones. Both CNNs have three convolutional layers and three fully connection layers. Except for the first convolutional layer, the rest configurations of these two CNNs are same. To further enhance the performance of the method, we model position and orientation as a triple-channel input of our networks by fusing the minimal oriented rectangle bounding boxes (MORBB) of stroke and its host sketch as masks. Extensive experimental results verify our method and demonstrate that our approach outperforms state of the art.


Part-level sketch segmentation Sketch labeling Stroke classification Dual convolutional neural networks 



The work is supported by the National Key Research & Development Program of China (Grant Num.:2018YFB0203904), NSFC from PRC (Grant Num.:61872137, 61502158, 61803150), Hunan NSF (Grant Num.: 2017JJ3042, 2018JJ3067), and China Postdoctoral Foundation (Grant Num.: 2016M590740).


  1. 1.
    Belongie, S.J., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)CrossRefGoogle Scholar
  2. 2.
    Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Gated feedback recurrent neural networks. In: Bach, F.R., Blei, D.M. (eds.) ICML 2015. PMLR, vol. 37, pp. 2067–2075. MIT Press, Cambridge (2015)Google Scholar
  3. 3.
    Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph. 31(4), 44:1–44:10 (2012)Google Scholar
  4. 4.
    Freeman, H., Shapira, R.: Determining the minimum-area encasing rectangle for an arbitrary closed curve. Commun. ACM 18(7), 409–413 (1975)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Furusawa, C., Fukusato, T., Okada, N., Hirai, T., Morishima, S.: Quasi 3D rotation for hand-drawn characters. In: SIGGRAPH 2014, Posters Proceedings, p. 12:1. ACM Press, New York (2014)Google Scholar
  6. 6.
    Galea, C., Farrugia, R.A.: Forensic face photo-sketch recognition using a deep learning-based architecture. IEEE Sig. Process. Lett. 24(11), 1586–1590 (2017)CrossRefGoogle Scholar
  7. 7.
    He, J., Wu, X., Jiang, Y., Zhao, B., Peng, Q.: Sketch recognition with deep visual-sequential fusion model. In: Liu, Q., et al. (eds.) ACM Multimedia 2017, pp. 448–456. ACM Press, New York (2017)Google Scholar
  8. 8.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR 2016, pp. 770–778. IEEE Press, New York (2016)Google Scholar
  9. 9.
    Huang, Z., Fu, H., Lau, R.W.: Data-driven segmentation and labeling of freehand sketches. ACM Trans. Graph. 33(6), 175:1–175:10 (2014)CrossRefGoogle Scholar
  10. 10.
    Jia, Q., Yu, M., Fan, X., Li, H.: Sequential dual deep learning with shape and texture features for sketch recognition. CoRR abs/1708.02716 (2017).
  11. 11.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)CrossRefGoogle Scholar
  12. 12.
    Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Brodley, C.E., Danyluk, A.P. (eds.) ICML 2001, pp. 282–289. Morgan Kaufmann, San Francisco (2001)Google Scholar
  13. 13.
    Léon, B.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) COMPSTAT 2010, pp. 177–186. Springer, Heidelberg (2010). Scholar
  14. 14.
    Li, Y., Hospedales, T.M., Song, Y., Gong, S.: Free-hand sketch recognition by multi-kernel feature learning. Comput. Vis. Image Underst. 137, 1–11 (2015)CrossRefGoogle Scholar
  15. 15.
    Liao, P., Chen, T., Chung, P.: A fast algorithm for multilevel thresholding. J. Inf. Sci. Eng. 17(5), 713–727 (2001)Google Scholar
  16. 16.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV 1999, pp. 1150–1157. IEEE Press, New York (1999)Google Scholar
  17. 17.
    Mao, C., Qin, S.F., Wright, D.K.: A sketch-based gesture interface for rough 3D stick figure animation. In: Jorge, J.A.P., Igarashi, T. (eds.) Sketch Based Interfaces and Modeling 2005, pp. 175–183. Eurographics Association, Geneva (2005)Google Scholar
  18. 18.
    Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: Kobayashi, T., Hirose, K., Nakamura, S. (eds.) INTERSPEECH 2010, pp. 1045–1048. ISCA Press, Singapore (2010)Google Scholar
  19. 19.
    Noris, G., et al.: Smart scribbles for sketch segmentation. Comput. Graph. Forum 31(8), 2516–2527 (2012)CrossRefGoogle Scholar
  20. 20.
    Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006). Scholar
  21. 21.
    Olsen, L., Samavati, F.F., Sousa, M.C., Jorge, J.A.: Sketch-based modeling: a survey. Comput. Graph. 33(1), 85–103 (2009)CrossRefGoogle Scholar
  22. 22.
    van den Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: Balcan, M., Weinberger, K.Q. (eds.) ICML 2016. PMLR, vol. 48, pp. 1747–1756. MIT Press, Cambridge (2016)Google Scholar
  23. 23.
    Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)CrossRefGoogle Scholar
  24. 24.
    Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans. Graph. 35(4), 119:1–119:12 (2016)CrossRefGoogle Scholar
  26. 26.
    Sarvadevabhatla, R.K., Kundu, J., Babu, R.V.: Enabling my robot to play pictionary: recurrent neural networks for sketch recognition. In: Hanjalic, A., et al. (eds.) ACM Multimedia 2016, pp. 247–251. ACM Press, New York (2016)Google Scholar
  27. 27.
    Schneider, R.G., Tuytelaars, T.: Sketch classification and classification-driven analysis using fisher vectors. ACM Trans. Graph. 33(6), 174:1–174:9 (2014)CrossRefGoogle Scholar
  28. 28.
    Schneider, R.G., Tuytelaars, T.: Example-based sketch segmentation and labeling using CRFs. ACM Trans. Graph. 35(5), 151:1–151:9 (2016)CrossRefGoogle Scholar
  29. 29.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014).
  30. 30.
    Sun, Z., Wang, C., Zhang, L., Zhang, L.: Free hand-drawn sketch segmentation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 626–639. Springer, Heidelberg (2012). Scholar
  31. 31.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR 2010, pp. 3360–3367. IEEE Press, New York (2010)Google Scholar
  32. 32.
    Xie, X., et al.: Sketch-to-design: context-based part assembly. Comput. Graph. Forum 32(8), 233–245 (2013)CrossRefGoogle Scholar
  33. 33.
    Yu, Q., Yang, Y., Liu, F., Song, Y., Xiang, T., Hospedales, T.M.: Sketch-a-net: a deep neural network that beats humans. Int. J. Comput. Vis. 122(3), 411–425 (2017)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Yu, Q., Yang, Y., Song, Y., Xiang, T., Hospedales, T.M.: Sketch-a-net that beats humans. In: Xie, X., Jones, M.W., Tam, G.K.L. (eds.) BMVC 2015, pp. 7.1–7.12. BMVA Press, London (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.College of Computer Science and Electronic EngineeringHunan UniversityChangshaPeople’s Republic of China
  2. 2.College of Electrical and Information EngineeringHunan UniversityChangshaPeople’s Republic of China

Personalised recommendations