Advertisement

Real-Time Multi-class Instance Segmentation with One-Time Deep Embedding Clustering

  • Yu-Chi Chen
  • Chia-Yuan Chang
  • Pei-Yung Hsiao
  • Li-Chen FuEmail author
Conference paper
  • 130 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12046)

Abstract

In recent years, instance segmentation research has been considered as an extension of object detection and semantic segmentation, which can provide pixel-level annotations on detected objects. Several approaches for instance segmentation exploit object detection network to generate bounding box and segment each bounding box with segmentation network. However, these approaches need more time consumption due to two independent networks as their framework. On the other hand, some approaches based on clustering transform each pixel into unique representation and produce instance mask by postprocessing. Nevertheless, most clustering approaches have to cluster all instances of each class individually, which contribute to additional time consumption. In this research, we propose a fast clustering method called one-time clustering with single network aiming at reducing time consumption on multi-class instance segmentation. Moreover, we present a class-sensitive loss function that allows the network to generate unique embedding which contains class and instance information. With the informative embeddings, we can cluster them only once instead of clustering for each class in other clustering approaches. Our approach is up to 6x faster than the state-of-the-art UPSNet [1], which appeared in CVPR 2019, and get about 25% lower AP performance on Cityscape dataset. It achieves significantly faster speed and great segmentation quality while having an acceptable AP performance.

Keywords

Deep learning Real-time instance segmentation Multi-class instance segmentation One-time clustering 

References

  1. 1.
    Xiong, Y., et al.: UPSNet: a unified panoptic segmentation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8818–8826 (2019)Google Scholar
  2. 2.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  3. 3.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)Google Scholar
  4. 4.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  5. 5.
    Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017)CrossRefGoogle Scholar
  6. 6.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)Google Scholar
  7. 7.
    Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)Google Scholar
  8. 8.
    Huang, Z., Huang, L., Gong, Y., Huang, C., Wang, X.: Mask scoring R-CNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6409–6418 (2019)Google Scholar
  9. 9.
    Watanabe, T., Wolf, D.: Distance to center of mass encoding for instance segmentation. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3825–3831. IEEE (2018)Google Scholar
  10. 10.
    Bai, M., Urtasun, R.: Deep watershed transform for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5221–5229 (2017)Google Scholar
  11. 11.
    De Brabandere, B., Neven, D., Van Gool, L.: Semantic instance segmentation with a discriminative loss function (2017)Google Scholar
  12. 12.
    Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)Google Scholar
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  14. 14.
    Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)Google Scholar
  15. 15.
    Romera-Paredes, B., Torr, P.H.S.: Recurrent instance segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 312–329. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_19CrossRefGoogle Scholar
  16. 16.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)CrossRefGoogle Scholar
  17. 17.
    Ren, M., Zemel, R.S.: End-to-end instance segmentation with recurrent attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6656–6664 (2017)Google Scholar
  18. 18.
    Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 19, 263–272 (2017)CrossRefGoogle Scholar
  19. 19.
    Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9404–9413 (2019)Google Scholar
  20. 20.
    van den Brand, J., Ochs, M., Mester, R.: Instance-level segmentation of vehicles by deep contours. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10116, pp. 477–492. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-54407-6_32CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Yu-Chi Chen
    • 1
  • Chia-Yuan Chang
    • 1
  • Pei-Yung Hsiao
    • 2
  • Li-Chen Fu
    • 1
    Email author
  1. 1.National Taiwan UniversityTaipeiTaiwan, ROC
  2. 2.National University of KaohsiungKaohsiungTaiwan, ROC

Personalised recommendations