Abstract
In recent years, instance segmentation research has been considered as an extension of object detection and semantic segmentation, which can provide pixel-level annotations on detected objects. Several approaches for instance segmentation exploit object detection network to generate bounding box and segment each bounding box with segmentation network. However, these approaches need more time consumption due to two independent networks as their framework. On the other hand, some approaches based on clustering transform each pixel into unique representation and produce instance mask by postprocessing. Nevertheless, most clustering approaches have to cluster all instances of each class individually, which contribute to additional time consumption. In this research, we propose a fast clustering method called one-time clustering with single network aiming at reducing time consumption on multi-class instance segmentation. Moreover, we present a class-sensitive loss function that allows the network to generate unique embedding which contains class and instance information. With the informative embeddings, we can cluster them only once instead of clustering for each class in other clustering approaches. Our approach is up to 6x faster than the state-of-the-art UPSNet [1], which appeared in CVPR 2019, and get about 25% lower AP performance on Cityscape dataset. It achieves significantly faster speed and great segmentation quality while having an acceptable AP performance.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Xiong, Y., et al.: UPSNet: a unified panoptic segmentation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8818–8826 (2019)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Huang, Z., Huang, L., Gong, Y., Huang, C., Wang, X.: Mask scoring R-CNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6409–6418 (2019)
Watanabe, T., Wolf, D.: Distance to center of mass encoding for instance segmentation. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3825–3831. IEEE (2018)
Bai, M., Urtasun, R.: Deep watershed transform for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5221–5229 (2017)
De Brabandere, B., Neven, D., Van Gool, L.: Semantic instance segmentation with a discriminative loss function (2017)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Romera-Paredes, B., Torr, P.H.S.: Recurrent instance segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 312–329. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_19
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Ren, M., Zemel, R.S.: End-to-end instance segmentation with recurrent attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6656–6664 (2017)
Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 19, 263–272 (2017)
Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9404–9413 (2019)
van den Brand, J., Ochs, M., Mester, R.: Instance-level segmentation of vehicles by deep contours. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10116, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54407-6_32
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, YC., Chang, CY., Hsiao, PY., Fu, LC. (2020). Real-Time Multi-class Instance Segmentation with One-Time Deep Embedding Clustering. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-41404-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41403-0
Online ISBN: 978-3-030-41404-7
eBook Packages: Computer ScienceComputer Science (R0)