Real-Time Multi-class Instance Segmentation with One-Time Deep Embedding Clustering

Chen, Yu-Chi; Chang, Chia-Yuan; Hsiao, Pei-Yung; Fu, Li-Chen

doi:10.1007/978-3-030-41404-7_16

Real-Time Multi-class Instance Segmentation with One-Time Deep Embedding Clustering

Yu-Chi Chen¹²,
Chia-Yuan Chang¹²,
Pei-Yung Hsiao¹³ &
…
Li-Chen Fu¹²

Conference paper
First Online: 23 February 2020

1409 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Abstract

In recent years, instance segmentation research has been considered as an extension of object detection and semantic segmentation, which can provide pixel-level annotations on detected objects. Several approaches for instance segmentation exploit object detection network to generate bounding box and segment each bounding box with segmentation network. However, these approaches need more time consumption due to two independent networks as their framework. On the other hand, some approaches based on clustering transform each pixel into unique representation and produce instance mask by postprocessing. Nevertheless, most clustering approaches have to cluster all instances of each class individually, which contribute to additional time consumption. In this research, we propose a fast clustering method called one-time clustering with single network aiming at reducing time consumption on multi-class instance segmentation. Moreover, we present a class-sensitive loss function that allows the network to generate unique embedding which contains class and instance information. With the informative embeddings, we can cluster them only once instead of clustering for each class in other clustering approaches. Our approach is up to 6x faster than the state-of-the-art UPSNet [1], which appeared in CVPR 2019, and get about 25% lower AP performance on Cityscape dataset. It achieves significantly faster speed and great segmentation quality while having an acceptable AP performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Xiong, Y., et al.: UPSNet: a unified panoptic segmentation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8818–8826 (2019)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017)
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Google Scholar
Huang, Z., Huang, L., Gong, Y., Huang, C., Wang, X.: Mask scoring R-CNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6409–6418 (2019)
Google Scholar
Watanabe, T., Wolf, D.: Distance to center of mass encoding for instance segmentation. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3825–3831. IEEE (2018)
Google Scholar
Bai, M., Urtasun, R.: Deep watershed transform for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5221–5229 (2017)
Google Scholar
De Brabandere, B., Neven, D., Van Gool, L.: Semantic instance segmentation with a discriminative loss function (2017)
Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Romera-Paredes, B., Torr, P.H.S.: Recurrent instance segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 312–329. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_19
Chapter Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Ren, M., Zemel, R.S.: End-to-end instance segmentation with recurrent attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6656–6664 (2017)
Google Scholar
Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 19, 263–272 (2017)
Article Google Scholar
Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9404–9413 (2019)
Google Scholar
van den Brand, J., Ochs, M., Mester, R.: Instance-level segmentation of vehicles by deep contours. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10116, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54407-6_32
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

National Taiwan University, Taipei, Taiwan, ROC
Yu-Chi Chen, Chia-Yuan Chang & Li-Chen Fu
National University of Kaohsiung, Kaohsiung, Taiwan, ROC
Pei-Yung Hsiao

Authors

Yu-Chi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Yuan Chang
View author publications
You can also search for this author in PubMed Google Scholar
Pei-Yung Hsiao
View author publications
You can also search for this author in PubMed Google Scholar
Li-Chen Fu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-Chen Fu .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
Consiglio Nazionale delle Ricerche, ICAR, Naples, Italy
Gabriella Sanniti di Baja
Chinese Academy of Sciences, Beijing, China
Liang Wang
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, YC., Chang, CY., Hsiao, PY., Fu, LC. (2020). Real-Time Multi-class Instance Segmentation with One-Time Deep Embedding Clustering. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-41404-7_16
Published: 23 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41403-0
Online ISBN: 978-3-030-41404-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics