Advertisement

Instance-Level Segmentation of Vehicles by Deep Contours

  • Jan van den BrandEmail author
  • Matthias Ochs
  • Rudolf Mester
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10116)

Abstract

The recognition of individual object instances in single monocular images is still an incompletely solved task. In this work, we propose a new approach for detecting and separating vehicles in the context of autonomous driving. Our method uses the fully convolutional network (FCN) for semantic labeling and for estimating the boundary of each vehicle. Even though a contour is in general a one pixel wide structure which cannot be directly learned by a CNN, our network addresses this by providing areas around the contours. Based on these areas, we separate the individual vehicle instances. In our experiments, we show on two challenging datasets (Cityscapes and KITTI) that we achieve state-of-the-art performance, despite the usage of a subsampling rate of two. Our approach even outperforms all recent works w.r.t. several rating scores.

Keywords

Markov Random Field Convolutional Neural Network Conditional Random Field Individual Instance Object Instance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)Google Scholar
  2. 2.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. (IJCV) 88, 303–338 (2010)CrossRefGoogle Scholar
  3. 3.
    Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. Trans. Pattern Anal. Mach. Intell. (PAMI) (2016). http://ieeexplore.ieee.org/abstract/document/7478072/
  4. 4.
    Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  5. 5.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012)Google Scholar
  6. 6.
    Tighe, J., Niethammer, M., Lazebnik, S.: Scene parsing with object instances and occlusion ordering. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3748–3755 (2014)Google Scholar
  7. 7.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. Trans. Pattern Anal. Mach. Intell. (PAMI) 32, 1627–1645 (2010)CrossRefGoogle Scholar
  8. 8.
    Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15561-1_31 CrossRefGoogle Scholar
  9. 9.
    He, X., Gould, S.: An exemplar-based CRF for multi-instance object segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 296–303 (2014)Google Scholar
  10. 10.
    Zhang, Z., Fidler, S., Urtasun, R.: Instance-level segmentation for autonomous driving with deep densely connected MRFs. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  11. 11.
    Zhang, Z., Schwing, A.G., Fidler, S., Urtasun, R.: Monocular object instance segmentation and depth ordering with CNNs. In: International Conference on Computer Vision (ICCV), pp. 2614–2622 (2015)Google Scholar
  12. 12.
    Uhrig, J., Cordts, M., Franke, U., Brox, T.: Pixel-level encoding and depth layering for instance-level semantic labeling. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 14–25. Springer, Cham (2016). doi: 10.1007/978-3-319-45886-1_2 CrossRefGoogle Scholar
  13. 13.
    Ghiasi, G., Fowlkes, C.C.: Laplacian pyramid reconstruction and refinement for semantic segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 519–534. Springer, Cham (2016). doi: 10.1007/978-3-319-46487-9_32 CrossRefGoogle Scholar
  14. 14.
    Lin, G., Shen, C., van den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  15. 15.
    Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.: Conditional random fields as recurrent neural networks. In: International Conference on Computer Vision (ICCV), pp. 1529–1537 (2015)Google Scholar
  16. 16.
    Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 345–360. Springer, Cham (2014). doi: 10.1007/978-3-319-10584-0_23 Google Scholar
  17. 17.
    Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 297–312. Springer, Cham (2014). doi: 10.1007/978-3-319-10584-0_20 Google Scholar
  18. 18.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. Trans. Pattern Anal. Mach. Intell. (PAMI) 38, 142–158 (2016)CrossRefGoogle Scholar
  19. 19.
    Wu, Z., Shen, C., van den Hengel, A.: Bridging Category-level and Instance-level Semantic Image Segmentation. arXiv:1605.06885 [cs.CV] (2016)
  20. 20.
    Ren, M., Zemel, R.S.: End-to-End Instance Segmentation and Counting with Recurrent Attention. arXiv:1605.09410 [cs.LG] (2016)
  21. 21.
    Liang, X., Wei, Y., Shen, X., Yang, J., Lin, L., Yan, S.: Proposal-free network for instance-level object segmentation. arXiv:1509.02636 [cs.CV] (2015)
  22. 22.
    Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. Trans. Pattern Anal. Mach. Intell. (PAMI) 26, 530–549 (2004)CrossRefGoogle Scholar
  23. 23.
    Mairal, J., Leordeanu, M., Bach, F., Hebert, M., Ponce, J.: Discriminative sparse image models for class-specific edge detection and image interpretation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 43–56. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88690-7_4 CrossRefGoogle Scholar
  24. 24.
    Dollar, P., Tu, Z., Belongie, S.: Supervised learning of edges and object boundaries. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1964–1971 (2006)Google Scholar
  25. 25.
    Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV), pp. 991–998 (2011)Google Scholar
  26. 26.
    Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. Trans. Pattern Anal. Mach. Intell. (PAMI) 33, 898–916 (2011)CrossRefGoogle Scholar
  27. 27.
    Gupta, S., Arbeláez, P., Girshick, R., Malik, J.: Indoor scene understanding with RGB-D images: bottom-up segmentation, object detection and semantic segmentation. Int. J. Comput. Vis. (IJCV) 112, 133–149 (2015)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Kokkinos, I.: Pushing the boundaries of boundary detection using deep learning. In: International Conference on Learning Representations (ICLR) (2016)Google Scholar
  29. 29.
    Rupprecht, C., Huaroc, E., Baust, M., Navab, N.: Deep Active Contours. arXiv:1607.05074 [cs.CV] (2016)
  30. 30.
    Shen, W., Wang, X., Wang, Y., Bai, X., Zhang, Z.: Deepcontour: a deep convolutional feature learned by positive-sharing loss for contour detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3982–3991 (2015)Google Scholar
  31. 31.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arxiv:1408.5093 (2014)

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jan van den Brand
    • 1
    Email author
  • Matthias Ochs
    • 1
  • Rudolf Mester
    • 1
    • 2
  1. 1.VSI LabGoethe UniversityFrankfurt am MainGermany
  2. 2.Computer Vision Laboratory, ISYLinköping UniversityLinköpingSweden

Personalised recommendations