Skip to main content

Adaptive Attention Model for Lidar Instance Segmentation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11844))

Abstract

Detecting and categorizing the instances of objects using Lidar scans are of critical importance for highly autonomous vehicles, which are expected to safely and swiftly maneuver through complex urban streets without the intervention of human drivers. In contrast to recent detection-based approaches [6, 10], we formulate the problem as a point-wise segmentation problem and focus on improving the recognition of small objects, which is very challenging due to the low resolution of commercial Lidar systems. Specifically, we propose a novel end-to-end convolutional neural network (CNN) that encapsulates adaptive attention information, and achieve instance segmentation by fusing multiple auxiliary tasks. We examined our algorithms on the 2D projection data derived from KITTI 3D object detection dataset [8] and achieved at least 14.6% improvement in Intersection over Union (IoU) with faster inference time (25.3 ms per Lidar scan) than the state-of-the-art algorithms.

P. Xiong and X. Hao—Both authors contributed equally to this research.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Society of Automotive Engineers.

References

  1. Anguelov, D., et al.: Discriminative learning of Markov random fields for segmentation of 3D scan data. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 169–176, June 2005. https://doi.org/10.1109/CVPR.2005.133

  2. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011). https://doi.org/10.1109/TPAMI.2010.161

    Article  Google Scholar 

  3. Beltran, J., Guindel, C., Moreno, F.M., Cruzado, D., Garcia, F., de la Escalera, A.: BirdNet: a 3D object detection framework from LiDAR information. arXiv preprint arXiv:1805.01195, May 2018

  4. Bischke, B., Helber, P., Folz, J., Borth, D., Dengel, A.: Multi-task learning for segmentation of building footprints with deep neural networks. CoRR abs/1709.05932 (2017). http://arxiv.org/abs/1709.05932

  5. Chen, L., Hermans, A., Papandreou, G., Schroff, F., Wang, P., Adam, H.: Masklab: instance segmentation by refining object detection with semantic and direction features. CoRR abs/1712.04837 (2017)

    Google Scholar 

  6. Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: IEEE CVPR, vol. 1, p. 3 (2017)

    Google Scholar 

  7. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). https://doi.org/10.1145/358669.358692. http://doi.acm.org/10.1145/358669.358692

    Article  MathSciNet  Google Scholar 

  8. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  9. Jiang, W., Ma, L., Jiang, Y., Liu, W., Zhang, T.: Recurrent fusion network for image captioning. CoRR abs/1807.09986 (2018)

    Google Scholar 

  10. Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.: Joint 3D proposal generation and object detection from view aggregation. In: IROS (2018)

    Google Scholar 

  11. Liang, X., Wei, Y., Shen, X., Yang, J., Lin, L., Yan, S.: Proposal-free network for instance-level object segmentation. CoRR abs/1509.02636 (2015)

    Google Scholar 

  12. Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2999–3007 (2017). https://doi.org/10.1109/ICCV.2017.324

  13. Luo, W., Yang, B., Urtasun, R.: Fast and furious: real time end-to-end 3D detection, tracking and motion forecasting with a single convolutional net. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018

    Google Scholar 

  14. Nguyen, A., Le, B.: 3D point cloud segmentation: a survey. In: RAM, pp. 225–230. IEEE (2013)

    Google Scholar 

  15. Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3D object detection from RGB-D data. arXiv preprint arXiv:1711.08488 (2017)

  16. Shin, M., Oh, G., Kim, S., Seo, S.: Real-time and accurate segmentation of 3-D point clouds based on Gaussian process regression. IEEE Trans. Intell. Transp. Syst. 18(12), 3363–3377 (2017)

    Article  Google Scholar 

  17. Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P.: Hough-transform and extended RANSAC algorithms for automatic detection of 3D building roof planes from LiDAR data. In: ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, Finland, vol. XXXVI, pp. 407–412, September 2007. https://halshs.archives-ouvertes.fr/halshs-00264843

  18. Uhrig, J., Cordts, M., Franke, U., Brox, T.: Pixel-level encoding and depth layering for instance-level semantic labeling. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 14–25. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45886-1_2

    Chapter  Google Scholar 

  19. Wang, D.Z., Posner, I., Newman, P.: What could move? Finding cars, pedestrians and bicyclists in 3D laser data. In: ICRA, pp. 4038–4044. IEEE (2012)

    Google Scholar 

  20. Wang, W., Yu, R., Huang, Q., Neumann, U.: SGPN: similarity group proposal network for 3D point cloud instance segmentation. In: CVPR (2018)

    Google Scholar 

  21. Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. arXiv preprint arXiv:1710.07368 (2017)

  22. Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. CoRR abs/1710.07368 (2017). http://arxiv.org/abs/1710.07368

  23. Xu, Y., et al.: Gland instance segmentation using deep multichannel neural networks. CoRR abs/1611.06661 (2016)

    Google Scholar 

  24. Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peixi Xiong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiong, P., Hao, X., Shao, Y., Yu, J. (2019). Adaptive Attention Model for Lidar Instance Segmentation. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11844. Springer, Cham. https://doi.org/10.1007/978-3-030-33720-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33720-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33719-3

  • Online ISBN: 978-3-030-33720-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics