Adaptive Attention Model for Lidar Instance Segmentation

Xiong, Peixi; Hao, Xuetao; Shao, Yunming; Yu, Jerry

doi:10.1007/978-3-030-33720-9_11

Adaptive Attention Model for Lidar Instance Segmentation

Peixi Xiong²⁰,
Xuetao Hao²¹,
Yunming Shao²² &
…
Jerry Yu²²

Conference paper
First Online: 21 October 2019

2111 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11844))

Abstract

Detecting and categorizing the instances of objects using Lidar scans are of critical importance for highly autonomous vehicles, which are expected to safely and swiftly maneuver through complex urban streets without the intervention of human drivers. In contrast to recent detection-based approaches [6, 10], we formulate the problem as a point-wise segmentation problem and focus on improving the recognition of small objects, which is very challenging due to the low resolution of commercial Lidar systems. Specifically, we propose a novel end-to-end convolutional neural network (CNN) that encapsulates adaptive attention information, and achieve instance segmentation by fusing multiple auxiliary tasks. We examined our algorithms on the 2D projection data derived from KITTI 3D object detection dataset [8] and achieved at least 14.6% improvement in Intersection over Union (IoU) with faster inference time (25.3 ms per Lidar scan) than the state-of-the-art algorithms.

P. Xiong and X. Hao—Both authors contributed equally to this research.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Society of Automotive Engineers.

References

Anguelov, D., et al.: Discriminative learning of Markov random fields for segmentation of 3D scan data. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 169–176, June 2005. https://doi.org/10.1109/CVPR.2005.133
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011). https://doi.org/10.1109/TPAMI.2010.161
Article Google Scholar
Beltran, J., Guindel, C., Moreno, F.M., Cruzado, D., Garcia, F., de la Escalera, A.: BirdNet: a 3D object detection framework from LiDAR information. arXiv preprint arXiv:1805.01195, May 2018
Bischke, B., Helber, P., Folz, J., Borth, D., Dengel, A.: Multi-task learning for segmentation of building footprints with deep neural networks. CoRR abs/1709.05932 (2017). http://arxiv.org/abs/1709.05932
Chen, L., Hermans, A., Papandreou, G., Schroff, F., Wang, P., Adam, H.: Masklab: instance segmentation by refining object detection with semantic and direction features. CoRR abs/1712.04837 (2017)
Google Scholar
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: IEEE CVPR, vol. 1, p. 3 (2017)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). https://doi.org/10.1145/358669.358692. http://doi.acm.org/10.1145/358669.358692
Article MathSciNet Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Google Scholar
Jiang, W., Ma, L., Jiang, Y., Liu, W., Zhang, T.: Recurrent fusion network for image captioning. CoRR abs/1807.09986 (2018)
Google Scholar
Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.: Joint 3D proposal generation and object detection from view aggregation. In: IROS (2018)
Google Scholar
Liang, X., Wei, Y., Shen, X., Yang, J., Lin, L., Yan, S.: Proposal-free network for instance-level object segmentation. CoRR abs/1509.02636 (2015)
Google Scholar
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2999–3007 (2017). https://doi.org/10.1109/ICCV.2017.324
Luo, W., Yang, B., Urtasun, R.: Fast and furious: real time end-to-end 3D detection, tracking and motion forecasting with a single convolutional net. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar
Nguyen, A., Le, B.: 3D point cloud segmentation: a survey. In: RAM, pp. 225–230. IEEE (2013)
Google Scholar
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3D object detection from RGB-D data. arXiv preprint arXiv:1711.08488 (2017)
Shin, M., Oh, G., Kim, S., Seo, S.: Real-time and accurate segmentation of 3-D point clouds based on Gaussian process regression. IEEE Trans. Intell. Transp. Syst. 18(12), 3363–3377 (2017)
Article Google Scholar
Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P.: Hough-transform and extended RANSAC algorithms for automatic detection of 3D building roof planes from LiDAR data. In: ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, Finland, vol. XXXVI, pp. 407–412, September 2007. https://halshs.archives-ouvertes.fr/halshs-00264843
Uhrig, J., Cordts, M., Franke, U., Brox, T.: Pixel-level encoding and depth layering for instance-level semantic labeling. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 14–25. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45886-1_2
Chapter Google Scholar
Wang, D.Z., Posner, I., Newman, P.: What could move? Finding cars, pedestrians and bicyclists in 3D laser data. In: ICRA, pp. 4038–4044. IEEE (2012)
Google Scholar
Wang, W., Yu, R., Huang, Q., Neumann, U.: SGPN: similarity group proposal network for 3D point cloud instance segmentation. In: CVPR (2018)
Google Scholar
Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. arXiv preprint arXiv:1710.07368 (2017)
Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. CoRR abs/1710.07368 (2017). http://arxiv.org/abs/1710.07368
Xu, Y., et al.: Gland instance segmentation using deep multichannel neural networks. CoRR abs/1611.06661 (2016)
Google Scholar
Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar

Download references

Author information

Authors and Affiliations

Northwestern University, Evanston, IL, 60208, USA
Peixi Xiong
University of Southern California, Los Angeles, CA, 90089, USA
Xuetao Hao
SAIC Innovation Center, San Jose, CA, 95134, USA
Yunming Shao & Jerry Yu

Authors

Peixi Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Xuetao Hao
View author publications
You can also search for this author in PubMed Google Scholar
Yunming Shao
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peixi Xiong .

Editor information

Editors and Affiliations

University of Nevada, Reno, NV, USA
George Bebis
NASA Ames Research Center, Moffett Field, CA, USA
Richard Boyle
University of Nevada, Reno, NV, USA
Bahram Parvin
Desert Research Institute, Reno, NV, USA
Darko Koracin
Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Daniela Ushizima
Latent AI, Palo Alto, CA, USA
Sek Chai
Texas A&M University, College Station, TX, USA
Shinjiro Sueda
Louisiana State University, Baton Rouge, LA, USA
Xin Lin
University of North Carolina at Charlotte, Charlotte, NC, USA
Aidong Lu
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Daniel Thalmann
Notre Dame University, Notre Dame, IN, USA
Chaoli Wang
Bosch Research North America, Palo Alto, CA, USA
Panpan Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiong, P., Hao, X., Shao, Y., Yu, J. (2019). Adaptive Attention Model for Lidar Instance Segmentation. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11844. Springer, Cham. https://doi.org/10.1007/978-3-030-33720-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-33720-9_11
Published: 21 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33719-3
Online ISBN: 978-3-030-33720-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics