Abstract
Detecting and categorizing the instances of objects using Lidar scans are of critical importance for highly autonomous vehicles, which are expected to safely and swiftly maneuver through complex urban streets without the intervention of human drivers. In contrast to recent detection-based approaches [6, 10], we formulate the problem as a point-wise segmentation problem and focus on improving the recognition of small objects, which is very challenging due to the low resolution of commercial Lidar systems. Specifically, we propose a novel end-to-end convolutional neural network (CNN) that encapsulates adaptive attention information, and achieve instance segmentation by fusing multiple auxiliary tasks. We examined our algorithms on the 2D projection data derived from KITTI 3D object detection dataset [8] and achieved at least 14.6% improvement in Intersection over Union (IoU) with faster inference time (25.3 ms per Lidar scan) than the state-of-the-art algorithms.
P. Xiong and X. Hao—Both authors contributed equally to this research.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Society of Automotive Engineers.
References
Anguelov, D., et al.: Discriminative learning of Markov random fields for segmentation of 3D scan data. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 169–176, June 2005. https://doi.org/10.1109/CVPR.2005.133
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011). https://doi.org/10.1109/TPAMI.2010.161
Beltran, J., Guindel, C., Moreno, F.M., Cruzado, D., Garcia, F., de la Escalera, A.: BirdNet: a 3D object detection framework from LiDAR information. arXiv preprint arXiv:1805.01195, May 2018
Bischke, B., Helber, P., Folz, J., Borth, D., Dengel, A.: Multi-task learning for segmentation of building footprints with deep neural networks. CoRR abs/1709.05932 (2017). http://arxiv.org/abs/1709.05932
Chen, L., Hermans, A., Papandreou, G., Schroff, F., Wang, P., Adam, H.: Masklab: instance segmentation by refining object detection with semantic and direction features. CoRR abs/1712.04837 (2017)
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: IEEE CVPR, vol. 1, p. 3 (2017)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). https://doi.org/10.1145/358669.358692. http://doi.acm.org/10.1145/358669.358692
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Jiang, W., Ma, L., Jiang, Y., Liu, W., Zhang, T.: Recurrent fusion network for image captioning. CoRR abs/1807.09986 (2018)
Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.: Joint 3D proposal generation and object detection from view aggregation. In: IROS (2018)
Liang, X., Wei, Y., Shen, X., Yang, J., Lin, L., Yan, S.: Proposal-free network for instance-level object segmentation. CoRR abs/1509.02636 (2015)
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2999–3007 (2017). https://doi.org/10.1109/ICCV.2017.324
Luo, W., Yang, B., Urtasun, R.: Fast and furious: real time end-to-end 3D detection, tracking and motion forecasting with a single convolutional net. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Nguyen, A., Le, B.: 3D point cloud segmentation: a survey. In: RAM, pp. 225–230. IEEE (2013)
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3D object detection from RGB-D data. arXiv preprint arXiv:1711.08488 (2017)
Shin, M., Oh, G., Kim, S., Seo, S.: Real-time and accurate segmentation of 3-D point clouds based on Gaussian process regression. IEEE Trans. Intell. Transp. Syst. 18(12), 3363–3377 (2017)
Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P.: Hough-transform and extended RANSAC algorithms for automatic detection of 3D building roof planes from LiDAR data. In: ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, Finland, vol. XXXVI, pp. 407–412, September 2007. https://halshs.archives-ouvertes.fr/halshs-00264843
Uhrig, J., Cordts, M., Franke, U., Brox, T.: Pixel-level encoding and depth layering for instance-level semantic labeling. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 14–25. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45886-1_2
Wang, D.Z., Posner, I., Newman, P.: What could move? Finding cars, pedestrians and bicyclists in 3D laser data. In: ICRA, pp. 4038–4044. IEEE (2012)
Wang, W., Yu, R., Huang, Q., Neumann, U.: SGPN: similarity group proposal network for 3D point cloud instance segmentation. In: CVPR (2018)
Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. arXiv preprint arXiv:1710.07368 (2017)
Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. CoRR abs/1710.07368 (2017). http://arxiv.org/abs/1710.07368
Xu, Y., et al.: Gland instance segmentation using deep multichannel neural networks. CoRR abs/1611.06661 (2016)
Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Xiong, P., Hao, X., Shao, Y., Yu, J. (2019). Adaptive Attention Model for Lidar Instance Segmentation. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11844. Springer, Cham. https://doi.org/10.1007/978-3-030-33720-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-33720-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33719-3
Online ISBN: 978-3-030-33720-9
eBook Packages: Computer ScienceComputer Science (R0)