PointNet Evaluation for On-Road Object Detection Using a Multi-resolution Conditioning

  • Jose PamplonaEmail author
  • Carlos Madrigal
  • Arturo de la Escalera
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11401)


On-road object detection is one of the main topics in the development of autonomous vehicles. Factors related to the diversity of classes, pose changes, occlusions, and low resolution make object detection challenging. Most of the object detection techniques which have been based on RGB images, have limitations because of the influence of environmental lighting conditions. Consequently, other sources of information have become interesting for undertaking this task. This paper proposes an on-road object detection method, which uses 3D information acquired by a LiDAR HD sensor. We evaluate a neural network architecture based on PointNet for multi-resolution 3D objects. To carry this out, a multi-resolution conditioning stage is proposed in order to optimize the performance of the PointNet architecture applied over LiDAR data. Both the training and evaluation processes are performed by using the KITTI dataset. Our approach uses low computational cost algorithms, which are based on occupancy grid maps for on-road object segmentation. The experiments show that the proposed method achieves better results than PointNet evaluated on a single resolution.


Pedestrian detection LiDAR Deep learning Resolution conditioning 


  1. 1.
    Beltran, J., Guindel, C., Moreno, F.M., Cruzado, D., Garcia, F., de la Escalera, A.: Birdnet: a 3D object detection framework from LiDAR information. arXiv preprint arXiv:1805.01195 (2018)
  2. 2.
    Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: image segmentation using expectation-maximization and its application to image querying. IEEE Trans. Pattern Anal. Mach. Intell. 24(8), 1026–1038 (2002)CrossRefGoogle Scholar
  3. 3.
    Fagnant, D.J., Kockelman, K.: Preparing a nation for autonomous vehicles: opportunities, barriers and policy recommendations. Transp. Res. Part A Policy Pract. 77, 167–181 (2015)CrossRefGoogle Scholar
  4. 4.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361. IEEE (2012)Google Scholar
  5. 5.
    Hwang, S., Kim, N., Choi, Y., Lee, S., Kweon, I.S.: Fast multiple objects detection and tracking fusing color camera and 3D LiDAR for intelligent vehicles. In: 2016 13th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), pp. 234–239. IEEE (2016)Google Scholar
  6. 6.
    Kidono, K., Miyasaka, T., Watanabe, A., Naito, T., Miura, J.: Pedestrian recognition using high-definition LiDAR. In: 2011 IEEE Intelligent Vehicles Symposium (IV), pp. 405–410. IEEE (2011)Google Scholar
  7. 7.
    Lin, B.Z., Lin, C.C.: Pedestrian detection by fusing 3D points and color images. In: 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), pp. 1–5. IEEE (2016)Google Scholar
  8. 8.
    Global status report on road safety 2015. World Health Organization (2015)Google Scholar
  9. 9.
    Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3D object detection from RGB-D data. arXiv preprint arXiv:1711.08488 (2017)
  10. 10.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), vol. 1, no. 2, p. 4. IEEE (2017)Google Scholar
  11. 11.
    Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNs for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5648–5656 (2016)Google Scholar
  12. 12.
    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)Google Scholar
  13. 13.
    Raguram, R., Frahm, J.-M., Pollefeys, M.: A comparative analysis of RANSAC techniques leading to adaptive real-time random sample consensus. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 500–513. Springer, Heidelberg (2008). Scholar
  14. 14.
    Schwarz, B.: LiDAR: mapping the world in 3D. Nat. Photonics 4(7), 429 (2010)CrossRefGoogle Scholar
  15. 15.
    Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shape modeling. In: CVPR, vol. 1, p. 3 (2015)Google Scholar
  16. 16.
    Xue, J.r., Wang, D., Du, S.y., Cui, D.x., Huang, Y., Zheng, N.n.: A vision-centered multi-sensor fusing approach to self-localization and obstacle perception for robotic cars. Front. Inf. Technol. Electron. Eng. 18(1), 122–138 (2017)Google Scholar
  17. 17.
    Zhou, Y., Tuzel, O.: Voxelnet: end-to-end learning for point cloud based 3D object detection. arXiv preprint arXiv:1711.06396 (2017)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jose Pamplona
    • 1
    Email author
  • Carlos Madrigal
    • 1
  • Arturo de la Escalera
    • 2
  1. 1.Artificial Vision and Photonics LabInstituto Tecnológico MetropolitanoMedellínColombia
  2. 2.Intelligent Systems LabUniversidad Carlos III de MadridLeganésSpain

Personalised recommendations