Object Recognition in 3D Point Cloud of Urban Street Scene

Babahajiani, Pouria; Fan, Lixin; Gabbouj, Moncef

doi:10.1007/978-3-319-16628-5_13

Pouria Babahajiani¹⁵,
Lixin Fan¹⁵ &
Moncef Gabbouj¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9008))

Included in the following conference series:

Asian Conference on Computer Vision

2361 Accesses
10 Citations

Abstract

In this paper we present a novel street scene semantic recognition framework, which takes advantage of 3D point clouds captured by a high-definition LiDAR laser scanner. An important problem in object recognition is the need for sufficient labeled training data to learn robust classifiers. In this paper we show how to significantly reduce the need for manually labeled training data by reduction of scene complexity using non-supervised ground and building segmentation. Our system first automatically segments grounds point cloud, this is because the ground connects almost all other objects and we will use a connect component based algorithm to oversegment the point clouds. Then, using binary range image processing building facades will be detected. Remained point cloud will grouped into voxels which are then transformed to super voxels. Local 3D features extracted from super voxels are classified by trained boosted decision trees and labeled with semantic classes e.g. tree, pedestrian, car, etc. The proposed method is evaluated both quantitatively and qualitatively on a challenging fixed-position Terrestrial Laser Scanning (TLS) Velodyne data set and two Mobile Laser Scanning (MLS), Paris-rue-Madam and NAVTEQ True databases. Robust scene parsing results are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: label transfer via dense scene alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009, CVPR 2009, pp. 1972–1979. IEEE (2009)
Google Scholar
Csurka, G., Perronnin, F.: A simple high performance approach to semantic segmentation. In: BMVC, pp. 1–10 (2008)
Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vision 75, 151–172 (2007)
Article Google Scholar
Floros, G., Leibe, B.: Joint 2d–3d temporally consistent semantic segmentation of street scenes. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2823–2830. IEEE (2012)
Google Scholar
Zhang, G., Jia, J., Wong, T.T., Bao, H.: Consistent depth maps recovery from a video sequence. IEEE Trans. Pattern Anal. Mach. Intell. 31, 974–988 (2009)
Article Google Scholar
Lu, W.L., Murphy, K.P., Little, J.J., Sheffer, A., Fu, H.: A hybrid conditional random field for estimating the underlying ground surface from airborne lidar data. IEEE Trans. Geosci. Remote Sens. 47, 2913–2922 (2009)
Article Google Scholar
Hernández, J., Marcotegui, B., et al.: Filtering of artifacts and pavement segmentation from mobile lidar data. In: ISPRS Workshop Laserscanning 2009 (2009)
Google Scholar
Zhou, Y., Yu, Y., Lu, G., Du, S.: Super-segments based classification of 3d urban street scenes. Int. J. Adv. Rob. Syst. 9, 1–8 (2012)
Google Scholar
Johnson, A.: Spin-Images: A Representation for 3-D Surface Matching. Ph.D. thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA (1997)
Google Scholar
Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3 d shape descriptors. In: Symposium on Geometry Processing, vol. 6 (2003)
Google Scholar
Sun, J., Ovsjanikov, M., Guibas, L.: A concise and provably informative multi-scale signature based on heat diffusion. In: Computer Graphics Forum, vol. 28, pp. 1383–1392. Wiley Online Library (2009)
Google Scholar
Osada, R., Funkhouser, T., Chazelle, B., Dobkin, D.: Shape distributions. ACM Trans. Graph. (TOG) 21, 807–832 (2002)
Article Google Scholar
Knopp, J., Prasad, M., Van Gool, L.: Orientation invariant 3d object classification using hough transform based methods. In: Proceedings of the ACM Workshop on 3D Object Retrieval, pp. 15–20. ACM (2010)
Google Scholar
Pavlidis, T.: Algorithms for Graphics and Image Processing. Computer Science Press, Rockville (1982)
Book Google Scholar
Klasing, K., Althoff, D., Wollherr, D., Buss, M.: Comparison of surface normal estimation methods for range sensing applications. In: IEEE International Conference on Robotics and Automation, 2009, ICRA 2009, pp. 3206–3211. IEEE (2009)
Google Scholar
Zhang, C., Wang, L., Yang, R.: Semantic segmentation of urban scenes using dense depth maps. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 708–721. Springer, Heidelberg (2010)
Chapter Google Scholar
Babahajiani, P., Fan, L., Gabbouj, M.: Semantic parsing of street scene images using 3d lidar point cloud. In: Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, vol. 13, pp. 714–721 (2013)
Google Scholar
Xiao, J., Quan, L.: Multiple view semantic segmentation for street view images. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 686–693. IEEE (2009)
Google Scholar
Collins, M., Schapire, R.E., Singer, Y.: Logistic regression, adaboost and bregman distances. Mach. Learn. 48, 253–285 (2002)
Article MATH Google Scholar
Lai, K., Fox, D.: Object recognition in 3d point clouds using web data and domain adaptation. Int. J. Rob. Res. 29, 1019–1037 (2010)
Article Google Scholar
Serna, A., Marcotegui, B.: Attribute controlled reconstruction and adaptive mathematical morphology. In: Hendriks, C.L.L., Borgefors, G., Strand, R. (eds.) ISMM 2013. LNCS, vol. 7883, pp. 207–218. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Nokia Research Center, Tampere, Finland
Pouria Babahajiani & Lixin Fan
Tampere University of Technology, Tampere, Finland
Moncef Gabbouj

Authors

Pouria Babahajiani
View author publications
You can also search for this author in PubMed Google Scholar
Lixin Fan
View author publications
You can also search for this author in PubMed Google Scholar
Moncef Gabbouj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pouria Babahajiani .

Editor information

Editors and Affiliations

Center for Visual Information Technology, International Institute of Information Technology, Hyderabad, India
C.V. Jawahar
Institue of Computing Technology, Chinese Academy of Sciences, Beijing, China
Shiguang Shan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Babahajiani, P., Fan, L., Gabbouj, M. (2015). Object Recognition in 3D Point Cloud of Urban Street Scene. In: Jawahar, C., Shan, S. (eds) Computer Vision - ACCV 2014 Workshops. ACCV 2014. Lecture Notes in Computer Science(), vol 9008. Springer, Cham. https://doi.org/10.1007/978-3-319-16628-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-16628-5_13
Published: 12 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16627-8
Online ISBN: 978-3-319-16628-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics