Geometry Driven Semantic Labeling of Indoor Scenes

Khan, Salman Hameed; Bennamoun, Mohammed; Sohel, Ferdous; Togneri, Roberto

doi:10.1007/978-3-319-10590-1_44

Salman Hameed Khan¹⁹,
Mohammed Bennamoun¹⁹,
Ferdous Sohel¹⁹ &
…
Roberto Togneri²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8689))

Included in the following conference series:

European Conference on Computer Vision

37k Accesses
25 Citations

Abstract

We present a discriminative graphical model which integrates geometrical information from RGBD images in its unary, pairwise and higher order components. We propose an improved geometry estimation scheme which is robust to erroneous sensor inputs. At the unary level, we combine appearance based beliefs defined on pixels and planes using a hybrid decision fusion scheme. Our proposed location potential gives an improved representation of the planar classes. At the pairwise level, we learn a balanced combination of various boundaries to consider the spatial discontinuity. Finally, we treat planar regions as higher order cliques and use graphcuts to make efficient inference. In our model based formulation, we use structured learning to fine tune the model parameters. We test our approach on two RGBD datasets and demonstrate significant improvements over the state-of-the-art scene labeling techniques.

Download to read the full chapter text

Chapter PDF

Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Article 03 July 2015

Joint 3D Object and Layout Inference from a Single RGB-D Image

3D Aware Correction and Completion of Depth Maps in Piecewise Planar Scenes

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Cadena, C., Košecká, J.: Semantic segmentation with heterogeneous sensor coverages (2014)
Google Scholar
Carreira, J., Sminchisescu, C.: Cpmc: Automatic object segmentation using constrained parametric min-cuts. TPAMI 34(7), 1312–1328 (2012)
Article Google Scholar
Couprie, C., Farabet, C., Najman, L., LeCun, Y.: Indoor semantic segmentation using depth information. In: ICLR (2013)
Google Scholar
Edwards, W., Miles Jr., R.F., Von Winterfeldt, D.: Advances in decision analysis: from foundations to applications. Cambridge University Press (2007)
Google Scholar
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. TPAMI 35(8), 1915–1929 (2013)
Article Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. IJCV 59(2), 167–181 (2004)
Article Google Scholar
Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. TIT 21(1), 32–40 (1975)
MATH MathSciNet Google Scholar
Gould, S., Baumstarck, et al.: Integrating visual and range data for robotic object detection. In: Workshop on M2SFA2 (2008)
Google Scholar
Gould, S., Fulton, R., Koller, D.: Decomposing a scene into geometric and semantically consistent regions. In: ICCV, pp. 1–8. IEEE (2009)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Article Google Scholar
Hayat, M., Bennamoun, M., An, S.: Learning non-linear reconstruction models for image set classification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2014)
Google Scholar
He, X., Zemel, R.S., Carreira-Perpinán, M.A.: Multiscale conditional random fields for image labeling. In: CVPR, vol. 2, pp. II–695. IEEE (2004)
Google Scholar
Huang, Q., Han, M., Wu, B., Ioffe, S.: A hierarchical conditional random field model for labeling and segmenting images of street scenes. In: CVPR, pp. 1953–1960. IEEE (2011)
Google Scholar
Jiang, Y., Lim, M., et al.: Learning to place new objects in a scene. IJRR 31(9), 1021–1043 (2012)
Google Scholar
Joachims, T., Finley, T., Yu, C.N.J.: Cutting-plane training of structural svms. JML 77(1), 27–59 (2009)
Article MATH Google Scholar
Kappes, J.H., Andres, B., Hamprecht, F.A., Schnorr, C., Nowozin, S., Batra, D., Kim, S., Kausler, B.X., Lellmann, J., Komodakis, N., et al.: A comparative study of modern inference techniques for discrete energy minimization problems. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1328–1335. IEEE (2013)
Google Scholar
Khan, S., Bennamoun, M., Sohel, F., Togneri, R.: Automatic feature learning for robust shadow detection. In: CVPR. IEEE (2014)
Google Scholar
Kohli, P., Kumar, M.P., Torr, P.H.: P3 & beyond: Solving energies with higher order cliques. In: CVPR, pp. 1–8. IEEE (2007)
Google Scholar
Kohli, P., Torr, P.H., et al.: Robust higher order potentials for enforcing label consistency. IJCV 82(3), 302–324 (2009)
Article Google Scholar
Koppula, H.S., Anand, A., et al.: Semantic labeling of 3D point clouds for indoor scenes. In: NIPS, pp. 244–252 (2011)
Google Scholar
Ladicky, L., Russell, C., Kohli, P., Torr, P.H.: Associative hierarchical crfs for object class image segmentation. In: ICCV, pp. 739–746. IEEE (2009)
Google Scholar
Ladickỳ, L., Russell, C., et al.: Inference methods for crfs with co-occurrence statistics. IJCV, 1–13 (2013)
Google Scholar
Lempitsky, V., Vedaldi, A., Zisserman, A.: Pylon model for semantic segmentation. In: NIPS, pp. 1485–1493 (2011)
Google Scholar
Li, Y., Tarlow, D., Zemel, R.: Exploring compositional high order pattern potentials for structured output learning (June 2013)
Google Scholar
Muller, A., Behnke, S.: Learning depth-sensitive conditional random fields for semantic segmentation of rgb-d images. In: ICRA (2014)
Google Scholar
Munoz, D., Bagnell, J.A., Hebert, M.: Stacked hierarchical labeling. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 57–70. Springer, Heidelberg (2010)
Chapter Google Scholar
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR, pp. 413–420 (2009)
Google Scholar
Quigley, M., Batra, S., et al.: High-accuracy 3D sensing for mobile manipulation: Improving object detection and door opening. In: ICRA, pp. 2816–2822. IEEE (2009)
Google Scholar
Rabbani, T., van Den Heuvel, F., Vosselmann, G.: Segmentation of point clouds using smoothness constraint. Intl. Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 36(5), 248–253 (2006)
Google Scholar
Rao, D., Le, Q.V., et al.: Grasping novel objects with depth segmentation. In: IROS, pp. 2578–2585. IEEE (2010)
Google Scholar
Ren, X., Bo, L., Fox, D.: Rgb-(d) scene labeling: Features and algorithms. In: CVPR, pp. 2759–2766. IEEE (2012)
Google Scholar
Rother, C., Kolmogorov, V., Blake, A.: Grabcut: Interactive foreground extraction using iterated graph cuts. TOG 23, 309–314 (2004)
Article Google Scholar
Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV 81(1), 2–23 (2009)
Article Google Scholar
Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: ICCV Workshops, pp. 601–608. IEEE (2011)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012)
Chapter Google Scholar
Szummer, M., Kohli, P., Hoiem, D.: Learning CRFs using graph cuts. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 582–595. Springer, Heidelberg (2008)
Chapter Google Scholar
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML, p. 104. ACM (2004)
Google Scholar
Von Gioi, R.G., Jakubowicz, J., Morel, J.M., Randall, G.: Lsd: A fast line segment detector with a false detection control. TPAMI 32(4), 722–732 (2010)
Article Google Scholar
Woodford, O.J., Rother, C., Kolmogorov, V.: A global perspective on map inference for low-level vision. In: ICCV, pp. 2319–2326. IEEE (2009)
Google Scholar
Xiao, J., Owens, A., Torralba, A.: Sun3d: A database of big spaces reconstructed using sfm and object labels. In: ICCV. IEEE (2013)
Google Scholar
Xiong, X., Huber, D.: Using context to create semantic 3D models of indoor environments. In: BMVC, pp. 45–41 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of CSSE, The University of Western Australia, 35 Stirling Highway, Crawley, WA, 6009, Australia
Salman Hameed Khan, Mohammed Bennamoun & Ferdous Sohel
School of EECE, The University of Western Australia, 35 Stirling Highway, Crawley, WA, 6009, Australia
Roberto Togneri

Authors

Salman Hameed Khan
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Bennamoun
View author publications
You can also search for this author in PubMed Google Scholar
Ferdous Sohel
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Togneri
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
PSI, iMinds, KU Leuven, ESAT, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

1 Electronic Supplementary Material

Electronic Supplementary Material (PDF 3,346 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, S.H., Bennamoun, M., Sohel, F., Togneri, R. (2014). Geometry Driven Semantic Labeling of Indoor Scenes. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8689. Springer, Cham. https://doi.org/10.1007/978-3-319-10590-1_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-10590-1_44
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10589-5
Online ISBN: 978-3-319-10590-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Geometry Driven Semantic Labeling of Indoor Scenes

Abstract

Chapter PDF

Similar content being viewed by others

Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Joint 3D Object and Layout Inference from a Single RGB-D Image

3D Aware Correction and Completion of Depth Maps in Piecewise Planar Scenes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (PDF 3,346 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Geometry Driven Semantic Labeling of Indoor Scenes

Abstract

Chapter PDF

Similar content being viewed by others

Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Joint 3D Object and Layout Inference from a Single RGB-D Image

3D Aware Correction and Completion of Depth Maps in Piecewise Planar Scenes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (PDF 3,346 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation