Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding

Wang, Huayan; Gould, Stephen; Koller, Daphne

doi:10.1007/978-3-642-15552-9_32

Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding

Huayan Wang¹⁹,
Stephen Gould²⁰ &
Daphne Koller¹⁹

Conference paper

5437 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6312))

Abstract

We address the problem of understanding an indoor scene from a single image in terms of recovering the layouts of the faces (floor, ceiling, walls) and furniture. A major challenge of this task arises from the fact that most indoor scenes are cluttered by furniture and decorations, whose appearances vary drastically across scenes, and can hardly be modeled (or even hand-labeled) consistently. In this paper we tackle this problem by introducing latent variables to account for clutters, so that the observed image is jointly explained by the face and clutter layouts. Model parameters are learned in the maximum margin formulation, which is constrained by extra prior energy terms that define the role of the latent variables. Our approach enables taking into account and inferring indoor clutter layouts without hand-labeling of the clutters in the training set. Yet it outperforms the state-of-the-art method of Hedau et al. [4] that requires clutter labels.

Download to read the full chapter text

Chapter PDF

References

Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Transactions on PAMI 24(5) (2002)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Transactions on PAMI (to appear, 2010)
Google Scholar
Gould, S., Fulton, R., Koller, D.: Decomposing a scene into geometric and semantically consistent regions. In: ICCV (2009)
Google Scholar
Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered room. In: ICCV 2009 (2009)
Google Scholar
Heitz, G., Koller, D.: Learning spatial context: Using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)
Chapter Google Scholar
Hoiem, D., Efros, A., Hebert, M.: Recovering surface layout from an image. IJCV 75(1) (2007)
Google Scholar
Joachims, T., Finley, T., Yu, C.-N.: Cutting-Plane Training of Structural SVMs. Machine Learning 77(1), 27–59 (2009)
Article Google Scholar
Rother, C.: A new approach to vanishing point detection in architectural environments. IVC 20 (2002)
Google Scholar
Shotton, J., Winn, J., Rother, C., Criminisi, A.: TextonBoost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV (2007)
Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y., Singer, Y.: Large margin methods for structured and interdependent output variables. JMLR 6, 1453–1484 (2005)
Google Scholar
Vedaldi, A., Zisserman, A.: Structured output regression for detection with partial occlusion. In: NIPS (2009)
Google Scholar
Yu, C.-N., Joachims, T.: Learning structural SVMs with latent variable. In: ICML (2009)
Google Scholar
Besag, J.: On the statistical analysis of dirty pictures (with discussions). Journal of the Royal Statistical Society, Series B 48, 259–302 (1986)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Stanford University, CA, USA
Huayan Wang & Daphne Koller
Electrical Engineering Department, Stanford Univeristy, CA, USA
Stephen Gould

Authors

Huayan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Gould
View author publications
You can also search for this author in PubMed Google Scholar
Daphne Koller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, H., Gould, S., Koller, D. (2010). Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15552-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-15552-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15551-2
Online ISBN: 978-3-642-15552-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics