A Three-Layered Approach to Facade Parsing

  • Anđelo Martinović
  • Markus Mathias
  • Julien Weissenberg
  • Luc Van Gool
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7578)


We propose a novel three-layered approach for semantic segmentation of building facades. In the first layer, starting from an oversegmentation of a facade, we employ the recently introduced machine learning technique Recursive Neural Networks (RNN) to obtain a probabilistic interpretation of each segment. In the second layer, initial labeling is augmented with the information coming from specialized facade component detectors. The information is merged using a Markov Random Field. In the third layer, we introduce weak architectural knowledge, which enforces the final reconstruction to be architecturally plausible and consistent. Rigorous tests performed on two existing datasets of building facades demonstrate that we significantly outperform the current-state of the art, even when using outputs from earlier layers of the pipeline. Also, we show how the final output of the third layer can be used to create a procedural reconstruction.


Parse Tree Window Detector Shape Grammar Semantic Vector Building Facade 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Teboul, O., Simon, L., Koutsourakis, P., Paragios, N.: Segmentation of building facades using procedural shape priors. In: CVPR (2010)Google Scholar
  2. 2.
    Socher, R., Lin, C.C., Ng, A.Y., Manning, C.D.: Parsing Natural Scenes and Natural Language with Recursive Neural Networks. In: ICML (2011)Google Scholar
  3. 3.
    Teboul, O.: Ecole centrale paris facades database (2010),
  4. 4.
    Zhao, P., Fang, T., Xiao, J., Zhang, H., Zhao, Q., Quan, L.: Rectilinear parsing of architecture in urban environment. In: CVPR (2010)Google Scholar
  5. 5.
    Wendel, A., Donoser, M., Bischof, H.: Unsupervised Facade Segmentation Using Repetitive Patterns. In: Goesele, M., Roth, S., Kuijper, A., Schiele, B., Schindler, K. (eds.) DAGM 2010. LNCS, vol. 6376, pp. 51–60. Springer, Heidelberg (2010)Google Scholar
  6. 6.
    Recky, M., Wendel, A., Leberl, F.: Façade segmentation in a multi-view scenario. In: 3DIMPVT (2011)Google Scholar
  7. 7.
    Mathias, M., Martinovic, A., Weissenberg, J., Haegler, S., Gool, L.V.: Automatic architectural style recognition. In: 3D-ARCH (2011)Google Scholar
  8. 8.
    Korč, F., Förstner, W.: eTRIMS Image Database for interpreting images of man-made scenes. Technical Report TR-IGG-P-2009-01 (April 2009)Google Scholar
  9. 9.
    Xiao, J., Fang, T., Tan, P., Zhao, P., Ofek, E., Quan, L.: Image-based façade modeling. In: SIGGRAPH Asia (2008)Google Scholar
  10. 10.
    Xiao, J., Fang, T., Zhao, P., Lhuillier, M., Quan, L.: Image-based street-side city modeling. SIGGRAPH 28(5) (2009)Google Scholar
  11. 11.
    Korah, T., Rasmussen, C.: Analysis of Building Textures for Reconstructing Partially Occluded Facades. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 359–372. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Mayer, H., Reznik, S.: Mcmc linked with implicit shape models and plane sweeping for 3d building facade interpretation in image sequences. ISPRS (2006)Google Scholar
  13. 13.
    Dick, A.R., Torr, P.H.S., Cipolla, R.: Modelling and interpretation of architecture from several images. IJCV 60 (2004)Google Scholar
  14. 14.
    Muller, P., Zeng, G., Wonka, P., Van Gool, L.: Image-based procedural modeling of facades. SIGGRAPH 26(3) (2007)Google Scholar
  15. 15.
    Gool, L.J.V., Zeng, G., den Borre, F.V., Müller, P.: Towards mass-produced building models. In: PIA (2007)Google Scholar
  16. 16.
    Alegre, O., Dellaert, F.: A probabilistic approach to the semantic interpretation of building facades. In: Workshop on Vision Techniques Applied to the Rehabilitation of City Centres (2004)Google Scholar
  17. 17.
    Ripperda, N., Brenner, C.: Reconstruction of Façade Structures Using a Formal Grammar and RjMCMC. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.) DAGM 2006. LNCS, vol. 4174, pp. 750–759. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. 18.
    Han, F., Zhu, S.C.: Bottom-up/top-down image parsing with attribute grammar. IEEE TPAMI 31(1) (2009)Google Scholar
  19. 19.
    Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., Paragios, N.: Shape grammar parsing via reinforcement learning. In: CVPR (2011)Google Scholar
  20. 20.
    Aliaga, D.G., Rosen, P.A., Bekins, D.R.: Style grammars for interactive visualization of architecture. TVCG 13(4) (2007)Google Scholar
  21. 21.
    Bokeloh, M., Wand, M., Seidel, H.P.: A connection between partial symmetry and inverse procedural modeling. SIGGRAPH 29(4) (2010)Google Scholar
  22. 22.
    Yang, M.Y., Förstner, W.: Regionwise Classification of Building Facade Images. In: Stilla, U., Rottensteiner, F., Mayer, H., Jutzi, B., Butenuth, M. (eds.) PIA 2011. LNCS, vol. 6952, pp. 209–220. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  23. 23.
    Liebowitz, D., Zisserman, A.: Metric rectification for perspective images of planes. In: CVPR (1998)Google Scholar
  24. 24.
    Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE TPAMI 24(5) (2002)Google Scholar
  25. 25.
    Gould, S., Fulton, R., Koller, D.: Decomposing a scene into geometric and semantically consistent regions. In: ICCV (2009)Google Scholar
  26. 26.
    Gould, S., Russakovsky, O., Goodfellow, I., Baumstarck, P., Ng, A.Y., Koller, D.: The stair vision library, v2.2 (2009),
  27. 27.
    Dollar, P., Tu, Z., Perona, P., Belongie, S.: Integral channel features. In: BMVC (2009)Google Scholar
  28. 28.
    Benenson, R., Mathias, M., Timofte, R., Van Gool, L.: Pedestrian detection at 100 frames per second. In: CVPR (2012)Google Scholar
  29. 29.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE TPAMI 23(11) (2001)Google Scholar
  30. 30.
    Mathias, M., Martinovic, A., Weissenberg, J., Gool, L.V.: Procedural 3d building reconstruction using shape grammars and detectors. In: 3DIMPVT (2011)Google Scholar
  31. 31.
    Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)Google Scholar
  32. 32.
    Teboul, O.: Shape Grammar Parsing: Application to Image-based Modeling. PhD thesis, Ecole Centrale Paris (2011)Google Scholar
  33. 33.
    Procedural: CityEngine (2010),

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Anđelo Martinović
    • 1
  • Markus Mathias
    • 1
  • Julien Weissenberg
    • 2
  • Luc Van Gool
    • 1
    • 2
  1. 1.ESAT-PSI/VISICSKU LeuvenBelgium
  2. 2.Computer Vision LaboratoryETH ZurichSwitzerland

Personalised recommendations