SuperParsing: Scalable Nonparametric Image Parsing with Superpixels

  • Joseph Tighe
  • Svetlana Lazebnik
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6315)


This paper presents a simple and effective nonparametric approach to the problem of image parsing, or labeling image regions (in our case, superpixels produced by bottom-up segmentation) with their categories. This approach requires no training, and it can easily scale to datasets with tens of thousands of images and hundreds of labels. It works by scene-level matching with global image descriptors, followed by superpixel-level matching with local features and efficient Markov random field (MRF) optimization for incorporating neighborhood context. Our MRF setup can also compute a simultaneous labeling of image regions into semantic classes (e.g., tree, building, car) and geometric classes (sky, vertical, ground). Our system outperforms the state-of-the-art nonparametric method based on SIFT Flow on a dataset of 2,688 images and 33 labels. In addition, we report per-pixel rates on a larger dataset of 15,150 images and 170 labels. To our knowledge, this is the first complete evaluation of image parsing on a dataset of this size, and it establishes a new benchmark for the problem.


Training Image Query Image Markov Random Field Semantic Label Boost Decision Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    He, X., Zemel, R., Carreira-Perpinan, M.: Multiscale CRFs for image labeling. In: CVPR (2004)Google Scholar
  2. 2.
    Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR (2008)Google Scholar
  3. 3.
    Hoiem, D., Efros, A., Hebert, M.: Recovering surface layout from an image. IJCV 75, 151–172 (2007)CrossRefGoogle Scholar
  4. 4.
    Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: ICCV, Rio de Janeiro (2007)Google Scholar
  5. 5.
    Malisiewicz, T., Efros, A.A.: Recognition by association via learning per-exemplar distances. In: CVPR (2008)Google Scholar
  6. 6.
    Gould, S., Fulton, R., Koller, D.: Decomposing a scene into geometric and semantically consistent regions. In: ICCV (2009)Google Scholar
  7. 7.
    Russell, B.C., Torralba, A., Liu, C., Fergus, R., Freeman, W.T.: Object recognition by scene alignment. In: NIPS (2007)Google Scholar
  8. 8.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR (2008)Google Scholar
  9. 9.
    Heitz, G., Koller, D.: Learning spatial context: Using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Divvala, S.K., Hoiem, D., Hays, J.H., Efros, A.A., Hebert, M.: An empirical study of context in object detection. In: CVPR (2009)Google Scholar
  11. 11.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. IJCV 77, 157–173 (2008)CrossRefGoogle Scholar
  12. 12.
    Hays, J., Efros, A.A.: Im2gps: Estimating geographic information from a single image. In: CVPR (2008)Google Scholar
  13. 13.
    Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large dataset for non-parametric object and scene recognition. PAMI 30, 1958–1970 (2008)Google Scholar
  14. 14.
    Liu, C., Yuen, J., Torralba, A., Sivic, J., Freeman, W.T.: SIFT flow: dense correspondence across difference scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 28–42. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR (2009)Google Scholar
  16. 16.
    Gu, C., Lim, J.J., Arbelaez, P., Malik, J.: Recognition using regions. In: CVPR (2009)Google Scholar
  17. 17.
    Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? PAMI 26, 147–159 (2004)Google Scholar
  18. 18.
    Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. PAMI 26, 1124–1137 (2004)Google Scholar
  19. 19.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)Google Scholar
  20. 20.
    Oliva, A., Torralba, A.: Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research 155 (2006)Google Scholar
  21. 21.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. International Journal of Computer Vision 2 (2004)Google Scholar
  22. 22.
    Bagon, S.: Graph cut matlab wrapper (2006),

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Joseph Tighe
    • 1
  • Svetlana Lazebnik
    • 1
  1. 1.Dept. of Computer ScienceUniversity of North Carolina at Chapel HillChapel Hill

Personalised recommendations