Advertisement

CollageParsing: Nonparametric Scene Parsing by Adaptive Overlapping Windows

  • Frederick Tung
  • James J. Little
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8694)

Abstract

Scene parsing is the problem of assigning a semantic label to every pixel in an image. Though an ambitious task, impressive advances have been made in recent years, in particular in scalable nonparametric techniques suitable for open-universe databases. This paper presents the CollageParsing algorithm for scalable nonparametric scene parsing. In contrast to common practice in recent nonparametric approaches, CollageParsing reasons about mid-level windows that are designed to capture entire objects, instead of low-level superpixels that tend to fragment objects. On a standard benchmark consisting of outdoor scenes from the LabelMe database, CollageParsing achieves state-of-the-art nonparametric scene parsing results with 7 to 11% higher average per-class accuracy than recent nonparametric approaches.

Keywords

image parsing semantic segmentation scene understanding 

References

  1. 1.
    Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(11), 2189–2202 (2012)CrossRefGoogle Scholar
  2. 2.
    Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: PatchMatch: a randomized correspondence algorithm for structural image editing. In: Proc. ACM SIGGRAPH (2009)Google Scholar
  3. 3.
    Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  4. 4.
    Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(9), 1124–1137 (2004)CrossRefGoogle Scholar
  5. 5.
    Boykov, Y., Veksler, O., Zabih, R.: Efficient approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(12), 1222–1239 (2001)CrossRefGoogle Scholar
  6. 6.
    Chen, X., Shrivastava, A., Gupta, A.: NEIL: extracting visual knowledge from web data. In: Proc. IEEE International Conference on Computer Vision (2013)Google Scholar
  7. 7.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005)Google Scholar
  8. 8.
    Eigen, D., Fergus, R.: Nonparametric image parsing using adaptive neighbor sets. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2799–2806 (2012)Google Scholar
  9. 9.
    Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. In: Proc. International Conference on Machine Learning (2012)Google Scholar
  10. 10.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1785 (2009)Google Scholar
  11. 11.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. International Journal of Computer Vision 59(2), 167–181 (2004)CrossRefGoogle Scholar
  12. 12.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  13. 13.
    Gould, S., Fulton, R., Koller, D.: Decomposing a scene into geometric and semantically consistent regions. In: Proc. IEEE International Conference on Computer Vision (2009)Google Scholar
  14. 14.
    Gould, S., Zhang, Y.: patchMatchGraph: Building a graph of dense patch correspondences for label transfer. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 439–452. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Hays, J., Efros, A.A.: Scene completion using millions of photographs. In: Proc. ACM SIGGRAPH (2007)Google Scholar
  16. 16.
    Heitz, G., Koller, D.: Learning spatial context: Using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Hou, X., Zhang, L.: Saliency detection: a spectral residual approach. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  18. 18.
    Isola, P., Liu, C.: Scene collaging: analysis and synthesis of natural images with semantic layers. In: Proc. IEEE International Conference on Computer Vision (2013)Google Scholar
  19. 19.
    Juneja, M., Vedaldi, A., Jawahar, C.V., Zisserman, A.: Blocks that shout: distinctive parts for scene classification. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2013)Google Scholar
  20. 20.
    Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence 26(2), 147–159 (2004)CrossRefGoogle Scholar
  21. 21.
    Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing via label transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(12), 2368–2382 (2011)CrossRefGoogle Scholar
  22. 22.
    Liu, C., Yuen, J., Torralba, A.: SIFT Flow: dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(5), 978–994 (2011)CrossRefGoogle Scholar
  23. 23.
    Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of Exemplar-SVMs for object detection and beyond. In: Proc. IEEE International Conference on Computer Vision, pp. 89–96 (2011)Google Scholar
  24. 24.
    McCann, S., Lowe, D.G.: Spatially local coding for object recognition. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 204–217. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  25. 25.
    Myeong, H., Chang, J.Y., Lee, K.M.: Learning object relationships via graph-based context model. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2727–2734 (2012)Google Scholar
  26. 26.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145–175 (2001)CrossRefzbMATHGoogle Scholar
  27. 27.
    Parikh, D., Grauman, K.: Relative attributes. In: Proc. IEEE International Conference on Computer Vision, pp. 503–510 (2011)Google Scholar
  28. 28.
    Patterson, G., Hays, J.: SUN Attribute database: discovering, annotating, and recognizing scene attributes. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2751–2758 (2012)Google Scholar
  29. 29.
    Russell, B.C., Torralba, A., Murphy, K., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. International Journal of Computer Vision 77(1-3), 157–173 (2008)CrossRefGoogle Scholar
  30. 30.
    van de Sande, K.E.A., Uijlings, J.R.R., Gevers, T., Smeulders, A.W.M.: Segmentation as selective search for object recognition. In: Proc. IEEE International Conference on Computer Vision, pp. 1879–1886 (2011)Google Scholar
  31. 31.
    Singh, G., Košecká, J.: Nonparametric scene parsing with adaptive feature relevance and semantic context. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3151–3157 (2013)Google Scholar
  32. 32.
    Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 73–86. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  33. 33.
    Tighe, J., Lazebnik, S.: Finding things: image parsing with regions and per-exemplar detectors. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3001–3008 (2013)Google Scholar
  34. 34.
    Tighe, J., Lazebnik, S.: Superparsing: scalable nonparametric image parsing with superpixels. International Journal of Computer Vision 101(2), 329–349 (2013)CrossRefMathSciNetGoogle Scholar
  35. 35.
    Tuytelaars, T., Fritz, M., Saenko, K., Darrell, T.: The NBNN kernel. In: Proc. IEEE International Conference on Computer Vision, pp. 1824–1831 (2011)Google Scholar
  36. 36.
    Wu, J., Rehg, J.M.: CENTRIST: a visual descriptor for scene categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(8), 1489–1501 (2011)CrossRefGoogle Scholar
  37. 37.
    Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: SUN database: large-scale scene recognition from abbey to zoo. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Frederick Tung
    • 1
  • James J. Little
    • 1
  1. 1.Department of Computer ScienceUniversity of British ColumbiaVancouverCanada

Personalised recommendations