Advertisement

Perceptually Inspired Layout-Aware Losses for Image Segmentation

  • Anton Osokin
  • Pushmeet Kohli
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8690)

Abstract

Interactive image segmentation is an important computer vision problem that has numerous real world applications. Models for image segmentation are generally trained to minimize the Hamming error in pixel labeling. The Hamming loss does not ensure that the topology/structure of the object being segmented is preserved and therefore is not a strong indicator of the quality of the segmentation as perceived by users. However, it is still ubiquitously used for training models because it decomposes over pixels and thus enables efficient learning. In this paper, we propose the use of a novel family of higher-order loss functions that encourage segmentations whose layout is similar to the ground-truth segmentation. Unlike the Hamming loss, these loss functions do not decompose over pixels and therefore cannot be directly used for loss-augmented inference. We show how our loss functions can be transformed to allow efficient learning and demonstrate the effectiveness of our method on a challenging segmentation dataset and validate the results using a user study. Our experimental results reveal that training with our layout-aware loss functions results in better segmentations that are preferred by users over segmentations obtained using conventional loss functions.

Keywords

structured prediction image segmentation loss-based learning max-margin learning perceptual error metrics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)Google Scholar
  2. 2.
    Blake, A., Rother, C., Brown, M., Perez, P., Torr, P.: Interactive image segmentation using an adaptive GMMRF model. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 428–441. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Boykov, Y., Kolmogorov, V.: An experimental comparison of Min-Cut/Max-Flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 26(9), 1124–1137 (2004)CrossRefGoogle Scholar
  5. 5.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  6. 6.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  7. 7.
    Goldberg, A.V., Hed, S., Kaplan, H., Tarjan, R.E., Werneck, R.F.: Maximum flows by incremental breadth-first search. In: European Symposium on Algorithms (ESA), pp. 457–468 (2011)Google Scholar
  8. 8.
    Gonzalez, R., Woods, R.: Digital image processing. Prentice Hall (2002)Google Scholar
  9. 9.
    Gulshan, V., Rother, C., Criminisi, A., Blake, A., Zisserman, A.: Geodesic star convexity for interactive image segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)Google Scholar
  10. 10.
    Jegelka, S., Bilmes, J.: Submodularity beyond submodular energies: Coupling edges in graph cuts. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2011)Google Scholar
  11. 11.
    Kohli, P., Kumar, M.P.: Energy minimization for linear envelope MRFs. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)Google Scholar
  12. 12.
    Kohli, P., Ladickỳ, L., Torr, P.: Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision (IJCV) 82(3), 302–324 (2009)CrossRefGoogle Scholar
  13. 13.
    Kohli, P., Nickish, H., Rother, C., Rhemann, C.: User-centric learning and evaluation of interactive segmentation systems. International Journal of Computer Vision (IJCV) 100(3), 261–274 (2012)CrossRefGoogle Scholar
  14. 14.
    Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 26(2), 147–159 (2004)CrossRefGoogle Scholar
  15. 15.
    Krahenbuhl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, NIPS (2011)Google Scholar
  16. 16.
    Lempitsky, V., Vedaldi, A., Zisserman, A.: A Pylon model for semantic segmentation. In: Advances in Neural Information Processing Systems, NIPS (2011)Google Scholar
  17. 17.
    Nowozin, S., Lampert, C.H.: Global interactions in random field models: A potential function ensuring connectedness. SIAM Journal on Imaging Sciences (SIIMS) 3(4) (2010)Google Scholar
  18. 18.
    Pletscher, P., Kohli, P.: Learning low-order models for enforcing high-order statistics. In: International Conference on Artificial Intelligence and Statistics, AISTATS (2012)Google Scholar
  19. 19.
    Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision 81(1), 2–23 (2009)Google Scholar
  20. 20.
    Sprent, P., Smeeton, N.: Applied Nonparametric Statistical Methods. Chapman & Hall/CRC (2001)Google Scholar
  21. 21.
    Szummer, M., Kohli, P., Hoiem, D.: Learning CRFs using graph cuts. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 582–595. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Tarlow, D., Zemel, R.: Structured output learning with high order loss functions. In: International Conference on Artificial Intelligence and Statistics, AISTATS (2012)Google Scholar
  23. 23.
    Taskar, B., Guestrin, C., Koller, D.: Max-Margin Markov networks. In: Advances in Neural Information Processing Systems, NIPS (2003)Google Scholar
  24. 24.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research (JMLR) 6(9), 1453–1484 (2005)zbMATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Anton Osokin
    • 1
  • Pushmeet Kohli
    • 2
  1. 1.Moscow State UniversityMoscowRussia
  2. 2.Microsoft ResearchCambridgeUK

Personalised recommendations