Abstract
Interactive image segmentation is an important computer vision problem that has numerous real world applications. Models for image segmentation are generally trained to minimize the Hamming error in pixel labeling. The Hamming loss does not ensure that the topology/structure of the object being segmented is preserved and therefore is not a strong indicator of the quality of the segmentation as perceived by users. However, it is still ubiquitously used for training models because it decomposes over pixels and thus enables efficient learning. In this paper, we propose the use of a novel family of higher-order loss functions that encourage segmentations whose layout is similar to the ground-truth segmentation. Unlike the Hamming loss, these loss functions do not decompose over pixels and therefore cannot be directly used for loss-augmented inference. We show how our loss functions can be transformed to allow efficient learning and demonstrate the effectiveness of our method on a challenging segmentation dataset and validate the results using a user study. Our experimental results reveal that training with our layout-aware loss functions results in better segmentations that are preferred by users over segmentations obtained using conventional loss functions.
Keywords
Download to read the full chapter text
Chapter PDF
References
Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Blake, A., Rother, C., Brown, M., Perez, P., Torr, P.: Interactive image segmentation using an adaptive GMMRF model. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 428–441. Springer, Heidelberg (2004)
Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)
Boykov, Y., Kolmogorov, V.: An experimental comparison of Min-Cut/Max-Flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 26(9), 1124–1137 (2004)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 32(9), 1627–1645 (2010)
Goldberg, A.V., Hed, S., Kaplan, H., Tarjan, R.E., Werneck, R.F.: Maximum flows by incremental breadth-first search. In: European Symposium on Algorithms (ESA), pp. 457–468 (2011)
Gonzalez, R., Woods, R.: Digital image processing. Prentice Hall (2002)
Gulshan, V., Rother, C., Criminisi, A., Blake, A., Zisserman, A.: Geodesic star convexity for interactive image segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Jegelka, S., Bilmes, J.: Submodularity beyond submodular energies: Coupling edges in graph cuts. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2011)
Kohli, P., Kumar, M.P.: Energy minimization for linear envelope MRFs. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Kohli, P., Ladickỳ, L., Torr, P.: Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision (IJCV) 82(3), 302–324 (2009)
Kohli, P., Nickish, H., Rother, C., Rhemann, C.: User-centric learning and evaluation of interactive segmentation systems. International Journal of Computer Vision (IJCV) 100(3), 261–274 (2012)
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 26(2), 147–159 (2004)
Krahenbuhl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, NIPS (2011)
Lempitsky, V., Vedaldi, A., Zisserman, A.: A Pylon model for semantic segmentation. In: Advances in Neural Information Processing Systems, NIPS (2011)
Nowozin, S., Lampert, C.H.: Global interactions in random field models: A potential function ensuring connectedness. SIAM Journal on Imaging Sciences (SIIMS) 3(4) (2010)
Pletscher, P., Kohli, P.: Learning low-order models for enforcing high-order statistics. In: International Conference on Artificial Intelligence and Statistics, AISTATS (2012)
Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision 81(1), 2–23 (2009)
Sprent, P., Smeeton, N.: Applied Nonparametric Statistical Methods. Chapman & Hall/CRC (2001)
Szummer, M., Kohli, P., Hoiem, D.: Learning CRFs using graph cuts. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 582–595. Springer, Heidelberg (2008)
Tarlow, D., Zemel, R.: Structured output learning with high order loss functions. In: International Conference on Artificial Intelligence and Statistics, AISTATS (2012)
Taskar, B., Guestrin, C., Koller, D.: Max-Margin Markov networks. In: Advances in Neural Information Processing Systems, NIPS (2003)
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research (JMLR) 6(9), 1453–1484 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Osokin, A., Kohli, P. (2014). Perceptually Inspired Layout-Aware Losses for Image Segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8690. Springer, Cham. https://doi.org/10.1007/978-3-319-10605-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-10605-2_43
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10604-5
Online ISBN: 978-3-319-10605-2
eBook Packages: Computer ScienceComputer Science (R0)