Multi-utility Learning: Structured-Output Learning with Multiple Annotation-Specific Loss Functions

  • Roman Shapovalov
  • Dmitry Vetrov
  • Anton Osokin
  • Pushmeet Kohli
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8932)


Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a multi-utility learning framework for structured prediction that can learn from training instances with different forms of supervision. We propose a unified technique for inferring the loss functions most suitable for quantifying the consistency of solutions with the given weak annotation. We demonstrate the effectiveness of our framework on the challenging semantic image segmentation problem for which a wide variety of annotations can be used. For instance, the popular training datasets for semantic segmentation are composed of images with hard-to-generate full pixel labellings, as well as images with easy-to-obtain weak annotations, such as bounding boxes around objects, or image-level labels that specify which object categories are present in an image. Experimental evaluation shows that the use of annotation-specific loss functions dramatically improves segmentation accuracy compared to the baseline system where only one type of weak annotation is used.


semantic image segmentation structured-output learning weakly-supervised learning loss functions 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI 23(11), 1222–1239 (2001)CrossRefGoogle Scholar
  2. 2.
    Delong, A., Osokin, A., Isack, H.N., Boykov, Y.: Fast Approximate Energy Minimization with Label Costs. IJCV 96(1), 1–27 (2012)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Heitz, G., Koller, D.: Learning spatial context: Using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMs. Machine Learning 77(1), 27–59 (2009)CrossRefzbMATHGoogle Scholar
  5. 5.
    Kumar, M.P., Turki, H., Preston, D., Koller, D.: Learning specific-class segmentation from diverse data. In: ICCV, pp. 1800–1807 (November 2011)Google Scholar
  6. 6.
    Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, Where and How Many? Combining Object Detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Lempitsky, V., Kohli, P., Rother, C., Sharp, T.: Image segmentation with a bounding box prior. In: ICCV, pp. 277–284 (September 2009)Google Scholar
  8. 8.
    Liu, K., Raghavan, S., Nelesen, S., Linder, C.R., Warnow, T.: Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science (New York, N.Y.) 324(5934), 1561–1564 (2009)CrossRefGoogle Scholar
  9. 9.
    Pletscher, P., Kohli, P.: Learning low-order models for enforcing high-order statistics. In: AISTATS (2012)Google Scholar
  10. 10.
    Quattoni, A., Wang, S., Morency, L.P., Collins, M., Darrell, T.: Hidden conditional random fields. PAMI 29(10), 1848–1853 (2007)CrossRefGoogle Scholar
  11. 11.
    Schwing, A.G., Hazan, T., Pollefeys, M., Urtasun, R.: Efficient Structured Prediction with Latent Variables for General Graphical Models. In: ICML (2012)Google Scholar
  12. 12.
    Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR (June 2008)Google Scholar
  13. 13.
    Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: textonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Tarlow, D., Zemel, R.S.: Structured Output Learning with High Order Loss Functions. In: AISTATS (2012)Google Scholar
  15. 15.
    Taskar, B., Chatalbashev, V., Koller, D.: Learning associative Markov networks. In: ICML. pp. 102–109, Banff, Alberta, Canada (2004)Google Scholar
  16. 16.
    Tighe, J., Lazebnik, S.: SuperParsing: Scalable Nonparametric Image Parsing with Superpixels. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 352–365. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  17. 17.
    Tighe, J., Lazebnik, S.: Finding Things: Image Parsing with Regions and Per-Exemplar Detectors. In: CVPR, pp. 3001–3008 (June 2013)Google Scholar
  18. 18.
    Torralba, A., Russel, B.C., Yuen, J.: LabelMe: Online Image Annotation and Applications. Proceedings of the IEEE 98(8), 1467–1484 (2010)CrossRefGoogle Scholar
  19. 19.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. JMLR 6, 1453–1484 (2006)MathSciNetGoogle Scholar
  20. 20.
    Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly Supervised Semantic Segmentation with a Multi-Image Model. In: ICCV, Barcelona, ES (2011)Google Scholar
  21. 21.
    Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly Supervised Structured Output Learning for Semantic Segmentation. In: CVPR, Providence, RI (2012)Google Scholar
  22. 22.
    Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In: CVPR (June 2012)Google Scholar
  23. 23.
    Yu, C.N.J., Joachims, T.: Learning structural SVMs with latent variables. In: ICML, Montreal, Canada (2009)Google Scholar
  24. 24.
    Yuille, A., Rangarajan, A.: The concave-convex procedure (CCCP). In: NIPS (2002)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Roman Shapovalov
    • 1
  • Dmitry Vetrov
    • 1
  • Anton Osokin
    • 1
    • 2
  • Pushmeet Kohli
    • 3
  1. 1.Lomonosov Moscow State UniversityRussia
  2. 2.INRIA — SIERRA Project TeamParisFrance
  3. 3.Microsoft ResearchCambridgeUK

Personalised recommendations