International Journal of Computer Vision

, Volume 100, Issue 1, pp 38–58 | Cite as

Minimizing Energies with Hierarchical Costs

  • Andrew Delong
  • Lena Gorelick
  • Olga Veksler
  • Yuri Boykov
Article

Abstract

Computer vision is full of problems elegantly expressed in terms of energy minimization. We characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm. Hierarchical costs are natural for modeling an array of difficult problems. For example, in semantic segmentation one could rule out unlikely object combinations via hierarchical context. In geometric model estimation, one could penalize the number of unique model families in a solution, not just the number of models—a kind of hierarchical MDL criterion. Hierarchical fusion uses the well-known α-expansion algorithm as a subroutine, and offers a much better approximation bound in important cases.

Keywords

Energy minimization Hierarchical models Graph cuts Markov random fields (MRFs) Segmentation 

References

  1. Aggarwal, C. C., Orlin, J. B., & Tai, R. P. (1997). Optimized crossover for the independent set problem. Operations Research, 45(2), 226–234. MathSciNetMATHCrossRefGoogle Scholar
  2. Ahuja, R. K., Ergun, Ö., Orlin, J. B., & Punnen, A. P. (2002). A survey of very large-scale neighborhood search techniques. Discrete Applied Mathematics, 123(1–3), 75–202. MathSciNetMATHCrossRefGoogle Scholar
  3. Barinova, O., Lempitsky, V., & Kohli, P. (2010). On the detection of multiple object instances using Hough transforms. In IEEE conference on computer vision and pattern recognition (CVPR), June 2010. Google Scholar
  4. Bartal, Y. (1998). On approximating arbitrary metrics by tree metrics. In ACM symposium on theory of computing (STOC). Google Scholar
  5. Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In International conference on computer vision (ICCV). Google Scholar
  6. Boros, E., & Hammer, P. L. (2002). Pseudo-boolean optimization. Discrete Applied Mathematics, 123(1–3), 155–225. MathSciNetMATHCrossRefGoogle Scholar
  7. Boykov, Y., & Jolly, M.-P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In International conference on computer vision (ICCV). Google Scholar
  8. Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Recognition and Machine Intelligence, 29(9), 1124–1137. CrossRefGoogle Scholar
  9. Boykov, Y., & Veksler, O. (2006). Graph cuts in vision and graphics: theories and applications. In N. Paragios, Y. Chen, & O. Faugeras (Eds.), Handbook of mathematical models in computer vision (pp. 79–96). New York: Springer. CrossRefGoogle Scholar
  10. Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Recognition and Machine Intelligence, 23(11), 1222–1239. CrossRefGoogle Scholar
  11. Choi, M. J., Lim, J. J., Torralba, A., & Willsky, A. S. (2010). Exploiting hierarchical context on a large database of object categories. In IEEE conference on computer vision and pattern recognition (CVPR), June 2010. Google Scholar
  12. Cunningham, W., & Tang, L. (1999). Optimal 3-terminal cuts and linear programming. In LNCS, Vol. 1610: Integer programming and combinatorial optimization (pp. 114–125). CrossRefGoogle Scholar
  13. Delong, A. (2011). Advances in graph-cut optimization. PhD thesis, University of Western Ontario. Google Scholar
  14. Delong, A., Gorelick, L., Schmidt, F. R., Veksler, O., & Boykov, Y. (2011). Interactive segmentation with super-labels. In Energy minimization methods in computer vision and pattern recognition (EMMCVPR), July 2011. Google Scholar
  15. Delong, A., Osokin, A., Isack, H. N., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27 (Earlier version in CVPR 2010). MathSciNetMATHCrossRefGoogle Scholar
  16. Feige, U. (1998). A threshold of lnn for approximating set cover. Journal of the ACM, 45(4), 634–652. MathSciNetMATHCrossRefGoogle Scholar
  17. Felzenszwalb, P. F., Pap, G., Tardos, É., & Zabih, R. (2010). Globally optimal pixel labeling algorithms for tree metrics. In IEEE conference on computer vision and pattern recognition (CVPR). Google Scholar
  18. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395. MathSciNetCrossRefGoogle Scholar
  19. Givoni, I. E., Chung, C., & Frey, B. J. (2011). Hierarchical affinity propagation. In Uncertainty in artificial intelligence (UAI), July 2011. Google Scholar
  20. Goldberg, A. V., & Tarjan, R. E. (1988). A new approach to the maximum-flow problem. Journal of the Association for Computing Machinery, 35(4), 921–940. MathSciNetMATHCrossRefGoogle Scholar
  21. Gorelick, L., Delong, A., Veksler, O., & Boykov, Y. (2011). Recursive MDL via graph cuts: application to segmentation. In International conference on computer vision (ICCV), November 2011. Google Scholar
  22. Greig, D., Porteous, B., & Seheult, A. (1989). Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society B, 51(2), 271–279. Google Scholar
  23. Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press. Google Scholar
  24. Hochbaum, D. S. (1982). Heuristics for the fixed cost median problem. Mathematical Programming, 22(1), 148–162. MathSciNetMATHCrossRefGoogle Scholar
  25. Isack, H. N., & Boykov, Y. (2012). Energy-based geometric multi-model fitting. International Journal on Computer Vision, 97(2), 123–147. MathSciNetMATHCrossRefGoogle Scholar
  26. Kalogerakis, E., Hertzmann, A., & Singh, K. (2010). Learning 3D mesh segmentation and labeling. In ACM SIGGRAPH. Google Scholar
  27. Kantor, E., & Peleg, D. (2009). Approximate hierarchical facility location and applications to the bounded depth Steiner tree and range assignment problems. Journal of Discrete Algorithms, 7(3), 341–362. MathSciNetMATHCrossRefGoogle Scholar
  28. Kleinberg, J., & Tardos, É. (2002). Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. Journal of the ACM, 49, 5. MathSciNetCrossRefGoogle Scholar
  29. Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583. CrossRefGoogle Scholar
  30. Kolmogorov, V., & Rother, C. (2007). Minimizing non-submodular functions with graph cuts—a review. IEEE Transactions on Pattern Recognition and Machine Intelligence (TPAMI), 29(7), 1274–1279 CrossRefGoogle Scholar
  31. Kolmogorov, V., & Zabih, R. (2004). What energy functions can be optimized via graph cuts. IEEE Transactions on Pattern Recognition and Machine Intelligence, 26(2), 147–159. CrossRefGoogle Scholar
  32. Kumar, M. P., & Koller, D. (2009). MAP estimation of semi-metric MRFs via hierarchical graph cuts. In Conference on uncertainty in artificial intelligence (pp. 313–320), June 2009. Google Scholar
  33. Ladický, L., Russell, C., Kohli, P., & Torr, P. H. S. (2010a). Graph cut based inference with co-occurrence statistics. In European conference on computer vision (ECCV), September 2010. Google Scholar
  34. Ladický, L., Sturgess, P., Russell, C., Sengupta, S., Bastanlar, Y., Clocksin, W., & Torr, P. H. S. (2010b) Joint optimisation for object class segmentation and dense stereo reconstruction. In British machine vision conference (BMVC). Google Scholar
  35. Lazic, N., Givoni, I., Frey, B. J., & Aarabi, P. (2009). FLoSS: facility location for subspace segmentation. In International conference on computer vision (ICCV). Google Scholar
  36. Lempitsky, V., Rother, C., Roth, S., & Blake, A. (2010). Fusion moves for Markov random field optimization. IEEE Transactions on Pattern Analysis and Machine Inference, 32, 1392–1405. CrossRefGoogle Scholar
  37. Li, S. Z. (1994). Markov random field modeling in image analysis. Berlin: Springer. Google Scholar
  38. Li, H. (2007). Two-view motion segmentation from linear programming relaxation. In IEEE conference on computer vision and pattern recognition (CVPR). Google Scholar
  39. Meyers, C., & Orlin, J. B. (2007). Very large-scale neighborhood search techniques in timetabling problems. In Practice and theory of automated timetabling (Vol. VI, p. 24). Google Scholar
  40. Olsson, C., Byröd, M., Overgaard, N. C., & Kahl, F. (2009). Extending continuous cuts: anisotropic metrics and expansion moves. In International conference on computer vision, October 2009. Google Scholar
  41. Pock, T., Schoenemann, T., Graber, G., Bischof, H., & Cremers, D. (2008). A convex formulation of continuous multi-label problems. In European conference on computer vision (ECCV), October 2008. Google Scholar
  42. Pock, T., Chambolle, A., Bischof, H., & Cremers, D. (2009). A convex relaxation approach for computing minimal partitions. In IEEE conference on computer vision and pattern recognition (CVPR), June 2009. Google Scholar
  43. Potts, R. B. (1952). Some generalized order-disorder transformations. Mathematical Proceedings of the Cambridge Philosophical Society, 48, 106–109. MathSciNetMATHCrossRefGoogle Scholar
  44. Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry. In IEEE conference on computer vision and pattern recognition (CVPR). Google Scholar
  45. Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In IEEE conference on computer vision and pattern recognition (CVPR), June 2007. Google Scholar
  46. Sahin, G., & Süral, H. (2007). A review of hierarchical facility location models. Computers and Operations Research, 34(8), 2310–2331. MathSciNetMATHCrossRefGoogle Scholar
  47. Sefer, E., & Kingsford, C. (2011). Metric labeling and semi-metric embedding for protein annotation prediction. In Research in computational molecular biology. Google Scholar
  48. Shmoys, D. B., Tardos, É., & Aardal, K. (1998). Approximation algorithms for facility location problems. In ACM symposium on theory of computing (STOC) (pp. 265–274). Google Scholar
  49. Strandmark, P., & Kahl, F. (2010). Parallel and distributed graph cuts by dual decomposition. In IEEE conference on computer vision and pattern recognition (CVPR), June 2010. Google Scholar
  50. Svitkina, Z., & Tardos, É. (2006). Facility location with hierarchical facility costs. In ACM-SIAM symposium on discrete algorithms (SODA). Google Scholar
  51. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2006). A comparative study of energy minimization methods for Markov random fields. In European conference on computer vision (ECCV) (pp. 16–29). Google Scholar
  52. Torr, P. H. S. (1998). Geometric motion segmentation and model selection. In Philosophical transactions of the royal society A (pp. 1321–1340). Google Scholar
  53. Torr, P. H. S., & Murray, D. (1994). Stochastic motion clustering. In European conference on computer vision (ECCV). Google Scholar
  54. Veksler, O. (1999). Efficient graph-based energy minimization methods in computer vision. PhD thesis, Cornell University. Google Scholar
  55. Werner, T. (2008). High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In IEEE conference on computer vision and pattern recognition (CVPR), June 2008. Google Scholar
  56. Woodford, O. J., Rother, C., & Kolmogorov, V. (2009). A global perspective on map inference for low-level vision. In International conference on computer vision (ICCV), October 2009. Google Scholar
  57. Yuan, J., & Boykov, Y. (2010). TV-based multi-label image segmentation with label cost prior. In British machine vision conference (BMVC), September 2010. Google Scholar
  58. Zhou, Q., Wu, T., Liu, W., & Zhu, S.-C. (2011). Scene parsing by data-driven cluster sampling. International Journal of Computer Vision. Google Scholar
  59. Zhu, S.-C., & Yuille, A. L. (1996). Region competition: unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9), 884–900. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Andrew Delong
    • 1
  • Lena Gorelick
    • 1
  • Olga Veksler
    • 1
  • Yuri Boykov
    • 1
  1. 1.Department of Computer ScienceUniversity of Western OntarioLondonCanada

Personalised recommendations