Minimizing Energies with Hierarchical Costs

Delong, Andrew; Gorelick, Lena; Veksler, Olga; Boykov, Yuri

doi:10.1007/s11263-012-0531-x

Minimizing Energies with Hierarchical Costs

Published: 09 May 2012

Volume 100, pages 38–58, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Andrew Delong¹,
Lena Gorelick¹,
Olga Veksler¹ &
…
Yuri Boykov¹

901 Accesses
27 Citations
Explore all metrics

Abstract

Computer vision is full of problems elegantly expressed in terms of energy minimization. We characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm. Hierarchical costs are natural for modeling an array of difficult problems. For example, in semantic segmentation one could rule out unlikely object combinations via hierarchical context. In geometric model estimation, one could penalize the number of unique model families in a solution, not just the number of models—a kind of hierarchical MDL criterion. Hierarchical fusion uses the well-known α-expansion algorithm as a subroutine, and offers a much better approximation bound in important cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized Fusion Moves for Continuous Label Optimization

Curve Propagation, Level Set Methods and Grouping

Ground Truth Energies for Hierarchies of Segmentations

Notes

Note that α-expansion itself does not require D _p(⋅)≥0; this assumption is only needed for analysis of worst-case bounds.
A tree is irreducible if all its internal nodes have at least two children, i.e. there are no ‘redundant’ parent nodes and so for each i there exists some γ,ζ such that lca(γ,ζ)=i.
Due to our assumption that V is semi-metric and so V(ℓ,ℓ)=0, we can simply sum over all $pq \in \mathcal{A}_{j}$ instead of only where $f^{*}_{p} \neq f^{*}_{q}$.

References

Aggarwal, C. C., Orlin, J. B., & Tai, R. P. (1997). Optimized crossover for the independent set problem. Operations Research, 45(2), 226–234.
Article MathSciNet MATH Google Scholar
Ahuja, R. K., Ergun, Ö., Orlin, J. B., & Punnen, A. P. (2002). A survey of very large-scale neighborhood search techniques. Discrete Applied Mathematics, 123(1–3), 75–202.
Article MathSciNet MATH Google Scholar
Barinova, O., Lempitsky, V., & Kohli, P. (2010). On the detection of multiple object instances using Hough transforms. In IEEE conference on computer vision and pattern recognition (CVPR), June 2010.
Google Scholar
Bartal, Y. (1998). On approximating arbitrary metrics by tree metrics. In ACM symposium on theory of computing (STOC).
Google Scholar
Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In International conference on computer vision (ICCV).
Google Scholar
Boros, E., & Hammer, P. L. (2002). Pseudo-boolean optimization. Discrete Applied Mathematics, 123(1–3), 155–225.
Article MathSciNet MATH Google Scholar
Boykov, Y., & Jolly, M.-P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In International conference on computer vision (ICCV).
Google Scholar
Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Recognition and Machine Intelligence, 29(9), 1124–1137.
Article Google Scholar
Boykov, Y., & Veksler, O. (2006). Graph cuts in vision and graphics: theories and applications. In N. Paragios, Y. Chen, & O. Faugeras (Eds.), Handbook of mathematical models in computer vision (pp. 79–96). New York: Springer.
Chapter Google Scholar
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Recognition and Machine Intelligence, 23(11), 1222–1239.
Article Google Scholar
Choi, M. J., Lim, J. J., Torralba, A., & Willsky, A. S. (2010). Exploiting hierarchical context on a large database of object categories. In IEEE conference on computer vision and pattern recognition (CVPR), June 2010.
Google Scholar
Cunningham, W., & Tang, L. (1999). Optimal 3-terminal cuts and linear programming. In LNCS, Vol. 1610: Integer programming and combinatorial optimization (pp. 114–125).
Chapter Google Scholar
Delong, A. (2011). Advances in graph-cut optimization. PhD thesis, University of Western Ontario.
Delong, A., Gorelick, L., Schmidt, F. R., Veksler, O., & Boykov, Y. (2011). Interactive segmentation with super-labels. In Energy minimization methods in computer vision and pattern recognition (EMMCVPR), July 2011.
Google Scholar
Delong, A., Osokin, A., Isack, H. N., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27 (Earlier version in CVPR 2010).
Article MathSciNet MATH Google Scholar
Feige, U. (1998). A threshold of lnn for approximating set cover. Journal of the ACM, 45(4), 634–652.
Article MathSciNet MATH Google Scholar
Felzenszwalb, P. F., Pap, G., Tardos, É., & Zabih, R. (2010). Globally optimal pixel labeling algorithms for tree metrics. In IEEE conference on computer vision and pattern recognition (CVPR).
Google Scholar
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Article MathSciNet Google Scholar
Givoni, I. E., Chung, C., & Frey, B. J. (2011). Hierarchical affinity propagation. In Uncertainty in artificial intelligence (UAI), July 2011.
Google Scholar
Goldberg, A. V., & Tarjan, R. E. (1988). A new approach to the maximum-flow problem. Journal of the Association for Computing Machinery, 35(4), 921–940.
Article MathSciNet MATH Google Scholar
Gorelick, L., Delong, A., Veksler, O., & Boykov, Y. (2011). Recursive MDL via graph cuts: application to segmentation. In International conference on computer vision (ICCV), November 2011.
Google Scholar
Greig, D., Porteous, B., & Seheult, A. (1989). Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society B, 51(2), 271–279.
Google Scholar
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
Google Scholar
Hochbaum, D. S. (1982). Heuristics for the fixed cost median problem. Mathematical Programming, 22(1), 148–162.
Article MathSciNet MATH Google Scholar
Isack, H. N., & Boykov, Y. (2012). Energy-based geometric multi-model fitting. International Journal on Computer Vision, 97(2), 123–147.
Article MathSciNet MATH Google Scholar
Kalogerakis, E., Hertzmann, A., & Singh, K. (2010). Learning 3D mesh segmentation and labeling. In ACM SIGGRAPH.
Google Scholar
Kantor, E., & Peleg, D. (2009). Approximate hierarchical facility location and applications to the bounded depth Steiner tree and range assignment problems. Journal of Discrete Algorithms, 7(3), 341–362.
Article MathSciNet MATH Google Scholar
Kleinberg, J., & Tardos, É. (2002). Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. Journal of the ACM, 49, 5.
Article MathSciNet Google Scholar
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.
Article Google Scholar
Kolmogorov, V., & Rother, C. (2007). Minimizing non-submodular functions with graph cuts—a review. IEEE Transactions on Pattern Recognition and Machine Intelligence (TPAMI), 29(7), 1274–1279
Article Google Scholar
Kolmogorov, V., & Zabih, R. (2004). What energy functions can be optimized via graph cuts. IEEE Transactions on Pattern Recognition and Machine Intelligence, 26(2), 147–159.
Article Google Scholar
Kumar, M. P., & Koller, D. (2009). MAP estimation of semi-metric MRFs via hierarchical graph cuts. In Conference on uncertainty in artificial intelligence (pp. 313–320), June 2009.
Google Scholar
Ladický, L., Russell, C., Kohli, P., & Torr, P. H. S. (2010a). Graph cut based inference with co-occurrence statistics. In European conference on computer vision (ECCV), September 2010.
Google Scholar
Ladický, L., Sturgess, P., Russell, C., Sengupta, S., Bastanlar, Y., Clocksin, W., & Torr, P. H. S. (2010b) Joint optimisation for object class segmentation and dense stereo reconstruction. In British machine vision conference (BMVC).
Google Scholar
Lazic, N., Givoni, I., Frey, B. J., & Aarabi, P. (2009). FLoSS: facility location for subspace segmentation. In International conference on computer vision (ICCV).
Google Scholar
Lempitsky, V., Rother, C., Roth, S., & Blake, A. (2010). Fusion moves for Markov random field optimization. IEEE Transactions on Pattern Analysis and Machine Inference, 32, 1392–1405.
Article Google Scholar
Li, S. Z. (1994). Markov random field modeling in image analysis. Berlin: Springer.
Google Scholar
Li, H. (2007). Two-view motion segmentation from linear programming relaxation. In IEEE conference on computer vision and pattern recognition (CVPR).
Google Scholar
Meyers, C., & Orlin, J. B. (2007). Very large-scale neighborhood search techniques in timetabling problems. In Practice and theory of automated timetabling (Vol. VI, p. 24).
Google Scholar
Olsson, C., Byröd, M., Overgaard, N. C., & Kahl, F. (2009). Extending continuous cuts: anisotropic metrics and expansion moves. In International conference on computer vision, October 2009.
Google Scholar
Pock, T., Schoenemann, T., Graber, G., Bischof, H., & Cremers, D. (2008). A convex formulation of continuous multi-label problems. In European conference on computer vision (ECCV), October 2008.
Google Scholar
Pock, T., Chambolle, A., Bischof, H., & Cremers, D. (2009). A convex relaxation approach for computing minimal partitions. In IEEE conference on computer vision and pattern recognition (CVPR), June 2009.
Google Scholar
Potts, R. B. (1952). Some generalized order-disorder transformations. Mathematical Proceedings of the Cambridge Philosophical Society, 48, 106–109.
Article MathSciNet MATH Google Scholar
Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry. In IEEE conference on computer vision and pattern recognition (CVPR).
Google Scholar
Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In IEEE conference on computer vision and pattern recognition (CVPR), June 2007.
Google Scholar
Sahin, G., & Süral, H. (2007). A review of hierarchical facility location models. Computers and Operations Research, 34(8), 2310–2331.
Article MathSciNet MATH Google Scholar
Sefer, E., & Kingsford, C. (2011). Metric labeling and semi-metric embedding for protein annotation prediction. In Research in computational molecular biology.
Google Scholar
Shmoys, D. B., Tardos, É., & Aardal, K. (1998). Approximation algorithms for facility location problems. In ACM symposium on theory of computing (STOC) (pp. 265–274).
Google Scholar
Strandmark, P., & Kahl, F. (2010). Parallel and distributed graph cuts by dual decomposition. In IEEE conference on computer vision and pattern recognition (CVPR), June 2010.
Google Scholar
Svitkina, Z., & Tardos, É. (2006). Facility location with hierarchical facility costs. In ACM-SIAM symposium on discrete algorithms (SODA).
Google Scholar
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2006). A comparative study of energy minimization methods for Markov random fields. In European conference on computer vision (ECCV) (pp. 16–29).
Google Scholar
Torr, P. H. S. (1998). Geometric motion segmentation and model selection. In Philosophical transactions of the royal society A (pp. 1321–1340).
Google Scholar
Torr, P. H. S., & Murray, D. (1994). Stochastic motion clustering. In European conference on computer vision (ECCV).
Google Scholar
Veksler, O. (1999). Efficient graph-based energy minimization methods in computer vision. PhD thesis, Cornell University.
Werner, T. (2008). High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In IEEE conference on computer vision and pattern recognition (CVPR), June 2008.
Google Scholar
Woodford, O. J., Rother, C., & Kolmogorov, V. (2009). A global perspective on map inference for low-level vision. In International conference on computer vision (ICCV), October 2009.
Google Scholar
Yuan, J., & Boykov, Y. (2010). TV-based multi-label image segmentation with label cost prior. In British machine vision conference (BMVC), September 2010.
Google Scholar
Zhou, Q., Wu, T., Liu, W., & Zhu, S.-C. (2011). Scene parsing by data-driven cluster sampling. International Journal of Computer Vision.
Zhu, S.-C., & Yuille, A. L. (1996). Region competition: unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9), 884–900.
Article Google Scholar

Download references

Acknowledgements

We wish to thank the anonymous reviewers for careful reading and helpful comments. This work was supported by NSERC Discovery Grant R3584A02, the Canadian Foundation for Innovation (CFI), and the Early Researcher Award (ERA) program.

Author information

Authors and Affiliations

Department of Computer Science, University of Western Ontario, London, ON, Canada, N6A 5B7
Andrew Delong, Lena Gorelick, Olga Veksler & Yuri Boykov

Authors

Andrew Delong
View author publications
You can also search for this author in PubMed Google Scholar
Lena Gorelick
View author publications
You can also search for this author in PubMed Google Scholar
Olga Veksler
View author publications
You can also search for this author in PubMed Google Scholar
Yuri Boykov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Delong.

Appendices

Appendix A: Proof of Metric Relationships

Pair (V,π) forms a tree metric if V represents an edge-weighted distance in tree π. This means that V(α,β)=d(α,β) where d(α,β) is the sum of edge weights d _ij≥0 along a path from leaf α to leaf β. A tree metric (V,π) is therefore entirely parameterized by its edge weights d _ij where j=π(i). An r-hst metric is just a tree metric where edge costs get cheaper by a factor of $\frac{1}{r}<1$ as we descend the tree, i.e. $d_{ij} \leq\frac{1}{r} d_{jk}$ for j=π(i),k=π(j). So, r-hst metrics are a subclass of tree metrics by definition.

[Tree metrics ⊂ h-metrics]: For a tree metric to be an h-metric, d must satisfy (according to Definition 3, p. 7)

(19)

For each $i \in\mathcal{L}\cup\mathcal{S}$ use shorthand j=π(i) and consider that

(20)

(21)

(22)

(23)

Use inequalities (20) and (21) to replace the left-hand side of (19) and cancel terms with (22) and (23) to get d(α ₁,i)+d(i,α ₂)≤d(α ₁,j)+d(j,α ₂), which is clearly satisfied since d _ij≥0. To see that some (non-h-Potts) h-metrics are not tree metrics, consider the tree and symmetric smoothness cost V(⋅,⋅) below.

[(h-Potts ∩ h-metrics) ⊈ tree metrics]: The example below is a simple h-Potts potential which is also an h-metric but is not a tree metric.

The fact that it is not a tree metric can be verified by setting up a linear program relating edge costs d _ij to node costs w _i, and noting that the system is infeasible if d _ij≥0.

[(h-Potts with w _i≤w _π(i)) ⊂ (h-Potts ∩ tree metrics)]: If node costs $\{w_{i}\}_{i \in\mathcal{S}\cup\{r\}}$ are non-negative and do not increase as we descend the tree (i.e. w _i≤w _π(i)) then we can construct a tree metric by induction. Given some node $j \in\mathcal{S}\cup\{r\}$, assume we have non-negative edge costs so that, for each child $i \in\mathcal{I}(j)$, $d(\alpha,i) = \frac {1}{2}w_{i}$ for all $\alpha\in\mathcal{L}_{i}$. Then we can assign cost $d_{ij} = \frac{1}{2}(w_{j} - w_{i})$ to each child edge of j to get $d(\alpha,j) = \frac{1}{2}w_{j}$ for all $\alpha\in\mathcal{L}_{j}$. Since w _i≤w _j we also have a tree metric for subtree j. It is not necessary to assume w _i≤w _π(i) for an h-Potts potential to be a tree metric, as the example below demonstrates (edge costs are shown on the tree).

[(h-Potts ∩ r-hst metrics) ⊂ (h-Potts with w _i≤w _π(i))]: As described by Kumar and Koller (2009), an r-hst metric has a constant edge cost d _ij between node j and all of its children $i \in\mathcal{I}(j)$. In other words, an r-hst metric is actually parameterized by one common ‘edge’ cost per parent node $\{d_{j}\}_{j \in\mathcal{S}\cup\{r\} }$, where $0 \leq d_{i} \leq\frac{1}{r} d_{\pi(i)}$ for all $i \in \mathcal{S}$. It is easy to see that, for an h-Potts potential to be an r-hst metric, it must have w _i=w _j−2d _j where j=π(i). Thus d _j≥0 implies w _i≤w _j. Also note that r>1 means quantity w _j−w _i must decrease at a rate of $\frac {1}{r}$ as we descend the tree. □

Appendix B: Proof of Theorem 5

Proof

Without loss of generality we assume that all weights w _pq=1. Consider any local minimum $\hat{f}^{j}$ computed by h-fusion at internal node j, and let us choose some child node $i \in\mathcal{I}(j)$. We first define a useful set of pixels for i with respect to a global optimum f ^∗

$$ \mathcal{P}_i = \bigl\{ p : f_p^{*} \in\mathcal{L}_i \bigr\}. $$

This set contains all pixels assigned a label within subtree i, and so for any other child i′≠i we know that $\mathcal{P}_{i} \cap \mathcal{P}_{i'} = \emptyset$.

We can produce a labeling $\hat{f}^{j \otimes i}$ within one h-fusion move from local minimum $\hat{f}^{j}$ as follows:

$$ \hat{f}^{j \otimes i}_p = \left\{ \begin{array}{l@{\quad}l} \hat{f}^i_p & \text{if $p \in\mathcal{P}_{i}$}\\ [3pt] \hat{f}^j_p & \text{otherwise}. \end{array} \right. $$

Since each $\hat{f}^{j}$ is known to be a local optimum w.r.t. expansion moves for each $i \in\mathcal{I}(j)$ we know that

$$ E\bigl(\hat{f}^j\bigr) \leq E\bigl( \hat{f}^{j \otimes i}\bigr). $$

(24)

The general strategy to use (24) for different i to build an inequality that is ultimately of the form $E(\hat{f}^{j}) \leq E(f^{*}) + \mathrm{error}$. This will be achieved by breaking the energy terms in E into parts in such a way that a recursive inequality can be established. The recursive inequality will then be expanded until all terms can be bounded relative to E(f ^∗).

Let $E(\cdot)|_{\mathcal{A}}$ denote a restriction of the summands of energy (1) to only the following terms:

$$ E(f)|_\mathcal{A} = \sum_{p \in\mathcal{A}} D_p( f_p ) + \sum _{pq \in\mathcal{A}} V( f_p, f_q). $$

We separate the unary and pairwise terms of E(f) via interior, exterior, and boundary sets with respect to pixels $\mathcal{P}_{i}$:

Let E _H(f) denote the total label cost incurred by a labeling f, i.e. the sum of label cost terms. The following facts now hold:

(25)

(26)

We have not accounted for the label costs yet, but for simplicity we break this proof into two parts: part 1 derives the coefficient c related to smoothness costs V, and part 2 derives the coefficient c ₂ related to label costs H. For part 1 we can assume there are no label costs at all.

Part 1. Derive Coefficient c for Smoothness Cost Bound

Using (25) and (26) we can cancel out all the $\overline{\mathcal{A}}_{i}$ terms and rewrite (24) as

$$ E\bigl(\hat{f}^j\bigr)|_{\mathcal{A}_i \cup \partial\mathcal{A}_i} \leq E \bigl(\hat{f}^i\bigr)|_{\mathcal{A}_i} + E\bigl(\hat{f}^{j \otimes i}\bigr)|_{\partial\mathcal{A}_i}. $$

(27)

For each $i \in\mathcal{I}(j)$ inequality (27) contains a subset of all the energy terms in $E(\hat{f}^{j})|_{\mathcal{A}_{j}}$ pertaining to pixels $\mathcal {P}_{i}$. Let $\mathcal{I}^{*} = \{ i \in\mathcal{I}(j) : \mathcal{P}_{i} \neq \emptyset\}$ be the set of children whose sub-trees contain a label used by f ^∗. If we sum inequality (27) over all $i \in\mathcal{I}^{*}$, the left-hand side will contain all the terms in $E(\hat{f}^{j})|_{\mathcal{A}_{j}}$ (and more). Adding up all the left-hand sides we have

(28)

Using (28) and likewise adding up the right-hand sides of (27) we have

(29)

The first important observation about (29) is that each $E(\hat{f}^{i})_{\mathcal{A}_{j}}$ term on the right-hand side can be substituted by recursively applying the inequality itself. We can recursively substitute, branching further and further down the tree, until the path finally stops at a leaf $\ell\in\mathcal{L}$ giving us base case $E(\hat{f}^{\ell})|_{\mathcal{A}_{\ell}} = \sum_{p \in \mathcal{P}_{\ell}} D_{p}(f_{p}^{*})$. The sets $\{\mathcal{P}_{\ell}\}_{\ell\in\mathcal{L}}$ must be disjoint and their union is $\mathcal{P}_{j}$ so expression (29), when fully expanded, becomes roughly

(30)

The second observation about (29) is that each edge pq on an outer boundary $\partial\mathcal{A}_{i} \cap\partial \mathcal{A}_{j}$ appears once in the sum over $\mathcal{I}^{*}$ whereas each edge on an interior boundary $\partial\mathcal{A}_{i} \setminus\partial\mathcal{A}_{j}$ appears twice: once for $p \in\mathcal{A}_{i}$ and once for some $q \in\mathcal{A}_{i'}$. By careful accounting we collect all the $V(\hat{f}^{i}_{p},\hat{f}^{\pi(i)}_{q})$ terms generated by the recursive substitution and express (29) as^{Footnote 3}

(31)

where we define $\mathcal{J}(\ell;\ell')$ to be the set of nodes along the path from a label $\ell\in\mathcal{L}$ up to, but not including, the lowest common ancestor of ℓ and ℓ′, namely

All that remains is to bound each $V(\hat{f}^{i}_{p},\hat{f}^{\pi(i)}_{q})$ in terms of $V(f^{*}_{p},f^{*}_{q})$ using b _i described in Definition 7. From now on we use $a_{i} = V ^{\max}_{i}$ and $d_{i} = V^{\min}_{i}$ as shorthand. For a particular edge pq shown in (31) we must have each $V(\hat{f}^{i}_{p},\hat{f}^{\pi(i)}_{q}) \leq a_{\pi(i)}$ and so their sum is

(32)

We also know that $V(f^{*}_{p},f^{*}_{q}) \geq d_{\mathrm{lca}(f^{*}_{p},f^{*}_{q})}$ so we can use ratio $\frac{b_{\mathrm{lca}(f^{*}_{p},f^{*}_{q})}}{d_{\mathrm {lca}(f^{*}_{p},f^{*}_{q})}}$ to bound the approximation error at each edge pq appearing in (31), giving upper-bound

(33)

If j is the root of the tree, then $\{p \in\mathcal{A}_{j} \} = \mathcal{P}$ and $\{ pq \in\mathcal{A}_{j} \} = \mathcal{N}$. Using the fact that any ratio $\frac{b_{i}}{d_{i}}$ is bounded from above by quantity c (Definition 7) we arrive at

(34)

(35)

(36)

This completes the proof of Part 1. When there are only smoothness costs, $E(\hat{f}) \leq2c E(f^{*})$ where $\hat{f}$ is the labeling generated at the root of the tree.

Part 2. Derive Coefficient c ₂ for Label Cost Bound

We now revisit from (27) onward but with the assumption that there are hierarchical label costs.

Let E _H(f) denote the total label cost incurred by a labeling f, i.e.the sum of label cost terms. We can bound the label cost $E_{H}(\hat{f}^{j \otimes i})$ of our fused labeling by

$$ E_{H}\bigl(\hat{f}^{j \otimes i}\bigr) \leq E_{H}\bigl(\hat{f}^j\bigr) + \sum_{\substack{L \subseteq \mathcal{L}\setminus\hat{\mathcal{L}}_j\\{L}\cap\hat{\mathcal {L}}_i \neq \emptyset}} H(L) $$

(37)

where $\hat{\mathcal{L}}_{j}$ and $\hat{\mathcal{L}}_{i}$ are the sets of unique labels appearing in $\hat{f}^{j}$ and $\hat{f}^{i}$ respectively.

Recall from Part 1 that, looking at the key inequality (24), we can break the energy terms on each side into parts based on sets $\mathcal{A}_{i}, \overline{\mathcal {A}}_{i}$, and $\partial\mathcal{A}_{i}$. Because $E(\hat{f}^{j \otimes i})|_{\overline{\mathcal{A}}_{i}} = E(\hat{f}^{j})|_{\overline{\mathcal{A}}_{i}}$ these terms cancel out, and we can substitute $E(\hat{f}^{j \otimes i})|_{\mathcal{A}_{i}} = E(\hat{f}^{i})_{\mathcal{A}_{i}}$. Along with bound (37) and canceling the $E_{H}(\hat{f}^{j})$ terms we can now rewrite (24) as

$$ E\bigl(\hat{f}^j\bigr)|_{\mathcal{A}_i \cup \partial\mathcal{A}_i} \leq E\bigl(\hat{f}^i\bigr)|_{\mathcal{A}_i} + E\bigl( \hat{f}^{j \otimes i}\bigr)|_{\partial\mathcal{A}_i} + \sum _{\substack{L \subseteq\mathcal{L}\setminus\hat{\mathcal{L}}_j\\ {L}\cap \hat{\mathcal{L}}_i \neq\emptyset}} H(L). $$

(38)

Again, let $\mathcal{I}^{*} = \{ i \in\mathcal{I}(j) : \mathcal{P}_{i} \neq\emptyset\}$ be the set of child nodes that contain a label used by f ^∗ in their subtree. We sum inequality (38) over all $i \in\mathcal {I}^{*}$ to arrive at a recursive expression, this time incorporating errors incurred by label costs. The key observation is that a particular label cost H(L) appears once on the right-hand side for each element in the set $\mathcal{I}^{*}_{L} = \{ i \in\mathcal{I}^{*} : L \cap\hat{\mathcal{L}} _{i} \neq\emptyset\}$. The sum of inequalities (38) thus implies

(39)

where the quantity in parentheses is identical to that of Part 1.

The above inequality can be recursively expanded for each $E(\hat{f} ^{i})|_{\mathcal{A}_{i}}$ until the recursion stops at a label used by f ^∗. We already know that, after recursive substitution, the quantity in parentheses is bounded by (33). We now must bound the total label cost accumulated by recursive application of (39). The central question is whether a particular subset L that appears in (39) with $|\mathcal{I}^{*}_{L}|>0$ for node j can appear again when we recursively substitute the children $i \in\mathcal{I}^{*}$. If the answer were ‘yes’ then each label cost H(L) could appear more than $|\mathcal{I}^{*}_{L}|$ total times by the end of recursive expansion, leading to a worse bound. Fortunately, Lemma 1 (after this proof) says that this is not the case; each L appearing in the sum for j and child i (38) can never reappear in the sums for i or its children.

From now on we assume j is the root of the tree structure, and so $\hat{f}^{j} = \hat{f}$, i.e.the final labeling output by h-fusion. If we let $\mathcal{H}^{*}$ denote the set of all subsets L generated by recursive substitution of (39), we can thereby write

(40)

Note that the left-hand side of (40) is still $E(\hat{f}^{j})|_{\mathcal{A}_{j}}$ which does not include the label costs incurred by $\hat{f}^{j}$. By adding $E_{H}(\hat{f}^{j})$ to both sides we have $E(\hat{f} ^{j})|_{\mathcal{A}_{j}} + E_{H}(\hat{f}^{j}) = E(\hat{f})$ on the left side, giving a new inequality below.

(41)

All that is left is to re-group the summands in the last three terms (the label cost terms) in a way that proves our theorem. First we rewrite the three sums more explicitly, using $\hat{\mathcal{L}}$ and L ^∗ to denote the unique labels used by $\hat{f}= \hat{f}^{j}$ and f ^∗ respectively.

(42)

First note that if $|\mathcal{I}^{*}_{L}|>1$ then this means $L \supset \mathcal{L}_{i}$ for some $\mathcal{L}_{i} \cap L^{*} \neq\emptyset$ and so L∩L ^∗≠∅ also. We can break the last sum in (42) into two parts based on whether L∩L ^∗≠∅.

(43)

We can also show that $L \in\mathcal{H}^{*} \Rightarrow L \cap \hat{\mathcal{L}}= \emptyset$ as follows. If $L \in\mathcal{H}^{*}$ then there must be some node i such that $L \cap\hat{\mathcal{L}}_{i} = \emptyset$ and $L \subset\mathcal{L}_{i}$. We know from (52) in Lemma 1 that $\hat{\mathcal{L}}\cap\mathcal{L}_{i} \subseteq\hat{\mathcal {L}}_{i}$, so this implies $\emptyset= L \cap\hat{\mathcal{L}}_{i} \supseteq L \cap(\hat {\mathcal{L}}\cap \mathcal{L}_{i}) = L \cap\hat{\mathcal{L}}$. This means the two leftmost sums of (43) have disjoint L and can be bounded by simply $\sum_{L \in\mathcal{H}} H(L)$. It furthermore implies that, for every L appearing in the rightmost sum of (43), the same L must appear in the negative sum. Putting these together we have upper bound on label costs

(44)

We can therefore revise bound (41) to

(45)

Inequality (45) is main result of our theorem. □

Lemma 1

If label subset L appears in the summand of (38) for node j and child i, then L does not appear in the summands of (38) for any k∈subtree(i).

Proof

To be clear, let us restate the claim more formally. Let $\mathcal{H}^{j \otimes i}$ denote all subsets L appearing in the label cost summands of (38) when applied to node j and child i, i.e.

$$ \mathcal{H}^{j \otimes i} \> \lower1pt \hbox{$\buildrel\mathrm{def} \over= $}\>\{ L : L \cap \hat{\mathcal{L}}_j = \emptyset, L \cap\hat{\mathcal{L}}_i \neq \emptyset\}. $$

(46)

We must prove that $L \in\mathcal{H}^{j \otimes i} \Rightarrow L \notin \mathcal{H}^{k \otimes l}$ for any k∈subtree(i) and $l \in \mathcal{I}(k)$.

First note that for each $L \in\mathcal{H}^{j \otimes i}$ we have

(47)

(48)

By the hierarchical label cost assumption (Definition 4) we can use (47) and (48) to conclude that $L \in \mathcal{H}^{j \otimes i} \Rightarrow L \subset\mathcal{L}_{j}$.

Now consider the set $\mathcal{H}^{j \otimes i} \cap\mathcal{H}^{k \otimes l}$. By the definition (46) an element L of this joint set must satisfy at least the following conditions:

(49)

(50)

(51)

However, no subset L can satisfy all three conditions, as we now show. In the h-fusion algorithm, if $\hat{f}^{i}$ contains a label $\ell\in\mathcal{L}_{k}$, then $\hat{f}^{k}$ must contain ℓ as well—after all, there is no other way that a label in $\mathcal{L}_{k}$ could have propagated up to $\hat{f}^{i}$. This relation can be restated as

$$ \hat{\mathcal{L}}_i \cap\mathcal{L}_k \subseteq\hat{\mathcal{L}}_k \quad\forall k \in\mathrm{subtree}(i). $$

(52)

Starting from (49) we can say

which contradicts requirement (50). Thus $\mathcal{H}^{j \otimes i} \cap\mathcal{H}^{k \otimes l} = \emptyset$ for all k∈subtree(i) and so L cannot reappear. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Delong, A., Gorelick, L., Veksler, O. et al. Minimizing Energies with Hierarchical Costs. Int J Comput Vis 100, 38–58 (2012). https://doi.org/10.1007/s11263-012-0531-x

Download citation

Received: 15 October 2011
Accepted: 21 April 2012
Published: 09 May 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s11263-012-0531-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimizing Energies with Hierarchical Costs

Abstract

Access this article

Similar content being viewed by others

Generalized Fusion Moves for Continuous Label Optimization

Curve Propagation, Level Set Methods and Grouping

Ground Truth Energies for Hierarchies of Segmentations

Notes

References

Acknowledgements