Abstract
Euclidean optimization problems such as TSP and minimumlength matching admit fast partitioning algorithms that compute nearoptimal solutions on typical instances.
In order to explain this performance, we develop a general framework for the application of smoothed analysis to partitioning algorithms for Euclidean optimization problems. Our framework can be used to analyze both the runningtime and the approximation ratio of such algorithms. We apply our framework to obtain smoothed analyses of Dyer and Frieze’s partitioning algorithm for Euclidean matching, Karp’s partitioning scheme for the TSP, a heuristic for Steiner trees, and a heuristic for degreebounded minimumlength spanning trees.
Introduction
Euclidean optimization problems are a natural class of combinatorial optimization problems. In a Euclidean optimization problem, we are given a set X of points in \(\mathbb{R}^{2}\). The topology used is the complete graph of all points, where the Euclidean distance ∥x−y∥ is the length of the edge connecting the two points x,y∈X.
Many such problems, like the Euclidean traveling salesman problem [22] or the Euclidean Steiner tree problem [14], are NPhard. For others, like minimumlength perfect matching, there exist polynomialtime algorithms. However, these polynomialtime algorithms are sometimes too slow to solve large instances. Thus, fast heuristics to find nearoptimal solutions for Euclidean optimization problems are needed.
A generic approach to design heuristics for Euclidean optimization problems are partitioning algorithms: They divide the Euclidean plane into a number of cells such that each cell contains only a small number of points. This allows us to compute quickly an optimal solution for our optimization problem for the points within each cell. Finally, the solutions of all cells are joined in order to obtain a solution to the whole set of points.
Although this is a rather simple adhoc approach, it works surprisingly well and fast in practice [16, 24]. This is at stark contrast to the worstcase performance of partitioning algorithms: They can both be very slow and output solutions that are far from being optimal. Thus, as it is often the case, worstcase analysis is too pessimistic to explain the performance of partitioning algorithms. The reason for this is that worstcase analysis is dominated by artificially constructed instances that often do not resemble practical instances.
Both to explain the performance of partitioning algorithms and to gain probabilistic insights into the structure and value of optimal solutions of Euclidean optimization problems, the averagecase performance of partitioning algorithms has been studied a lot. In particular, Steele [31] proved complete convergence of Karp’s partitioning algorithm [18] for Euclidean TSP. Also strong central limit theorems for a wide range of optimization problems are known. We refer to Steele [32] and Yukich [35] for comprehensive surveys.
However, also averagecase analysis has its drawback: Random instances usually have very specific properties with overwhelming probability. This is often exploited in averagecase analysis: One shows that the algorithm at hand performs very well if the input has some of these properties. But this does not mean that typical instances share these properties. Thus, although a good averagecase performance can be an indicator that an algorithm performs well, it often fails to explain the performance convincingly.
In order to explain the performance of partitioning schemes for Euclidean optimization problems, we provide a smoothed analysis. Smoothed analysis has been introduced by Spielman and Teng [27] in order to explain the performance of the simplex method for linear programming. It is a hybrid of worstcase and averagecase analysis: An adversary specifies an instance, and this instance is then slightly randomly perturbed. The perturbation can, for instance, model noise from measurement. Since its invention in 2001, smoothed analysis has been applied in a variety of contexts [3, 4, 6, 12, 26]. We refer to two recent surveys [20, 28] for a broader picture.
We develop a general framework for smoothed analysis of partitioning algorithms for optimization problems in the Euclidean plane (Sect. 3). We consider a very general probabilistic model where the adversary specifies n density functions f _{1},…,f _{ n }:[0,1]^{2}→[0,ϕ], one for each point. Then the actual point set is obtained by drawing x _{ i } independently from the others according to f _{ i }. The parameter ϕ controls the adversary’s power: The larger ϕ, the more powerful the adversary. (See Sect. 2.2 for a formal explanation of the model.) We analyze the expected runningtime and approximation performance of a generic partitioning algorithm under this model. The smoothed analysis of the runningtime for partitioning algorithms depends crucially on the convexity of the worstcase bound of the runningtime of the problem under consideration. The main tool for the analysis of the expected approximation ratio is Rhee’s isoperimetric inequality [25]. Let us note that, even in the average case, convergence to the optimal value for large n does not imply a bound on the expected approximation ratio. The reason is that if we compute a very bad solution with very small probability, then this allows convergence results but it deteriorates the expected approximation ratio.
We apply the general framework to obtain smoothed analyses of partitioning algorithms for Euclidean matching (Sect. 4), Karp’s partitioning scheme for the TSP (Sect. 5), Steiner trees (Sect. 6), and degreebounded minimum spanning trees (Sect. 7) in the Euclidean plane. Table 1 shows an overview. To summarize, for ϕ≤log^{O(1)} n, Dyer and Frieze’s partitioning algorithm for computing matchings [10] has an almost linear runningtime, namely O(nlog^{O(1)} n). For ϕ∈o(log^{2} n), its expected approximation ratio tends to 1 as n increases. The approximation ratios of the partitioning algorithms for TSP and Steiner trees tend to 1 for ϕ∈o(logn). For degreebounded spanning trees, this is the case for ϕ∈o(logn/loglogn). Our general framework is applicable to many other partitioning algorithms as well, but we focus on the aforementioned problems in this work.
Preliminaries
For \(n \in\mathbb{N}\), let [n]={1,2,…,n}. We denote probabilities by \(\mathbb{P}\) and expected values by \(\mathbb{E}\).
Euclidean Functionals
A Euclidean functional is a function \(\mathsf {F}:([0,1]^{2})^{\star}\to \mathbb{R}\) that maps a finite point set X⊆[0,1]^{2} to a real number \(\mathsf{F}(X)\). The following are examples of Euclidean functionals:

\(\mathsf{MM}\) maps a point set to the length of its minimumlength perfect matching (length means Euclidean distance, one point is left out if the cardinality of the point set is odd).

\(\mathsf{TSP}\) maps a point set to the length of its shortest Hamiltonian cycle, i.e., to the length of its optimal traveling salesman tour.

\(\mathsf{MST}\) maps a point set to the length of its minimumlength spanning tree.

\(\mathsf{ST}\) maps a point set to the length of its shortest Steiner tree.

\(\mathsf{dbMST}\) maps a point set to the length of its minimumlength spanning tree, restricted to trees of maximum degree at most b for some given bound b.
The Euclidean functionals that we consider in this paper are all associated with an underlying combinatorial optimization problem. Thus, the function value \(\mathsf{F}(X)\) is associated with an optimal solution (minimumlength perfect matching, optimal TSP \(\mathrm{tour},\ldots\)) to the underlying combinatorial optimization problem. In this sense, we can design approximation algorithms for \(\mathsf{F}\): Compute a (nearoptimal) solution (where it depends on the functional what a solution actually is; for instance, a perfect matching), and compare the objective value (for instance, the sum of the lengths of its edges) to the function value.
We follow the notation of Frieze and Yukich [13, 35]. A Euclidean functional \(\mathsf{F}\) is called smooth [25, 35] if there is a constant c such that
for all finite X,Y⊆[0,1]^{2}. The constant c may depend on the function \(\mathsf{F}\), but not on the sets X and Y or their cardinality.
Let C _{1},…,C _{ s } be a partition of [0,1]^{2} into rectangles. We call each C _{ ℓ } a cell. Note that the cells are not necessarily of the same size. For a finite set X⊆[0,1]^{2} of n points, let X _{ ℓ }=X∩C _{ ℓ } be the points of X in cell C _{ ℓ }. Let n _{ ℓ }=X _{ ℓ } be the number of points of X in cell C _{ ℓ }. Let \(\mathop{\mathrm{diameter}}(C_{\ell})\) be the diameter of cell C _{ ℓ }.
We call \(\mathsf{F}\) subadditive if
for all finite X⊆[0,1]^{2} and all partitioning of the square. \(\mathsf{F}\) is called superadditive if
for all finite X⊆[0,1]^{2} and all partitioning of the square. A combination of subadditivity and superadditivity for a Euclidean functional \(\mathsf{F}\) is a sufficient (but not a necessary) condition for the existence of a partitioning heuristic for approximating \(\mathsf{F}\). We will present such a generic partitioning heuristic in Sect. 3.
Following Frieze and Yukich [13], we define a slightly weaker additivity condition that is sufficient for the performance analysis of partitioning algorithms. Frieze and Yukich [13] call a Euclidean function \(\mathsf{F}\) nearadditive if, for all partitions C _{1},…,C _{ s } of [0,1]^{2} into cells and for all finite X⊆[0,1]^{2}, we have
If \(\mathsf{F}\) is subadditive and superadditive, then \(\mathsf{F}\) is also nearadditive.
Unfortunately, the Euclidean functionals \(\mathsf{TSP}\), \(\mathsf {MM}\) and \(\mathsf{MST}\) are smooth and subadditive but not superadditive [31, 32, 35]. However, these functionals can be approximated by their corresponding canonical boundary functionals, which are superadditive [13, 35]. We obtain the canonical boundary functional of a Euclidean functional by considering the boundary of the domain as a single point [35]. This means that two points can either be connected directly or via a detour along the boundary. In the latter case, only the lengths of the two edges connecting the two points to the boundary count, walking along the boundary is free of charge. Yukich [35] has shown that this is a sufficient condition for a Euclidean functional to be nearadditive.
Proposition 2.1
(Yukich [35, Lemma 5.7])
Let \(\mathsf{F}\) be a subadditive Euclidean functional. Let \(\mathsf{F}_{\mathop{\mathrm{B}}}\) be a superadditive functional that wellapproximates \(\mathsf{F}\). (This means that \(\mathsf{F}(X)\mathsf{F}_{\mathop{\mathrm{B}}}(X) = O(1)\) for all finite X⊆[0,1]^{2}.) Then \(\mathsf{F}\) is nearadditive.
The functionals \(\mathsf{MM}\), \(\mathsf{TSP}\), \(\mathsf{MST}\), \(\mathsf{ST}\), and \(\mathsf{dbMST}\) are nearadditive.
Limit theorems are powerful tools for the analysis of Euclidean functionals. Rhee [25] proved the following limit theorem for smooth Euclidean functionals over [0,1]^{2}. We will mainly use it to bound the probability that \(\mathsf{F}\) assumes a too small function value.
Theorem 2.2
(Rhee [25])
Let X be a set of n points drawn independently according to identical distributions from [0,1]^{2}. Let \(\mathsf{F}\) be a smooth Euclidean functional. Then there exist constants c and c′ such that for all t>0, we have
Remark 2.3
Rhee proved Theorem 2.2 for the case that x _{1},…,x _{ n } are identically distributed. However, as pointed out by Rhee herself [25], the proof carries over to the case when x _{1},…,x _{ n } are drawn independently but their distributions are not necessarily identical.
Smoothed Analysis
In the classical model of smoothed analysis [27], an adversary specifies a point set \(\bar{X}\), and then this point set is perturbed by independent identically distributed random variables in order to obtain the input set X. A different viewpoint is that the adversary specifies the means of the probability distributions according to which the point set is drawn. This model has been generalized as follows [4]: Instead of only specifying the mean, the adversary can specify a density function for each point, and then we draw the points independently according to their density functions. In order to limit the power of the adversary, we have an upper bound ϕ for the densities: The adversary is allowed to specify any density function [0,1]^{2}→[0,ϕ]. If ϕ=1, then this boils down to the uniform distribution on the unit square [0,1]^{2}. If ϕ gets larger, the adversary becomes more powerful and can specify the location of the points more and more precisely. The role of ϕ is the same as the role of 1/σ in classical smoothed analysis, where σ is the standard deviation of the perturbation. We summarize this model formally in the following assumption.
Assumption 2.4
Let ϕ≥1. An adversary specifies n probability density functions f _{1},…,f _{ n }:[0,1]^{2}→[0,ϕ]. We write f=(f _{1},…,f _{ n }) for short. Let x _{1},…,x _{ n }∈[0,1]^{2} be n random vectors where x _{ i } is drawn according to f _{ i }, independently from the other points. Let X={x _{1},…,x _{ n }}.
If the actual density functions f matter and are not clear from the context, we write X∼f to denote that X is drawn as described above. If we have a performance measure P for an algorithm (P will be either runningtime or approximation ratio in this paper), then the smoothed performance is \(\max_{f} (\mathbb{E}_{X \sim f} [P(X)])\). Note that the smoothed performance is a function of the number n of points and the parameter ϕ.
Let \(\mathsf{F}\) be a Euclidean functional. For the rest of this paper, let \(\mu_{\mathsf{F}}(n,\phi)\) be a lower bound for the expected value of \(\mathsf{F}\) if X is drawn according to the probabilistic model described above. More precisely, \(\mu_{\mathsf{F}}\) is some function that fulfills \(\mu_{\mathsf{F}}(n, \phi) \leq\min_{f} (\mathbb{E}_{X \sim f} [\mathsf{F}(X)])\). The function \(\mu_{\mathsf{F}}\) comes into play when we have to bound the objective value of an optimal solution, i.e., \(\mathsf{F}(X)\), from below in order to analyze the approximation ratio.
Framework
In this section, we present our framework for the performance analysis of partitioning heuristics for Euclidean functionals. Let \(\mathsf{A}_{\mathop{\mathrm{opt}}}\) be an optimal algorithm for some smooth and nearadditive Euclidean functional \(\mathsf{F}\), and let \(\mathsf{A}_{\mathop{\mathrm{join}}}\) be an algorithm that combines solutions for each cell into a global solution. We assume that \(\mathsf{A}_{\mathop{\mathrm{join}}}\) runs in time linear in the number of cells. Then we obtain the following algorithm, which we call \(\mathsf{A}\).
Algorithm 3.1
(Generic algorithm \(\mathsf{A}\))
Input: set X⊆[0,1]^{2} of n points.

1.
Divide [0,1]^{2} into s cells C _{1},…,C _{ s }.

2.
Compute optimal solutions for each cell using \(\mathsf{A}_{\mathop{\mathrm{opt}}}\).

3.
Join the s partial solutions to a solution for X using \(\mathsf{A}_{\mathop{\mathrm{join}}}\).
The cells in the first step of Algorithm 3.1 are rectangles. They are not necessarily of the same size (in this paper, only the algorithm for matching divides the unique square into cells of exactly the same size, the other algorithms choose the division into squares depending on the actual point set). We use the following assumptions in our analysis and mention explicitly whenever they are used.
Assumption 3.2

1.
ϕ∈O(s). This basically implies that the adversary cannot concentrate all points in a too small number of cells.

2.
ϕ∈ω(slogn/n). This provides a lower bound for the probability mass in a “full” cell, where full is defined in Sect. 3.1.

3.
\(\phi\in o(\sqrt{n/\log n})\). With this assumption, the tail bound of Theorem 2.2 becomes subpolynomial.
These assumptions are not too restrictive: For the partitioning algorithms we analyze here, we have s=O(n/log^{O(1)} n) (for matching, we could also use smaller s while maintaining polynomial, albeit worse, runningtime; for the other problems, we even need s=O(n/log^{O(1)})). Ignoring polylogarithmic terms, the first and third assumption translate roughly to ϕ=O(n) and \(\phi= o(\sqrt{n})\), respectively. The second assumption roughly says ϕ=ω(1). But for ϕ=O(1), we can expect roughly averagecase behavior because the adversary has only little influence on the positions of the points.
Smoothed RunningTime
Many of the schemes that we analyze choose the partition in such a way that we have a worstcase upper bound on the number of points in each cell. Other algorithms, like the one for matching [10], have a fixed partition independent of the input points. In the latter case, the runningtime also depends on ϕ.
Let T(n) denote the worstcase runningtime of \(\mathsf{A}_{\mathop{\mathrm{opt}}}\) on n points. Then the runningtime of \(\mathsf{A}\) is bounded by \(\sum_{\ell=1}^{s} T(n_{\ell}) + O(s)\), where n _{ ℓ } is the number of points in cell C _{ ℓ }. The expected runningtime of \(\mathsf{A}\) is thus bounded by
For the following argument, we assume that T (the runningtime of \(\mathsf{A}_{\mathop{\mathrm{opt}}}\)) is a monotonically increasing convex function and that the locations of the cells are fixed and all their volumes are equal. (The assumption about the cells is not fulfilled for all partitioning heuristics. For instance, Karp’s partitioning scheme [18] chooses the cells not in advance but based on the actual point set. However, in Karp’s scheme, the cells are chosen in such a way that there is a good worstcase upper bound for the number of points per cell, so there is no need for a smoothed analysis.) By slightly abusing notation, let \(f_{i}(C_{\ell}) = \int_{C_{\ell}} f_{i}(x)\,\mathrm{d}x\) be the cumulative density of f _{ i } in the cell C _{ ℓ }. Since f _{ i } is bounded from above by ϕ, we have f _{ i }(C _{ ℓ })≤ϕ/s (this requires that the cells are of equal size, thus their area is 1/s). Let \(f(C_{\ell}) = \sum_{i=1}^{n} f_{i}(C_{\ell})\). Note that \(f_{i}(C_{\ell}) = \mathbb{P}[x_{i} \in C_{\ell}]\) and \(f(C_{\ell}) = \mathbb{E}[n_{\ell}]\).
We call a cell C _{ ℓ } full with respect to f if f(C _{ ℓ })=nϕ/s. We call C _{ ℓ } empty if f(C _{ ℓ })=0. Our bound (1) on the runningtime depends only on the values f _{1}(C _{ ℓ }),…,f _{ n }(C _{ ℓ }), but not on where exactly within the cells the probability mass is assumed.
The goal of the adversary is to cause the partitioning algorithm to be slow. We will show that, in order to do this, the adversary will make as many cells as possible full. Note that there are at most ⌊s/ϕ⌋ full cells. Assume that we have ⌊s/ϕ⌋ full cells and at most one cell that is neither empty nor full. Then the number of points in any full cell is a binomially distributed random variable B with parameters n and ϕ/s. By linearity of expectation, the expected runningtime is bounded by
Since ϕ=O(s) by Assumption 3.2(1), this is bounded by \(O(\frac{s}{\phi}\cdot\mathbb{E}[T(B)] + s)\). If T is bounded by a polynomial, then this evaluates to \(O(\frac{s}{\phi}\cdot T(n\phi/s) + s)\) by the following Lemma 3.3. This lemma can be viewed as “Jensen’s inequality in the other direction” with p=ϕ/s for ϕ∈ω(slogn/n). The latter is satisfied by Assumption 3.2(2).
Lemma 3.3
(Inverse Jensen’s inequality)
Let T be any convex, monotonically increasing function that is bounded by a polynomial, and let B be a binomially distributed random variable with parameters \(n \in\mathbb{N}\) and p∈[0,1] with p∈ω(logn/n). Then \(\mathbb{E}[T(B)] = \varTheta(T(\mathbb{E}[B]))\).
Proof
We have \(\mathbb{E}[B] = np\). Jensen’s inequality yields \(\mathbb{E}[T(B)] \geq T(np)\). Thus, what remains to be proved is \(\mathbb{E}[T(B)] = O(T(np))\). Chernoff’s bound [21, Theorem 4.4] says
This allows us to bound
Since T is bounded by a polynomial, we have T(2np)=O(T(np)). Since p∈ω(logn/n) and T is bounded by a polynomial, we have (e/4)^{np}⋅T(n)∈o(1). Thus, \(\mathbb{E}[T(B)] = O(T(np))\), which proves the lemma. □
What remains to be done is to show that the adversary will indeed make as many cells as possible full. This follows essentially from the convexity of the runningtime. In the following series of three lemmas, we make the argument rigorous.
The first lemma basically says that we maximize a convex function of a sum of independent 0/1 random variables if we balance the probabilities of the random variables. This is similar to a result by León and Perron [19]. But when we apply Lemma 3.4 in the proof of Lemma 3.5, we have to deal with the additional constraint p _{ i }∈[ε _{ i },1−ε _{ i }]. This makes León and Perron’s result [19] inapplicable.
Lemma 3.4
Let p∈(0,1). Let X _{1},X _{2} be independent 0/1 random variables with \(\mathbb{P}[X_{1} =1] = p\delta\) and \(\mathbb{P}[X_{2} =1] = p+\delta \). Let X=X _{1}+X _{2}. Let f be any convex function, and let \(g(\delta) = \mathbb{E}[f(X)]\).
Then g is monotonically decreasing in δ for δ>0 and monotonically increasing for δ<0 and has a global maximum at δ=0.
Proof
A short calculation shows that
Abbreviating all terms that do not involve δ by z yields
The lemma follows now by the convexity of f. □
With Lemma 3.4 above, we can show the following lemma: If we maximize a convex function of n 0/1 random variables and this function is symmetric around n/2, then we should make all probabilities as small as possible (or all as large as possible) in order to maximize the function.
Lemma 3.5
Let f be an arbitrary convex function. Let X _{1},X _{2},…,X _{ n } be independent 0/1 random variables with \(\mathbb{P}[X_{i} = 1] = p_{i} \in[\varepsilon_{i}, 1\varepsilon_{i}]\), and let \(X = \sum _{i=1}^{n} X_{i}\). Let \(g(p_{1},\ldots, p_{n}) = \mathbb{E}[f(X) + f(nX)]\). Then g has a global maximum at (ε _{1},…,ε _{ n }).
Proof
In the following, let \(X' = \sum_{i=1}^{n1} X_{i}\). Without loss of generality, we can assume that \(\sum_{i=1}^{n} p_{i} \leq n/2\). Otherwise, we replace p _{ i } by 1−p _{ i }, which does not change the function value of g by symmetry.
First, we want to eliminate p _{ i } with p _{ i }>1/2. If there is a p _{ i }>1/2, then there must be a p _{ i′}<1/2 since \(\sum_{i=1}^{n} p_{i} \leq n/2\). Let i=n and i′=n−1 without loss of generality. Our goal is to shift “probability mass” from X _{ n } to X _{ n−1}. To do this, let q=(p _{ n−1}+p _{ n })/2. We consider two new functions \(\tilde{g}\) and h. The function \(\tilde{g}\) is defined by
where the expected value is taken only over X _{1},…,X _{ n−2}. The function h is defined by
By definition, we have \(h(\frac{p_{n}  p_{n1}}{2}) = g(p_{1},\ldots, p_{n})\). The function h is convex and we can apply Lemma 3.4: We should choose δ as small as possible in order to maximize it. We decrease δ from (p _{ n }−p _{ n−1})/2>0 until q−δ or q+δ becomes 1/2. Then we set p _{ n−1} and p _{ n } accordingly. In this way, we guarantee that p _{ n−1}∈[ε _{ n−1},1−ε _{ n−1}] and p _{ n }∈[ε _{ n },1−ε _{ n }]. We iterate this process until we have p _{ i }≤1/2 for all i∈[n]. This only increases F.
Now we can assume that p _{1},…,p _{ n }≤1/2. We finish the proof by showing that decreasing any p _{ i } as much as possible only increases g(p _{1},…,p _{ n }). Let Δ(x)=f(x+1)−f(x). Since f is convex, Δ is nondecreasing. By symmetry, it suffices to consider p _{ n }. We have
Only the term in the last line depends on p _{ n }. Since p _{ i }≤1/2 for all i∈[n−1], X′ is stochastically dominated by n−X′−1. Since Δ is nondecreasing, this yields
Hence, decreasing p _{ n } will never decrease the value of g. □
Lemma 3.5 above is the main ingredient for the proof that the adversary wants as many full cells as possible. Lemma 3.6 below makes this rigorous.
Lemma 3.6
Let C _{ ℓ′} and C _{ ℓ″} be any two cells. Let f _{1},…,f _{ n }:[0,1]^{2}→[0,ϕ] be any density functions. Let \(\tilde{f}_{1}, \ldots, \tilde{f}_{n}: [0,1]^{2} \to[0, \phi]\) be density functions with the following properties for all i∈[n]:

1.
\(\tilde{f}_{i}(C_{\ell'}) = \min (\phi/s, f_{i}(C_{\ell'}) +f_{i}(C_{\ell''}) )\).

2.
\(\tilde{f}_{i}(C_{\ell''}) =(f_{i}(C_{\ell'}) + f_{i}(C_{\ell''}) )  \tilde{f}_{i}(C_{\ell'})\).
(Note that there are densities \(\tilde{f}_{1}, \ldots, \tilde{f}_{n}\) with these properties: First, all \(\tilde{f}_{i}\) are nonnegative and, second, \(\int_{[0,1]^{2}} \tilde{f}_{i}(x) \,\mathrm{d}x = 1\). Furthermore, \(\tilde{f}_{1}, \ldots, \tilde{f}_{n}\) can be chosen such that they are bounded by ϕ since we have f _{ i }(C _{ ℓ′}),f _{ i }(C _{ ℓ″})≤ϕ/s by construction.) Let n _{ ℓ } be the (random) number of points in X _{ ℓ } with respect to f=(f _{1},…,f _{ n }), and let \(\tilde{n}_{\ell}\) be the (random) number of points in X _{ ℓ } with respect to \(\tilde{f} = (\tilde{f}_{1}, \ldots, \tilde{f}_{n})\). Then
Proof
First, we note that \(\mathbb{E}[T(n_{\ell})] = \mathbb{E}[T(\tilde{n}_{\ell})]\) for ℓ≠ℓ′,ℓ″. Without loss of generality, let ℓ′=1 and ℓ″=2. Thus, we have to prove
Let M={i∣x _{ i }∈C _{1}∪C _{2}} be the (random) set of indices of points in the two cells. To prove this, we prove the inequality
for any set I⊆[n]. This is equivalent to
Without loss of generality, we restrict ourselves to the case I=[n]. This gives us the following setting: Any point x _{ i } is either in C _{1} or in C _{2}. Under this condition, the probability that x _{ i } is in C _{1} is \(p_{i} = \frac{f_{i}(C_{1})}{f_{i}(C_{1} \cup C_{2})}\), and the probability that x _{ i } is in C _{2} is \(1p_{i} = \frac{f_{i}(C_{2})}{f_{i}(C_{1} \cup C_{2})}\). We can choose p _{ i } arbitrarily such that \(p_{i} \leq\min\{1, \frac{\phi/s}{f_{i}(C_{1})+f_{i}(C_{2})}\} =1\varepsilon _{i}\) and \(p_{i} \geq\max\{0, 1\frac{\phi/s}{f_{i}(C_{1})+f_{i}(C_{2})}\}= \varepsilon_{i}\). This is precisely the setting that we need to apply Lemma 3.5. □
Let f _{1},…,f _{ n }:[0,1]^{2}→[0,ϕ] be the given distributions. By applying Lemma 3.6 repeatedly for pairs of nonfull, nonempty cells C _{ ℓ′} and C _{ ℓ″}, we obtain distributions \(\tilde{f}_{1},\ldots, \tilde{f}_{n}\) with the following properties:

1.
\(\tilde{f}_{1},\ldots, \tilde{f}_{n}\) have ⌊s/ϕ⌋ full cells and at most one cell that is neither full nor empty.

2.
The expected value of T on X sampled according to \(\tilde{f}_{1},\ldots\tilde{f}_{n}\) is not smaller than the expected value of T on X sampled according to f _{1},…,f _{ n }.
This shows that the adversary, in order to slow down our algorithm, will concentrate the probability in as few cells as possible. Thus, we obtain the following theorem.
Theorem 3.7
Assume that the runningtime of \(\mathsf{A}_{\mathop{\mathrm{opt}}}\) can be bounded from above by a convex function T that is bounded by a polynomial. Then, under Assumptions 2.4, 3.2(1), and 3.2(2), the expected runningtime of \(\mathsf{A}\) on input X is bounded from above by
Proof
The expected runningtime is maximized if we have ⌊s/ϕ⌋ cells that are full plus possibly one cell containing all the remaining probability mass. The expected runningtime for each such cell is O(T(nϕ/s)) by Lemma 3.3 and because of Assumption 3.2(2). Thus, the expected runningtime of \(\mathsf{A}\) is bounded from above by
The theorem follows as ϕ=O(s) by Assumption 3.2(1). □
Smoothed Approximation Ratio
The value computed by \(\mathsf{A}\) can be bounded from above by
where J′ is an upper bound for the cost incurred by joining the solution for the cells. Since \(\mathsf{F}\) is a nearadditive Euclidean functional, we have \(\mathsf{A}(X) \leq\mathsf{F}(X) + J\) for \(J = J' + O(\sum_{\ell=1}^{s} \mathop{\mathrm{diameter}}(C_{\ell}))\). Dividing by \(\mathsf{F}(X)\) yields
Together with \(\mathbb{E}[\mathsf{F}(X)] \geq\mu_{\mathsf{F}}(n,\phi)\), we obtain a generic upper bound of
for the ratio of expected output of \(\mathsf{A}\) and expected function value of \(\mathsf{F}\). While this provides some guarantee on the approximation performance, it does not provide a bound on the expected approximation ratio, which is in fact our goal.
For estimating the expected approximation ratio \(\mathbb{E}[\mathsf{A}(X)/\mathsf{F}(X)]\) for some algorithm \(\mathsf{A}\), the main challenge is that \(\mathsf{F}(X)\) stands in the denominator. Thus, even if we have a good (deterministic) upper bound for \(\mathsf{A}(X)\) that we can plug into the expected ratio in order to get an upper bound for the ratio that only depends on \(\mathsf{F}(X)\), we are basically left with the problem of estimating \(\mathbb{E}[1/\mathsf{F}(X)]\). Jensen’s inequality yields \(\mathbb{E}[1/\mathsf{F}(X)] \geq1/\mathbb{E}[\mathsf{F}(X)]\). But this does not help, as we need upper bounds for \(\mathbb{E}[1/\mathsf{F}(X)]\). Unfortunately, such upper bounds cannot be derived easily from \(1/\mathbb{E}[\mathsf{F}(X)]\). The problem is that we need strong upper bounds for the probability that \(\mathsf{F}(X)\) is close to 0. Theorem 2.2 is too weak for this. This problem of bounding the expected value of the inverse of the optimal objective value arises frequently in bounding expected approximation ratios [11, 12].
There are two ways to attack this problem: The first and easiest way is if \(\mathsf{A}\) comes with a worstcase guarantee α(n) on its approximation ratio for instances of n points. Then we can apply Theorem 2.2 to bound \(\mathsf{F}(X)\) from below. If \(\mathsf{F}(X) \geq\mu _{\mathsf{F}}(n,\phi)/2\), then we can use (2) to obtain a ratio of \(1+O(\frac{J}{\mu_{\mathsf{F}}(n, \phi)})\). Otherwise, we obtain a ratio of α(n). If α(n) is not too large compared to the tail bound obtained from Theorem 2.2, then this contributes only little to the expected approximation ratio. The following theorem formalizes this.
Theorem 3.8
Assume that \(\mathsf{A}\) has a worstcase approximation ratio of α(n) for any instance consisting of n points. Then, under Assumption 2.4, the expected approximation ratio of \(\mathsf{A}\) is
for some positive constant c>0.
Proof
We have
By Theorem 2.2 and Remark 2.3, we have
for some constants c,c′>0. Together with (3), this allows us to bound the expected approximation ratio as
which completes the proof. □
Now we turn to the case that the worstcase approximation ratio of \(\mathsf{A}\) cannot be bounded by some α(n). In order to be able to bound the expected approximation ratio, we need an upper bound on \(\mathbb{E}[1/\mathsf{F}(X)]\). Note that we do not explicitly provide an upper bound for \(\mathbb{E}[1/\mathsf {F}(X)]\), but only a sufficiently strong tail bound h _{ n } for \(1/\mathsf{F}(X)\).
Theorem 3.9
Assume that there exists a β≤J and a function h _{ n } such that \(\mathbb{P}[\mathsf{F}(X) \leq x] \leq h_{n}(x)\) for all x∈[0,β]. Then, under Assumption 2.4, the expected approximation ratio of \(\mathsf{A}\) is
Proof
If \(\mathsf{F}(X) \geq\mu_{\mathsf{F}}(n, \phi)/2\), then the approximation ratio is
which is good. By Theorem 2.2, the probability that this does not hold is bounded from above by \(\exp (\frac{\mu_{\mathsf{F}}(n, \phi)^{4}}{Cn})\) for some constant C>0. If we still have \(\mathsf{F}(X) \geq\beta\), then we can bound the ratio from above by
This contributes
to the expected value, where the inequality follows from β≤J. We are left with the case that \(\mathsf{F}(X) \leq\beta\). This case contributes
to the expected value. By definition, we have
which completes the proof. □
Matching
As a first example, we apply our framework to the matching functional \(\mathsf{MM}\) defined by the Euclidean minimumlength perfect matching problem. A partitioning algorithm for approximating \(\mathsf{MM}\) was proposed by Dyer and Frieze [10]. For completeness, let us describe their algorithm.
Algorithm 4.1
(\(\mathsf{DF}\); Dyer, Frieze [10])
Input: set X⊆[0,1]^{2} of n points, n is even.

1.
Partition [0,1]^{2} into s=k ^{2} equalsized subsquares \(C_{1},\ldots,C_{k^{2}}\), each of side length 1/k, where \(k=\frac{\sqrt{n}}{\log n}\).

2.
Compute minimumlength perfect matchings for X _{ ℓ } for each ℓ∈[k ^{2}].

3.
Compute a matching for the unmatched points from the previous step using the strip heuristic [33].
Let \(\mathsf{DF}(X)\) be the cost of the matching computed by the algorithm above on input X={x _{1},…,x _{ n }}, and let \(\mathsf{MM}(X)\) be the cost of a perfect matching of minimum total length. Dyer and Frieze showed that \(\mathsf{DF}(X)\) converges to \(\mathsf{MM}(X)\) with probability 1 if the points in X are drawn according to the uniform distribution on [0,1]^{2} (this corresponds to Assumption 2.4 with ϕ=1). We extend this to the case when X is drawn as described in Assumption 2.4.
Smoothed RunningTime
A minimumlength perfect matching can be found in time O(n ^{3}) [1]. By Theorem 3.7, we get the following corollary.
Corollary 4.2
Under Assumptions 2.4, 3.2(1), and 3.2(2), the expected runningtime of \(\mathsf{DF}\) on input X is at most
If we plug in \(k = \sqrt{n}/\log n\), we obtain an expected runningtime of at most
Smoothed Approximation Ratio
To estimate the approximation performance, we have to specify the function \(\mu_{\mathsf{MM}}(n,\phi)\). To obtain a lower bound for \(\mu_{\mathsf{MM}}(n, \phi)\), let \(\mathsf{NN}(X)\) denote the total edge length of the nearestneighbor graph for the point set X⊆[0,1]^{2}. This means that
We use \(\mathsf{NN}\) to bound \(\mathsf{MM}\) from below: First, we have \(\mathsf{MM}(X) \geq\mathsf{NN}(X)/2\). Second, \(\mathbb {E} [\mathsf{NN}(X) ]\) is easier to analyze than \(\mathbb{E} [\mathsf{MM}(X) ]\). Thus, according to the following lemma, we can choose \(\mu_{\mathsf {MM}}(n,\phi ) = \varOmega (\sqrt{n/\phi} )\).
Lemma 4.3
Under Assumption 2.4, we have
Proof
By linearity of expectation, we have \(\mathbb{E} [\mathsf{NN}(X) ] = n \cdot\mathbb{E}[\min_{i \geq2} \x_{1}  x_{i}\ ]\). Thus, we have to prove \(\mathbb{E} [\min_{i \geq2} \x_{1}  x_{i}\] =\varOmega (1/\sqrt{n\phi} )\). To bound this quantity from below, we assume that x _{1} is fixed by an adversary and that only x _{2},…,x _{ n } are drawn independently according to their density functions. Then we obtain
The probability that ∥x _{1}−x _{ i }∥≤r can be bounded from above by ϕ times the area of a circle of radius r, which is ϕπr ^{2}. Thus,
The second inequality holds because \(1 \phi\pi r^{2} \geq1\frac{1}{n}\) for \(r \in[0, 1/\sqrt{\phi\pi n}]\). The third inequality exploits \((1\frac{1}{n})^{n1} \geq1/e\). □
Since \(\mathsf{MM}\) is nearadditive and the diameter of each cell is O(1/k), we can use
Unfortunately, we cannot bound the worstcase approximation ratio of Dyer and Frieze’s partitioning algorithm. Thus, we cannot apply Theorem 3.8, but we have to use Theorem 3.9. Thus, we first need a tail bound for \(1/\mathsf{MM}(X)\). The bound in the following lemma suffices for our purposes.
Lemma 4.4
Under Assumption 2.4, we have
for all \(c \leq\frac{1}{2\pi}\).
Proof
Let us first analyze the probability that a specific fixed matching M has a length of at most c. We let an adversary fix one endpoint of each edge. Then the probability that a specific edge of M has a length of at most c is bounded from above by ϕπc ^{2}. Thus, the density of the length of a particular edge is bounded from above by 2ϕπc≤ϕ as \(c \leq\frac{1}{2\pi}\). Furthermore, the lengths of the edges of M are independent random variables. Thus, the probability that the sum of the edge lengths of all n/2 edges of M is bounded from above by c is at most \(\frac{(\phi\pi c)^{n/2}}{(n/2)!}\), which can be proved by the following induction: Let m=n/2, and let a _{1},…,a _{ m } be the (random) edge lengths of the edge of M. For m=1, the statement follows from \(\mathbb{P}[a_{1} \leq c] \leq\phi c\). For larger m, assume that the claim holds for m−1, and let h be the density of a _{ m }. This density is bounded by ϕ as argued above. Thus,
The number of perfect matchings of a complete graph on n vertices is (n−1)!!=(n−1)⋅(n−3)⋅(n−5)⋯ (“!!” denotes the double factorial). A union bound over all matchings yields
which completes the proof. □
With this tail bound for \(1/\mathsf{MM}(X)\), we can prove the following bound on the smoothed approximation ratio.
Corollary 4.5
Under Assumptions 2.4 and 3.2(3), the expected approximation ratio of \(\mathsf{DF}\) is \(1 + O(\frac{\sqrt{\phi}}{\log n})\).
Proof
We apply Theorem 3.9. To do this, let \(\beta= \frac{1}{2\pi\phi}\) (this is exactly the value at which Lemma 4.4 becomes nontrivial). Lemma 4.4 allows us to choose h _{ n }(x)=(2ϕπx)^{n/2} and yields
Assumption 3.2(3) with (4) yields
by Assumption 3.2(3).
We can choose \(\mu_{\mathsf{MM}}(n, \phi) = \varOmega(\sqrt{n/\phi})\) as \(\mathsf{MM}(X) \geq\mathsf{NN}(X)/2 = \varOmega(\sqrt{n/\phi})\) by Lemma 4.3. Theorem 2.2 together with Assumption 3.2(3) thus yields that the probability that \(\mathsf{MM}(X) < \mu_{\mathsf {MM}}(n,\phi)/2\) is bounded from above by
This bound decreases faster than any polynomial in n. Thus, also by Assumption 3.2(3),
decreases faster than any polynomial in n.
Altogether, Theorem 3.9 yields a bound of
for the expected approximation ratio. □
Remark 4.6

1.
There exist other partitioning schemes for Euclidean matching [2], which can be analyzed in a similar way.

2.
Instead of a standard cubictime algorithm, we can use Varadarajan’s matching algorithm [34] for computing the optimal matchings within each cell. This algorithm has a runningtime of O(m ^{1.5}log^{5} m) for m points, which improves the runningtime bound to \(O(n\sqrt{\phi}\log(n)\log^{5}(\phi\log n))\).
Karp’s Partitioning Scheme for Euclidean TSP
Karp’s partitioning scheme [18] is a heuristic for Euclidean TSP that computes nearoptimal solutions on average. It proceeds as follows:
Algorithm 5.1
(\(\mathsf{KP}\), Karp’s partitioning scheme)
Input: set X⊆[0,1]^{2} of n points.

1.
Partition [0,1]^{2} into \(k = \sqrt{n/\log n}\) stripes such that each stripe contains exactly \(n/k = \sqrt{n \log n}\) points.

2.
Partition each stripe into k cells such that each cell contains exactly n/k ^{2}=logn points.

3.
Compute optimal TSP tours for each cell.

4.
Join the tours to obtain a TSP tour for X.
We remark that the choice of k in Karp’s partitioning scheme is optimal in the following sense: On the one hand, more than Θ(logn) points per cell would yield a superpolynomial runningtime as the runningtime is exponential in the number of points per cell. On the other hand, less than Θ(logn) point per cell would yield a worse approximation ratio as the approximation ratio gets worse with increasing k.
For a point set X⊆[0,1]^{2}, let \(\mathsf{KP}(X)\) denote the cost of the tour through X computed by Karp’s scheme. Steele [31] has proved complete convergence of \(\mathsf {KP}(X)\) to \(\mathsf{TSP}(X)\) with probability 1, if the points are chosen uniformly and independently. Using our framework developed in Sect. 3, we extend the analysis of \(\mathsf{KP}\) to the case of nonuniform and nonidentical distributions.
Since Karp’s scheme chooses the cells adaptively based on the point set X, our framework for the analysis of the runningtime cannot be applied. However, the total runningtime of the algorithm is \(T(n)=2^{n/k^{2}} \mathop{\mathrm{poly}}(n/k^{2})+ O(k^{2})\), which is, independent of the randomness, polynomial in n for k ^{2}=n/logn.
The nearestneighbor functional \(\mathsf{NN}\) is a lower bound for \(\mathsf{TSP}\). Thus, we can use Lemma 4.3 to obtain \(\mu_{\mathsf{TSP}}(n, \phi) = \varOmega(\sqrt{n/\phi})\). We can use the bound [18, 30]
to obtain \(J = O(\sqrt{n/\log n})\).
The nice thing about the TSP is that every tour has a worstcase approximation guarantee: Consider any two points x,y∈X. Since any tour must visit both x and y, its length is at least 2∥x−y∥ by the triangle inequality. Since a tour consists of n edges, any tour has a length of at most \(\frac{n}{2} \cdot\mathsf{TSP}(X)\). Thus, we can use Theorem 3.8 together with α(n)=n/2 and obtain the following result.
Corollary 5.2
Under Assumptions 2.4 and 3.2(3), the expected approximation ratio of \(\mathsf{KP}\) is \(\mathbb{E}[ \frac{\mathsf{KP}(X)}{\mathsf{TSP}(X)}]\leq 1 +O(\sqrt{\phi/\log n})\).
Proof
We plug \(J = O(\sqrt{n\log n})\) and \(\mu_{\mathsf{TSP}} (n, \phi)= \varTheta(\sqrt{n/\phi})\) and α(n)=n/2 into the bound of Theorem 3.8 and obtain an upper bound of
for the expected approximation ratio. By Assumption 3.2(3), the exponential term decreases faster than any polynomial. Thus, \(O(\sqrt{\phi/\log n})\) is an upper bound for the last term. □
Euclidean Steiner Trees
Kalpakis and Sherman [17] proposed a partitioning algorithm for the Euclidean minimum Steiner tree problem analogous to Karp’s partitioning scheme for Euclidean TSP. The solution produced by their algorithm converges to the optimal value with probability 1−o(1). Also, their algorithm [17] is known to produce nearoptimal solutions in practice too [24]. Let us now describe Kalpakis and Sherman’s algorithm [17].
Algorithm 6.1
(\(\mathsf{KS}\), Kalpakis, Sherman [17])
Input: set X⊆[0,1]^{2} of n points.

1.
Let s=n/logn. Partition [0,1]^{2} into Θ(s) cells such that each cell contains at most n/s=logn points.

2.
Solve the Steiner tree problem optimally within each cell.

3.
Compute a minimumlength spanning tree to connect the forest thus obtained.
The runningtime of this algorithm is polynomial for the choice of s=n/logn [8]. For the same reason as for Karp’s partitioning scheme, we cannot use our framework to estimate the runningtime, because the choice of cells depends on the actual point set.
Let \(\mathsf{KS}(X)\) denote the cost of the Steiner tree computed Kalpakis and Sherman’s algorithm [17]. For the analysis of the approximation performance, let \(\mathsf{ST}(X)\) denote the cost of a minimum Steiner tree for the point set X, and let \(\mathsf{MST}(X)\) denote the cost of a minimumlength spanning tree of X. Kalpakis and Sherman [17] have shown that
Thus, \(J = O (\sqrt{n/\log n})\).
Since minimum spanning trees are \(2/\sqrt{3}\) approximations for Euclidean Steiner trees [9], we have \(\mathsf{ST}(X) \geq\frac{\sqrt{3}}{2}\cdot\mathsf{MST}(X)\). Furthermore, we have \(\mathsf{MST}(X) \geq\frac{1}{2} \cdot\mathsf{NN}(X)\). Thus, we can choose \(\mu_{\mathsf{ST}}(n,\phi) = \varTheta(\sqrt {n/\phi})\) by Lemma 4.3.
As \(\mathsf{KP}\) for the traveling salesman problem, \(\mathsf{KS}\) comes with a worstcase approximation ratio of α(n)=O(n). The reason is that, for any two points x,y∈X, we have \(\xy\\leq\mathsf{ST}(X)\). Since Kalpakis and Sherman’s partitioning algorithm [17] outputs at most a linear number of edges, we have \(\mathsf{KS}(X) \leq O (n \cdot\mathsf {ST}(X) )\). This gives us a worstcase approximation ratio of O(n) and yields the following corollary of Theorem 3.8.
Corollary 6.2
Under Assumptions 2.4 and 3.2(3), the expected approximation ratio of \(\mathsf{KS}\) is
Proof
The proof is almost identical to the proof of Corollary 5.2. □
DegreeBounded Minimum Spanning Tree
A bdegreebounded minimum spanning tree of a given set of points in [0,1]^{2} is a spanning tree in which the degree of every point is bounded by b. For 2≤b≤4, this problem is NPhard, and it is solvable in polynomial time for b≥5 [23]. Let \(\mathsf{dbMST}\) denote the Euclidean functional that maps a point set to the length of its shortest bdegreebounded minimum spanning tree.
Proposition 7.1
\(\mathsf{dbMST}\) is a smooth, subadditive and nearadditive Euclidean functional.
Proof
The smoothness and subadditivity properties have been proved by Srivastav and Werth [29]. They have also defined a canonical superadditive boundary functional that wellapproximates \(\mathsf {dbMST}\) [29, Lemmas 3 and 4]. This, together with Proposition 2.1 proves that \(\mathsf{dbMST}\) is nearadditive. □
Naturally, nearadditivity implies that Karp’s partitioning scheme can be extended to the bdegreebounded minimum spanning tree problem. Let \(\mathsf{PbMST}\) be the adaptation of Karp’s partitioning algorithm to \(\mathsf{dbMST}\) with parameter \(k^{2}=\frac{n \log\log n}{\log n}\). With this choice of k, \(\mathsf{PbMST}\) runs in polynomialtime as a degreebounded minimumlength spanning tree on m nodes can be found in time 2^{O(mlogm)} using bruteforce search. Then, for any X, we have
which yields \(J = O(\sqrt{n \log\log n/\log n})\).
Again, we have \(\xy\ \leq\mathsf{dbMST}(X)\) for all X and x,y∈X, which implies that any possible tree is at most a factor n worse than the optimal tree. This implies in particular that the worstcase approximation ratio of \(\mathsf{P}\mbox{}\mathsf{bMST}\) is O(n): \(\mathsf{P}\mbox{}\mathsf{bMST}(X) =O(n \cdot \mathsf{dbMST}(X))\). Furthermore, we can use \(\mu_{\mathsf{dbMST}}(n, \phi) = \varOmega (\sqrt{n/\phi})\) by Lemma 4.3 as \(\mathsf{dbMST}(X) = \varOmega(\mathsf{NN}(X))\).
We can apply Theorem 3.8 to obtain the following result.
Corollary 7.2
Under Assumptions 2.4 and 3.2(3), the expected approximation ratio is
Proof
The proof is almost identical to the proof of Corollary 5.2. The only difference is we now have to use \(J=O(\sqrt{n\log\log n/\log n})\), which leads to the slightly worse bound for the approximation ratio. □
Again, we cannot use our framework for the runningtime, but the runningtime is guaranteed to be bounded by a polynomial.
Concluding Remarks
We have provided a smoothed analysis of partitioning algorithms for Euclidean optimization problems. The results can be extended to distributions over \(\mathbb{R}^{2}\) by scaling down the instance so that the inputs lie inside [0,1]^{2}. The analysis can also be extended to higher dimensions. However, the value of ϕ for which our results are applicable will depend on the dimension d.
Even though solutions computed by most of the partitioning algorithms achieve convergence to the corresponding optimal value with probability 1 under uniform samples, in practice they have constant approximation ratios close to 1 [16, 24]. Our results show that the expected function values computed by partitioning algorithms approach optimality not only under uniform, identical distributions, but also under nonuniform, nonidentical distributions, provided that the distributions are not sharply concentrated.
One prominent open problem for which our approach does not work is the functional defined by the total edge weight of a minimumweight triangulation in the Euclidean plane. The main obstacles for this problem are that, first, the functional corresponding to minimumweight triangulation is not smooth and, second, the value computed by the partitioning heuristic depends on the number of points in the convex hull of the point set [15]. Damerow and Sohler [7] provide a bound for the smoothed number of points in the convex hull. However, their bound is not strong enough for analyzing triangulations.
References
 1.
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory, Algorithms, and Applications. PrenticeHall, Englewood Cliffs (1993)
 2.
Anthes, B., Rüschendorf, L.: On the weighted Euclidean matching problem in \(\mathbb{R}^{d}\). Appl. Math. 28(2), 181–190 (2001)
 3.
Arthur, D., Manthey, B., Röglin, H.: Smoothed analysis of the kmeans method. J. ACM 58(5), 19 (2011)
 4.
Beier, R., Vöcking, B.: Random knapsack in expected polynomial time. J. Comput. Syst. Sci. 69(3), 306–329 (2004)
 5.
Bläser, M., Manthey, B., Rao, B.V.R.: Smoothed analysis of partitioning algorithms for Euclidean functionals. In: Dehne, F., Iacono, J., Sack, J.R. (eds.) Proc. of the 12th Algorithms and Data Structures Symposium (WADS). Lecture Notes in Computer Science, vol. 6844, pp. 110–121. Springer, Berlin (2011)
 6.
Damerow, V., Manthey, B., Meyer auf der Heide, F., Räcke, H., Scheideler, C., Sohler, C., Tantau, T.: Smoothed analysis of lefttoright maxima with applications. ACM Trans. Algorithms (to appear)
 7.
Damerow, V., Sohler, C.: Extreme points under random noise. In: Albers, S., Radzik, T. (eds.) Proc. of the 12th Ann. European Symp. on Algorithms (ESA). Lecture Notes in Computer Science, vol. 3221, pp. 264–274. Springer, Berlin (2004)
 8.
Dreyfus, S.E., Wagner, R.A.: The Steiner problem in graphs. Networks 1(3), 195–207 (1971)
 9.
Du, D.Z., Hwang, F.K.: A proof of the GilbertPollak conjecture on the Steiner ratio. Algorithmica 7(2&3), 121–135 (1992)
 10.
Dyer, M.E., Frieze, A.M.: A partitioning algorithm for minimum weighted Euclidean matching. Inf. Process. Lett. 18(2), 59–62 (1984)
 11.
Engels, C., Manthey, B.: Averagecase approximation ratio of the 2opt algorithm for the TSP. Oper. Res. Lett. 37(2), 83–84 (2009)
 12.
Englert, M., Röglin, H., Vöcking, B.: Worst case and probabilistic analysis of the 2Opt algorithm for the TSP. In: Proc. of the 18th Ann. ACMSIAM Symp. on Discrete Algorithms (SODA), pp. 1295–1304. SIAM, Philadelphia (2007)
 13.
Frieze, A.M., Yukich, J.E.: Probabilistic analysis of the traveling salesman problem. In: Gutin, G., Punnen, A.P. (eds.) The Traveling Salesman Problem and Its Variations, pp. 257–308. Kluwer Academic, Dordrecht (2002). Chapter 7
 14.
Garey, M.R., Graham, R.L., Johnson, D.S.: The complexity of computing Steiner minimal trees. SIAM J. Appl. Math. 32(4), 835–859 (1977)
 15.
Golin, M.J.: Limit theorems for minimumweight triangulations, other Euclidean functionals, and probabilistic recurrence relations. In: Proc. of the 7th Ann. ACMSIAM Symp. on Discrete Algorithms (SODA), pp. 252–260. SIAM, Philadelphia (1996)
 16.
Johnson, D.S., McGeoch, L.A.: Experimental analysis of heuristics for the STSP. In: Gutin, G., Punnen, A.P. (eds.) The Traveling Salesman Problem and Its Variations, pp. 369–443. Kluwer Academic, Dordrecht (2002). Chapter 9
 17.
Kalpakis, K., Sherman, A.T.: Probabilistic analysis of an enhanced partitioning algorithm for the Steiner tree problem in R ^{d}. Networks 24(3), 147–159 (1994)
 18.
Karp, R.M.: Probabilistic analysis of partitioning algorithms for the travelingsalesman problem in the plane. Math. Oper. Res. 2(3), 209–224 (1977)
 19.
León, C.A., Perron, F.: Extremal properties of sums of Bernoulli random variables. Stat. Probab. Lett. 62(4), 345–354 (2003)
 20.
Manthey, B., Röglin, H.: Smoothed analysis: Analysis of algorithms beyond worst case. it–Inf. Technol. 53(6), 280–286 (2011)
 21.
Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, Cambridge (2005)
 22.
Papadimitriou, C.H.: The Euclidean traveling salesman problem is NPcomplete. Theor. Comput. Sci. 4(3), 237–244 (1977)
 23.
Papadimitriou, C.H., Vazirani, U.V.: On two geometric problems related to the traveling salesman problem. J. Algorithms 5(2), 231–246 (1984)
 24.
Ravada, S., Sherman, A.T.: Experimental evaluation of a partitioning algorithm for the Steiner tree problem in R ^{2} and R ^{3}. Networks 24(8), 409–415 (1994)
 25.
Rhee, W.T.: A matching problem and subadditive Euclidean functionals. Ann. Appl. Probab. 3(3), 794–801 (1993)
 26.
Röglin, H., Teng, S.H.: Smoothed analysis of multiobjective optimization. In: Proc. of the 50th Ann. IEEE Symp. on Foundations of Computer Science (FOCS), pp. 681–690. IEEE Press, New York (2009)
 27.
Spielman, D.A., Teng, S.H.: Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM 51(3), 385–463 (2004)
 28.
Spielman, D.A., Teng, S.H.: Smoothed analysis: An attempt to explain the behavior of algorithms in practice. Commun. ACM 52(10), 76–84 (2009)
 29.
Srivastav, A., Werth, S.: Probabilistic analysis of the degree bounded minimum spanning tree problem. In: Arvind, V., Prasad, S. (eds.) Proc. of the 27th Int. Conf. on Foundations of Software Technology and Theoretical Computer Science (FSTTCS). Lecture Notes in Computer Science, vol. 4855, pp. 497–507. Springer, Berlin (2007)
 30.
Steele, J.M.: Complete convergence of short paths in Karp’s algorithm for the TSP. Math. Oper. Res. 6, 374–378 (1981)
 31.
Steele, J.M.: Subadditive Euclidean functionals and nonlinear growth in geometric probability. Ann. Probab. 9(3), 365–376 (1981)
 32.
Steele, J.M.: Probability Theory and Combinatorial Optimization. CBMSNSF Regional Conference Series in Applied Mathematics, vol. 69. SIAM, Philadelphia (1987)
 33.
Supowit, K.J., Reingold, E.M.: Divide and conquer heuristics for minimum weighted Euclidean matching. SIAM J. Comput. 12(1), 118–143 (1983)
 34.
Varadarajan, K.R.: A divideandconquer algorithm for mincost perfect matching in the plane. In: Proc. of the 39th Ann. Symp. on Foundations of Computer Science (FOCS), pp. 320–331. IEEE Press, New York (1998)
 35.
Yukich, J.E.: Probability Theory of Classical Euclidean Optimization Problems. Lecture Notes in Mathematics, vol. 1675. Springer, Berlin (1998)
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
Author information
Additional information
A preliminary version has been presented at the 12th Algorithms and Data Structures Symposium (WADS 2011) [5]. Supported by DFG grant BL 511/71.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Bläser, M., Manthey, B. & Rao, B.V.R. Smoothed Analysis of Partitioning Algorithms for Euclidean Functionals. Algorithmica 66, 397–418 (2013). https://doi.org/10.1007/s0045301296435
Received:
Accepted:
Published:
Issue Date:
Keywords
 Span Tree
 Approximation Ratio
 Minimum Span Tree
 Steiner Tree
 Steiner Tree Problem