CoverEncodings of Fitness Landscapes
Abstract
The traditional way of tackling discrete optimization problems is by using local search on suitably defined cost or fitness landscapes. Such approaches are however limited by the slowing down that occurs when the local minima that are a feature of the typically rugged landscapes encountered arrest the progress of the search process. Another way of tackling optimization problems is by the use of heuristic approximations to estimate a global cost minimum. Here, we present a combination of these two approaches by using coverencoding maps which map processes from a larger search space to subsets of the original search space. The key idea is to construct coverencoding maps with the help of suitable heuristics that single out nearoptimal solutions and result in landscapes on the larger search space that no longer exhibit trapping local minima. We present coverencoding maps for the problems of the traveling salesman, number partitioning, maximum matching and maximum clique; the practical feasibility of our method is demonstrated by simulations of adaptive walks on the corresponding encoded landscapes which find the global minima for these problems.
Keywords
Adaptive walk Coarsegraining Oracle function Genotype–phenotype map Combinatorial optimization1 Introduction
Fitness landscapes have proved to be a valuable concept in the understanding of adaptation in evolutionary biology and beyond, by visualizing the relationships between genotypes and effective reproductive success (Wright 1932, 1967). This concept has been taken forward in the field of evolutionary computation, where the performance of optimization algorithms utilizing local search has often been described as dynamics on a fitness landscape, see, e.g., the book by Engelbrecht and Richter (2014).
However, fitness functions alone do not determine the performances of local search algorithms, which depend also on the structure of the search spaces involved. These in turn are determined by two largely independent ingredients: (1) the concrete representations of the configurations that are to be optimized, referred to as encodings, and (2) locality in the search space, referred to as a move set.
For many wellstudied combinatorial optimization problems and related models from statistical physics (such as spin glasses), there is a natural encoding. For instance, tours of a traveling salesperson problem (TSP) are naturally encoded as permutations of the cities concerned, while spin configurations are encoded as strings over the alphabet \(\{+,\}\) with each letter referring to a fixed spin variable. This natural encoding is usually free of redundancy; any residual redundancies that occur usually arise from simple symmetries of the problem which can easily be factored out. For instance, TSP tours can start at any city so that they are invariant under rotations, while many spin glass models are invariant under simultaneous flipping of all spins. This natural or “direct” encoding is often referred to as the phenotype space, see, e.g., (Rothlauf 2006; Neumann and Witt 2010; Rothlauf 2011; Borenstein and Moraglio 2014).
In biology, fitness is conceptually understood as a property (function) of the genotype. It depends, however, on properties of higherlevel structures such as molecular structure, generegulatory networks, tissues, or organs, i.e., on a phenotype. The relationship of genotype and fitness, therefore, is a composition of a genotype–phenotype map and phenotypedependent fitness function. This decomposition has been studied extensively in several distinct models systems, including RNA secondary structures, (Schuster et al. 1994), generegulatory networks (Ciliberti et al. 2007), and metabolic networks (Dykhuizen et al. 1987; Flamm et al. 2010). Here, we focus on the abstract structure rather than the specifics of such models.
For a given encoding, irrespective of whether it is genotypic or phenotypic, the performance of search crucially depends on the move set. Here, we will consider only reversible, mutationlike moves. The search space therefore is modeled as an undirected graph. More general settings are discussed, e.g., by Flamm et al. (2007). The cost function assigned to a specific search space defines a fitness landscape. Evolutionary algorithms can thus be viewed as dynamical systems operating on landscapes, whose structure has, as a consequence, been studied extensively in the field (Reidys and Stadler 2002; Østman et al. 2010; Engelbrecht and Richter 2014).
Continuing the analogy with biology in evolutionary computation, an additional encoding Y, the socalled genotype space, is often used (Rothlauf and Goldberg 2003; Rothlauf 2006). The genotype–phenotype relation is determined by a map \(\alpha :Y\rightarrow X\cup \{\varnothing \}\), where \(\varnothing \) represents phenotypic configurations that do not occur in the original problem, i.e., \(y\in Y\) does not encode a feasible solution of the original problem whenever \(\alpha (y)=\varnothing \). For example, a frequently used genotypic encoding for TSP tours comprises binary strings for two cities which represent their presence (1) or absence (0), for each of the possible adjacencies (Applegate et al. 2006). Most binary strings, however, do not correspond to TSP tours.
In practice, genotypic representations are usually chosen with a high degree of redundancy to tackle optimization problems which often also introduces neutrality, i.e., the appearance of adjacent configurations with the same value of the cost function. Detailed investigations of fitness landscapes from molecular biology have shown that degrees of neutrality can facilitate optimization (Schuster et al. 1994; Reidys and Stadler 2002) due to the inclusion of extensive neutral paths which prevent trapping in metastable states (Schuster et al. 1994; Fernández and Solé 2007; Yu and Miller 2002; Banzhaf and Leier 2006). On the other hand, “synonymous encodings” where genotypes mapping to the same phenotype form tight clusters in the genotype space have been advocated for the design of evolutionary algorithms (Rothlauf 2006; Choi and Moon 2008; Rothlauf 2011). Rather than having neutral paths connecting remote areas of the landscape, costequivalent configurations are locally clustered in synonymous encodings.
What is clear is that, empirically, the introduction of arbitrary redundancy (by means of random Boolean network mapping) does not increase the performance of mutationbased search (Knowles and Watson 2002), suggesting that the inclusion of redundancy should be suitably designed in order to facilitate optimization. One such approach was that of Klemm et al. (2012), which emphasized the utility of such inhomogeneous genotype–phenotype maps via the idea that lowcost solutions could be enriched and optimization made more efficient in genotype space if the size of the preimage \(\alpha ^{1}(x)\) of the phenotypes were anticorrelated with the cost function f(x) . Of course, for such anticorrelations to be imposed, \(\alpha \) needs to become explicitly dependent on the cost function.
2 Simplifying Landscape Structure by Encoding
Before delving into the technicalities, we present a conceptual outline of the key ideas of this contribution. Our starting point is the twentyyearold observation by Ruml et al. (1996) that certain redundant encodings of the NumberPartitioning Problem (NPP) allow simple, generic optimization heuristics to find dramatically improved solutions. In previous work (Klemm et al. 2012) we found that this approach was not limited to the NPP, but that suitably chosen redundant encodings also improved the performance of heuristics on several other combinatorial optimization problems. In the present work, our objectives are to understand (a) why the particular method used by (Ruml et al. 1996) works so well and (b) how it can be generalized to essentially arbitrary combinatorial optimization problems in a principled way.
We focus in this contribution on blackboxtype optimization scenarios in which the information on the cost function f(x) is exclusively obtained by evaluating it for specific configurations \(x\in X\) in the search space X. The sequence of these function evaluations is determined by the optimization heuristic. Practical algorithms of this type propose candidates \(x\in X\) for evaluation based on past evaluation results. These candidates are chosen locally in the vicinity of past successful candidates with the help of rules that depend on the representation of X. This explicitly or implicitly defines a topological structure on X. For the purpose of the present contribution, we assume that the topology of the search space X is expressed by a notion of adjacency that is respected by the search process.
 (i)
neighborhoods in Y are small enough to be searched in practice.
 (ii)
for every starting point there is a path to the global optimum such that the cost function is decreasing, or at least nonincreasing.
Is it possible at least in principle to construct such an encoding? The prepartition encoding, which performed best for the NPP (Ruml et al. 1996), provides an important hint. Each particular encoding \(y\in Y\) corresponds to a restricted version of the original optimization problem, i.e., it can be seen as constraining the original search space X to a subset \(\varphi (y)\subseteq X\). A deterministic approximation is then used to solve the restricted problem on \(\varphi (y)\). For every \(y\in Y\), this provides an upper bound on the cost function \(\tilde{f}(y)\). Since the encoding is chosen such that there is also a code \(\hat{y}\) for the global optimum \(\hat{x}\in X\), i.e., \(\varphi (\hat{y})=\{\hat{x}\}\), the task now becomes to find \(\hat{y}\), which minimizes \(\tilde{f}\) by construction. The numerical results by (Ruml et al. 1996) suggest that this auxiliary problem of minimizing the cost function of the encoding is much easier than the original, despite the fact that the search space is much larger. Below we show that this is case because (1) \(\tilde{f}\) does a good job at approximating the true solution \(\tilde{F}(y)\) of the restricted optimization problem on \(\varphi (y)\) and (2) the perfect solutions \(\tilde{F}(y)\) give rise to landscapes with the desired properties mentioned above.
This observation suggests a general construction for “good” landscape encodings. The first step is the construction of a genotype space Y and an encoding scheme \(\varphi \) that maps genotypes to restrictions of the original problem rather than a particular phenotype y. This map has to satisfy certain conditions discussed in detail in Sect. 3.2 to be a good choice. The cost function then enters by guiding, for every genotype \(y\in Y\), a heuristic that solves the restricted problem \(\varphi (y)\).
Following the formal introduction of the general concepts, we construct landscape encodings explicitly for several wellknown examples. In Sect. 4, we focus on a particularly useful construction that makes use of the fact that the restricted subproblems on \(\varphi (y)\) can be seen as smaller instances of the same type of optimization problem, or alternatively, as coarsegrained problems. We show in particular that the NPP heuristic that motivated our approach is also of this type. In Sect. 5, finally, we use numerical experiments to show that the encoding scheme proposed here also works well in practice.
3 A Theory of Encoding Representations
3.1 Landscapes
Formally, an instance (X, f) of a combinatorial optimization problem consists of a finite set X and a cost function \(f:X\rightarrow {\mathbb {R}}\) on X. The task of the combinatorial optimization problem (X, f) is to find a global minimum \(\hat{x}\in X\) so that \(f(\hat{x})\le f(x)\) for all \(x\in X\).
A landscape \((X,\sim ,f)\) consists of a finite set X endowed with a symmetric and irreflexive (adjacency) relation \(\sim \) and a cost function \(f:X\rightarrow {\mathbb {R}}\). A point \(x^*\in X\) is a strict local minimum in \((X,\sim ,f)\) if (i) \(f(x^*)>f(\hat{x})\) and (ii) there is no \(x'\in X\) with \(f(x')<f(x^*)\) and an fnonincreasing path \(x^*=x_0,x_2,\dots ,x_k=x'\), that is, \(x_{i1}\sim x_{i}\) and \(f(x_{i1})\ge f(x_i)\) holds for \(0<i\le k\). Note that a global minimum \(\hat{x}\) is not a strict local minimum as defined above.
For any \(X'\subseteq X\), the restricted problem \((X',f_{X'})\), where \(f_{X'}(x)=f(x)\) for all \(x\in X'\), consists in finding a \(\hat{x}'\in X'\) so that \(f(\hat{x}')\le f(x')\) for all \(x'\in X'\). A restricted landscape \((X',\sim ,f_{X'})\) can be defined analogously.
3.2 Oracle Function and CoverEncoding Map
A key ingredient in our reasoning is to consider the global solutions of restricted optimization problems. This is formalized as follows:
Definition 1
We start by formalizing the idea of an encoding of a landscape.
Definition 2
 (Y1)

\(\bigcup _{y\in Y} \varphi (y) = X\).
 (Y0)

\(\varphi (y)\ne \emptyset \).
 (Y2)

For every \(x\in X\) there is a \(y\in Y\) such that \(\varphi (y)=\{x\}\).
 (Y3)

There is \(y\in Y\) such that \(\varphi (y)=X\).
It is not hard to see that coverencoding maps always exist. In particular, consider any subset \(Y\subseteq \mathfrak {P}_0(X) = 2^X\setminus \{\emptyset \}\), the set of nonempty subsets of X, such that (i) the singletons \(\{x\}\in Y\) for all \(x\in X\) and (ii) \(\{X\}\in Y\). Then the identity \(\iota \) is obviously a coverencoding map that satisfies (Y0), (Y1), (Y2), and (Y3).
Now consider an optimization problem (X, f) and let \(\varphi :Y\rightarrow 2^X\) be a coverencoding map for X. We define \(\tilde{F}: Y\rightarrow {\mathbb {R}}\) as the composition of \(\varphi \) with the oracle function of (X, f), i.e., \(\tilde{F}(y) = F(\varphi (y))\). In the following, we will be interested in the relationship between the “encoded” optimization problem \((Y,\tilde{F})\) and the original problem (X, f).
 (F0)

There is \(\hat{y}\in Y\) so that (i) \(\varphi (\hat{y})=1\) and \(F(\varphi (\hat{y}))=f(\hat{x})\).
The identity coverencodings from \(Y_{\max }:=\mathfrak {P}_0(X)\) and \(Y_{\min }:=\{ \{x\}x\in X\} \cup \{ X\}\) are the extreme cases. \(Y_{\max }\) encodes all possible subproblems, while \(Y_{\min }\) only encodes the singletons, i.e., the evaluation of the cost function f for every \(x\in X\), as well as the full optimization problem.
 (R1)

For every \(y\in Y\) with \(\tilde{F}(y)=\tilde{F}(\hat{y})\) there is a sequence \(y=y_0,y_1,\dots ,y_k=\hat{y}\) such that \(y_i\sim y_{i1}\) for \(0<i\le k\) and \(\tilde{F}(y_i)=\tilde{F}(\hat{y})\).
 (R2)

For every \(y\in Y\) with \(\tilde{F}(y)>\tilde{F}(\hat{y})\) there is a sequence \(y=y_0,y_1,\dots ,y_k=\hat{y}\) such that \(y_i\sim y_{i1}\) for \(0<i\le k\), \(\tilde{F}(y_k)=F(\hat{y})\) and \(\tilde{F}(y_{i1})\ge \tilde{F}(y_i)\).
 (R3)

Every y with \(\varphi (y)\ne X\) has a neighbor \(y'\sim y\) with \(\varphi (y)\subset \varphi (y')\).
For identity coverencodings introduced above, a natural definition of adjacency is to set \(y\sim y'\) and \(y'\sim y\) whenever (i) \(y\subseteq y'\), (ii) \(y\ne y'\), and (iii) if \(y\subseteq y'' \subseteq y'\) then \(y''=y\) or \(y''=y'\). That is, two sets are adjacent if they are adjacent in the Hasse diagram for set inclusion. By construction, every \(y\in Y\) is connected by a sequence of adjacent sets to all singletons \(\{x\}\) with \(x\in y\) and to the full set \(y=X\). Since \(\varphi \) is the identity, (R3) holds. Using that \(y\subseteq y'\) implies \(\tilde{F}(y)\ge \tilde{F}(y')\), properties (R1) and (R2) also follows immediately.
Taken together, the identity coverencodings demonstrate that coverencodings and associated adjacencies satisfying (Y0) through to (Y3) as well as (R1), (R2), and (R3) always exist.
Lemma 1
(R3) implies (R2) for any oracle function F.
Proof
If \(\varphi (y)=X\), then \(\tilde{F}(y)=F(X)=f(\hat{x})=\tilde{F}(\tilde{y})\) by construction. Now consider an arbitrary starting point y. By (R3), there is a neighbor \(y'\sim y\) such that \(\varphi (y)\subset \varphi (y')\), and by Eq. (2), we therefore have \(\tilde{F}(y')\le \tilde{F}(y)\). Repeating the argument, we obtain a \(\tilde{F}\)nonincreasing sequence \(y,y',y'',\dots ,y^{(k)},\dots \) along which \(\varphi \) is strictly increasing in each step. Since X is finite, there is a finite k so that \(\varphi (y^{(k)})=X\) and thus \(\tilde{F}(y^{(k)})=\tilde{F}(\hat{y})\), i.e., (R2) is satisfied. \(\square \)
The importance of conditions (R1) and (R2) stems from the following observation:
Theorem 1
Suppose (X, f), \(\varphi :Y\rightarrow 2^X\), and the relation \(\sim \) on Y are chosen such that (Y1), (F0), (R1), and (R2) are satisfied. Then the landscape \((Y,\sim ,\tilde{F})\) has no strict local optimum.
Proof
Let \(y\in Y\) be an arbitrary starting point. If \(\tilde{F}(y)=\tilde{F}(\hat{y})\) then y, by (R1), is not a local optimum but part of a connected neutral network that contains the global optimum \(\hat{y}\). If \(\tilde{F}(y)\ne \tilde{F}(\hat{y})\), then \(\tilde{F}(y)>\tilde{F}(\hat{y})\). By (R2), there is a path with nonincreasing values of \(\tilde{F}\) that connects y to a point \(y'\) with \(\tilde{F}(y')=\tilde{F}(\hat{y})\). We already know that there is a path with constant values of \(\tilde{F}\) leading from \(y'\) to the global optimum \(\hat{y}\). Thus y is connected by a \(\tilde{F}\)nonincreasing path to \(\hat{y}\). Hence y is, by definition, not a strict local optimum. \(\square \)
In particular, the identity coverencodings satisfy the conditions of Theorem 1 and thus their landscapes have no strict local optima. There are, however, also very different general constructions with this property. In the remainder of this section, we consider one example.
Definition 3
The graph \((Y,\sim _Y)\) is the Cartesian square of the graph \((X,\sim _X)\) (Hammack et al. 2016). The idea behind this construction is to allow a local search algorithm to keep track of the best solution so far in one variable and use the other variable for exploration. Figure 1 shows an example.
Lemma 2
The landscape \(X\times X,\sim _Y,\tilde{F}\) satisfies (Y0), (Y2), (F0), (R1), and (R2). In particular it has no strict local optima.
Proof
Considering the properties of \(\varphi \), (Y0) is obtained with \(\varphi (y)>0\) for all \(y \in Y\); (Y2) is fulfilled choosing \(y=(x,x)\) for any \(x \in X\). This implies (Y0) so \(\varphi \) is a coverencoding map. We have (Y3) only in the trivial case \(X \le 2\). Property (F0) is fulfilled with \(\hat{y} = (\hat{x},\hat{x})\).
For \(y,y' \in Y\), we write \(d_Y(y,y')\) for the standard graph distance, the length of a shortest path, between y and \(y'\); analogous notation for the distance \(d_X\) on \((X,\sim _X)\). For \((x_1,x_2) \in Y\) and \((\xi _1,\xi _2) \in Y\), we have \(d_Y((x_1,x_2),(\xi _1,\xi _2)) = d_X(x_1,\xi _1)+d_X(x_2,\xi _2)\).
Now let \((x_1,x_2) = y \in Y \setminus \{(\hat{x}, \hat{x})\}\). Then \(x_1 \ne \hat{x} \ne x_2\). We assume, without loss of generality, \(f(x_1) \ge f(x_2)\) (otherwise swap \(x_1\) and \(x_2\)). Because \((X,\sim _X)\) is connected, we find a neighbor \(x' \sim _X x_1\) with \(d_X(x',\hat{x})=d_X(x_1,\hat{x})1\). With \(y' = (x',x_2)\), we have \(\tilde{F}(y') = \min \{ f(x'), f(x_2) \} \le f(x_2) = \tilde{F}(y)\) and \(d_Y(y',\hat{y}) = d_Y(y,\hat{y}) 1\). For each element \(y \in Y\) we thus find a \(y' \in Y\) that (i) is strictly closer to \(\hat{y}\) than y is; and (ii) does not evaluate at higher value than y under \(\tilde{F}\). Using the argument inductively at most \(d_Y(y,\hat{y})\) times, the desired sequences in (R1) and (R2) are constructed. Therefore properties (R1) and (R2) are fulfilled by \((Y,\sim _Y,\tilde{F})\). Theorem 1 now implies that there are no strict local minima. \(\square \)
3.3 Adaptive Walks
An adaptive walk on a fitness landscape \((Y,\sim _Y,\tilde{F})\) is a Markov chain on the state space Y with transition probabilities \(\pi _{y \rightarrow z} = 1/d_y\) for \(y \sim _Y z\) and \(\tilde{F}(z) \le \tilde{F}(y)\). Otherwise \(\pi _{y \rightarrow z} = 0\), except for \(y=z\) where \(\pi _{y \rightarrow y}\) is obtained by normalization of probability. The degree \(d_y\) of state y is the number of neighbors \(\{ z \in Y : z \sim _Y y \}\). Formulated as a stochastic search algorithm, a neighbor z of the current (time t) configuration y is drawn uniformly at random. If \(\tilde{F}(z) \le \tilde{F}(y)\), the walk proceeds to configuration z at time \(t+1\); otherwise it remains at configuration y.
Call \(\hat{Y}\) the set of global minima of the landscape \((Y,\sim _Y, \tilde{F})\). Assume that this landscape does not have a strict local minimum. Then each realization of an adaptive walk eventually hits a global minimum. Due to the absence of strict local minima, the adaptive walk is trapped only at global minima. Each invariant measure of the adaptive walk therefore evaluates to zero on all configurations with nonminimum cost. Property (R2) clearly is a necessary condition for an optimization problem to be solvable by adaptive walks alone. The conditions of Theorem 1 are already sufficient as it excludes strict local optima.
3.4 Examples of CoverEncoding Maps
Let us now turn to constructing some problemspecific examples of coverencoding maps. We will then use some of these examples to show that some coverencoding maps are useful to construct good heuristic search algorithms for several wellstudied combinatorial optimization problems.
3.4.1 Prepartition Encoding for the NPP
The most natural choice of an adjacency \(\sim \) on Y is to define \(y\sim y'\) if and only if \(y(i)\ne y'(i)\) for exactly one \(i\in [n]\). Unless y is a bijection, there is at least one unused value \(k\in [n]\setminus y([n])\) and at least one pair \(j',j''\in [n]\) with \(y(j')=y(j'')\). The neighbor \(y'\) of y with \(y'(i)=y(i)\) for \(i\ne j''\) and \(y'(j'')=k\) corresponds to refinement of the partition \(\varPi _y\) because \([j']_{\varPi _y'}=[j']_{\varPi _y}\setminus \{j''\}\), \([j'']_{\varPi _y'}=\{j''\}\), and all other classes of \(\varPi _y'\) and \(\varPi _y'\) are the same. Thus \((Y,\sim )\) satisfies (R3).
An optimal solution \(\hat{x}\) of the NPP (X, f) is a partition \(\hat{\Omega }\) of [n] into exactly two classes \(Q_+\) and \(Q_\) so that \(x_i=+1\) for \(i\in Q_+\) and \(x_i=1\) for \(i\in Q_\). A code \(y\in Y\) is good if there is a configuration in \(\varphi (y)\) in which the signs can be assigned in exactly this manner, i.e., if \(\varPi _y\) is a refinement of \(\hat{\Omega }\). Conversely, \(\varphi (y)\) is good only if it is a refinement of a bipartition \(\Omega \) that represents a global minimum. Generically \(\hat{\Omega }\) is unique. Now consider two classes \(Q_1\) and \(Q_2\) in \(\varPi _y\) that are contained in the small class of \(\Omega \), i.e., \(Q_1,Q_2\subset \Omega \). Reassigning one element at a time from \(Q_2\) to \(Q_1\) thus corresponds to a sequence of codes \(y=y_1,y_2,\dots y_{Q_2}\) all of which are encode refinements \(\Omega \). Furthermore, \(y_{Q_2}\) is one class less than y. Repeating this step at most \(n2\) times eventually results in \(\Omega \). Intermediate codes \(y_i\) and \(y_{i1}\) are adjacent by construction and satisfy \(\tilde{F}(y_i)=\tilde{F}(\hat{y})\), i.e, condition (R1) is satisfied. Thus, we conclude that the “oracle landscape” \((Y,\sim ,\tilde{F})\) has no strict local minima.
3.4.2 Prepartition Encoding for the TSP
Klemm et al. (2012) introduced the following version of a prepartition encoding. Here, an arbitrary function \(y:C\rightarrow [n]\) is used to restrict the possible orderings of the cities along the tour as follows: For all cities \(c,d\in C\), the condition \(y(c)<y(d)\) implies \(\pi ^{1}(c)<\pi ^{1}(d)\). Again this defines a subset \(X_y\) of the search space X of each y. We use the same definition of adjacency in Y. Here, constant functions y impose no restrictions on \(\pi \), i.e, \(\varphi (y)=X\) whenever \(y(c)=y(d)\) for all \(c,d\in C\). On the other hand, if y is bijective then \(X_y\) consists only of a single tour since in this case \(y(c)=\pi ^{1}(c)\) for all \(c\in C\), i.e., \(\pi =y^{1}\). Thus, (Y2) and (Y3) are satisfied.
To address properties (R2) and (R1), we first observe that given an encoding y, we can always move one city c with \(y(c)=k\) to one of the classes defined by y with an adjacent value \(k'\). More precisely, suppose \(k'\) is such that (a) there is a city d so that \(y(d)=k'\) and b) there are no cities e with \(y(e)=k''\), for any \(k''\) between k and \(k'\). If \(k'>k\), the city which we can move is the one with \(y(c)=k\) that appears last in the optimal tour \(\omega \in \varphi (y)\); similarly, if \(k'<k\), we can move the city c with \(y(c)=k\) that appears first in the optimal tour \(\omega \in \varphi (y)\). In the first case, we can set \(k < y'(c)\le k'\), while in the second case, we can choose \(k' \le y'(c) < k'\). By construction \(\omega \in \varphi (y')\), and therefore \(\tilde{F}(y')\le \tilde{F}(y)\). It is also clear from the construction that the step from y to \(y'\) can always be chosen so that the number of classes \(y^{1}([n])\) remains constant, increases by one \(y^{1}([n])\), or decreases by one—unless we already have \(y^{1}([n])=n\), in which case only a decrease is possible, or we have \(y^{1}([n])=1\), in which case only an increase is possible. Thus, we can always find a path along which \(\tilde{F}(y')\) does not increase and along which \(y^{1}([n])\) is nonincreasing or nondecreasing, respectively. Note the moves keeping \(y^{1}([n])\) constant might be necessary to move the values y(c) stepwise around in [n] to have enough “space” to break up individual classes of \(y^{1}\), so that its members in the end have consecutive values of y. It is not hard to convince oneself that this is always possible. As a consequence, we can always connect any y to a code with a single class (for which \(\varphi (y)=X\)). For two adjacent classes, we simply join, onebyone, the cities of the smaller class to the larger one. Furthermore, the singleclass code can be broken by pulling a city at a time so that (R1) also holds. Note that (R3) is not necessarily satisfied, however.
In contrast to the previous example of the NPP, here the paths are much more involved and often longer. We therefore conjecture that the prepartition encoding is less efficient for the TSP than for the NPP.
3.4.3 Spanning Forest Encoding for the NPP
(R3) holds since removing an edge from the spanning forest y yields another spanning forest \(y'\) that imposes fewer restrictions and thus corresponds to a larger subset of X. In general, write \(y'\prec y\) if \(y'\) is a subforest of y. Then \(\varphi (y)\subset \varphi (y')\). The unconstrained search space corresponds to the spanning forest \(y_0\) without edges. Conversely, every spanning tree \(\hat{t}\) that defines the bipartition of the globally minimal solution of the original NPP encodes exactly this solution. Every sequence \(\hat{t} = y_{n1} \succ y_{n2} \succ \dots \succ y_1 \succ y_0\) of spanning forests obtained by successive edge deletions from \(\hat{t}\) connects \(y_0\) and \(\hat{t}\) and each \(\varphi (y_i)\) also contains the global minimum encoded by \(\hat{t}\). Thus (R1) holds.
3.4.4 Subdivision Encoding for the TSP
If \(\varPi \) is the discrete partition, then we obviously have \(\varphi (y)=X\), while the indiscrete partition uniquely specifies the tour \(\psi \). The encoding therefore satisfies (F0), (Y0), (Y1), (Y2), and (Y3). Consider any adjacency relation \(\sim \) on Y so that \(y\sim y'\) if \(\varPi '\) is obtained by splitting a class (interval) into two or merging two intervals. Then (R3) is clearly satisfied.
In order to consider (R1), we specify the adjacency relation \(\sim \) more stringently. If \(y\sim y'\), then either (i) y is obtained from \(y'\) by splitting exactly one class of \(y'\) into two nonempty parts or vice versa, or (ii) y and \(y'\) exhibit the same partition of the cities, i.e., \(\varPi =\varPi '\). In case (i), the ordering within each class in maintained. For the split interval \(I_u'=[\psi '(i_{u1}+1),\dots ,\psi '(i_{u})]\), this means that an index \(j\in [i_{u1}+1,i_u1]\) is chosen and the resulting intervals become \(I_{u_1}=[\psi '(i_{u1}+1),\dots ,\psi '(j1)]\) and \(I_{u_2}=[\psi '(j),\dots ,\psi '(i_{u})]\). The ordering between intervals (classes of \(\varPi \)) remains fixed. In case (ii), the partition and the ordering within the intervals both remain unchanged, but the ordering of the intervals (classes of \(\varPi \)) changes. For our purposes, it is not important which types of permutations between intervals are allowed, as long as they form an ergodic set. Plausible choices are transpositions, canonical transpositions, reversals, or even all permutations.
Now consider an encoded configuration \(\hat{y}\) with \(\hat{x}\in \varphi (\hat{y})\). The intervals of specified \(\hat{y}\) are partial tours of the globally optimal solution. Moves on Y can now be performed so that a new encoding \(y'\) is obtained in a stepwise fashion, that uses the same intervals and brings two partial tours that are consecutive in \(\hat{x}\) into the desired order. During this stepwise change of \(\psi \), the encoded sets \(\varphi (y)\) stay the same, and thus \(\varphi (y')=\varphi (\hat{y})\). Now the two appropriate consecutive intervals can be merged. This reduces m by 1 and makes \(\varphi (y)\) smaller, but the globally optimal solution is still retained, i.e., \(\hat{x}\in \varphi (y)\). The procedure can be repeated at most \(m1\) times to reach the indiscrete partition, which fully specifies the globally optimal tour. Thus, (R1) holds for all choices of neighborhoods that allow merging/splitting of adjacent intervals and an ergodic permutation of the intervals.
3.4.5 Sparse Subgraph Encoding for the Maximum Matching Problem
Now consider an edge subset \(S \subseteq E\). In the present context, we call S sparse if the graph (V, S) has maximum degree 2, so each connected component of (V, S) is a cycle or path (including isolated nodes as trivial paths). Denote by Y the set of all sparse subsets of E. Since a matching M is also a sparse subset of G, we have \(X \subseteq Y\).
The coverencoding map \(\varphi :Y \rightarrow 2^X\) assigns each \(S \in Y\) the set of maximum matchings of the graph (V, S). Now with S sparse, the maximum matching problem on (V, S) is trivially solved separately on each connected component being a path or cycle. For a path of odd length k, the maximum matching is unique with \((k+1)/2\) edges; a path or cycle of even length k has exactly two disjoint maximum matchings of cardinality k / 2. A cycle of odd length k has exactly k pairwise different maximal matchings of cardinality \((k1)/2\).
For each matching \(x \in X\), we have \(\varphi (x) = \{x\}\) so property (Y2) holds. Properties (Y0) and (Y1) are fulfilled. With the choice \(\hat{y} = \hat{x}\), (F0) is fulfilled. Property (Y3) holds if and only if (G, E) is sparse itself.
We consider sparse subsets D and \(D^\prime \) as adjacent, \(D \sim D^\prime \), if they differ at exactly one edge, \((D\cup D^{\prime }) \setminus (D\cap D^{\prime })=1\).
In order to demonstrate properties (R1) and (R2), let \(y \in Y \setminus \{ \hat{y} \}\). We show that there is \(y' \sim _Y y\) with \(\tilde{F}(y') \le \tilde{F}(y)\) and \((y' \cup \hat{y}) \setminus (y' \cap \hat{y}) \le  (y \cup \hat{y}) \setminus (y' \cap \hat{y})\). Thus, neighbor \(y'\) is obtained from y either by adding an edge contained in \(\hat{y}\) or removing an edge not contained in \(\hat{y}\). If \(y \supset \hat{x}\), find an edges \(e \in y \setminus \hat{x}\) and set \(y'=y \setminus \{e\}\), and we are done. Otherwise, since \(y \ne \hat{y}\), there is an edge \(\{v,w\}=e \in \hat{x} \setminus y\). If \(y \cup \{e\}=:z\) is sparse, we are done using \(y' = z\). Otherwise at least one of nodes v and w has degree 3 in the graph (V, z); suppose node v has degree 3. Find a maximum matching \(x \in \varphi (y)\). Since v has degree 2 in the graph (V, y), there is an edge \(e'\in y \setminus x\) incident in v. Set \(y'=y\setminus \{e'\}\). We easily confirm \(\tilde{F}(y') \le \tilde{F}(y)\) in each of the cases above. Sequences for properties (R1) and (R2) are obtained by induction.
3.4.6 String Encoding for the Maximum Clique Problem
In order to prove properties (R1) and (R2), we first observe that there is a nonincreasing sequence of strings from any \(y \in Y\) to a pure \(y^{(\mathrm p)} \in Y\) with \(\varphi (y^{(\mathrm p)}) \subseteq \varphi (y)\) and \(\tilde{F}(y^{(\mathrm p)}) = \tilde{F}(y)\). The sequence is obtained by finding a maximal \(C \in \varphi (y)\). If y is not pure, there is \(i \in [n]\) with \(y_i \notin C\). The next string in the sequence can be obtained by replacing the entry \(y_i\) with an arbitrary element from C.
If \(y, z \in Y\) are pure with \(\varphi (y)=\varphi (z)=\{C\}\) and \(C<n\), there is a nonincreasing sequence from y to z. It may be constructed by stepwise swapping operations. Since \(C<n\), there is at least one element in C found at two distinct positions in y so one of these can be used as a temporary variable in the swap.
Now let \(y, y' \in Y\) with \(\tilde{F}(y') \le \tilde{F}(y)\). Find a maximal clique \(C \in \varphi (y)\) and a maximal clique \(C' \in \varphi (y')\). We construct a nonincreasing sequence from y to \(y'\) by concatenating the following sequences. First, a nonincreasing sequence from y to a pure \(y^{(\mathrm p)} \in Y\) with \(\tilde{F}(y^{(\mathrm p)}) = \tilde{F}(y)\). Second, a nonincreasing sequence from \(y^{(\mathrm p)}\) to a pure \(z \in Y\) with \(\{ z_1,z_2, \dots , z_{C}\}=C\) and \(\{ z_1,z_2, \dots , z_{C \setminus C'}\}=C \setminus C'\), and arbitrary \(z_{C+1},z_{C+2}, \dots , z_n \in C\). Third, a sequence from z to a string \(z'\) is obtained by assigning, step by step, nodes in \(C'\setminus C\) to entries from \(z_{C+1}\) to \(z_n\). The sequence is nonincreasing because each of its strings generates C under \(\varphi \). On the other hand, \(\gamma _G((z')_{(C \setminus C'+1)\dashv }) = C'\) so \(\tilde{F} (z') = \tilde{F}(y')\). Now again by swap steps, we transform \(z'\) into \(y'\).
4 CoarseGraining
Some of the restricted search spaces \(\varphi (y)\) introduced above can also be thought of as coarsegrainings of the original problem. In the following subsections, we show this for the prepartition and spanning forest encodings of the NPP, as well as for the TSP.
4.1 Prepartition Encoding of the NPP
4.2 Travelling Salesman Problems
Note that we naturally obtain an asymmetric TSP even if the original problem was symmetric since now \(d'_{pq}\ne d'_{qp}\) because in general we will have \(d_{\pi (i_p)\pi (i_{q1}+1)}\ne d_{\pi (i_q)\pi (i_{p1}+1)}\).
4.3 Spanning Forest Representation of the NPP
4.4 Some Remarks on CoarseGrainings: Analogies with the Renormalization Group?
It is tempting to speculate that the coarsegrainings we have observed in the above are analogous to those observed in renormalization group theory, well known for its use in analyzing spin glasses and related disordered systems (Rosten 2012). In our context, it can be described as follows. For a given type of problem, such as the NPP or the TSP, consider the space \(\mathfrak {X}\) of all possible instances of all sizes. A particular instance (e.g., the NPP with n numbers \(a=\{a_1,a_2,\dots ,a_n\}\)) is a point \(\mathbf {x}\in \mathfrak {X}\). Now we define a set \(\mathscr {R}\) of maps \(r:\mathfrak {X}\rightarrow \mathfrak {X}\) that map larger instances to strictly smaller ones. Of interest in this context are in particular those maps r that (approximately) preserve salient properties. Since \(r(\mathbf {x})\) is a smaller instance than \(\mathbf {x}\), the map r is not invertible. The maps in \(\mathscr {R}\) can of course be composed, and thus form a semigroup which is known as the renormalization group (Wilson and Kogut 1974; Wilson 1971). Of course, while renormalization groups in statistical physics are used to analyze the typical behavior of large systems near criticality, our focus in the present optimization context is on particular instances of systems that are typically large. This does not yet rule out an analogy, assuming that something like an ergodic hypothesis applies, where the behavior of typical instances is indeed that of the average. Thus, starting from \(\mathbf {x}=(X,f)\), or more precisely, an encoding y so that \(\varphi (y)=\mathbf {x}\), we can think of adjacent encodings \(y'\sim y\) with \(\varphi (y')<\varphi (y)\) as “renormalized” versions of \(\varphi (y)\). A path in \((Y,\sim )\) leading from \(\mathbf {x}\) to the trivial instance thus can be seen as the iteration of progressively renormalized samples.
A positive example of this analogy could be that of the spanning forest encoding of the NPP with realspace renormalization schemes for Ising spins: an example of an \(\mathscr {R}\) could be a socalled block spin transformation (Kadanoff 1966), where suitable averages are taken over small local subsets of spins, which are then progressively scaled up to larger system sizes to explore their critical behavior. Only certain block variables will work for such schemes, depending on the underlying symmetries of the problem, just as, in the earlier subsection, only the sums of numbers \(a_i\) preserve the optimal solutions. Such simple realspace scalings, do not, however, always exist for our optimization schemes: the prepartition encoding of the TSP, for example, cannot be rephrased as a coarsegrained (i.e., reducedsize) TSP. To see this, simply observe that the evaluation of a tour in the restricted model still requires an optimization over multiple incoming and outgoing connections (roads) for every city, i.e., the information of intercity distances cannot be collapsed in any way upon the transition from a larger (less restricted) to a smaller (more restricted) problem. This does not, however, rule out the possibility of, say, a renormalizationtype scaling in some sort of generalized Fourier space. In the case of landscapes on permutation spaces, the characters of the symmetric group provide a suitable Fourierlike basis (Rockmore et al. 2002), which seem to be applicable to TSP and certain assignment problems. These and other possibilities are currently being explored, since it seems that deep similarities may underlie relatively superficial differences in the nature of the transformations involved in renormalization groups and the optimizationfacilitating encodings that are the subject of this paper.
5 Heuristic Optimization over Y
5.1 General Considerations
So far, we were only concerned with the abstract structure of coverencoding maps \(\varphi :Y\rightarrow 2^X\) and the adjacencies \(\sim \) in their encodings Y. On this theoretical basis, we can now construct a searchbased optimization heuristic that generalizes the approaches in (Ruml et al. 1996) and our earlier work (Klemm et al. 2012). The idea is very simple: If we have an accurate and efficiently computable heuristic, we can quickly obtain good upper bounds \(\alpha _f(y)\ge \tilde{F}(y)\) for each of the restricted problems \((\varphi (y),f)\). The properties (R1) and (R2) guarantee the existence of nonincreasing paths from an arbitrary initial encoding \(y_0\) down to a final encoding \(\hat{y}\). Steps to adjacent encodings that decrease \(\alpha _f\) therefore will have a bias toward the optimal solution of the original problem.
The fact that we have to rely on the quality of the estimate \(\alpha _f(y)\approx \tilde{F}(y)\) also suggests that it should be more efficient to restart the search often rather than try to overcome barriers of local minima in the landscape \((Y,\alpha _f)\). In the examples above, local minima in \((Y,\alpha _f)\) can, as we have proved, appear only due to insufficient accuracy of the heuristic solutions \(\alpha _f(y)\) for some encodings.
 1.
The coverencoding map \(\varphi :Y\rightarrow X\) should be of a form that guarantees that \((Y,\sim ,\tilde{F})\) has no local optima, i.e., the properties (R1), (R2), (Y1), and (Y2) should hold.
 2.
The paths in \((Y,\sim )\) connecting large sets \(\varphi (y)\) to smaller ones should not contain many steps along which the sets do not shrink. For instance, while the prepartition encoding for the NPP always has a strictly coarsegrained neighbor, this is not the case for the prepartition encoding for the TSP. We therefore suspect that other encodings for the TSP will work better in general.
 3.
The heuristic producing \(\alpha _f(y)\) needs to be efficient, ideally not much slower than the function evaluations for the initial cost function f.
We select the MMP and the MCP as examples because (1) oracle functions and encodings can constructed that guarantee the absence of strict local minima; and (2) there is a simple and efficient algorithm for exact computation of \(\tilde{F}(y)\) for each \(y \in Y\). So we do not require heuristics. We leave the combination of coverencoding maps with nontrivial heuristics for a future manuscript.
5.2 Maximum Matching Problems
5.3 Maximum Clique Problems
Figure 4 shows the time evolution of the cost of adaptive walks on the encoded landscapes of graph cliques encoded by node sequences. The figure caption contains details on the instances and relevant definitions can be found in Sect. 3.4.6. We plot the difference with the minimum cost \(\tilde{F}(y)\), so that a plotted value of 0 means the global optimum has been found.
Our tentative conclusions are that the time to reach the optimal solution scales moderately with problem size. The standard deviation over realizations (error bars in the plot) also indicates a moderate variation of optimization time across these randomly generated instances.
6 Discussion and Conclusions
In this contribution we have shown that, in principle, it is possible to construct a genotypic encoding for any given phenotypically encoded combinatorial optimization problem with the property that the encoded landscape has no strict local minima. The construction hinges on three ingredients: a coverencoding map \(\varphi :Y\rightarrow 2^X\) that satisfies a few additional conditions, a suitable adjacency relation on Y, and an oracle function that (miraculously) returns the optimal cost value on the restrictions of the original problem to the covering sets \(\varphi (y)\). Of course, if we had such an oracle function in practice, we would not need a search heuristic in the first place.
Nevertheless, the concepts of oracle functions and coverencoding maps are not just an empty exercise. We have seen that coverencoding maps \(\varphi \) give rise to practically useful encodings provided there is a good deterministic heuristic for the restriction of the optimization problem to \(\varphi (y)\). For the NPP, it turns out that the Karmarkar–Karp differencing algorithm (Karmarkar and Karp 1982; Boettcher and Mertens 2008) provides a very good approximation to the oracle function. The prepartition encoding proposed by Ruml et al. (1996), on the other hand, ensures that the landscape of the oracle function is of the desirable type that has no local minima. Together these two facts make the work of Ruml et al. (1996) a showcase application of the theory developed here.
The numerical simulations of Sect. 5 strongly suggest that encodings with localminimafree landscapes indeed admit efficient optimization by local searchbased methods also for other optimization problems. Hence the theoretical results obtained here are of practical relevance provided a sufficiently accurate approximation to the oracle function can be computed. The precise meaning of the phrase “sufficiently accurate approximation” remains an open question for future research. We suspect, however, that the main problem arises when the approximation claims \(\alpha _f(y')<\alpha (y)\), suggesting that a step from y to \(y'\) be accepted, while \(\tilde{F}(y')>\tilde{F}(y)\) holds, suggesting the step to \(y'\) should not be taken.
The construction of encodings for several wellknown optimization problems also highlights the connections between encodings and a natural notion of coarsegraining for optimization problems. This also suggests a link to renormalization group methods commonly used in statistical physics. While it is clear that there is not a trivial correspondence, and that realspace coarsegrainings are just a particular subclass of encodings, this connection certainly deserves further study. The formalism laid out here at least provides a promising starting point.
An important issue in biology is the fact that encodings as symbolized by the genotype–phenotype map are themselves subject to evolutionary changes because the mechanisms of development evolve. It is well known that features of the genotype–phenotype, such as robustness (Wagner 2005) and accessibility (Fontana and Schuster 1998; Ndifon et al. 2009) have a key influence on evolution in the long term. Mathematical approaches that focus on the properties of encodings thus may become a very useful component in formal theories of evolvability and developmental evolution.
Notes
Acknowledgements
Open access funding provided by Max Planck Society. KK acknowledges funding from MINECO through the Ramón y Cajal program and through project SPASIMM, FIS201680067P (AEI/FEDER, EU). This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 694925).
References
 Applegate DL, Bixby RM, Chvátal V, Cook WJ (2006) The traveling salesman problem. Princeton University Press, PrincetonMATHGoogle Scholar
 Banzhaf W, Leier A (2006) Evolution on neutral networks in genetic programming. In: Yu T, Riolo R, Worzel B (eds) Genetic programming theory and practice III. Springer, New York, pp 207–221CrossRefGoogle Scholar
 Boettcher S, Mertens S (2008) Analysis of the KarmarkarKarp differencing algorithm. Eur Phys J B 65:131–140CrossRefGoogle Scholar
 Bomze IM, Budinich M, Pardalos PM, Pelillo M (1999) The maximum clique problem. In: Du DZ, Pardalos PM (eds) Handbook of combinatorial optimizationsupplement volume A. Kluwer Academic Publishers, Dordrecht, pp 1–74Google Scholar
 Borenstein Y, Moraglio A (eds) (2014) Theory and principled methods for designing metaheuristics. Springer, BerlinMATHGoogle Scholar
 Choi SS, Moon BR (2008) Normalization for genetic algorithms with nonsynonymously redundant encodings. IEEE Trans Evol Comp 12:604–616CrossRefGoogle Scholar
 Ciliberti S, Martin OC, Wagner A (2007) Innovation and robustness in complex regulatory gene networks. Proc Natl Acad Sci USA 104:13,591–13,596. https://doi.org/10.1073/pnas.0705396104 CrossRefGoogle Scholar
 Dykhuizen DE, Dean AM, Hartl DL (1987) Metabolic flux and fitness. Genetics 115:25–31Google Scholar
 Engelbrecht A, Richter H (eds) (2014) Recent advances in the theory and application of fitness landscapes. Springer, BerlinGoogle Scholar
 Fernández P, Solé RV (2007) Neutral fitness landscapes in signalling networks. J R Soc Interface 4:41–47CrossRefGoogle Scholar
 Flamm C, Stadler BMR, Stadler PF (2007) Saddles and barrier in landscapes of generalized search operators. In: Stephens CR, Toussaint M, Whitley D, Stadler PF (eds) 9th international workshop on foundations of genetic algorithms IX, FOGA 2007, Mexico City, Mexico, January 8–11, 2007. Lecture notes in computer science, vol 4436, pp 194–212. Springer, BerlinGoogle Scholar
 Flamm C, Ullrich A, Ekker H, Mann M, Högerl D, Rohrschneider M, Sauer S, Scheuermann G, Klemm K, Hofacker IL, Stadler PF (2010) Evolution of metabolic networks: a computational framework. J Syst Chem 1:4CrossRefGoogle Scholar
 Fontana W, Schuster P (1998) Continuity in evolution: on the nature of transitions. Science 280:1451–1455CrossRefGoogle Scholar
 Gutin G, Punnen AP (eds) (2007) The traveling salesman problem and its variations, combinatorial optimization, vol 12. Springer, BerlinMATHGoogle Scholar
 Hammack R, Imrich W, Klavžar S (2016) Handbook of product graphs, 2nd edn. CRC Press, Boca RatonMATHGoogle Scholar
 Kadanoff LP (1966) Scaling laws for Ising models near \(t_c\). Physics 2:263–272CrossRefGoogle Scholar
 Karmarkar N, Karp RM (1982) The differencing method of set partitioning. Computer Science Division (EECS), University of California, Berkeley, CAGoogle Scholar
 Klemm K, Mehta A, Stadler PF (2012) Landscape encodings enhance optimization. PLoS ONE 7(e34):780Google Scholar
 Knowles JD, Watson RA (2002) On the utility of redundant encodings in mutationbased evolutionary search. In: Guervós JJM, Adamidis P, Beyer HG, Schwefel HP, FernándezVillacañas JL (eds) Parallel problem solving from nature—PPSN VII, vol 2439. Lecture notes in computer science. Springer, Berlin, pp 88–98CrossRefGoogle Scholar
 Lovász L, Plummer MD (1986) Matching theory, annals of discrete mathematics, vol 29. NorthHolland, AmsterdamGoogle Scholar
 Mertens S (2006) The easiest hard problem: number partitioning. In: Percus A, Istrate G, Moore C (eds) Computational complexity and statistical physics. Oxford University Press, Oxford, pp 125–140Google Scholar
 Ndifon W, Plotkin JB, Dushoff J (2009) On the accessibility of adaptive phenotypes of a bacterial metabolic network. PLoS Comput Biol 5(e1000):472. https://doi.org/10.1371/journal.pcbi.1000472 MathSciNetGoogle Scholar
 Neumann F, Witt C (2010) Bioinspired computation in combinatorial optimization. Springer, BerlinCrossRefMATHGoogle Scholar
 Östergård PRJ (2002) A fast algorithm for the maximum clique problem. Discr Appl Math 120:197–207MathSciNetCrossRefGoogle Scholar
 Østman B, Hintze A, Adami C (2010) Critical properties of complex fitness landscapes. In: Fellermann H, Dörr M, Hanczyc MM, Laursen LL, Maurer SE, Merkle D, Monnard PA, Støy K, Rasmussen S (eds) Artificial life XII. MIT Press, Cambridge, pp 126–132Google Scholar
 Reidys CM, Stadler PF (2002) Combinatorial landscapes. SIAM Rev 44:3–54MathSciNetCrossRefMATHGoogle Scholar
 Rockmore D, Kostelec P, Hordijk W, Stadler PF (2002) Fast Fourier transform for fitness landscapes. Appl Comput Harm Anal 12:57–76MathSciNetCrossRefMATHGoogle Scholar
 Rosten OJ (2012) Fundamentals of the exact renormalization group. Phys Rep 511:177–272MathSciNetCrossRefGoogle Scholar
 Rothlauf F (2006) Representations for genetic and evolutionary algorithms, 2nd edn. Springer, HeidelbergMATHGoogle Scholar
 Rothlauf F (2011) Design of modern heuristics: principles and application. Springer, HeidelbergCrossRefMATHGoogle Scholar
 Rothlauf F, Goldberg DE (2003) Redundant representations in evolutionary computation. Evol Comput 11:381–415CrossRefGoogle Scholar
 Ruml W, Ngo J, Marks J, Shieber S (1996) Easily searched encodings for number partitioning. J Optim Theory Appl 89:251–291MathSciNetCrossRefMATHGoogle Scholar
 Schuster P, Fontana W, Stadler PF, Hofacker IL (1994) From sequences to shapes and back: a case study in RNA secondary structures. Proc R Soc Lond B 255:279–284CrossRefGoogle Scholar
 Teranishi Y (2005) The number of spanning forests of a graph. Discrete Math 290:259–267MathSciNetCrossRefMATHGoogle Scholar
 Wagner A (2005) Robustness, evolvability, and neutrality. FEBS Lett 579:1772–1778CrossRefGoogle Scholar
 Wilson KG (1971) Renormalization group and critical phenomena. I. Renormalization group and the Kadanoff scaling picture. Phys Rev B 4:3174–3183CrossRefMATHGoogle Scholar
 Wilson KG, Kogut J (1974) The renormalization group and the \(\epsilon \) expansion. Phys Rep 12:75–199CrossRefGoogle Scholar
 Wright S (1932) The roles of mutation, inbreeding, crossbreeding and selection in evolution. In: Jones DF (ed) Proceedings of the sixth international congress on genetics, vol 1, pp 356–366Google Scholar
 Wright S (1967) “Surfaces” of selective value. Proc Nat Acad Sci USA 58:165–172CrossRefGoogle Scholar
 Yu T, Miller JF (2002) Finding needles in haystacks is not hard with neutrality. Lect Notes Comp Sci 2278:13–25 euroGP 2002CrossRefMATHGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.