Keywords

1 Introduction

Evolutionary algorithms (EAs) [1] are problem independent and have been reported to perform relatively well on problems such as: search, optimization, and artificial intelligence.

The analysis of problem hardness (difficulty) has been studied for decades [2]. Measuring hardness of an instance for a particular algorithm is typically done by comparing the optimization precision reached after a certain number of iterations compared to other algorithms, and/or by comparing the number of iterations taken to reach the best solution [3]. There have been numerous efforts identifying challenges such as isolation, deception, multi-modality, the size of basins of attraction [4], as well as landscape metrics [5], autocorrelation structures and distributions of local minima [6].

In optimization algorithms, performance measurement is an important topic and effectiveness and efficiency are two aspects of the performance of algorithms. Firstly, effectiveness refers to the quality of the obtained solution and most of its indicators are directly related to the optimized objective(s) [7], and there is also effective metric about the solution (population) diversity [8]. Secondly, efficiency is usually characterized by its runtime behavior, i.e., the order of its computation time and its memory requirements [9] and its indicators are mainly about the algorithm. Efficiency can be increased drastically with the use of domain knowledge which can be embedded in algorithms, while effectiveness highly depends on the properties of the problem [10]. To our knowledge, the issues that relationship between efficiency and effectiveness, the relationship among metrics of efficiency are still not analyzed deeply. And this is the purpose of this paper.

2 Efficiency Metrics of Evolutionary Algorithms

2.1 Population Size and Termination Generation Number

To calculate the efficiency, for the population size, only the number of individuals who get evaluation from environment is considered, but not from surrogate, in other words, who get real evaluation but not virtual evaluation. As shown in Fig. 1, there are two kinds of individuals’ fitness evaluation: directly from environment and from the surrogate. Generally surrogate is developed for cheaper fitness functions, but at the cost of accuracy of fitness estimation [11].

Fig. 1.
figure 1

Individuals evaluation structure in EAs.

The product of population size and the number of generation is the number of individuals whose information is used by the algorithm. In other words, the product is the number of individuals who get their evaluation from environment or whose evaluation is the real one but not the estimated one by surrogate. With the same obtained best solution \( {\text{X}}_{best}^{real} \), the less the product is, the less resources the algorithm costs, and the higher the efficiency is.

Label the number of those individuals who have the real evaluation till the current evolutionary generation t as N ind (t), which can be simply explained as the product of the population size |P| and t, and there should be:

$$ N_{ind} (t) = \left| P \right| \times t $$
(1)

\( N_{ind} (t) \) should not be more than |S|. It can be taken as granted that EAs are better than the traversal methods on searching optima, and the latter will have to visit all the individuals in search space. Therefore, for the termination generation T of each algorithm, there should be:

$$ \left| S \right| \ge N_{ind} (T) $$
(2)

The relationship among \( \left| P \right|,\,T,\,N_{Ind} (T)\,{\text{and}}\,\left| S \right| \) is illustrated in Fig. 2, in which, the outer curve of the shadow area is |P| × T=|S|, which is also the upper bound of N ind (T). The lower bounds for T and |P| are labeled as T 0 and |P| 0 respectively.

Fig. 2.
figure 2

Relationship of population size and generation number

According to (3), it is easy to know that the points A, B and C, in Fig. 2, are the examples of N Ind (T), while point D is not. The product value of |P| × T corresponding to points decreases with the order of D, A, B and C.

In fact, N ind (T) as a metrics of efficiency has been used many times [12, 13], but the relationship between |P| and T was not given.

2.2 Space Complexity and Time Complexity

The definitions of N ind (T) does not include enough elements. For example, although N ind (T) consider the time-complexity with T and space-complexity with |P|, they are only the basic complexity.

Space-complexity of the algorithms is related with population size, but it can’t be simply represented with it. There are many memory-based algorithms, that use the same population size as other algorithms, but their space complexity is more than others.

Similarly, time-complexity of the algorithm is related with the termination generation number, but it can’t be simply represented with it. There are many data-mining embedded algorithms that spend more time to extract rules to guide the algorithms to explorer/exploit promising search space. In addition to the search process, establishing a high quality model in EAs might require longer time [14]. In a competing heuristics situation [15], time-complexity is also heavy, because the solutions are repeated generated and evaluated until it dominates the worst one in current population.

In order to compare the space-complexity, the memory cost per individual is considered. Label the memory cost of an algorithm during the evolutionary process as M C , and define the memory cost per individual as:

$$ m_{C} = \frac{{M_{C} }}{\left| S \right|} $$
(3)

Similarly, in order to compare the time-complexity, the time cost of an algorithm per individual is considered. Label the time cost during the evolutionary process as T C , and define the time cost per individual as:

$$ t_{C} = \frac{{T_{C} }}{\left| S \right|} $$
(4)

It has been mentioned many times in literature that under certain efficiency condition, time complexity is inversely proportion the space complexity. Sometimes, in order to save time, an algorithm will have to run with high space complexity, or in order to save memory to run in long time. Therefore, the relationship of m C and t C is shown in Fig. 3.

Fig. 3.
figure 3

Relationship between m C and t C

For a certain problem, algorithm with high m C will be with the low t C . Therefore, the product of power of m C and power of t C also has a upper bound. With this idea, a complexity metric can be defined as:

$$ p_{C} = \frac{\left| S \right|}{{\left| S \right| + \omega_{P1} M_{C} \times T_{C} }} $$
(5)

where \( \omega_{P1} = e^{\left\lfloor {ln\left| S \right|} \right\rfloor - \left\lfloor {ln\left( {M_{C} \times T_{C} } \right)} \right\rfloor} \), and it is for the balance of magnitudes. Therefore, there is \( p_{C} \in \left( {0,1} \right) \).

2.3 Efficiency of Evolutionary Algorithms

The algorithm should not visit every individual. It is assumed that the increasing of M C and T C will result in the decreasing of |P| × T. Therefore, m C and t C are inversely proportion to N ind (T) with the same algorithm and the same problem. Therefore, there is:

$$ \frac{{\partial N_{ind} \left( T \right)}}{{\partial M_{C} }} < 0,\,\frac{{\partial N_{ind} \left( T \right)}}{{\partial T_{C} }} < 0 $$
(6)

Based on the relationship of |P| and T, one of the metrics of the efficiency of algorithm can be defined as:

$$ p_{N1} = 1 - \omega_{P2} \frac{{N_{ind} \left( T \right)}}{\left| S \right|} = 1 - \omega_{P2} \frac{\left| P \right| \times T}{\left| S \right|} $$
(7)

where \( \omega_{P1} = e^{\left\lfloor {ln\left| S \right|} \right\rfloor - \left\lfloor {ln\left( {\left| P \right| \times T} \right)} \right\rfloor} \), which is for the balance of magnitudes. It is easy to know \( p_{N1} \in \left[ {0,1} \right] \). The bigger \( p_{N1} \) is, the better the efficiency is.

Based on the analysis of M C , T C and N ind (T), the following basic logistics can be given:

$$ \begin{array}{*{20}c} {M_{C} \uparrow \Rightarrow T_{C} \downarrow ,\,T_{C} \uparrow \Rightarrow M_{C} \downarrow ,\,M_{C} \uparrow \Rightarrow N_{Ind} \left( T \right) \downarrow ,\,} \\ {T_{C} \uparrow \Rightarrow N_{Ind} \left( T \right) \downarrow ,\,N_{Ind} \left( T \right) \downarrow \Rightarrow P_{N1} \uparrow ,\,M_{C} \uparrow \Rightarrow P_{C} \downarrow ,\,T_{C} \uparrow \Rightarrow P_{C} \downarrow } \\ \end{array} $$
(8)

Based on the above logistics, the efficiency metric can be defined as:

$$ p_{e1} = \omega_{1} p_{N1} + \omega_{2} p_{C} $$
(9)

where \( \omega_{1} + \omega_{2} = 1 \).

This efficiency metric includes parameters of population size, number of termination generation, time-complexity and memory-complexity. All the values of these parameters can be calculated prior to the running of the algorithm. Compared with effectiveness, this is the advantage of efficiency metrics.

3 Relationship of Efficiency and Effectiveness of Evolutionary Algorithms

3.1 Effectiveness in EAs

It is sure that, for two algorithms, the one, which obtains a solution nearer from the real optimum than that of another algorithm, has higher effectiveness. Therefore, a measurement of the effectiveness of the algorithm based on distance of obtained solution from the real optimum can be given. Generally, it is difficult to know the real optimum, but a solution better than or equal to the best obtained one can be given. Label this solution as \( {\text{X}}_{best}^{virtaul} \), the precision as \( \delta \). Label \( d_{b} = \left| {{\text{X}}_{best}^{real} - {\text{X}}_{best}^{virtual} } \right| \), then d b can be regarded as the metric for the improvement of population. According to Deb’s necessary properties for metrics [8], the effectiveness p e2 of the performance should satisfy:

$$ p_{e2} \in \left[ {0,1} \right],\frac{{\partial p_{e2} }}{{\partial d_{b} }} < 0 $$
(10)

This means that with the decreasing of the distance, the effectiveness increases. The progress rate [16] is one of the metrics that satisfy this property, but it is complex to calculate.

For simplicity, the expression is as following:

$$ p_{e2} = \frac{\delta }{{\left| {\left| {{\text{X}}_{best}^{real} - {\text{X}}_{best}^{virtual} } \right|} \right| + \delta }} = \frac{\delta }{{d_{b} + \delta }} $$
(11)

3.2 Relationship Between Efficiency and Effectiveness

Usually, the longer time an EA run (with a sufficient population size), the higher the solution quality will be [14]. However, in real-world scenarios, the computational resources are often limited, which leads to a tradeoff between efficiency and the effectiveness. This means that although efficiency p e1 and effectiveness p e2 are two elements of the performance p, they are not independent. If an algorithm is assigned with big value for c P or T, it will be expected to have higher effectiveness, while large value of c P or T also means low efficiency, in another word, an algorithm with low p e1 should have high p e2.

Assuming that for a certain algorithm, p e1 could be inversely proportion to p e2. Intuitively, for a certain algorithm, when the efficiency is improved, the effectiveness will be decreased, and vice versa. This means that for a certain algorithm, the improvement of efficiency will bring the low effectiveness and the improvement of effectiveness will bring the low efficiency. Therefore, there should be a tradeoff between p e1 and p e2.

But for different algorithm variants, the performance will be different. Some of the variants can increase the efficiency and some of them can increase the effectiveness and some of them can increase both of the effectiveness and efficiency. The last kind of algorithm is the one that is preferred.

The relationship that p e1 is inversely proportional top e2 can be seen from the aspect of space precision δ. Denote the dimension or number of variants as d, which means that X = {x {i}|i = 1, 2,…, d}, and D i as the scope of i-th dimension. Label the upper and low bound of every dimension as \( u_{Di} \) and \( l_{Di} \), and \( d_{D} = \prod\nolimits_{i = 1}^{d} {\left( {u_{Di} - l_{Di} } \right)} \), then there is:

$$ \left| S \right| = \prod\limits_{i = 1}^{d} {\left| {D_{i} } \right|} = \prod\limits_{i = 1}^{d} {\frac{{U_{Di} - L_{Di} }}{\delta } = \frac{{d_{D} }}{{\delta^{d} }}} $$
(12)

where d D is a constant value and δ > 0. Therefore, according to (8) and (12), there is:

$$ \frac{{\partial p_{e1} }}{\partial \delta } > 0,\,\frac{{\partial p_{e2} }}{\partial \delta } < 0 $$
(13)

Therefore, from the viewpoint of decision variable δ, p e1 is approximately inversely proportional to p e2, and they can be seen as conflict objectives. For a certain algorithm, the relationship of p e1 and p e2 is illustrated in Fig. 4, in which, the intersection point “A” of p e1 and p e2 is the tradeoff. And for the point of “B”, p e1 is increased while p e2 is decreased. Similarly, for the point of “C”, p e2 is increased while p e1 is decreased.

Fig. 4.
figure 4

Relationship between pe1 and pe2 for a certain algorithm

The relationship that p e1 is inversely proportional top e2 also can be seen from the aspect of complexity metric p C . It is easy to know that:

$$ \frac{{\partial p_{C} }}{{\partial M_{C} }} < 0,\,\frac{{\partial p_{C} }}{{\partial T_{C} }} < 0 $$
(14)

And there is

$$ \frac{{\partial p_{e1} }}{{\partial M_{C} }} < 0,\,\frac{{\partial p_{e1} }}{{\partial T_{C} }} < 0,\,\frac{{\partial p_{e2} }}{{\partial M_{C} }} > 0,\,\frac{{\partial p_{e2} }}{{\partial T_{C} }} > 0 $$
(15)

Based on the above analysis, the logistics relationship among the parameters and the algorithm’s performance is given in Fig. 5.

Fig. 5.
figure 5

Logistic relationships among the parameters and algorithm performance

Algorithm variants are always proposed in order to improve the performance. For example, in Fig. 6, it can be taken as granted that the original performance tradeoff is at the point . Then the tradeoff is improved to points “A”, “B” and “C”, during which “A” is the best tradeoff, which has both high p e1 and p e2.

Fig. 6.
figure 6

Relationships among algorithms’ efficiency and effectiveness

3.3 Definition of Performance

Based on the above analysis, a more sophisticated performance indicator is proposed as:

$$ {\text{p}} = \left( {p_{e1} ,\,p_{e2} } \right) $$
(16)

Based on this indicator, the definition of dominance relationship on performance can be given.

Definition 1 (Dominance Relationship on Performance).

For two performance vector \( {\text{p}}_{ 1} = \left( {p_{e1,1} ,\,p_{e2,1} } \right) \) and \( {\text{p}}_{2} = \left( {p_{e1,2} ,\,p_{e2,2} } \right) \), The preference orders on the set of performance vector can be defined.

  • \( {\text{p}}_{ 1} \succ {\text{p}}_{2} \left( {{\text{p}}_{ 1} \,{\text{dominates p}}_{ 2} } \right) \), if p 1 is not worse than p 2 in any element, and is better in at least one of the elements;

  • , if p 1 is not worse than p 2 in any elements;

  • \( {\text{p}}_{ 1} = {\text{p}}_{2} \left( {{\text{p}}_{ 1} \,{\text{equals}}\,{\text{p}}_{2} } \right) \), if p 1 = p 2;

  • \( {\text{p}}_{ 1} \left\| {{\text{p}}_{2} } \right. \)(p1 and p2 are incomparable to each other), if neither p 1 weakly dominates p 2 nor p 2 weakly dominates p 1.

The relationship are defined accordingly, i.e., \( {\text{p}}_{1} \succ {\text{p}}_{2} \) is equivalent to \( {\text{p}}_{2} \prec {\text{p}}_{1} \), etc.

4 Application of Performance Measurement

Performance p is determined by an algorithm A and the optimization problem pl. This from the definition of p e1 is the variables of c p and T, and both c p and T are related with the algorithm A. Similarly, from the definition of p e2, it can be seen that the variables of δ and \( d_{b} \) are also related with pl and A. Therefore, for distinguishing performance, p(pl, A) can be used to explicitly illustrate that the performance is for the concrete problem pl and concrete algorithm A.

4.1 Algorithm Comparison

For the same problem pl with the same space size |S|, all EAs can be compared with performance p. the algorithms domination definition is given as following.

Definition 2 (Algorithms Domination Relationship on Certain Problem).

For two EAs A 1 and A 2, suppose their performance is p 1(pl, A 1) and \( {\text{p}}_{2} \left( {pl_{2} ,A\neg_{2} } \right) \) respectively. The performance orders can be defined as:

  • \( A_{1} \succ_{pl} A_{2} \left( {A_{1} \,{\text{dominates}}\,{\text{A}}_{ 2} } \right) \), if p 1 is not worse than p 2 in any element, and is better in at least one of the elements;

  • , if p 1 is not worse than p 2 in any elements;

  • \( A_{1} =_{pl} \,A_{2} \,\left( {A_{1} \,{\text{equals}}\,{\text{A}}_{ 2} } \right) \), if p 1 = p 2;

  • \( A_{1} \left\| {_{pl} A_{2} } \right. \)(A 1 and A 2 are incomparable to each other), if neither p 1 weakly dominates p 2 nor p 2 weakly dominates p 1.

The relationship are defined accordingly, i.e., \( A_{1} \succ_{pl} A_{2} \) is equivalent to \( A_{2} \prec_{pl} A_{1} \), etc.

Suppose that all the problems compose of a set PL = {pl 1, pl 2, …}. Based on this set, the theoretic case for the algorithms comparison can be defined as \( A_{1} \succ_{PL} A_{2} \) and so on. For the same problem with the same precision δ, if three EAs: a 1, a 2 and a 3 obtain the same optima and the efficiency are corresponding to the points C, B and A respectively in Fig. 1, then there is \( {\text{p}}_{1} \left( {A_{1} ,\,pl} \right) \succ {\text{p}}_{2} \left( {A_{2} ,\,pl} \right) \succ {\text{p}}_{3} \left( {A_{3} ,\,pl} \right) \) and \( A_{1} \succ_{pl} A_{2} \succ_{pl} A_{3} \).

4.2 Problem Hardness Measurement

For different problems pl 1, pl 2 with the same space size |S|, if the same algorithm performs differently on them with \( {\text{p}}_{1} \left( {pl_{1} ,A\neg } \right) \) and \( {\text{p}}_{2} \left( {pl_{2} ,A\neg } \right) \), then the definition about problem hardness can be given in the following definition.

Definition 3 (Hardness Domination Relationship of Problems on Certain Algorithm).

For a concrete EAs A, label the performance for two problems pl 1 and pl 2 as \( {\text{p}}_{1} \left( {pl_{1} ,A\neg } \right) \) and \( {\text{p}}_{2} \left( {pl_{2} ,A} \right) \) respectively. The preference orders can be defined as:

  • \( pl_{1} \prec_{A} pl_{2} \left( {pl_{1} \,{\text{dominates }}pl_{2} } \right) \), if p 1 is not worse than p 2 in any element, and is better in at least one of the elements;

  • \( pl_{1} \prec_{A} pl_{2} \left( {pl_{1} \,{\text{weakly dominates }}pl_{2} } \right) \), if p 1 is not worse than p 2 in any elements;

  • \( pl_{1} =_{A} pl_{2} \left( {pl_{1} {\text{ equals }}pl_{2} } \right) \), if p 1 = p 2;

  • \( pl_{1} ||_{A} pl_{2} (pl_{1} \,{\text{and }}pl_{2} \) are incomparable to each other), if neither p 1 weakly dominates p 2 nor p 2 weakly dominates p 1.

The relationship are defined accordingly, i.e., \( pl_{1} \succ_{A} pl_{2} \) is equivalent to \( pl_{2} \prec_{A} pl_{1} \), etc.

Suppose that all the algorithms compose of a set Al = {A 1, A 2, …}. The theoretic case for the problems comparison can be defined as \( pl_{1} \succ_{Al} pl_{2} \) and so on.

Generally speaking, there are two kinds of measurements for the problems difficulty: prior estimation and post calculation. One can see that the above definition is based on the post calculation.

4.3 Comparison Design

Efficiency and effectiveness are metrics for the comparison of algorithms (and its parameters’ value) and problems. These two metrics are related with at least three elements: algorithm, value of parameters of the algorithm and the optimized problem. Based on the comparison of these three elements, experiments can be carried out to validate the relationship between efficiency and effectiveness. The comparison of these elements is shown in Table 1, in which, the last column gives the suitability to compare for the corresponding case. And in this table, 0 means the same case, 1 means the difference case.

Table 1. Comparison of elements
  • For the case of No. 1, the three elements, algorithms, values of parameters and optimized problems, are all with the same situations; therefore, it is unnecessary to compare the performance. But generally, it is always to be utilized to calculate the mean and variance of algorithms’ performance.

  • For the case of No. 2, two elements, algorithms and its parameters’ values are same and only the optimized problems are different. Therefore, by the performance of the algorithm, the hardness of the optimized problem can be compared.

  • Similar to the case of No. 2, No. 3 can be used to compare the influence of parameters’ values on the performance of algorithms. And the case of No. 4 can be used to compare the performance for different problems with different parameters’ values.

  • For the case of No. 5, when the compared algorithms have the same parameters, it can be used to compare the performance for different algorithms with the same parameters’ value and for the same optimized problems. But generally speaking, since parameters are part of an algorithm, different algorithm may have parameters with different meaning resulting in their incomparability. Therefore, the value in the last column is performance/none. Similarly, for the case of No. 6, the performance can be compared.

  • For the case of No. 7 and 8, both the algorithms and optimized problems are different; therefore, both of these two elements are incomparable.

Based on the analysis in Table 1, the experiments to compare performance can be as the cases of No. 3, 5 and 6, and the experiments to compare hardness can be as the case of No. 2.

5 Conclusion

As two important metrics for the evaluation of evolutionary algorithms (EAs), efficiency and effectiveness are studied in this paper. For efficiency metrics, population size, number of termination generation, space complexity, and time complexity and their relationship were studied. We conclude that the product of population size and number of generation should less than the search space size, and the product of time complexity and space complexity should also less than a constant. Secondly, we study the relationship between efficiency and effectiveness. Based on these two metrics, we conclude that not only EAs can be compared, but also problems hardness can be measured. The results reveal important insights of EAs and problems hardness.