1 Introduction

As representative technologies of the third information revolution, Internet of Things (IoT) [1], big data, cloud computing, and edge computing [2,3,4] have gradually become an indispensable part of our life through their coordinated development [5,6,7]. However, the coordinated development of the three technologies leads to an unprecedented increase in the number and scale of Internet services, which greatly increases the demand for high-traffic network communication and flexible network construction. However, it is clear that the traditional network architecture (that is, every adjustment needs to rebuild the substrate network structure) cannot meet the demand, which leads to the problem of Internet rigidity. In this regard, great attention has shifted to network virtualization as a core technology to solve the problem of Internet rigidity [8, 9]. The logical networks may transcend substrate infrastructure maintained, and has the advantage of fast configuration, high resource utilization and high isolation capabilities.

The key stage of network virtualization is to map the virtual network (VN) to the substrate network, that is, Virtual Network Embedding (VNE). The VNE problem has been proven to be an NP-hard problem [10]. Therefore, much work has focused on the research of heuristic algorithms. However, unlike other problems, the components of the solution vector of the VNE problem affect each other, and the order in which different components are solved will affect the solution space of the remaining components. That is, if one of the virtual nodes is mapped to a substrate node first, the other virtual nodes cannot use this substrate node. Therefore, we need to disturb the current solution from time to time in order to get better results, which requires the algorithm to have higher randomness. In addition, the discrete nature of VNE problems may make meta-heuristic algorithms based on direction vectors (such as flower pollination algorithm, differential optimization algorithm, particle swarm algorithm, etc.) invalid. Therefore, the genetic algorithm (GA) based on random search has inherent advantages in solving discrete VNE problems, and has certain optimization value.

Previous work mainly considered the design of algorithm framework, such as: the heuristic algorithm is combined with tabu search algorithm or simulated annealing algorithm to avoid falling into local optimal solution [11, 12], or the mutation operator of GA is added to other heuristic algorithms to increase population diversity. However, the details of the algorithm steps usually retain the traditional design. For example, the crossover probability in the GA is set in a static way, and the mutation gene in the mutation is selected in a random way. This makes the algorithm’s running time shorter and the code easier to implement, but the static method is too dependent on experience and cannot flexibly adapt to multiple environments. In addition, when using the Shortest Path algorithm (SP) to estimate the cost of link mapping, the shortest path may not be able to meet the bandwidth resource constraints of the virtual link due to insufficient substrate network resources. However, compared with the traditional network environment, the Internet of Things with a large number of high-demand physical equipment (such as disaster relief, medical, life support equipment) has higher requirements for network stability and algorithm reliability. Therefore, inappropriate fitness estimation methods will result in mapping schemes whose fitness and quality do not match, which will cause a greater impact on the physical world in the IoT environment. In order to solve these problems, we proposed a hybrid GA called LB-HGA based on the traditional GA model.

The main contributions and our main ideas are summarized as follows:

  1. 1.

    In view of the three cases of: both parents’ fitness is above average, both parents’ fitness is below average, one is better than the mean and the other is worse than the mean, a crossover method based on fitness is proposed. The advantage of this method is that it can not only maintain some randomness, but also effectively the probability of obtaining valid offspring.

  2. 2.

    A mutation gene selection strategy based on pheromone content is proposed. Therein, the pheromone is derived from the ant colony algorithm and is used in this strategy to represent the value of substrate nodes. This strategy can increase or decrease the mutation probability of genes according to their performance. The advantage of this strategy is that it can effectively protect the better offspring obtained by cross operation and improve the probability of the worse offspring being optimized by mutation.

  3. 3.

    A link mapping strategy considering link load balancing and link resource constraints. This strategy can calculate the shortest path that conforms to different resource constraints, which can make the link cost estimation more accurate in the fitness calculation.

The reminder of this paper is organized as follows. Section 2 reviews the existing methods for VNE. Section 3 introduces the network model and problem statement. Section 4 introduces the three core strategies used in LB-HGA method. In Section 5, we describe our proposed method LB-HGA in detail. The performance of our method and other methods is evaluated in Section 6. Section 7 concludes this paper.

2 Related work

A classification strategy [13] based on algorithm logic divides existing VNE methods into optimal algorithm and heuristic algorithm in which the heuristic algorithms can be further divided into traditional heuristic algorithm and meta-heuristic algorithm. Whereas the solution obtained by the optimal algorithm is closer to the optimal solution, these are characterized by high computational time which renders unsuitable for practical delay sensitive scenarios. On the other hand, heuristic algorithms often cannot guarantee an optimal solution but have an appealing characteristic of low time complexity. Therefore, the two approaches present a trade-off between solution Optimality and execution time.

2.1 Optimal algorithms

A typical optimization algorithm is proposed in [14] in which the authors proposed a VNE algorithm based on subgraph isomorphism detection. This method has a good mapping effect for large. In the same year, the authors of [15] for the first time applied a mixed integer linear programming(MIP) model to solve the VNE problem and proposed D-ViNE and RViNE algorithms based on LP relaxation to tame the time complexity of the MIP algorithm. However, this work has less coordination between the two mapping phases (link mapping and node mapping). In order to make up this defect, the authors of [16] proposed a progressive greedy VNE algorithm (PG-VNE), which is shown to result into better coordination between the two phases. In addition, with the development of IoT and other technologies to improve the demand for network service quality, the authors of [17] proposed a dynamic mapping algorithm based on QoS driver to further meet the demands of customized QOS. In the following year, the authors of [18] further considers the perception of energy consumption, avoiding the single consideration of mapping revenue. In recent studies, the authors of [19] proposed a candidate set based mapping algorithm considering delay and geographical location constraints, which is significantly less time complexity than the existing heuristic algorithms. In view of the lack of multi-attribute decision making in the single-domain mapping problem, the authors of [20] proposed a new node ordering method, which comprehensively considered five topology attributes and global network resources, and showed good performance.

Mathematically speaking, any optimization method involves finding the extremum under certain constraints. But in the case of a larger problem which is the case in most scenarios, solving the optimal solution tends to consume large amounts of computing resources. For this reason, the optimal method in the large-scale network environment is not widely used. Therefore, the study of heuristic algorithm which gives a feasible solution at an acceptable cost is important.

2.2 Heuristic algorithms

In the classical algorithm [11], a greedy algorithm based on node matching is used for node mapping, and k-shortest path is used for link mapping. In addition, the authors of [21] proposed a unified enhanced VN embedding algorithm (VNE-UEPSO) based on particle swarm optimization (PSO). However, the algorithm has higher randomness and slower convergence speed. In order to overcome this commonly occurring shortcoming, the authors of [22] proposed a PSO optimization scheme based on probabilistic particle reconstruction. The algorithm sacrifices some computation time, but the result is better than the traditional PSO algorithm. In addition to the PSO algorithm, GA has also attracted wide attention because of its excellent performance. The authors of [23] proposed a VNE strategy (CB-GA) based on the simple node sorting method and GA. The authors of [24] proposed a GA model based on new chromosomes to solve the multi-domain VNE problem. However, both of these algorithms rely on probability for random selection, crossover and variation, so it is difficult to guarantee that an excellent enough solution can be found within a limited number of iterations. In order to make up for these shortcomings, in recent studies, the authors of [25] proposed a virtual network mapping strategy based on cellular automata genetic mechanism. The algorithm introduced cellular automata model of network nodes, effectively guides the crossover stage, ensures the diversity of population, and avoids premature convergence. However, since the mutation operation of this algorithm has random variation, the unguided random variation may cause the better individuals that were selected to mutate into the worse ones. Moreover, the algorithm does not clearly consider the load balancing of nodes and links, so there is still some room for optimization.

Based on the above analysis, it can be seen that as far as genetic algorithms are concerned, there is still some room for optimization in the current research.

3 Network model and problem statement

3.1 Substrate network and virtual network model

Figure 1 shows the mapping process consisting of four layers, and the tags on the picture apply to the full-text picture. The substrate network is abstracted as an undirected graph Gs = {Ns,Ls}, where Ns represents the set of substrate nodes and Ls represents the set of substrate links. Each substrate node has functional or non-functional attributes, including the CPU capacity CPU(ns), and the unit price UP(ns) of CPU(ns). Each substrate link also has a set of attributes, including the bandwidth BW(ls) and the unit price UP(ls) of BW(ls). We define the set of substrate paths as Ps. And a substrate path set from substrate node i to substrate node j is represented by Ps(i,j). Similarly, a VN can also be abstracted as a weighted undirected graph Gv = {Nv,Lv}, and in each Virtual Network Request (VNR), Nv represents the set of virtual nodes and Lv represents the set of virtual links. Each virtual node nvNv has a requirement for CPU, that can be defined as CPU(nv). And each virtual link lvLv has a requirement for bandwidth, that can be defined as BW(lv).

Fig. 1
figure 1

The mapping process of the virtual network to the substrate network

3.2 Virtual network embedding problem description

The process can be modeled as a mapping M: Gv{Nv,Lv} → Gs{Ns,Ps}. The VNR mapping process consists of two steps: (i) virtual node mapping; (ii) virtual link mapping;. In the node mapping phase, each virtual node nvNv chooses a substrate node that conforms to the constraint condition as the mapping target. Different virtual nodes in the same VNR cannot be mapped repeatedly to the same substrate node. In the link mapping phase, each virtual link lvLv in the VN is mapped to an substrate path Ps(lv).

3.3 Objectives and evaluation index

Since the cost of mapping nodes is certain, some studies omit it in the objective function and only retain the cost of bandwidths. However, since we consider that different domains in the multi-domain substrate network have the different unit prices of CPU, so our objective function will consider the cost of CPU. Model it as an integer programming model and shown below:

$$ \begin{array}{@{}rcl@{}} OBJ(VN) &= &min\sum\limits_{n_{v}\in N^{v}}CPU(n_{v})\times UP(n_{s})\\ &&+\sum\limits_{l_{v}\in L^{v}}BW(l_{v})\times AUP(P_{s}), \end{array} $$
(1)
$$ AUP(P_{s})={\sum}_{l_{s}\in P_{s}}UP(l_{s}), $$
(2)

where AUP(Ps) represents the aggregate unit price of path Ps.

In addition, the mapping needs to meet the constraints of VNR. In this model, it can be formulated as:

$$ \begin{cases} BW(l_{v})\leq BW(l_{s}), \forall l_{s}\in Ps(l_{v}),\\ CPU(n_{v})\leq CPU(n_{s}), n_{v}\leftrightarrow n_{s}, \end{cases} $$
(3)

where \(\leftrightarrow \) represents the two ends of the arrow map to each other.

We use 5 evaluation indexes to measure the performance of VNE algorithms. Including the load balancing of substrate links, the ratio of revenue to cost, the VN request acceptance ratio, the mapping average quotation, and the running time of algorithms. Therein, the running time of algorithms includes the average running time and the total running time. In addition, and we use the mapping average earnings to assist the illustration.

We use the variance of bandwidths’ consumption to measure the link load balancing, and it can be formulated as follows:

$$ \sigma^{2}=\frac{{\sum}_{l_{s}\in L^{s}}(BC(l_{s})-\mu)}{N}, $$
(4)

where BC(ls) represents the consumption of bandwidths of the substrate link ls, it can be formulated as total BW(ls) - residual BW(ls). μ represents the population mean of BC(ls), and N is the number of links in the substrate network.

The revenue of mapping a VN at time t can be defined as the resources for all virtual nodes and virtual links requested by the VN, and it can be formulated as follows:

$$ R(G^{v},t)=\sum\limits_{n_{v}\in N^{v}}CPU(n_{v})+\sum\limits_{l_{v}\in L^{v}}BW(l_{v}). $$
(5)

The cost of mapping a VN at time t can be defined as the total amount of substrate network resources that allocated to the VN, and it can be formulated as follows:

$$ C(G^{v},t)=\sum\limits_{n_{v}\in N^{v}}CPU(n_{v})+\sum\limits_{l_{v}\in L^{v}}BW(l_{v})Hops(P_{s}(l_{v})), $$
(6)

where Hops(Ps(lv)) represents the number of hops of the substrate path Ps(lv) that the virtual link lv eventually mapped to.

Based on the above model, the revenue to cost ratio over a period of time t ∈ (0,k) can be formulated as follows:

$$ \frac{R}{C}=\frac{{\sum}_{t=0}^{k}R(G^{v})}{{\sum}_{t=0}^{k}C(G^{v})}. $$
(7)

The VN request acceptance ratio over a period of time t ∈ (0,k) can be defined as follows:

$$ acceptance ratio=\frac{{\sum}_{t=0}^{k}VNR_{accept}}{{\sum}_{t=0}^{k}VNR_{refuse}}, $$
(8)

where V NRaccept represents the number of VNRs that were accepted and successfully mapped, and V NRrefuse represents the number of rejections.

The mapping quotation is defined as the price the user has to pay to map a VN, it’s the same as Eq. 1. The average quotation is the average price of mapping VNRs over a period of time t ∈ (0,k), and it can be formulated as follows:

$$ average quotation=\frac{{\sum}_{t=0}^{k}OBJ(VN)}{{\sum}_{t=0}^{k}VNR_{accept}}. $$
(9)

The total running time is the total time that each algorithm runs in a simulation experiment, and the time is measured in milliseconds. In addition, the average running time can be formulated as follows:

$$ average time=\frac{total time}{{\sum}_{t=0}^{k}VNR_{accept}}, $$
(10)

4 Strategy model and innovation motivations

In this section, we introduce the core strategies used in LB-HGA algorithm in detail. We will analyze the problems existing in traditional algorithms, give the motivations of optimization, and give the required mathematical expression. In addition, these strategies will be used in the next section as part of the algorithm model.

4.1 Dynamic crossover probability

The crossover probability in traditional GA models is mostly fixed, such as literatures [24, 26,27,28,29]. This makes the algorithms computational complexity small and the code implementation simple. But it will make the parents with different performance have the same crossover probability. However, the upside potential of different individuals is different (which is usually related to the fitness of the individuals). We believe that different crossover probabilities should be calculated for different quality parents in order to improve the possibility of obtaining excellent offspring.

As illustrated in Figs. 234, and 5, on the left is an example of a VNR, and on the right is a solution for mapping this VNR. Taking virtual node C as an example, the better choices (BCs) in each plan are marked blue. Therein, BCs mean the alternative mapable substrate nodes that the virtual nodes can choose to make the fitness lower. As can be seen from Figs. 3 and 4, plan 3 with the highest fitness has 6 BCs, while plan 2 with the lowest fitness has 1 BCs. Thus, it can be seen that the plan with better performance has smaller ascending space than the plan with poorer performance. In addition, although BCs can more accurately reflect the upside potential, calculating the number of BCs for each parents will make the calculation too much. In order to balance the running time and performance, we designed the following crossover probability function based on fitness.

  1. 1.

    min{F(x1),F(x2)} \(\geq \bar {X}\):

    $$ P(x_{1},x_{2})=\frac{\lambda_{1}\times(min\{F(x_{1}),F(x_{2})\}-\bar{X})}{max\{F(x_{1}),...,F(x_{n})\}-\bar{X}}. $$
    (11)
    $$ \bar{X}=\frac{F(x_{1})+F(x_{2})+,...,+F(x_{n})}{n}, $$
    (12)

    where F(xi) represents the fitness of the individual xi, and λ1 intervenes in the crossover probability with the default value of 1 and the adjustment range of (0,2].

  2. 2.

    max{F(x1),F(x2)} \(\leq \bar {X}\):

    $$ P(x_{1},x_{2})=\frac{\lambda_{2}\times(1-(\bar{X}-max\{F(x_{1}),F(x_{2})\}))}{\bar{X}-min\{F(x_{1}),...,F(x_{n})\}}, $$
    (13)

    where the default value and range of λ2 are the same as λ1. And the λ2 is recommended to set λ2 to the default value or slightly smaller than 1.

  3. 3.

    min{F(x1),F(x2)} \(< \bar {X}\) and max{F(x1),F(x2)} \(> \bar {X}\):

    $$ S_{max}=\frac{max\{F(x_{1}),F(x_{2})\}-\bar{X}}{max\{F(x_{1}),...,F(x_{n})\}-\bar{X}}, $$
    (14)
    $$ S_{min}=\frac{\bar{X}-min\{F(x_{1}),F(x_{2})\}}{\bar{X}-min\{F(x_{1}),...,F(x_{n})\}}, $$
    (15)
    $$ P(x_{1},x_{2})=\left\{ \begin{array}{rcl} \lambda_{1}\times S_{max} & & {S_{max} > S_{min},}\\ \lambda_{2}\times (1-S_{min})& & {S_{max} \leq S_{min}.} \end{array} \right. $$
    (16)

In the third case, the fitness of the parents is better or worse than the overall average fitness of the population, respectively. Therefore, further analysis is needed to identify individuals in parents who deserve more attention. Smax represents the importance of the individual with high fitness. Smin represents the importance of the individual with low fitness. Function (15) means that the crossover probability will consider the more important individual and multiply the corresponding intervention weight according to the tendency to support or oppose crossover.

Fig. 2
figure 2

The diagram of a VNR and mapping plan 1 with fitness of 53

Fig. 3
figure 3

The diagram of a VNR and mapping plan 2 with fitness of 43

Fig. 4
figure 4

The diagram of a VNR and mapping plan 3 with fitness of 67

Fig. 5
figure 5

The diagram of a VNR and mapping plan 4 with fitness of 51

4.2 Link load balancing strategy

The static weights that does not take the load balancing into account will cause the resources of the substrate links with less weighted decrease too fast. And when the substrate network resources are relatively small, the SP algorithms that do not take resource constraints into account may not be able to obtain the mapping scheme of links conforming to the constraints. This will make the estimation of individuals’ fitness in the node mapping stage inaccurate, as shown in Fig. 6.

Fig. 6
figure 6

The diagram of a multidomain substrate network with the initial weights

Figure 6 shows a substrate network with three physical domains and a VNR. In addition, the virtual link ls(b,c) in the VN is mapped to the substrate path Ps(E,C). When we set the weight as UP(ls), the Ps(E,C){EDBC} is the shortest path when resources are abundant, and the Ps(E,C){EFHGC} is the shortest path when the resources of Ps(E,C){EDBC} are scarce. The link aggregation unit price difference between the two is 10, the difference is large. If load balancing is not considered, the substrate network resources will uneven occupancy in the later stage of mapping, and some paths will get blocked, which will lead to the increase of response time and the increase of mapping cost. However, if load balancing is considered, the VNRs later can also get a better mapping scheme.

A simple way to consider load balancing is to adjust the weight of the substrate link according to the bandwidth occupancy of the substrate link. It can be formulated as:

$$ W(l_{s}) = \left\{ \begin{array}{rcl} UP(l_{s})(1 + \lambda\!\times\! \mathit{extra} \mathit{weight}) & & {U(l_{s}) \!>\! \bar{U},}\\ UP(l_{s})& & {U(l_{s}) \!\leq\! \bar{U}.} \end{array} \right. $$
(17)
$$ extra weight=\frac{U(l_{s})-\bar{U}}{max\{\forall l_{s}\in L_{s}\vert U(l_{s})\}-\bar{U}}, $$
(18)
$$ \bar{U}=\frac{{\sum}_{l_{s}\in L_{s}}U(l_{s})}{n}, $$
(19)
$$ U(l_{s})=\sum\limits_{l_{v}\in M(l_{s})}BW(l_{v}), $$
(20)

where the range of λ is (0,2], \(\bar {U}\) represents the average used bandwidth of n substrate links in substrate network, U(ls) represents the total amount of bandwidths used in a substrate link ls, and M(ls) represents a collection of mapped virtual links on a substrate link ls. Equation 16 means that when the used bandwidth U(ls) of a substrate link is larger than the average used bandwidth \(\bar {U}\) of substrate network, the weight will increase with the increase of U(ls). When U(ls) is less than \(\bar {U}\), then use UP(ls) as link weight. By adding intervention weight λ, the manager can adjust the importance of load balance according to the demand, and make the algorithm more flexible.

Some bandwidth resources in the substrate network as shown in Fig. 6 are randomly consumed to form the substrate network as shown in Fig. 7.

Fig. 7
figure 7

The diagram of a multidomain substrate network that consumes a portion of bandwidth resources with the weight that considering the load balance

The intervention weight λ was set to 0.8, and the weight of all links in the substrate network was adjusted. After adjustment, the weight with changes was marked as red. The Ps(E,C){EDBC} is the shortest path before weight adjustment and the Ps(E,C){EFDBC} is the shortest path after weight adjustment. It can be seen that after weight adjustment, the mapping can bypass the links with high consumption of bandwidth resources.

In the stage of GA, single source shortest path is suitable for the algorithms with both paths and nodes in individuals. Since the BW(lv) required by each virtual link lvLv is not the same, the shortest path needs to be calculated for different links. The multi-source shortest path is suitable for the algorithms that only includes nodes in individuals. Because the multi-source SP algorithms is only used to estimate the cost of mapping of virtual links when calculating fitness, it is not necessary to consider the exact resource constraints. Moreover, after the node mapping stage, the mapping scheme of virtual links needs to be obtained by using an single source SP algorithm. At this time, the precise resource constraints need to be considered. In addition, when solving the single source shortest path, the bandwidth resources required by each virtual link can be taken as the constraint. By setting the weight of the bottom link with insufficient resources to be the highest, it can be prevented from being selected into the mapping scheme, thereby preventing mapping failure. When solving the multi-source shortest path, only the minimum resource constraints needs to be satisfied. And the minimum resource constraints is equal to the BW(lv) of the virtual link lv that requires the least bandwidth resources in the unmapped VN.

4.3 Gene selection strategy

We consider a gene selection strategy to introduce the concept of pheromones in Ant Colony Algorithm (ACA) into GA to guide the selection of mutation nodes. The introduction of ACA can be obtained from [30], and there are some examples of genetic algorithms being combined with ant colony algorithms in literatures [31,32,33]. In one iteration, individuals with lower fitness will release more pheromones, and individuals with higher fitness will release fewer pheromones. In the mutation stage, the nodes with lower pheromones will be more likely to selected for mutation. Introducing the positive feedback mechanism into the genetic algorithms will increase the interactivity of the population and reasonably guide the selection of mutation nodes.

In addition, we provide a pheromones initialization strategy for the initial population, and it can be abstracted as the following function:

$$ \tau_{ns}(t)={\sum}_{k=1}^{num(X)}\varDelta(1)\tau_{ns}^{k}, n_{s}\in N^{s}, $$
(21)
$$ \varDelta(1)\tau_{ns}^{k}=\left\{ \begin{array}{rcl} \frac{max\{F(x_{i}),x_{i}\in X\}-F(x_{k})}{num({N_{k}^{s}})}& & {n_{s}\in x_{k},}\\ 0& & {n_{s}\not\in x_{k},} \end{array} \right. $$
(22)

where τns(t) represents the pheromones quantity of the substrate node ns when the number of iterations is t, num(X) represents the number of individuals in the population X, \(num({N_{k}^{s}})\) represents the number of substrate nodes of the individual xk, and \(\varDelta (1)\tau _{ns}^{k}\) represents the pheromones released by the individual xk on the substrate node ns.

The pheromone update strategy of the crossover stage can be abstracted as the following function:

$$ \tau_{ns}(t+1) = (1 - \rho)\tau_{ns}(t)+\sum\limits_{k=1}^{num(X)}\varDelta(1)\tau_{ns}^{k}, n_{s}\in N^{s}, $$
(23)

where ρ represents the pheromones dissipation factor. In addition, Eq. 19 indicates that after reducing pheromones in a certain proportion, all the new individuals generated by crossover in one iteration will leave pheromones in the substrate nodes of individuals according to their fitness. Moreover, since the goal of our algorithm is to minimize fitness, we take the difference between the fitness of each individual in the population and the highest fitness in the population as the reference for pheromone updates to reflect the goal.

During the mutation state, the pheromone update rules for each node in the individual xi be selected for mutation can be abstracted as the following function:

$$ \tau_{ns}(t+1) = \left\{ \begin{aligned} \tau_{ns}(t)-\varDelta(2)\tau_{ns}^{i}& & {F(x_{i})^{before}\!>\!F(x_{i})^{after},}\\ \tau_{ns}(t)& & {F(x_{i})^{before} = F(x_{i})^{after},}\\ \tau_{ns}(t)+\varDelta(2)\tau_{ns}^{i}& & {F(x_{i})^{before}\!<\!F(x_{i})^{after},} \end{aligned} \right. $$
(24)

where nsmutantgeneset, mutantgeneset is defined as a set of genes selected for mutation in xi, F(xi)before represents the fitness of the xi before mutation, and F(xi)after represents the fitness of xi after mutation.

$$ \tau_{ns}(t+1) = \left\{ \begin{aligned} \tau_{ns}(t) + \varDelta(2)\tau_{ns}^{i}& & {F(x_{i})^{before}>F(x_{i})^{after},}\\ \tau_{ns}(t)& & {F(x_{i})^{before}=F(x_{i})^{after},}\\ \tau_{ns}(t) - \varDelta(2)\tau_{ns}^{i}& & {F(x_{i})^{before}<F(x_{i})^{after},} \end{aligned} \right. $$
(25)

where nsgoalnodeset, the goalnodeset is defined as a set of goal nodes that selected by mutantgeneset, and it can also be called post-mutation nodes.

The Δ(2)τns of the mutation stage is different from the Δ(1)τns of the crossover stage, and it can be formulated as:

$$ \varDelta(2)\tau_{ns}^{i}=\frac{|F(x_{i})^{after}-F(x_{i})^{before}|}{num(mutant gene set)}, $$
(26)

where \(\varDelta (2)\tau _{ns}^{i}\) represents the pheromones released by the xi on the substrate node ns, and num(mutant geneset) represents the number of genes in mutantgeneset.

According to the proportion of pheromones amount of each node in the mutation stage to the total pheromones amount of all substrate nodes in the individual, a certain number of different mutation genes were obtained by roulette algorithm, and these genes were used to form mutantgeneset for mutation. Where, the proportion of pheromones can be formulated as follows:

$$ pheromone proportion=\frac{\tau_{ns}(t)}{{\sum}_{n_{s}\in X_{i}}\tau_{ns}(t)}. $$
(27)

In addition, because all the substrate nodes of the individual must be released pheromones in the crossover stage, so τns(t) must be greater than 0.

5 Heuristic algorithm design

Based on the dynamic crossover probability, the load balancing and the resource constraints strategy, and the gene selection strategy, a hybrid GA for VNE problem solving strategy LB-HGA is proposed.

5.1 Node mapping algorithm

We use the optimized GA to complete the mapping of nodes. In this model, we take the real number encoding method and define the individuals as \(X_{i}=\{{X_{i}^{1}},{X_{i}^{2}},...{X_{i}^{j}}...{X_{i}^{n}}\}\), where Xi represents the individual numbered i in the population. In addition, n is the number of virtual nodes in the virtual network, \({x_{i}^{j}}\) represents the substrate node corresponding to the virtual node numbered j, and the gene belongs to the individual Xi. And we use Eq. 1 as the fitness function F(xi).

We modified the iterative steps based on the framework of the traditional GA algorithm. Therein, the elite selection strategy was adopted to retain half of the individuals with lower fitness. For cross process, select a pair of individuals at random and decide whether to generate offspring through the dynamically calculated crossover probability. If crossover is determined, several pairs of alleles are randomly selected and exchanged. In addition, for each newly generated offspring, mutation is determined according to a certain probability. Moreover, a strategy named cataclysm is used to jump out of the local optimal solution. It occurs when the maximum number of iterations × 0.6 consecutive iterations do not update the optimal solution. Only the first third of the individuals with the lowest fitness were retained, and then the initialized individuals were generated to complete the population, so that the number of individuals in the population was maintained at X.

The detailed steps of node mapping algorithm are illustrated in Algorithm 1.

figure g

5.2 Link mapping algorithm

The detailed steps of link mapping algorithm are illustrated in Algorithm 2.

figure h

6 Performance evaluation

In this section, we describe the setup of the simulation environment, including the parameters of the substrate network and algorithm, and give the experimental results. We used the five evaluation criteria defined earlier to measure the performance of our method against the others. In addition, we also describe the mapping process and parameter setting of other algorithms.

6.1 Environment settings

The experiment was run on a PC with Intel Core i5 2.90GHz CPU and 8 GB memory. The substrate network topology and virtual network request topology are generated by the GT-ITM [34] tool. The substrate network includes a total of 4 domains, and each domain includes 30 substrate nodes. Therein, the CPU capacity of the substrate nodes ranges from [100,300], the bandwidth of the links within the domain ranges from [1000,3000], and the bandwidth of the inter-domain links ranges from [3000,6000]. The unit price of the bandwidth and the unit price of the CPU are both in the range of [1,10]. In addition, the value range of the number of virtual nodes in a VN is [5,10], and the value range of the CPU capacity required by the virtual node and the bandwidth resource required by the virtual link are both [1,10]. The above variables all obey uniform distribution. In addition, the number of VNRs follows a Poisson distribution with an average of 10 within 100 time units. The simulation time is 2200 time units, and the life of the VN is 1000 time units.

6.2 Algorithm parameters

We compared the designed algorithm with the other three existing heuristic VNE problem solving methods. Table 1 shows the comparison and introduction of the mapping process of the other three algorithms, and Table 2 shows the parameter settings of the all four algorithms.

Table 1 Introduction of three algorithms compared with LB-HGA
Table 2 Parameter settings of four algorithms

6.3 Evaluation results

In this section, we analyze the performance of the four algorithms according to five evaluation indexes, and give the experimental results and the causes of the results.

Figure 8 uses the standard deviation of resource allocation of the substrate network link as the measurement method of link load balancing. As can be seen from the figure, LB-HGA algorithm performs best. This is because although the four algorithms all use the shortest path algorithm to map the virtual link, the LB-HGA algorithm considers the link load balancing.

Fig. 8
figure 8

The diagram of load balancing of the substrate link

Figure 9 uses revenue cost ratio to compare the resource allocation efficiency of the algorithm. As can be seen from the figure, LB-HGA algorithm performs best. This is because LB-HGA algorithm will obtain the best solution based on fitness, which takes into account price and resource consumption, so the benefit-cost ratio performs well.

Fig. 9
figure 9

The diagram of revenue cost ratio

As can be seen from Fig. 10, LB-HGA algorithm performs best in the acceptance rate of virtual network requests. This is because LB-HGA has added the preliminary evaluation of the substrate link resources into the shortest path algorithm, so that the algorithm can bypass the substrate link with insufficient resources, which can avoid most mapping failures. However, the other three algorithms did not clearly consider resource constraints in the link mapping stage, nor did they have a good re-mapping method, so the acceptance rate was poor.

Fig. 10
figure 10

The diagram of the VN request acceptance ratio

As can be seen from Fig. 11, in the early stage when resources are relatively sufficient, the mapping revenue of LB-HGA algorithm is stable, and in the later stage, the revenue will slight decline due to insufficient resources. However, even at an early stage with sufficient resources, the revenue of the other three algorithms is reduced by mapping failures. This can reflect the good performance of LB-HGA algorithm from the side.

Fig. 11
figure 11

The diagram of mapping average earnings

Figure 12 uses the product of the resource unit price and the required resource as a measure of the mapping scheme quotation. As can be seen in the figure, the performance of LB-HGA algorithm is second only to IVERM algorithm. This is because because LB-HGA algorithm increased the consideration of load balancing, so the quotation was slightly higher than the IVERM algorithm that gave priority to single domain mapping. However, our algorithm is more stable, which means that our algorithm can get better results with less leeway within the same number of iterations.

Fig. 12
figure 12

The diagram of mapping average quotation.

As can be seen from Fig. 13, the total running time of IVERM, T-GA, and LB-HGA algorithms are all low and not significantly different. This shows that even if LB-HGA algorithm adds a variety of strategies to ensure the performance of the algorithm, the running time does not increase significantly.

Fig. 13
figure 13

The diagram of the total running time

Figure 14 shows the average running time of the four algorithms mapping a virtual network. It can be seen that the running time of LB-HGA algorithm is slightly higher than that of IVERM and T-GA algorithm. This is because the LB-HGA algorithm will re-mapping when the link map fails to improve the VN request acceptance ratio, but this also leads to an increase in the running time. But we use inexact resource constraints to replace precise resource constraints in algorithm iteration, which has reduced the running time as much as possible, making it not much different from other algorithms.

Fig. 14
figure 14

The diagram of the average total running time

7 Conclusion

Heuristic algorithms are suitable for solving NP-hard problems, so they are widely used to solve VNE problems. However, in solving the VNE problem, there are some unresolved problems in the existing work. For example, VNE method based on genetic algorithms usually uses the traditional design method with large randomness, which usually leads to the instability of the quality of the algorithms’ results. It is a problem worthy of attention in the Internet of Things environment that requires high network stability and algorithm reliability. In addition, the traditional algorithm’s dependence on experience reduces its usefulness, and its low flexibility makes it unable to adapt to increasingly complex network environments. In this paper, the operational optimization of the genetic algorithm is discussed. As a result, the calculation method of crossover probability in three cases is given, as well as the gene scoring strategy for selecting mutated genes. The purpose is to accelerate the convergence speed and make the algorithm more flexible to adapt to different simulation environments. In addition, taking into account different link mapping methods, we analyze the resource constraints and the use of the shortest path algorithm, and we design a link mapping strategy enforcing load balancing. In addition, this strategy improves the accuracy of fitness estimation while improving the acceptance rate by avoiding links with insufficient resources. Simulation results show that our algorithm performs best in link load balance, mapping revenue-cost ratio and VNR acceptance rate, and performs well in mapping average quotation and algorithm running time. In addition, compared with other algorithms, LB-HGA algorithm is significantly more stable and can perform well even in the later stage of the experiment.

In the future work, we will consider better neural network design approaches and hybrid strategies for multiple intelligent algorithms, and we will consider information security in our algorithm. In addition, we intend to study machine learning based algorithms [35, 36] to address the issues of computer networks.