Ants are capable of finding the shortest path between the food and the colony using a pheromone-laying mechanism. ACO is a metaheuristic optimization approach inspired by this foraging behavior of ants. This chapter is dedicated to ACO.

11.1 Introduction

Eusociality has evolved independently among the hymenoptera insects (ants and bees), and among the isoptera insects (termites). These two orders of social insects have almost identical social structures: populous colonies consisting of sterile workers, often differentiated into castes that are the offspring of one or a few reproductively competent individuals. This type of social structure is similar to a superorganism, in which the colony has many attributes of an organism, including physiological and structural differentiation, coordinated and goal-directed action.

Many species of ants have foraging behavior. The strategies of two types of ponerine ant are the army ant style foraging of the genus Leptogenys and the partitioned space search of Pachycondyla apicalis.

Termite swarms are organized through a complex language of tactile and chemical signals between individual members. These drive the process of recruitment in response to transient perturbation of the environment. A termite can either experience a perturbation directly, or is informed of it by other termites. The structures as well as their construction of the mound of Macrotermes have been made clear in [22]. Swarm cognition in these termites is in the form of extended cognition, whereby the swarm’s cognitive abilities arise both from interaction among agents within a swarm, and from the interaction of the swarm with the environment, mediated by the mound’s dynamic architecture.

Ants are capable of finding the shortest path between the food and the colony (nest) due to a simple pheromone-laying mechanism. Inspired by the foraging behavior of ants, ACO is a metaheuristic approach for solving discrete or continuous optimization problems [1, 2, 4–6]. Unlike in EAs, PSO and multiagent systems where agents do not communicate with each other, agents in ant-colony system communicate with one another with pheromone. The optimization is the result of the collective work of all the ants in the colony.

Ants use their pheromone trails as a medium for communicating information. All the ants secrete pheromone and contribute to the pheromone reinforcement, and old trails will vanish due to evaporation. The pheromone builds up on the traversed links between nodes. An ant selects a link probabilistically based on the intensity of the pheromone. Ant-Q [3, 8] merges ant-colony system with reinforcement learning such as Q-learning to update the amount of pheromone on the succeeding link. Ants in the ant-colony system use only one kind of pheromone for their communication, while natural ants also use haptic information for communication and possibly learn the environment with their micro brain.

In ACO, simulated ants walk around the graph representing the problem to solve. ACO has an advantage over SA and GA when the graph changes dynamically. ACO has been extended to continuous domains without any major conceptual change to ACO structure, applied to continuous and mixed discrete-continuous problems [18, 19].

11.2 Ant-Colony Optimization

ACO (http://www.aco-metaheuristic.org/) can be applied to discrete COPs, where solutions can be expressed in terms of feasible paths on a graph. In every iteration, artificial ants construct solutions randomly but guided by pheromone information from former ants that found good solutions. Among all feasible paths, ACO can locate the one with a minimum cost. ACO algorithm includes initialization, construction of ants’ solutions, applying optional local search, updating pheromones, and evaluation of the termination criterion.

Ant system [5] was initially designed for solving the classical TSP. The ant system uses the terminology of EAs. Several generations (iterations) of artificial ants search for good solutions. Every ant of a generation builds up a complete solution, step by step, going through several decisions by choosing the nodes on a graph according to a probabilistic state transition rule, called the random-proportional rule. When building its solution, each ant collects information based on the problem characteristics and its own performance. The information collected by the ants during the search process is stored in pheromone trails \(\tau \) associated to the connection of all edges. The ants cooperate in finding the solution by exchanging information via the pheromone trials. Edges can also have an associated heuristic value to represent a priori information about the problem instance definition or runtime information provided by a source different from the ants. Once all ants have completed their tours at the end of each generation, the algorithm updates the pheromone trails. Different ACO algorithms arise from different pheromone update rules.

The probability for ant k at node i moving to node j at generation t is defined by [5]

$$\begin{aligned} P^k_{i,j}(t) = \frac{ \tau _{i,j}(t) d_{i,j}^{-\beta }}{\sum _{u\in \mathcal {J}^k_i} \tau _{i,u} d_{i,u}^{-\beta } }, \quad j \in \mathcal {J}^k_i, \end{aligned}$$
(11.1)

where \(\tau _{i,j}\) is the intensity of the pheromone on edge \(i \rightarrow j\), \(d_{i,j}\) is the distance between nodes i and j, \(\mathcal {J}^k_i\) is the set of nodes that remain to be visited by ant k positioned at node i to make the solution feasible, and \(\beta >0\). A tabu list is used to save the nodes already visited during each generation. When a tour is completed, the tabu list is used to compute the ant’s current solution.

Once all the ants have built their tours, the pheromone is updated on all edges \(i\rightarrow j\) according to a global pheromone updating rule

$$\begin{aligned} \tau _{i,j}(t+1) = (1- \rho ) \tau _{i,j}(t) + \sum _{k=1}^{N_P} \tau ^k_{i,j}(t), \end{aligned}$$
(11.2)

where \(\tau ^k_{i,j}\) is the intensity of the pheromone on edge \(i \rightarrow j\) laid by ant k, taking \(\frac{1}{L_k}\) if ant k passes edge \(i \rightarrow j\) and 0 otherwise, \(\rho \in (0,1)\) is a pheromone decay parameter, \(L_k\) is the length of the tour performed by ant k, and \(N_P\) is the number of ants. Consequently, a shorter tour gets a higher reinforcement. Each edge has a long-term memory to store the pheromone intensity. In ACO, pheromone evaporation provides an effective strategy to avoid rapid convergence to local optima and to favor the exploration of new areas of the search space.

Finally, a pheromone renewal is again implemented by

$$\begin{aligned} \tau _{i,j}(t+1) \leftarrow \max \{ \tau _{\min }, \tau _{i,j} (t+1) \} \qquad \forall (i,j). \end{aligned}$$
(11.3)

Ant-colony system [4] improves on ant system [5]. It applies a pseudorandom-proportional state transition rule. The global pheromone updating rule is applied only to edges that belong to the best ant tour, while in ant system, the pheromone update is performed at a global level by every ant. Ant-colony system also applies a local pheromone updating rule during the construction of a solution, which is performed by every ant every time node j is added to the path being built.

Max–min ant system [20] improves ant system by introducing explicit maximum and minimum trail strengths on the arcs to alleviate the problem of early stagnation. In both max–min ant system and ant-colony system, only the best ant updates the trails in each iteration. The two algorithms differ mainly in the way how a premature stagnation of the search is prevented.

A convergence proof to the global optimum, which is applicable to a class of ACO algorithms that constrain all pheromone values not smaller than a given positive lower bound, is given in [21]. This lower bound prevents the probability of generating any solution becoming zero. This proof is applicable directly to ant-colony system [4] and max–min ant system [20]. A short convergence proof for a class of ACO is given in [21].

In [14], the dynamics of ACO algorithms are analyzed for certain types of permutation problems using a deterministic pheromone update model that assumes an average expected behavior of the algorithms. In [16], a runtime analysis of simple ACO algorithm is presented. By deriving lower bounds on the tails of sums of independent Poisson trials, the effect of the evaporation factor is almost completely determined and a transition from exponential to polynomial runtime is proved. In [11], an analysis of ACO convergence time is made based on the absorbing Markov chain model, and the relationship between convergence time and pheromone rate is established.

11.2.1 Basic ACO Algorithm

An NP-hard COP can be denoted by \((S, \Omega , f)\), where S is the discrete solution space, \(\Omega \) is the constraint set, \(f: S \rightarrow R^+\) is the objective function, and \(R^+\) is the positive real domain. The output is the best solution \({\varvec{s}}_{best}\).

ACO has been widely used to tackle COPs [2, 6]. In ACO, artificial ants randomly walk on a graph \(G = (V, E, \mathbf {W}_L, \mathbf {W}_T)\), where V is the set of vertices, E is the set of edges, and \(\mathbf {W}_L\) and \(\mathbf {W}_T\) are, respectively, the length and weight matrices of the edges. Besides the initialization step, ACO is a loop of the ant’s solution construction, evaluation of the solutions, optional local search, and the pheromone update, until the termination condition is satisfied. The basic ACO algorithm is given by Algorithm 11.1 [6].

figure a

In Algorithm 11.1, \(\mathbf {T} = [ \tau _{i, j} ]\) is the pheromone matrix and \(S_s (t)\) is the set of solutions obtained by ants. Step 2 initializes the pheromone matrix, \(\tau _{i, j} (0) = \tau _0 \ge \tau _{\min } > 0\), \(i, j = 1, \ldots , n\), where n is the number of nodes (size of the problem).

In Step 4(b)i, each ant first starts at a randomly selected vertex i and then chooses the next vertex j according to \(P_{i,j}\) until a solution \({\varvec{s}}\) contains all the nodes: \( {\varvec{s}} = {\varvec{x}}^n\), where \({\varvec{x}}^n = \{s_1, s_2, \ldots , s_i \}\), \(s_i\) is the node visited by the ant at step i, and \(i=1, \ldots , n\).

Example 11.1: Consider the TSP for Berlin52 benchmark in TSPlib. Berlin52 provides coordinates of 52 locations in Berlin, Germany. The length of the optimal tour is 7542 when using Euclidean distances. In this example, we implement max–min ant system. The parameters are selected as \(\beta =5\), \(\rho =0.7\). We set the population size as 40 and the number of iterations as 1000. The best result obtained is 7544.4. For a random run, the optimal solution is illustrated in Figure 11.1, and the evolution of a random run is illustrated in Figure 11.2.

Figure 11.1
figure 1

The best TSP solution by ACO.

Figure 11.2
figure 2

The TSP evolution by ACO.

11.2.2 ACO for Continuous Optimization

ACO was originally introduced to solve discrete (combinatorial) optimization problems. In order to expand ACO for continuous optimization, an intuitive idea is to change the discrete distributed pheromone on the edge into a continuous distributed probabilistic distribution function on the solution landscape.

API [15] simulates the foraging behavior of Pachycondyla apicalis ants, which use visual landmarks but not pheromones to memorize the positions and search the neighborhood of the hunting sites.

Continuous ACO [1, 23] generally hybridizes with other algorithms for maintaining diversity. Pheromones are placed on the points in the search space. Each point is a complete solution, indicating a region for the ants to perform local neighborhood search. Continuous interacting ant-colony algorithm [7] uses both the pheromone information and the ants’ direct communications to accelerate the diffusion of information. Continuous orthogonal ant-colony algorithm [10] adopts an orthogonal design method and a global pheromone modulation strategy to enhance the search accuracy and efficiency.

By analysizing the relationship between the position distribution and food source in the process of ant-colony foraging, a distribution model of ant-colony foraging is proposed in [13], based on which a continuous domain optimization algorithm is implemented.

Traditional ACO is extended for solving both continuous and mixed discrete–continuous optimization problems in [18]. ACOR [19] is an implementation of continuous ACO. In ACOR, an archive with k best solutions with n variables are maintained and used to generate normal distribution density functions, which are later used to generate m new solutions by ants. Then, the m newly generated solutions replace the worst solutions in the archive. In ACOR, the construction of new solutions by the ants is accomplished in an incremental manner, variable by variable. At first, an ant is used to generate a variable value, just like it is used to generate a step in TSP. For a problem with n variables, an ant needs n steps to generate a solution, just like it needs n steps to generate a Hamiltonian cycle in TSP. ACOR is quite similar to CMA and EDA. Similar realizations of this type are reported in [17].

SamACO [9] extends ACO to solving continuous optimization problems by focusing on continuous variable sampling as a key to transforming ACO from discrete optimization to continuous optimization. SamACO consists of three major steps, namely, the generation of candidate variable values for selection, the ants’ solution construction, and the pheromone update process. The distinct characteristics of SamACO are the cooperation of a sampling method for discretizing the continuous search space and an efficient incremental solution construction method based on the sampled values.

ACOMV [12] extends ACOR to tackle mixed-variable optimization problems. The decision variables of an optimization problem can be explicitly declared as continuous, ordinal, or categorical, which allows the algorithm to treat them adequately. ACOMV includes three solution generation mechanisms: a continuous optimization mechanism (ACOR), a continuous relaxation mechanism (ACOMV-o) for ordinal variables, and a categorical optimization mechanism (ACOMV-c) for categorical variables.

Problems

11.1 :

Given an ant-colony system with four cities, and that the kth ant is in city 1 and

$$ P_{11}^k = 0, \quad P_{12}^k = 1/4, \quad P_{13}^k=1/4, \quad P_{14}^k=1/2. $$

What is the probability of the kth ant proceeding to each of the four cities?

11.2 :

TSP consists in finding a Hamiltonian circuit of minimum cost on an edge-weighted graph \(G=(N,E)\), where N is the set of nodes, and E is the set of edges. Let \(x_{ij} ({\varvec{s}})\) be a binary variable taking 1 if edge \({<}{i,j}{>}\) is included in the tour, and 0 otherwise. Let \(c_{i,j}\) be the cost associated with edge \({<}{i,j}{>}\). The goal is to find such a tour that minimizes the function

$$ f ({\varvec{s}}) = \sum _{i\in N} \sum _{j\in N} c_{ij} x_{ij}({\varvec{s}}). $$

Set the algorithmic parameters of ACO for TSP. [Hint: \(\tau _{ij} = 1/c_{ij}\)].

11.3 :

In quadratic assignment problem, n facilities and n locations are given, together with two \(n\times n\) matrices \(\mathbf {A}=[a_{ij}]\) and \(\mathbf {B}=[b_{uv}]\), where \(a_{ij}\) is the distance from location i to j, and \(b_{uv}\) is the flow from facility u to v. A solution \({\varvec{s}}\) is an assignment of each facility to a location. Let \(x_i ({\varvec{s}})\) denote the facility assigned to location i. The goal is to find an assignment that minimizes the function

$$ f({\varvec{s}}) = \sum _{i=1}^n \sum _{j=1}^n a_{ij} b_{x_i ({\varvec{s}}) x_j ({\varvec{s}})}. $$

Set the algorithmic parameters of ACO for this problem. [Hint: \(\beta =0\); or \(\tau {ij}=1/\sum _{l=1}^n a_{il}\)].

11.4 :

Implement ACO\(_R\) on the Rastrigin function given in the Appendix.