The Ant System Algorithm

The Ant System Formulation

Marco Dorigo introduced the Ant System (AS) in his 1992 PhD thesis and published the full formulation with Vittorio Maniezzo and Alberto Colorni in IEEE Transactions on Systems, Man, and Cybernetics B in 1996. The algorithm operates on a problem represented as a construction graph: a weighted graph where edges correspond to solution components and a complete tour (or path) through the graph represents a candidate solution.

The Traveling Salesman Problem (TSP) serves as the canonical benchmark. The construction graph has nodes representing cities and edges representing pairwise connections. Each edge (i, j) carries two values: a pheromone level tau_ij (initially set to a small constant tau_0) and a heuristic desirability eta_ij (typically 1/d_ij, the inverse of the distance between cities i and j).

At each construction step, an ant at node i selects the next node j from the set of unvisited nodes. The selection is probabilistic:

p_ij = [tau_ij^alpha * eta_ij^beta] / sum_k [tau_ik^alpha * eta_ik^beta]

where the sum runs over all unvisited nodes k, alpha controls the influence of pheromone (higher alpha means ants follow trails more strongly), and beta controls the influence of heuristic information (higher beta means ants favor short edges more strongly). Each ant maintains a tabu list of visited nodes to prevent revisiting.

An ant constructs a complete tour by visiting every node exactly once, then returning to its start. After all ants (typically m ants per iteration) complete their tours, the tour lengths are evaluated and pheromone is updated.

The parameter alpha and beta together control the exploration-exploitation tradeoff. When alpha is large relative to beta, ants rely heavily on pheromone trails — exploitation of accumulated experience. When beta is large relative to alpha, ants rely on the heuristic — greedy construction based on local edge quality. Empirically, Dorigo found that alpha = 1 and beta between 2 and 5 worked well for TSP, giving the heuristic moderate priority while still allowing pheromone to guide search.

The number of ants m is typically set equal to the number of nodes, ensuring that each node serves as a start point for at least one ant per iteration. The initial pheromone level tau_0 is usually set to the inverse of the number of nodes times the nearest-neighbor heuristic tour length, providing a baseline that is neither too high (which would swamp the heuristic signal) nor too low (which would make early pheromone deposits disproportionately influential).

Pheromone Update Rules

After all ants complete their tours, pheromone is updated in two steps.

Evaporation. All pheromone values are reduced by a factor (1 - rho):

tau_ij <- (1 - rho) * tau_ij for all edges (i, j)

where rho is the evaporation rate (0 < rho < 1). Higher rho means faster evaporation — trails fade quickly, promoting exploration. Lower rho means slower evaporation — trails persist, promoting exploitation. Typical values range from 0.1 to 0.5.

Deposit. Each ant k deposits pheromone on every edge in its tour:

tau_ij <- tau_ij + delta_tau_ij^k for all edges (i, j) in ant k’s tour

where delta_tau_ij^k = Q / L_k and L_k is the length of ant k’s tour, Q is a constant. Better tours (shorter L_k) deposit more pheromone per edge. This quality-proportional deposit is the mechanism that concentrates pheromone on edges belonging to good solutions.

In the original Ant System, all ants deposit pheromone. The Ant Colony System (ACS) modified this to allow only the iteration-best ant (or the global-best ant) to deposit, sharpening the exploitation of the best-known solution. The MAX-MIN Ant System (MMAS) imposed upper and lower bounds on pheromone values, preventing both premature convergence (where all pheromone concentrates on one tour) and stagnation (where pheromone values become negligibly small).

The evaporation-then-deposit sequence is critical. Evaporation first ensures that edges not part of good solutions gradually lose pheromone, making room for new information. Deposit second ensures that the current iteration’s best solutions reinforce their constituent edges. The order matters: reversing it (deposit, then evaporate) would reduce the impact of the current iteration’s deposits, slowing convergence.

Variants: ACS, MMAS, and Rank-Based AS

The original Ant System was competitive but not state-of-the-art on larger TSP instances. Three major variants addressed its limitations.

Ant Colony System (ACS). Dorigo and Gambardella (1997) introduced two modifications. First, a local pheromone update: as an ant traverses edge (i, j), it immediately reduces the pheromone on that edge — tau_ij <- (1 - xi) * tau_ij + xi * tau_0. This local decay makes recently used edges less attractive, encouraging subsequent ants to explore different paths during the same iteration. It is the mechanism that maintains diversity within an iteration, preventing all ants from following the same trail. Second, the global pheromone update is performed only by the global-best ant (the best solution found across all iterations so far), concentrating reinforcement on the highest-quality solution components. ACS also uses a pseudo-random proportional selection rule: with probability q_0, the ant deterministically chooses the edge with the highest tau_ij^alpha * eta_ij^beta; with probability (1 - q_0), it uses the standard probabilistic rule. This creates an explicit exploration-exploitation switch.

MAX-MIN Ant System (MMAS). Stutzle and Hoos (2000) bounded pheromone values between tau_min and tau_max. Only the iteration-best or global-best ant deposits pheromone. The bounds prevent two pathological behaviors: premature convergence (where pheromone values diverge, making all ants follow the same tour) and stagnation (where pheromone values converge to zero, making all choices effectively random). When the system stagnates — detected by monitoring solution quality over recent iterations — the pheromone values are reinitialized to tau_max, restarting exploration. MMAS consistently outperformed ACS on many benchmark instances and became the most widely used ACO variant.

Rank-Based Ant System (AS_rank). Bullnheimer, Hartl, and Strauss (1999) weighted the pheromone deposit by each ant’s rank in the current iteration. The best ant deposits the most; the second-best deposits less; ants ranked below a cutoff do not deposit at all. This intermediate approach between “all ants deposit” (original AS) and “only the best deposits” (ACS, MMAS) provides a graded reinforcement signal that is less susceptible to noise than single-ant deposit but more selective than population-wide deposit.

Comparative performance on benchmark TSP instances (from the TSPLIB library) shows MMAS and ACS as generally the strongest ACO variants, with MMAS having a slight edge on larger instances. All ACO variants are competitive with other metaheuristics (simulated annealing, genetic algorithms) on moderate-sized TSP instances (up to several hundred cities) but fall behind specialized TSP solvers (Lin-Kernighan-Helsgott) that incorporate problem-specific local search. In practice, the best results come from hybridizing ACO with local search: ants construct tours using the pheromone-based rule, then each tour is improved by local search (2-opt or 3-opt) before pheromone deposit. The combination leverages ACO’s global exploration with local search’s refinement.

Theoretical Properties

The theoretical analysis of ACO lags behind its empirical success, but several results provide formal grounding.

Asymptotic convergence. Stutzle and Dorigo (2002) proved that ACO algorithms with appropriate pheromone bounds (as in MMAS) converge to the optimal solution with probability 1 as the number of iterations approaches infinity. The proof relies on two properties: (1) every possible solution has a nonzero construction probability in every iteration (ensured by the pheromone bounds — tau_min > 0 guarantees every edge has positive selection probability), and (2) the optimal solution, once found, receives maximal reinforcement and is therefore increasingly likely to be reconstructed. The convergence is asymptotic — it says nothing about how many iterations are needed.

Time complexity. The per-iteration cost of ACO is O(m * n^2) for the TSP, where m is the number of ants and n is the number of cities (each ant makes n probabilistic selections, each requiring a scan of up to n remaining cities). The number of iterations required for convergence is not bounded polynomially in the worst case — consistent with the NP-hardness of the TSP. ACO does not solve NP-hard problems in polynomial time; it finds good solutions quickly.

Runtime analysis for simple problems. Neumann and Witt (2010) and Doerr et al. (2011) provided runtime analyses of simplified ACO variants on simple combinatorial problems (OneMax, LeadingOnes, shortest-path problems). These analyses prove polynomial expected runtime for specific problem classes, establishing that ACO is efficient on problems with appropriate structure — but the results do not extend to general NP-hard instances.

Relationship to other metaheuristics. ACO is formally related to other population-based search methods. Zlochin et al. (2004) showed that ACO can be interpreted as a model-based search algorithm: the pheromone matrix defines a probability distribution over solutions, and the pheromone update is a form of incremental distribution estimation. This perspective connects ACO to estimation of distribution algorithms (EDAs) and cross-entropy methods. The distinguishing feature is the construction mechanism — ants build solutions sequentially, edge by edge, rather than sampling complete solutions from a distribution.

The gap between theory and practice is significant. Convergence proofs guarantee that ACO will eventually find the optimal solution, but they do not predict solution quality after a finite number of iterations — which is all that matters in practice. The practical success of ACO relies on the interplay between the pheromone-based learning (which concentrates search near good solutions) and the stochastic construction (which maintains diversity). This interplay is well understood empirically but resists tight theoretical characterization.


Further Reading