Skip to main content

An ant colony optimization algorithm for phylogenetic estimation under the minimum evolution principle



Distance matrix methods constitute a major family of phylogenetic estimation methods, and the minimum evolution (ME) principle (aiming at recovering the phylogeny with shortest length) is one of the most commonly used optimality criteria for estimating phylogenetic trees. The major difficulty for its application is that the number of possible phylogenies grows exponentially with the number of taxa analyzed and the minimum evolution principle is known to belong to the N P MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8xdX7Kaeeiuaafaaa@3888@ -hard class of problems.


In this paper, we introduce an Ant Colony Optimization (ACO) algorithm to estimate phylogenies under the minimum evolution principle. ACO is an optimization technique inspired from the foraging behavior of real ant colonies. This behavior is exploited in artificial ant colonies for the search of approximate solutions to discrete optimization problems.


We show that the ACO algorithm is potentially competitive in comparison with state-of-the-art algorithms for the minimum evolution principle. This is the first application of an ACO algorithm to the phylogenetic estimation problem.


The Minimum Evolution (ME) principle is a commonly used principle to estimate phylogenetic trees of a set Γ of n species (taxa) given an n × n symmetric matrix D = {d ij } of evolutionary distances. First introduced by Kidd and Sgaramella-Zonta [1] and subsequently reinterpreted by Rzhetsky and Nei [2, 3], the ME principle aims at finding a phylogeny characterized by minimal sum of branch lengths, under the auxiliary criteria that branches have a positive length and the pair-wise distances on the tree are not smaller than the directly observed pair-wise differences. Its biological justification is based on the fact that, when unbiased estimates of the true distances are available, the correct phylogenetic tree has an expected length shorter than any other possible tree [2, 3] compatible with the distances in D. More formally, the ME principle can be expressed in terms of the following optimization problem:

Problem 1. Minimum Evolution under Least Square (LS)

min ( X , v ) v 1 s . t . f ( D , X , v ) = 0 X X MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqbaeqabmGaaaqaamaaxababaGagiyBa0MaeiyAaKMaeiOBa4galeaacqGGOaakieqacqWFybawcqGGSaalcqWF2bGDcqGGPaqkaeqaaaGcbaWaauWaaeaacqWF2bGDaiaawMa7caGLkWoadaWgaaWcbaGaeGymaedabeaaaOqaaiabdohaZjabc6caUiabdsha0jabc6caUaqaaiabdAgaMjabcIcaOiab=reaejabcYcaSiab=HfayjabcYcaSiab=zha2jabcMcaPiabg2da9iabicdaWaqaaaqaaiab=HfayjabgIGioprtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaGabaiab+Dr8ybaaaaa@5909@

where || · ||1 is the 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NeHW0aaWbaaSqabeaacqaIXaqmaaaaaa@37B1@ -vector norm; v is a vector of the 2n - 3 edge lengths; X is a n(n - 1)/2 × (2n - 3) topological matrix[4] encoding a phylogenetic tree as an unrooted binary tree with the n taxa in Γ as terminal vertices (leaves); X MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83fXJfaaa@3775@ is the set of all the topological matrices; finally, f(· , · , ·) defines the level of compatibility among the distances in D and the distances induced by the phylogenetic tree edges. Any optimal solution (X*, v*) of problem (1) defines a phylogenetic tree satisfying the minimum evolution principle. A topological matrix X is an Edge-Path incidence matrix of a Tree (EPT) (see [5], and additional files 1 and 2) that encodes a tree as follows: any generic entry x ij,k is set to 1 if the edge k belongs to the path from the leaf i to the leaf j, 0 otherwise. In the rest of the paper we refer to problem (1) as the ME problem.

The distance matrix D of problem (1) is estimated from the dataset, e.g., accordingly to any method described in [612]. Condition f(D, X, v) = 0 typically imposes that, for any given EPT matrix X, v minimizes the (weighted) sum of the square values of the differences between the distances in D and the corresponding distances induced by the phylogenetic tree edges [6, 13]. In particular, under the unweighted least-square (also called Ordinary Least-Squares (OLS)) [2]:

v = XDΔ (1)

where X is the Moore-Penrose pseudoinverse of X, and DΔ is a vector whose components are obtained by taking row per row the entries of the strictly upper triangular matrix of D.

Others [14] and [15] have suggested the use of a Weighted Least-Squares (WLS) function:

v = (XtWX)-1XtWDΔ (2)

where W is a strictly positive definite diagonal matrix whose entries w ij represent weights associated to leaves i and j. Finally, Hasegawa et al. [16] introduced a Generalized Least-Squares (GLS) function in which v is computed using:

v = (XtC-1X)-1XtC-1DΔ (3)

where C is a strictly positive definite symmetric matrix representing the covariance matrix of D. To avoid the occurrence of negative branch lengths [14, 17], problem (1) can be modified as follows:

Problem 2. Minimum Evolution under Linear Programming (LP)

min ( X , v ) v 1 s . t . X v D Δ v + 2 n 3 X X MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqbaeaabqabaaaaaeaacyGGTbqBcqGGPbqAcqGGUbGBdaWgaaWcbaGaeiikaGccbeGae8hwaGLaeiilaWIae8NDayNaeiykaKcabeaaaOqaamaafmaabaGae8NDayhacaGLjWUaayPcSdWaaSbaaSqaaiabigdaXaqabaaakeaaaeaaaeaaieGacqGFZbWCcqGFUaGlcqGF0baDcqGFUaGlaeaacqWFybawcqWF2bGDaeaacqGHLjYSaeaacqWFebardaahaaWcbeqaaiabfs5aebaaaOqaaaqaaiab=zha2bqaaiabgIGiodqaamrr1ngBPrwtHrhAYaqeguuDJXwAKbstHrhAGq1DVbaceaGae0xhHi1aa0baaSqaaiabgUcaRaqaaiabikdaYiabd6gaUjabgkHiTiabiodaZaaaaOqaaaqaaiab=HfaybqaaiabgIGiodqaamrtHrhAL1wy0L2yHvtyaeXbnfgDOvwBHrxAJfwnaGqbaiab8Dr8ybaaaaa@68A4@

Unfortunately, both problems (1) and (2) are N P MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8xdX7Kaeeiuaafaaa@3888@ -hard [18]. In this context, let us observe that, given Γ, the cardinality of X MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83fXJfaaa@3775@ is:


where n!! is the double factorial of n. Hence, the number of topological matrices grows exponentially with the number of leaves ([6], p. 25, and see additional files 1 and 2).

Problem (1) has received great attention from the scientific community such that exact and approximate algorithms to solve it have been developed. Exact algorithms for solving problem (1) are typically based on an exhaustive approach (i.e., enumerating all possible trees X). As an example, PAUP* 4.0 [19] allows exhaustive search for datasets containing up to 12 taxa. A number of heuristics were also developed in the last 20 years. E.g., Rzhetsky and Nei [2, 3] (i) start from a Neighbor-Joining (NJ) tree [20, 21], (ii) apply a local search generating topologies within a given topological distance (see [2]) from the NJ tree, and (iii) return the best topology found. Kumar [22] further improved the approach as follows: starting from a topology, a leaf l is selects at each step and all possible assignments of l on the topology are tested. Despite that the neighborhood size in Kumar's approach is larger than in Rzhetsky and Nei's algorithm, it requires examining a number of topologies that is, at most, an exponential function of the number of leaves n: (n - 1)!/2, and generates solution in a shorter computing time. Finally, Bryant and Waddell [4] implemented programming optimisation and Desper and Gascuel [23] introduced a greedy search that both improved speed and accuracy of the search.

Here, we introduce the Ant Colony Optimization (ACO) algorithm for estimating phylogenies under the minimum evolution principle, and show that ACO has the potential to compete with other widely-used methods. ACO (see [24, 25] for an introduction, and [26, 27] for recent reviews) is a widely-used metaheuristic approach for solving hard combinatorial optimization problems. ACO is inspired from the pheromone trail laying and following behavior of real ants. ACO implements indirect communication among simple agents, called (artificial) ants. Communication is mediated by (artificial) pheromone trails implemented as a probabilistic model to which the ants adapt during the algorithm's execution to reflect their search experience. ACO has proven a successful technique for numerous N P MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8xdX7Kaeeiuaafaaa@3888@ -hard combinatorial optimization problems (see [28]), although no application to the ME phylogeny problem is currently known. Our specific implementation of the ACO algorithm exploits a stochastic version of the Neighbor-Joining (NJ) algorithm [20, 21] to explore tree space.

Results and Discussion

Iterative addition

Given a set Γ of taxa, let us define a partial tree as a m-leaf tree whose leaves are taxa of a subset Γ' Γ, with m = |Γ'|. Moreover, given a partial tree, with node set V and edge set E, let us say that we add/insert a leaf i (not yet in Γ') on the edge (r, s) E (i.e., the edge joining the nodes r, s V), and generate a new partial tree with node set V ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGafmOvayLbaKaaaaa@2D18@ = V {i, t} and edge set E ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGafmyrauKbaKaaaaa@2CF6@ = E {(r, t), (t, s), (t, i)}\{(r, s)}. In other words, we add a leaf i on an edge, divide that edge with a new node t, and join the leaf i to t. All algorithms described here build complete phylogenetic trees by iteratively adding one leaf at a time on the edges of a partial tree.

Primal bound

To generate a first upper bound [5] of the ME problem, we adapted the Sequential Addition (SA) greedy algorithm [6]. The Sequential Addition algorithm is less prone, than NJ, to generate a systematic local optimum at the end of the search (i.e., starting from "too good" a primal bound may lead to inefficient results [29]).

The pseudo-code of our version of the Sequential Addition algorithm is presented in Figure 1. In the initialization step, we arbitrarily chose a subset Γ' Γ of mn leaves, and we generate as initial m-leaf partial tree, i.e., an optimal solution of the problem (1) when only m leaves are considered. A each iteration, we join the leaf i to all possible leaves already present in Γ', and choose the solution that minimize tree length (we break possible ties randomly), hence, generating a new partial tree and new set Γ' = Γ' {i}. We iterate the procedure until a tree with n leaves is obtained. Finally, fixing the topology matrix X ^ m MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaacbeGaf8hwaGLbaKaadaWgaaWcbaGaemyBa0gabeaaaaa@2EB1@ , we determine the optimal edge weights by imposing f (D, X ^ m MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaacbeGaf8hwaGLbaKaadaWgaaWcbaGaemyBa0gabeaaaaa@2EB1@ , v = 0, and return the length of the tree, i.e., the upper bound on the optimal solution of the ME problem.

Figure 1
figure 1

Principle of the ACO-ME algorithm. The core iteration includes three main steps: (i) the pheromone update phase during which artificial ants walk on a graph with all possible connections among the n taxa and (n - 2) internal nodes, and lay a trail of volatile pheromone on the branches of the starting tree; (ii) the stochastic construction phase during which new trees are built using both the heuristic information of the pairwise distances and the stochastic process guided by the newly-updated pheromone trail matrix (ants follow a given edge with a probability which is a function of the amount of pheromone on that edge); and (iii) the 2-OPT local search phase that corresponds to a local search using taxon swapping. The curved arrow indicates the stochastic jump of an ant from one edge to another. See text for details.

Unfortunately, the computation complexity of our heuristic is O((2m - 5)!! + n(n - m)2). AT each iteration, given a partial tree, i.e., a k-leaf phylogenetic tree of the leaves in Γ' Γ with k = |Γ'|, and a leaf i not in Γ', the procedure generates all the different (k + 1)-leaf partial trees that can be obtained by adding the leaf i in each edge of the current partial tree.

The ant colony optimization algorithm

The specific ACO algorithm for the minimum evolution problem (hereafter ACO-ME) that we introduce here (cf. pseudo-code in Figure 2), is a hybrid between the Max-Min Ant System (MMAS) [30, 31] and the Approximate Nondeterministic Tree Search (ANTS) [32]. Both methods are modifications of the original Ant System approach [33].

Figure 2
figure 2

High-level pseudo-code for the Sequential Addition heuristic.

The core of the ACO-ME algorithm is the iteration phase, where the ants generate a set T MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83eXtfaaa@376D@ of trees. Then, starting from the trees in T MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83eXtfaaa@376D@ , a local search is performed until a locally optimal tree is found and compared with the current-best tree. If stopping conditions are met the procedure ends, otherwise the iteration phase is repeated.

Each ant builds a phylogenetic tree by iteratively adding a leaf at a time to a partial tree. Following a relation-learning model [34], the choices performed by an ant about (i) which leaf to insert, and (ii) where to add it on the partial tree are based of a set of parameters {τ ij } called pheromone trails. The values of the pheromone trail parameters {τ ij } represent a stochastic desirability that a leaf i shares a direct common ancestor with a vertex j on a partial tree. The ants generate a new set T MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83eXtfaaa@376D@ of trees and the pheromone trail parameters are updated at the end of the main iteration phase.

Let us now consider the algorithm in more details. It uses two identical data structures: s* and s k . The former stores the current-best complete reconstruction (solution) known, whereas the latter stores the best complete reconstruction obtained by the ants during iteration k. The algorithm also uses a variable n a , i.e., the number of artificial ants. How to set the value of n a is discussed in the Parameter settings section. In the initialization phase, s* is first set to the reconstruction obtained by the Sequential Addition algorithm and s k is set to null, then the pheromone trail parameters are updated. We implemented the MMAS [30, 31] method of pheromone update, where τ min τ ij τ max . Here, we set τ min and τ max to 0.0001 and 0.9999, respectively [35]. In the initialization phase, the pheromone trail parameters {τ ij } are set to 0.5, i.e., all positions for leaf insertion have the same desirability.

Before describing the iteration phase, let us introduce some definitions. Let G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NbXF0aaSbaaSqaaiabdUgaRbqabaaaaa@38DE@ be a partial tree with k leaves, V( G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NbXF0aaSbaaSqaaiabdUgaRbqabaaaaa@38DE@ ) the set of vertices of G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NbXF0aaSbaaSqaaiabdUgaRbqabaaaaa@38DE@ , and Γ G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaeu4KdC0aaSbaaSqaamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaGabaiab=zq8hnaaBaaameaacqWGRbWAaeqaaaWcbeaaaaa@3A7E@ the set of leaves of G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NbXF0aaSbaaSqaaiabdUgaRbqabaaaaa@38DE@ . Let us also use the recursive distance definition of [23, 36]: if A and B are two non-intersecting subtrees from a tree G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NbXFeaaa@3753@ , then the average distance between A and B is:

Δ A | B = 1 | A | | B | i A , j B d i j . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaeuiLdq0aaSbaaSqaaiabdgeabjabcYha8jabdkeacbqabaGccqGH9aqpjuaGdaWcaaqaaiabigdaXaqaaiabcYha8jabdgeabjabcYha8jabcYha8jabdkeacjabcYha8baadaaeqbqaaiabdsgaKnaaBaaabaGaemyAaKMaemOAaOgabeaaaeaacqWGPbqAcqGHiiIZcqWGbbqqcqGGSaalcqWGQbGAcqGHiiIZcqWGcbGqaeqacqGHris5aiabc6caUaaa@4BD2@

In the iteration phase, each artificial ant r generates a complete phylogenetic tree using the ConstructCompleteReconstruction(r) procedure, as illustrated in Figure 3: ant r randomly selects four leaves from the set Γ, and builds a partial tree G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NbXF0aaSbaaSqaaiabdUgaRbqabaaaaa@38DE@ , k = 4, then, ant r (i) chooses, among the leaves not yet inserted in the partial topology, the leaf i defining the smallest distance d ij , j Γ G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaeu4KdC0aaSbaaSqaamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaGabaiab=zq8hnaaBaaameaacqWGRbWAaeqaaaWcbeaaaaa@3A7E@ , and (ii) computes the probability that i has a common ancestor with the vertex j V ( T k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83eXt1aaSbaaSqaaiabdUgaRbqabaaaaa@38F8@ ) using the formula suggested by ANTS [32]:

Figure 3
figure 3

High-level pseudo-code for the ACO algorithm.

p i j = α τ i j + ( 1 α ) η i j q Γ \ Γ G k [ α τ q j + ( 1 α ) η q j ] MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemiCaa3aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqGH9aqpjuaGdaWcaaqaaGGaciab=f7aHjab=r8a0naaBaaabaGaemyAaKMaemOAaOgabeaacqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqWFXoqycqGGPaqkcqWF3oaAdaWgaaqaaiabdMgaPjabdQgaQbqabaaabaWaaabeaeaacqGGBbWwcqWFXoqycqWFepaDdaWgaaqaaiabdghaXjabdQgaQbqabaGaey4kaSIaeiikaGIaeGymaeJaeyOeI0Iae8xSdeMaeiykaKIae83TdG2aaSbaaeaacqWGXbqCcqWGQbGAaeqaaiabc2faDbqaaiabdghaXjabgIGiolabfo5ahjabcYfaCjabfo5ahnaaBaaabaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae4NbXF0aaSbaaeaacqWGRbWAaeqaaaqabaaabeGaeyyeIuoaaaaaaa@6C2E@

where η ij represents the "heuristic desirability" that leaf i shares a common ancestor with a vertex j of V( G k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NbXF0aaSbaaSqaaiabdUgaRbqabaaaaa@38DE@ ) (whereas τ ij represents the corresponding "stochastic desirability"). Finally, α [01] allows the relative weighting of heuristic and stochastic desirabilities. The heuristic desirability η ij is computed as:

η ij = (Δ ij - u i - u j )-1

where u i = j V G k Δ i j / ( | V Γ G k | ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemyDau3aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpdaaeqaqaaiabfs5aenaaBaaaleaacqWGPbqAcqWGQbGAaeqaaOGaei4la8IaeiikaGIaeiiFaWNaemOvay1aaSbaaSqaaiabfo5ahnaaBaaameaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFge=rdaWgaaqaaiabdUgaRbqabaaabeaaaSqabaGccqGG8baFcqGGPaqkaSqaaiabdQgaQjabgMGiplabdAfawnaaBaaameaacqWFge=rdaWgaaqaaiabdUgaRbqabaaabeaaaSqab0GaeyyeIuoaaaa@534A@ , i.e., the sum of the distances from i to the leaves not yet inserted in the partial tree divided the number of leaves inserted in the partial tree.

Note that η ij , Δ ij , and u i correspond to the quantities used in the Neighbor-Joining algorithm[20, 21](see also[23]). Hence, computation of the vector p i = {p ij }, for all i Γ, can be interpreted as the stochastic application of the Neighbor-Joining algorithm. A possible problem (not observed yet in practice in our analyses) is that η ij can take negative values. Finally, ant r randomly chooses a vertex j on the basis of the probabilities p i , and the leaf i is added to the tree.

At the end of the construction phase, a set T MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83eXtfaaa@376D@ of trees is obtained and a 2-OPT local search (with best-improvement and without candidate list [37, 38]) is iteratively performed on each tree: two randomly-chosen leaves are swapped and the tree length is evaluated. Swap i is performed on the new tree if swap i-1 generated an improvement, otherwise it is performed on the old tree. To reduce the 2-OPT computational overhead, we perform no more than 10 swappings on each tree in T MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83eXtfaaa@376D@ . If the best tree generated by the 2-OPT local search is shorter than the tree in s*, both s* and s k are updated, otherwise only s k is updated.

The pheromone update completes the iteration phase: each entry τ ij is updated following:

τ ij ← (1 - ρ)τ ij + ε ij


ε i j = { κ ρ / l G | Γ | , if  w i  adjacent to  w j  in  s b e s t ; 0 , otherwise . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaacciGae8xTdu2aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqGH9aqpdaGabeqaauaabaqaciaaaeaacqWF6oWAcqWFbpGCcqGGVaWlcqWGSbaBdaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae4NbXF0aaSbaaWqaamaaemaabaGaeu4KdCeacaGLhWUaayjcSdaabeaaaSqabaGccqGGSaalaeaacqqGPbqAcqqGMbGzcqqGGaaicqWG3bWDdaWgaaWcbaGaemyAaKgabeaakiabbccaGiabbggaHjabbsgaKjabbQgaQjabbggaHjabbogaJjabbwgaLjabb6gaUjabbsha0jabbccaGiabbsha0jabb+gaVjabbccaGiabdEha3naaBaaaleaacqWGQbGAaeqaaOGaeeiiaaIaeeyAaKMaeeOBa4MaeeiiaaIaem4Cam3aaWbaaSqabeaacqWGIbGycqWGLbqzcqWGZbWCcqWG0baDaaGccqGG7aWoaeaacqaIWaamcqGGSaalaeaacqqGVbWBcqqG0baDcqqGObaAcqqGLbqzcqqGYbGCcqqG3bWDcqqGPbqAcqqGZbWCcqqGLbqzcqGGUaGlaaaacaGL7baaaaa@7EE0@

where κ and ρ, the pheromone evaporation rate, are two tuning constants, sbestis one of the tree s* or s k (see below), and l G | Γ | MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemiBaW2aaSbaaSqaamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaGabaiab=zq8hnaaBaaameaadaabdaqaaiabfo5ahbGaay5bSlaawIa7aaqabaaaleqaaaaa@3DA2@ the length of sbest. When applying equation (8), if τ ij is greater than τ max or smaller than τ min , then its value is set to τ max or τ min , respectively. We set to ρ 0.1, κ to κρ [10-2, 10-1], and α to 0.7. Fine-tuning of these parameters might have a significant impact on search efficiency but such a systematic analysis is out of the scope of a proof-of concept for the use of ACO-ME. Finally, if the objective function does not decrease after 30 iteration, ACO-ME chooses s k as sbestinstead of s* for the pheromone updating; if the objective function does not decrease after 30 additional iterations, then all {τ ij } are reset to 0.5 and s* is used for pheromone updating.

Parameter settings

We evaluated the performances of the ACO-ME algorithm under different values of the parameter κ(0.1, 0.5, and 1), and different numbers of ants (1 to 10). For each of the 30 possible combinations of these parameters values, we run ACO-ME for 1000 iterations. As suggested elsewhere (see [29]), we do not consider colony sizes larger than 10.

Relative performances are measured using a normalized index as in [3941]:

I j k = x j ( x ) x j m i n x j m a x x j m i n MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqcfaOaemysaK0aa0baaeaacqWGQbGAaeaacqWGRbWAaaGaeyypa0ZaaSaaaeaacqWG4baEdaqhaaqaaiabdQgaQbqaaiabcIcaOiabdIha4jabcMcaPaaacqGHsislcqWG4baEdaqhaaqaaiabdQgaQbqaaiabd2gaTjabdMgaPjabd6gaUbaaaeaacqWG4baEdaqhaaqaaiabdQgaQbqaaiabd2gaTjabdggaHjabdIha4baacqGHsislcqWG4baEdaqhaaqaaiabdQgaQbqaaiabd2gaTjabdMgaPjabd6gaUbaaaaaaaa@4F10@

where x j ( k ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemiEaG3aa0baaSqaaiabdQgaQbqaaiabcIcaOiabdUgaRjabcMcaPaaaaaa@31E7@ is the best solution found under parameter value k using dataset j, whereas x j m i n MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemiEaG3aa0baaSqaaiabdQgaQbqaaiabd2gaTjabdMgaPjabd6gaUbaaaaa@32F9@ and x j m a x MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemiEaG3aa0baaSqaaiabdQgaQbqaaiabd2gaTjabdggaHjabdIha4baaaaa@32FD@ are respectively the best and worst solutions found on the instance j using the parameter value k. By definition, performance index values are in the interval [0, 1]. The optimal parameter value exhibits the smallest relative performance index (see box-and-whisker plot histograms in Figure (4, 5, 6). Figures 4, 5, and 6 indicate that, for small, medium, and large datasets, the optimal combinations of number of ants/κ are 7/1, 10/0.5, and 8/0.5, respectively. However, differences of performances are not spectacular among different combinations of parameter values (except that performances are generally very low when a single at is used).

Figure 4
figure 4

Normalized ranking of the ACO algorithm performances with small datasets (20 taxa) and κ = 0.1 (a), κ = 05 (b), and κ = 1 (c) versus colony size n a .

Figure 5
figure 5

Normalized ranking of the ACO algorithm performances with medium datasets (50 taxa) and κ = 0.1 (a), κ = 0.5 (b), and κ = 1 (c) versus colony size n a .

Figure 6
figure 6

Normalized ranking of the ACO algorithm performances with large datasets (100 taxa) and κ = 0.1 (a), κ = 0.5 (b), and κ = 1 (c) versus colony size n a .

Experimental evaluation

We first used a set of distance matrices generated from real datasets: the dataset "551314.nex" that includes 55 RBCL sequences of 1314 nucleotides each, and the dataset "Zilla500.nex" that includes 500 RBCL sequences of 1428 nucleotides each. These datasets are available at [42]. Note that sequences in these datasets were aligned using ClustalX [43] and columns including gaps were excluded before computing pairwise distances. Second, we generated (i) 10 artificial instances of 20 taxa (also called small instances); (ii) 10 artificial instances of 50 taxa (also called medium instances); and (iii) 10 artificial instances of 100 taxa (also called large instances). Each artificial instance was generated by random sampling of taxa and partial character reshuffling of the Zilla500.nex data set. More explicitly, after random selection of the 20 or 50 or 100 taxa, we randomly reshuffled characters among taxa, for 50 percents of the aligned columns. As the reshuffling makes the dataset prone to yield undefined pairwise distances [44], we simply used the absolute number of differences between sequence pairs for generating the distance matrix. Edge lengths were computed using the standard OLS because WLS and GLS can potentially lead to inconsistent results et al. [45]. All numerical experiments were performed on a workstation Apple 64-bit Power Mac G5 dual processor dual core, with 8 Gb of RAM, and OS X. The ACO-ME source code is written in C/C++ and compiled using IBM XL C/C++ compiler version 6.0. We compared the quality (total length) of trees generated by the ACO-ME algorithm to those obtained using a classical hill-climbing algorithm (implemented in PAUP* 4.0 [19]) after a fixed run time of 1 minute. The starting tree was generated using the Neighbor-Joining algorithm [20, 21], and the TBR branch-swapping operator [6] was used for exploring the solution space. PAUP* 4.0 was used with and without the "Steepest Descent" (SD) option. When SD is activated, all possible TBR are tried, and the rearrangement producing the largest decrease in tree length is selected, inducing a computational overhead similar to that of the 2-OPT local search implemented in our ACO-ME algorithm. Each algorithm was run 30 times on each of the two real datasets. Figure 7a and 7b show that ACO-ME performances are intermediate between hill-climbing with SD, and hill-climbing without SD. Furthermore, Figure 7a and 7b indicate that the relative performances of ACO-ME, in comparison to hill climbing, increase with larger datasets. Note that, contrary to our simple implementation of ACO-ME, the implementation of ME in PAUP* 4.0 [19] incorporates procedures [4, 23] that greatly speed-up the OLS (reaching a complexity O(n2)). We trust that implementation of these procedures in combination with further tuning of the ACO parameters (number of ants, relative weights of the heuristic information and stochastic pheromone parameters, etc) would lead to better performances of the ACO-ME algorithm. Figure 8a and 8b indicate that the relative performances described above are relatively stable trough time, especially for large data sets (at any time during the run, ACO-ME has similar performances than "hill-climbing without SD" and better performances than "hill-climbing with SD").

Figure 7
figure 7

Comparison of performances between ACO-ME and hill-climbing (with and without Steepest Descent, SD) after a fixed run time of 1 minute on datasets of 55 (a) and 500 (b) taxa. A paired Wilcoxon test indicates that ACO-ME performances are significantly better (p-value = 3.92e-2 for 55 taxa dataset, and p-value = 6.821e-4 for 500 taxa dataset) than those of hill-climbing with SD, but significantly worst (p-value = 4.71e-3 for 55 taxa dataset, and p-value = 4.53e-4 for 500 taxa dataset) than those of hill-climbing without SD.

Figure 8
figure 8

Comparison of score vs. running time for hill-climbing with steepest descent (line labeled "1"), hill-climbing without steepest decent (line labeled "2"), and ACO-ME (line labeled "3") on datasets of 55 (a) and 500 (b) taxa.


We introduce here an Ant Colony Optimization algorithm (ACO) for the phylogeny estimation problem under the minimum evolution principle and demonstrate the feasibility of this approach. Although much improvement in performances can probably be obtained through (i) modification of the local search phase, (ii) tuning of the ACO parameters (number of ants, relative weights of the heuristic information and stochastic pheromone parameters, etc), and (iii) implementation of speed-up procedures and optimization of the code, the current implementation of our algorithm already demonstrates that the ant colony metaphor can efficiently solve instances of the phylogeny inference problem.


  1. Kidd KK, Sgaramella-Zonta LA: Phylogenetic analysis: concepts and methods. American Journal of Human Genetics. 1971, 23: 235-252.

    PubMed Central  CAS  PubMed  Google Scholar 

  2. Rzhetsky A, Nei M: Statistical properties of the ordinary least-squares, generalized least-squares, and minimum evolution methods of phylogenetic inference. Journal of Molecular Evolution. 1992, 35: 367-375. 10.1007/BF00161174.

    Article  CAS  PubMed  Google Scholar 

  3. Rzhetsky A, Nei M: Theoretical foundations of the minimum evolution method of phylogenetic inference. Molecular Biology and Evolution. 1993, 10: 1073-1095.

    CAS  PubMed  Google Scholar 

  4. Bryant D, Waddell P: Rapid evaluation of least-squares and minimum evolution criteria on phylogenetic trees. Molecular Biology and Evolution. 1998, 15 (10): 1346-1359.

    Article  CAS  Google Scholar 

  5. Nemhauser GL, Wolsey LA: Integer and combinatorial optimization. 1999, Wiley-Interscience publication, New York, NY, USA

    Google Scholar 

  6. Felsenstein J: Inferring Phylogenies. 2004, Sinauer Associates, Sunderland, UK

    Google Scholar 

  7. Hasegawa M, Kishino H, Yano T: Evolutionary Trees From DNA Sequences: a Maximum Likelihood approach. Journal of Molecular Evolution. 1981, 17: 368-376. 10.1007/BF01734359.

    Article  Google Scholar 

  8. Jukes TH, Cantor C: Evolution of protein molecules. Mammalian Protein Metabolism. Edited by: Munro HN. 1969, Academic Press, New York, 21-123.

    Chapter  Google Scholar 

  9. Kimura M: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nulceotide sequences. Journal of Molecular Evolution. 1980, 16: 111-120. 10.1007/BF01731581.

    Article  CAS  PubMed  Google Scholar 

  10. Lanave C, Preparata G, Saccone C, Serio G: A New Method for Calculating Evolutionary Substitution Rates. Journal of Molecular Evolution. 1984, 20: 86-93. 10.1007/BF02101990.

    Article  CAS  PubMed  Google Scholar 

  11. Rodriguez F, Oliver JL, Marin A, Medina JR: The general stochastic model of nucleotide substitution. Journal of Theoretical Biology. 1990, 142: 485-501.

    Article  CAS  PubMed  Google Scholar 

  12. Waddell PJ, Steel MA: General Time Reversible Distances with Unequal Rates across Sites: Mixing Gamma and Inverse Gaussian Distributions with Invariant Sites. Molecular Phylogenetics and Evolution. 1997, 8: 398-414. 10.1006/mpev.1997.0452.

    Article  CAS  PubMed  Google Scholar 

  13. Cavalli-Sforza LL, Edwards AWF: Phylogenetic analysis: Models and estimation procedures. American Journal of Human Genetics. 1967, 19: 233-257.

    PubMed Central  CAS  PubMed  Google Scholar 

  14. Beyer WA, Stein M, Smith T, Ulam S: A molecular sequence metric and evolutionary trees. Mathematical Biosciences. 1974, 19: 9-25. 10.1016/0025-5564(74)90028-5.

    Article  CAS  Google Scholar 

  15. Fitch WM, Margoliash E: Construction of phylogenetic trees. Science. 1967, 155: 279-284. 10.1126/science.155.3760.279.

    Article  CAS  PubMed  Google Scholar 

  16. Hasegawa M, Kishino H, Yano T: Dating the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution. 1985, 22: 160-174. 10.1007/BF02101694.

    Article  CAS  PubMed  Google Scholar 

  17. Waterman MS, Smith TF, Singh M, Beyer WA: Additive evolutionary trees. Journal of Theoretical Biology. 1977, 64: 199-213. 10.1016/0022-5193(77)90351-4.

    Article  CAS  PubMed  Google Scholar 

  18. Day WHE: Computational complexity of inferring phylogenies from dissimilarity matrices. Bullettin of Mathematical Biology. 1987, 49: 461-467.

    Article  CAS  Google Scholar 

  19. Swofford DL: PAUP* version 4.0. 1997, Sinauer, Sunderland, MA

    Google Scholar 

  20. Saitou N, Nei M: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution. 1987, 4: 406-425.

    CAS  PubMed  Google Scholar 

  21. Studier JA, Keppler KJ: A note on the neighbor-joining algorithm of Saitou and Nei. Molecular Biology and Evolution. 1988, 5: 729-731.

    CAS  PubMed  Google Scholar 

  22. Kumar S: A stepwise algorithm for finding minimum evolution evolutionary trees. Molecular Biology and Evolution. 1996, 13: 584-593.

    Article  CAS  PubMed  Google Scholar 

  23. Desper R, Gascuel O: Fast and accurate phylogeny reconstruction algorithms based on the minimum evolution principle. Journal of computational biology. 2002, 9 (5): 687-705. 10.1089/106652702761034136.

    Article  CAS  PubMed  Google Scholar 

  24. Dorigo M, Caro GD: The ant colony optimization meta-heuristic. New Ideas in Optimization. 1999, McGraw-Hill, 11-32.

    Google Scholar 

  25. Zlochin M, Birattari M, Dorigo M: Model-based search for combinatorial optimization: A critical review. Annals of Operations Research. 2004, 131: 373-395. 10.1023/

    Article  Google Scholar 

  26. Blum C: Ant colony optimization: Introduction and recent trends. Physics of Life reviews. 2005, 2: 353-373. 10.1016/j.plrev.2005.10.001.

    Article  Google Scholar 

  27. Dorigo M, Birattari M, Stützle T: Ant Colony Optimization – Artificial ants as a computational intelligence technique. IEEE Computational Intelligence Magazine. 2006, 1: 28-33.

    Article  Google Scholar 

  28. Dorigo M, Stützle T: Ant Colony Optimization. 2004, MIT Press, Cambridge, MA

    Book  Google Scholar 

  29. Stützle T, Hoos HH: Stochastic Local Search : Foundations and Application. 2004, Morgan Kaufman, Elsevier

    Google Scholar 

  30. Glover F, Kochenberger GA: Handbook of Metaheuristics. 2003, Kluwer Academic Publishers, Boston, MA

    Book  Google Scholar 

  31. Stützle T, Hoos HH: MAX-MIN Ant System. Future Generation Computer Systems. 2000, 16: 889-914. 10.1016/S0167-739X(00)00043-1.

    Article  Google Scholar 

  32. Maniezzo V: Exact and approximate nondeterministic tree-search procedures for the quadratic assignment problem. INFORMS Journal on Computing. 1999, 11: 358-369.

    Article  Google Scholar 

  33. Dorigo M, Maniezzo V, Colorni A: Ant System: optimization by a colony of cooperating agents. IEEE Trans Syst, Man, Cybern B. 1996, 26: 29-41. 10.1109/3477.484436.

    Article  CAS  Google Scholar 

  34. Blum C, Roli A: Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Computing Surveys. 2003, 35 (3): 268-308. 10.1145/937503.937505.

    Article  Google Scholar 

  35. Blum C, Dorigo M: The Hyper-Cube Framework for Ant Colony Optimization. IEEE Transactions on systems, man, and cybernetics – Part B: Cybernetics. 2004, 34 (2): 1161-1172. 10.1109/TSMCB.2003.821450.

    Article  Google Scholar 

  36. Gascuel O: Mathematics of evolution and phylogeny. 2005, Oxford University Press, New York, NY, USA

    Google Scholar 

  37. Bentley JL: Fast algorithms for geometric traveling salesman problems. ORSA Journal on Computing. 1992, 4 (4): 387-411.

    Article  Google Scholar 

  38. Martin O, Otto SW, Felten EW: Large-step Markov chains for the traveling salesman problem. Complex Systems. 1991, 5 (3): 299-326.

    Google Scholar 

  39. Bianchi L, Birattari M, Chiarandini M, Manfrin M, Mastrolilli M, Paquete L, Rossi-Doria O, Schiavinotto T: Hybrid metaheuristics for the vehicle routing problem with stochastic demands. Journal of Mathematical Modelling and Algorithms. 2006, 5: 91-110. 10.1007/s10852-005-9033-y.

    Article  Google Scholar 

  40. Birattari M, Dorigo M: How to assess and report the performance of a stochastic algorithm on a benchmark problem: Mean or best result on a number of runs?. Optimization Letters. 2006,To Appear,

    Google Scholar 

  41. Birattari M, Zlochin M, Dorigo M: Towards a theory of practice in metaheuristics design: A machine learning perspective. RAIRO – Theoretical Informatics and Applications. 2006, 40: 353-369. 10.1051/ita:2006009.

    Article  Google Scholar 

  42. []

  43. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acid Research. 1997, 24: 4876-4882. 10.1093/nar/25.24.4876.

    Article  Google Scholar 

  44. Catanzaro D, Pesenti R, Milinkowitch M: A non-linear optimization procedure to estimate distances and instantaneous substitution rate matrices under the GTR model. Bioinformatics. 2006, 22 (6):

  45. Gascuel O, Bryant D, Denis F: Strengths and limitations of the minimum evolution principle. Systematic Biology. 2001, 50: 621-627. 10.1080/106351501753328767.

    Article  CAS  PubMed  Google Scholar 

Download references


Daniele Catanzaro is a Research Fellow at the Belgian National Fund for Scientific Research (FNRS). This work was supported by the "Communauté Française de Belgique" (ARC 11649/20022770) and the "Région Wallone". We thank C. Korostensky, and Mike Steel for helpful discussions, as well as J. L. Deneubourg, L. Keller, and two anonymous reviewers for constructive and helpful comments on a previous version of this manuscript.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Michel C Milinkovitch.

Additional information

Authors' contributions

All authors read and approved the final manuscript. Daniele Catanzaro, Raffaele Pesenti, and Michel C. Milinkovitch conceived the study and wrote the manuscript, Daniele Catanzaro performed the numerical analyses.

Electronic supplementary material


Additional File 1: An ant colony optimization algorithm for phylogenetic estimation under the minimum evolution principle – supplementary material. The supplementary file includes discussions on the structure of the EPT matrices as well as how we generate and enumerate topologies in ACO-ME. (TEX 22 KB)

Authors’ original submitted files for images

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Catanzaro, D., Pesenti, R. & Milinkovitch, M.C. An ant colony optimization algorithm for phylogenetic estimation under the minimum evolution principle. BMC Evol Biol 7, 228 (2007).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: