The impact of historical contingency on evolution was analyzed using a two-step evolution experimental design (described in the Methods section and Fig. 2). Briefly, populations initiated from a single ancestor were propagated in four different environments during 1000 generations [22]. At the end of this phase I, population samples as well as randomly isolated clones (one for each population) were propagated during 1000 additional generations in a final environment (phase II). Growth rate and fitness assays in the final environment were performed on population samples at the end of phases I and II, on clones isolated at the end of phase I, as well as on the population samples derived from these clones at the end of phase II. The populations founded from isolated clones were mainly studied for the genomic aspect of our study. The phenotypic results obtained from the two types of populations were similar (see below).
Historical contingency and phenotypic evolution
Divergence of the populations in the four historical environments has been previously reported [22]. Here, we measured the maximum growth rate and fitness of the populations (see Methods) in the new environment at both the start and end of phase II (Fig. 3) to investigate how adaptation in the historical environments affected both the phenotype and evolution trajectories in the new environment. Maximum growth rate and fitness are not independent phenotypic traits. Indeed, fitness often increases owing to growth rate improvement, although this is not systematic. For example, fitness, but not growth rate, may improve if the lag phase is shortened. Conversely, growth rate may improve without dramatic fitness increase if the duration of growth at maximum growth rate is reduced. Analyzing both fitness and maximum growth rate give complementary data on how populations perform and adapt to the final environment.
After phase I (i.e., at the start of phase II), the phenotypic traits of the populations showed differences in the new environment for both population samples and isolated clones (Fig. 3 a,b; Additional file 1: Table S1). Divergence occurred between the populations that evolved in different historical environments (between historical environment divergence, Additional file 1: Table S1, Historical environment effect), but also between the replicate populations that evolved in identical historical environments (within historical environment divergence, Additional file 1: Table S1, Random population effect). The major effect is due to the between historical environment divergence (Additional file 1: Table S1, η2) and illustrates that adaptation in different initial environments affected the performance in the new environment. This is especially true for the populations and clones that initially evolved in Gly. These samples tend to display both higher fitness and growth rate in the final environment compared to the population and clones that initially evolved in the three other environments (Fig. 3a,b). To a lesser extent, within environment divergence shows that differences in evolutionary trajectories of replicate populations that evolved in identical historical environments also affected the phenotypic performance in the new environment.
After phase II, the maximum growth rate and the fitness of the populations increased indicating adaptation to the new environment (Fig. 3c,d). This adaptation was strongly dependent upon the historical environment (between historical environment divergence, Additional file 1: Table S2, Historical environment effect). This may be illustrated by the major fitness improvement of the populations that initially evolved in Ace compared to the populations that initially evolved in Glc and Glu (Fig. 3c). One may wonder whether such environment-specific historical contingency may be caused by initial differences in growth rate and fitness. Indeed, both growth rate and fitness improvement rates have been shown to depend on initial levels [23–25]. However, in our experiment, this cannot be the only explanation for the detected contingency, because populations displaying similar fitness levels at the end of phase I (for example those that initially evolved in Ace, Gly, and Glc) have very different fitness levels at the end of phase II (higher fitness for those that initially evolved in Ace). The adaptation during phase II is also influenced by the differences between replicate populations (within historical environment divergence, Additional file 1: Table S2, Random population effect). These results show that historical evolution affected the ability to adapt to a new environment.
We then wanted to better visualize whether the populations that initially evolved in different environments converged or diverged when adapting to the same final environment. For all pairwise combination of populations and for both fitness and maximum growth rate, we calculated the Difference between the end and start of phase II of the Absolute Phenotypic Difference between the populations (DAPD, see Methods, Fig. 4). Depending on the historical environments, patterns of convergence and divergence (Fig. 4c,d) were identified during phase II. The fitness of the populations that evolved in Ace during phase I diverged from the populations that evolved in Glc and Glu (Fig. 4a,c), due to a major fitness improvement of the populations that initially evolved in Ace during phase II (Fig. 3 a,c). Also, the maximum growth rate of the populations that initially evolved in Ace and Glu, increased during phase II (Fig. 3b,d), leading to the convergence with the populations that initially evolved in Gly (Fig. 4b,d).
Historical contingency and genetic evolution
In addition, we investigated the impact of historical contingency on the genomic changes that occurred during the two phases. Sequencing the genomes of one evolved clone sampled from each of the populations at the end of phase I (Fig. 5a) allowed us to detect 53 mutations [22]. Here, we sequenced the genomes of one evolved clone sampled from each population at the end of phase II (Fig. 5b, Additional file 1: Tables S3 and S4). We detected 31 new mutations (30 by genome sequencing and one by PCR, see below) including 23 point mutations, 20 of which are in coding regions (two producing stop codons) and three in intergenic regions. We also characterized six small deletions including four in coding regions and two in intergenic regions, one 3-bp insertion and one IS186 insertion in a coding region. Genetic parallelism was detected as seven genes (flu, rpoA, glpR, glpG, glpK, spoT and nadR) carried mutations in clones isolated from independent populations. As genome sequencing may have missed DNA rearrangements, we investigated by PCR amplification and Sanger sequencing whether these seven genes were affected by larger insertions or deletions in the clones where no point mutations were initially identified. We found one additional mutation, an IS186 insertion in nadR in the clone from population Glu_2 (Additional file 1: Tables S3 and S4).
We investigated whether and how the genetic modifications detected after phase II were contingent on the historical environments. After phase I, 53 mutations have been identified with two to seven mutations per clone [22]. After phase II, we found 31 mutations with a lower number of mutations (0 to 4) per clone (two-sided t-test, p = 0.002, Additional file 1: Tables S3 and S4). During phase I, seven genes (glpK, glpR, nadR, spoT, argR, rho and lldR) and one operon (mreBC) were repeatedly affected by mutations, i.e. in more than one clone (Fig. 5a). Moreover, the genes mutated in parallel were different from one environment to the other (8 × 4 contingency table Fisher exact test, p = 0.0008). This pattern holds even when considering only the four genes (nadR, spoT, argR and rho) that were mutated in more than one historical environment (4 x 4 Fisher exact test, p = 0.039). By contrast, during phase II 4 genes (flu, glpK, glpR and nadR) and 1 operon (flhCD) were repeatedly mutated in more than one clone but without historical contingency (Fig. 5b; 5 × 4 Fisher exact test, p = 0.728). We observed the same trend when considering all mutations that occurred during both evolution phases (Fig. 5c, 11 × 4 Fisher exact test p = 0.108, 9x4 Fisher exact test, p = 0.562 when excluding the two environment-specific genes namely mreBC and lldR). Three additional lines of evidence supported the same conclusion. First, mutations in glp genes have been identified after phase I specifically in the clones isolated from the glycerol-containing historical environment (Fig. 5a) and after phase II in various clones irrespective to their historical environments (Fig. 5b), suggesting specific adaptation to glycerol-containing environments. Second, mutations in nadR have been identified after phase I in clones from the two glycerol- and glucose-containing environments (Fig. 5a, blue and purple respectively), and after phase II in clones from populations that historically evolved in the two other acetate- and gluconate-containing environments (Fig. 5b, green and pink respectively), suggesting general adaptation. Third, after phase II, one gene (flu) was affected by mutations in several clones irrespective to their historical environments, which indicates no strong historical contingency. These clones showed no other shared mutated gene during either phase I or II. We found no evidence for contingency linked to the historical environments on genes mutated during phase II. Likewise, the mutations detected during phase II were not contingent upon the mutations that accumulated during phase I (10 × 5 Fisher exact test p = 0.169 considering only those mutations that occurred in parallel during either phase).