To model how the GP, EP and PP maps interact, it is necessary to represent developmental systems in a way that allow for genetic, environmental and parental inputs. We expand on three different and widely used GRN-based models of development, each one entailing a different complexity and biological realism (Fig. 1A–C). To better interpret our results, we succinctly introduce here the architecture of the models we use (see “Methods” section for further details). From the simpler to the more complex, these models are:
Basic GRN model
This model (based on [29]) represents a simple gene regulatory network (GRN; Fig. 1A). It consists of Ng transcription factors that have continuous, positive concentrations (vector G = (g1,…,gNg); gi ≥ 0 ∀ i), and regulate the expression of each other by binding to cis-regulatory sequences on gene promoters. The regulatory interactions of this GRN are encoded in the Ng x Ng matrix B, whose elements Bik represent the effect of gene k on the transcription of gene i. Positive elements (Bik > 0) represent activation and negative elements (Bik < 0) represent inhibition. A binary (0 or 1) matrix M (Ng x Ng) encodes the GRN topology, so that the interaction Bik is only active if Mik = 1. The initial state of the vector G (represented by the vector G0) accounts for the initial state at the beginning of development, which is supposed to be parentally determined. In addition, the expression of each gene j can be potentially modulated by an environmental factor Ej, which can either upregulate (0 ≤ Ej ≤ 1) or downregulate it (−1 ≤ Ei ≤ 0). The environmental effects of all these Ne environmental factors (Ne = Ng), are contained within the vector E.
Developmental dynamics are attained by changes in gene concentration over a number of developmental iterations (tdev), and the phenotype is recorded as the steady-state expression levels of two arbitrarily chosen genes in tdev (Fig. 1A). Only viable (temporally stable) phenotypes are considered: from 1 to Ng, the normalized G (represented here as G*), must remain the same within a threshold of 10–2 over an interval of tdev/10 developmental time units (|G*0.9·tdev-G*tdev|≤ 10–2). The gene–gene interactions within the GRN follow a non-linear, saturating Michaelis–Menten dynamics (a special type of Hill function), so that the concentration of the gene i changes over developmental time according to the following differential equation:
$$\frac{{\partial g_{i} }}{\partial t} = \frac{{R\left( {h_{i} } \right)}}{{K_{M} + R\left( {h_{i} } \right)}} - \mu g_{i} + \xi$$
(1)
where
$$h_{i} = \sum\nolimits_{j = 1}^{Ng} {M_{ij} B_{ij} g_{j} + E_{j} }$$
(2)
and R (hi) is the Ramp function (R(x) = x, ∀ x ≥ 0 and 0 otherwise) which prevents negative concentrations in gene products resulting from inhibiting genetic or environmental interactions. KM is the Michaelis–Menten coefficient. Without loss of generality, we set KM = 1 (other choices of KM or specific Hill functions are known not to affect the results, see [30]). The environmental term Ej is embedded within the gene-ordered summation of Eq. (2) following the associative property of the sum, and because Ne ≤ Ng always. Notice that, while environmental factors can effectively inhibit gene expression (if Ej < 0), they cannot turn a genetic concentration into a negative value (because of the Ramp function R). All genes and gene products (but not environmental factors) are degraded with a decay term μ = 0.1. In order to avoid unstable solutions, a certain amount of (Gaussian) noise is introduced in the system through the term ξ, randomly drawn from a Normal distribution ~ N(0, 10–2).
GRN + multilinear model
This mixed model (based on [21]), can be viewed as a multi-linear model of phenotypic determination [20] that is added to a basic GRN-model [29] (Fig. 1B). The key difference with the previous model lies in how each phenotypic trait is generated. Rather than being the expression level of one element of the GRN, each trait Ti, i = (1, 2) receives a contribution from each transcription factor according to a linear coefficient:
$$T_{i} = \sum\nolimits_{j = 1}^{Ng} {} Z_{ij} g_{j}$$
(3)
where the factor Zij represents the contribution of the jth gene to the ith trait (−1 < Zij < 1). Note that the Z matrix encoding the linear coefficients is separated from the matrix B encoding the GRN itself. In this paper, the evolutionary implications of the correlations between maps are reported on the basis of this model.
Lattice model
This reaction–diffusion model (based on [30, 31]) represents a simple developmental model that implements multicellular phenotypes in an explicitly spatial context (Fig. 1C). The model describes on a one-dimensional row of Nc non-motile cells (Nc = 16 in our case), whose developmental dynamics is determined by a GRN (as described in the basic model) that is identical for all cells. Interaction between the different cells is achieved through cell–cell signalling involving extracellular diffusion of morphogens (Ng/3 of the GRN elements are considered to be diffusible morphogens). Each of these morphogens has a specific diffusion rate Di (0 < Di < 1) and follows Fick’s second law. Zero-flux boundary conditions are used. Thus, the concentration of gene i over developmental time now is calculated as:
$$\frac{{\partial g_{ij} }}{\partial t} = \frac{{R\left( {h_{ij} } \right)}}{{K_{M} + R\left( {h_{ij} } \right)}} - \mu g_{ij} + \xi + D_{i} \nabla^{2} g_{ij}$$
(4)
In most works that use this model, the phenotype is conceptualised as the expression pattern of one of the constituent genes along the row of cells (e.g. [30, 31]). Here, for the sake of comparability with the other models used, we set T1 and T2 as the average concentration of gene 1 in the first and last two cells of the organism (T1 = (g1,1 + g1,2)/2, and T2 = (g1,Nc-1 + g1,Nc)/2).
While these models differ in complexity, all three feature a GRN at their core, which is defined by three types of variables—namely, regulatory connections, initial gene expression and exogenous inputs (see “Methods” and Fig. 1). Together, these variables define the GRN dynamics which, when implemented and iterated in the models, result in measurable phenotypes. Furthermore, these three types of variables have some correspondence with the three types of variation we address (genetic, parental and environmental). For example, parental effects generally apply modification only in early stages, and thus can be allied to the initial gene expression values of the GRN. This provides an intuitive way to link each of the constituent elements of the GRN to a different source of phenotypic variation: (i) changes in gene–gene interaction strengths in the GRN can be conceptualised as an effect of genetic variation; (ii) changes in the environmentally sensitive elements in the GRN (sensor nodes and diffusion rates) as an effect of environmental variation; and (iii) changes in the initial concentration of each transcription factor in the GRN as an inherited initial state (Fig. 1D–F).
However, it is also true that, for example, an environmental input might change a regulatory sensitivity and a genetic mutation might change an initial gene expression level, rendering the correspondence non-univocal. This loose correspondence should be kept in mind in interpreting our results, but we will show that the type of model variable itself entails a marginal explanatory power when compared with other factors (i.e. the timescale of the change in these variables, see next sections). In addition, even when the correspondence is not strictly one-to-one, our approach exhausts the ways in which a specific GRN can vary (these variations do not alter the GRN topology, which here is assumed to evolve much slower than the inputs [4, 8]).
While more complex models (e.g. cell-based models including morphogenesis) could implement environmental or parental inputs in the developmental dynamics in more ways than we consider here (e.g. by changing the bio-physical properties of cells and tissues), the size of their parameter spaces and their associate computational costs would render our approach unfeasible. Notwithstanding this limitation of our work, comparing the results for these three different models allows us to assess how the robustness of our results escalates with model complexity (see Additional file 1: Fig. S2).
With the described settings, all the models used in this paper produce a single, 2-trait (2-dimensional) phenotype for each combination of inputs. Thus, a set of phenotypes (i.e. a phenotype distribution) can be generated by introducing variation in those inputs. These phenotype distributions represented in a 2D morphospace are considered maps: those resulting from variation in the genetic inputs are GP maps, whereas those resulting from environmental perturbations or perturbations in the initial conditions are considered as EP and PP maps, respectively. While for some authors (e.g. [31, 32]) any map exhibiting phenotypic variation in response to genetic mutations has the property of being evolvable; we consider a GP map to exhibit more or less genetic evolvability to the extent that its phenotypic distributions are aligned with the adaptive demands (in this sense, evolvability is a joint property of variation + selective environment, not a property of the map alone; [9, 10]). If a similar alignment is found in the EP map or the PP map, they are said to exhibit adaptive plasticity and adaptive parental effects, respectively.
Considered together, the three maps encompass the whole set of phenotypes that a given developmental mechanism can generate from the sum of all perturbations. This “total” variation, which can be also represented in a 2D trait space, is referred to as a general phenotype distribution, or GPD (i.e., the GP, EP and PP maps are all contained within this GPD, Additional file 1: Fig. S1).
GP, EP and PP mappings are correlated in randomly generated GRNs
We first explore the inter-dependence between GP, EP, and PP maps in a large (n > 106) ensemble of randomly generated GRNs (encompassing different GRN connectivities, topologies and number of genes, see “Methods”). We then separately introduced random variation (10 input values 0 < x < 1) in the genetic, environmental and parental inputs (i.e., connection weights, expression levels, and initial conditions, respectively) of each of these GRNs, and compared the resulting phenotypic distributions, that is, the resulting GP, EP and PP maps.
We estimated the similarity between these three maps by testing whether or not variation in the genetic, environmental or parental inputs produce similar covariation between the two traits (Fig. 1D–F), using the linear slopes in the phenotypic morphospace as basic descriptors of the different maps. Since comparisons were performed using the (linear, unordered) map slopes, and not the whole phenoypic distributions of each map, we used a simple (Pearson) product moment correlation to evaluate whether or not the slope of a map was associated with a similar slope in the other maps. Pairwise comparisons between the slopes caused by variation in genetic, environmental, or parental inputs were all significantly positive (Pearson r ≥ 0.3; Fig. 2A). This demonstrates that GP, EP and PP mappings are not independent in random GRNs. Note that this positive correlation between maps does not imply that the map-specific slopes themselves are positive; only that their slopes, which can be positive or negative, are similar (a comparison between maps that does not consider the direction of the slope since genetic evolvability is concerned with the sensitivity to random mutations, not their direction; [9]). GRNs showing zero or negative correlation between different mappings also exist, but they are less frequent (Fig. 2B). Importantly, this interdependence across different mappings is robust to more detailed measures of map-to-map similarity, such as Euclidean distances (ED, Additional file 1: Fig. S3). Such ED-based correlations, however, decrease as the difference between the map slopes become too large (presumably as an effect of map complexity itself, since complex maps are more dissimilar than simpler ones, see next section, Fig. 2C, Additional file 1: Fig. S3).
To eliminate the possibility that these observed correlations were caused by similarities in the input values, rather than in the structure of the GRNs, we gradually randomized the input parametric values while recording the correlations between maps (Additional file 1: Fig. S4). This procedure revealed that the correlations do not depend on particular choices of the input parameters. In contrast, correlations were extremely sensitive to parametric changes in the GRN topology, suggesting that the observed similarity between maps is caused by the structure of the GRN connections (i.e., ‘developmental mechanism’ sensu [30,31,32]) rather than the structure of the input perturbations. While a relationship between specific GRN topologies and the strength of the map-to-map correlations is expected (as it occurs for individual maps [30, 31]), further studies would be required to establish this relationship in more detail.
Another alternative explanation for our results would be that the observed correlations are driven by the (close-to-zero) linear slopes associated with very complex (i.e. a “zigzag”-like) maps. However, Fig. 1A shows that even the central region of the correlational space where most (> 90%) maps are contained shows a clear diagonal structure densely populated with non-zero (1 <|Sx|< 2) slopes. Furthermore, if this would be the case, one would expect Euclidean distances between maps to be generally very large (i.e. because most of them would be very complex and highly dissimilar maps having both close-to-zero slopes). This possibility is rejected by our observations (Additional file 1: Fig S3B), which show that the majority of map-to-map Euclidean distances occurs at quite low values (EDA,B≈2), and is mostly constituted by relatively simple, sub-linear maps).
Finally, our simulations reveal that map-to-map correlations are affected in non-trivial ways by certain GRN features, such as GRN size, connectivity, the number of iterations in the GRN dynamics or the model architecture (Additional file 1: Fig. S2). In general, correlations decrease as connectivity and GRN size increase (presumably because large networks offer more opportunities for modularity, which in turn may enable a developmental de-coupling between different traits).
The complexity of GP, EP and PP maps are correlated in randomly generated GRNs
We also investigated whether or not the GP, EP and PP maps exhibit similar complexity in random GRNs. We defined map complexity as the degree of non-linearity in phenotypic response to inputs. This captures the intuition that a linear slope is less complex than a U-shaped response, which is itself simpler than a W-shaped response. Comparing the map complexities between the 106 random GRNs reveals that map complexities are, on average, positively correlated (Pearson r ≥ 0.56; Fig. 2C and Additional file 1: Fig. S2). In other words, if a map (e.g., GP) is simple, the other maps (EP and PP) will be simple too, and they will exhibit very similar slopes. In contrast, if a map is complex, other maps too are likely to be complex, and their slopes will be less similar (Fig. 2C and Additional file 1: Fig. S2).
To ensure that these observed correlations between map complexities are not a general property of pairs of input–output maps, we analysed a large ensemble of random mathematical functions (polynomials of known degree ≤ 4) using the same tools that we used for calculating map complexity. This analysis verified that the correlations do not arise between pairs of randomly selected functions unless they belong to the same complexity class (polynomial degree) (Additional file 1: Fig. S5).
How a network topology creates similarity between map slopes and complexities can be better understood by looking at the whole set of developmentally attainable phenotypes (general phenotypic distribution: GPD), which can be revealed by means of massive and unspecific parametric perturbations (see “Methods” and Additional file 1: Fig. S1). This procedure shows that each generative network creates a distinctive GPD with a highly anisotropic and discontinuous structure. This structure increases the likelihood that individual maps will have similar slopes simply because many phenotypic directions of change are either very unlikely or developmentally impossible (Additional file 1: Figs. S1, S2, S4 and S5).
Positive map-to-map correlations in both slopes and map complexities were found in all three considered models of phenotypic determination (Additional file 1: Fig. S2). However, the correlation coefficients are higher and more variable for complex models involving more than pure-GRN dynamics (Figs. 1B–C and Additional file 1: Fig. S2).
Evolving only one of the GP, EP or PP maps changes the phenotypic biases across the other maps
After exploring the map-to-map correlations in random GRNs, we next wanted to address whether or not adaptive changes within one map (i.e., changes in the covariation between traits) are able to induce similar changes to the other maps. To do so, we performed three sets of selection simulations using the developmental model of intermediate complexity (GRN + multilinear). This model was chosen because it is the one showing more stable map-to-map correlations under a wider range of assumptions and GRN properties (see Additional file 1: Fig. S2). In each simulation, we allowed only one of the three different maps (henceforth the “selected map”) to evolve in response to selection. We refer to the other maps as the “non-selected” maps (see “Methods” for details).
Although an individual may experience many environmental inputs during its lifetime, it has only one genotype and, generally, one parental input (here initial condition). This means that, in one generation, natural selection can act on a distribution of environmentally induced phenotypes, but only on a single phenotype produced by genetic variation (that is why the evolution of GP and PP maps would ordinarily require lineage selection over many generations). Our main point in this paper depends, indeed, on the fact that this difference in the selective timescale makes selection for phenotypic plasticity likely to be a strong driver of genetic evolvability, but not vice versa (see next sections). But, before we get to that, it is necessary to first examine how evolving one map influences other maps, and to examine this properly it is necessary to be able to apply an equally effective, fine-grained selection on each of the selected maps. While introducing similar rates of change in the genetic, environmental and parental inputs is biologically unrealistic, it enables us to examine the evolutionary interdependence of the maps that arise because of their developmental linkage, while removing the differences in their capacity to be selected (next experiments will address more biologically grounded cases).
To adaptively evolve the “selected map”, an initial heterogeneous population of p = 64 individuals is composed by randomly picking each individual (GRN) from the initial random ensemble. According to our previous experiments, each of these individuals exhibits certain “by default” correlation between its maps, yet the average slope of the maps at population level show no particular direction (Fig. 3A). At each evolutionary time step (i.e. within a generation time and for each individual in the population), we introduce variation only in the input associated with the selected map (i.e., genetic, environmental or parental inputs). To make comparisons possible, only one element (e.g. one gene) is varied at a time for each type of input.
In response to the variation on one type of input, each individual develops a set of phenotypes that is compared to an arbitrary (linear) target map to determine the individual’s fitness. In turn, the individual’s fitness determines the likelihood of that individual to contribute to the next generation. Thus, the entire phenotype distribution produced by the selected map is accessible to natural selection (i.e., fine-grained selection). In contrast, the inputs of the non-selected maps were kept fixed (no variation) during simulations, so that these maps remain effectively “invisible” to natural selection (Fig. 3A). Notice that this selection criterion does not select for particular phenotypes, but for particular biases (i.e. trait correlations) in the maps themselves. This procedure allows us to accelerate the weak selection on variation that is found on natural populations, where it is performed indirectly through individual-level selection of phenotypes [9, 10, 33, 34].
In each generation, a number of random point mutations are introduced in the GRN parameters and in the multi-linear coefficients (as in [21], see “Methods”) of the new individuals. Such changes in the developmental architecture may change the way in which organisms respond to the focal inputs, thus creating new selectable variation in the slope of the selected map, and allowing adaptive change in the long-term (< 1000 generations). Once each evolutionary simulation reached a steady state, we assessed if there were any changes in the non-selected maps. We did this by: (1) introducing variation to each of the non-selected maps (one by one), and (2) collapsing the variation for the selected map to a single input value x ~ U(0,1). In the “collapsed” maps, we deliberately avoid setting x = 0 to ensure that our results are due to the lack of variation in the inputs and not to the absence the input itself.
That is, if the selectable phenotypic distribution had been originated exclusively through variation (0 < x < 1) in the parameters of type “A” (e.g. genetic), and keeping the all parameters of B (e.g. environmental) and C (e.g. parental) types fixed; now all “A”-type parameters are kept fixed and parametric variation is introduced, alternatively, in the “B”-type and “C”-type parameters to quantify the newly arising phenotypic distributions (see “Methods”). This experimental setup guarantees that any observed changes in non-selected maps can be attributed to indirect effects of selection on the selected map.
The results revealed that evolving any one map modifies the other maps as well, introducing in them the same adaptive phenotypic biases as observed in the selected map (Pearson r > 0.3, Fig. 3A). This holds true for every map combination (Fig. 3B) and across the entire range of parameters we tested (Additional file 1: Fig. S2). However, the phenotype biases in the non-selected maps are not as strong (|r|< 0.5 in some cases) as in the map under selection (r = 0.99), and exhibit substantial temporal variation, with non-selected maps lagging behind the selected map (Fig. 4). The results in Fig. 3 illustrate the outcome of selecting for a linear map with a slope S = 1 in the two-trait morphospace, but simulations with S = −1 or with changing selective pressures yielded similar results (Figs. 4, Additional file 1: Fig. S6).
The correlated evolution of non-selected maps also implies that the ability of a map to adapt may be influenced by past selective events on the other maps. Indeed, the adaptive evolution of any selected map takes longer if (any of) the maps had evolved before to match a different target (since evolution has to “undo” the already evolved biases before evolving new ones; Additional file 1: Fig. S7).
Maps evolve faster under fine-grained selection than under coarse-grained selection
In the previous experiments, each evolving population was allowed to sample a wide range of genetic, parental or environmental inputs in each generation, and selection therefore acted on a wide range of phenotypic outputs. In other words, we assumed a very fine-grained selection. This allowed us to see how adaptation in each individual map would affect the other maps in the hypothetical case where selection is able to effect change in the selected map easily. In natural populations, such fine-grained selection cannot be assumed, so in this section we examine the effects of relaxing this simplifying assumption. To that end, the previous results under idealized, very fine-grained selective scenarios are taken as a “null hypothesis”, and compared against more coarse-grained regimes.
Several studies show that adaptive plasticity readily evolves when selection is fine-grained [35,36,37], although it is not essential [33]. Whether or not a similar effect occurs for GP and PP mappings is unknown. To address this, we explored the ability of every map to adapt to a target map under different levels of selective grain, ranging from very fine-grained selection (where individuals can experience several inputs within their lifetime) to coarse-grained cases in which there is just one input per generation and this input only shifts every n generations.
As Fig. 5 shows, all maps are in principle equally responsive to strong selection, yet all of them evolve more efficiently under fine-grained selection than under coarse-grained selection. Furthermore, the ability to adapt to the target map escalates sharply around a grain value of 1 (Figs. 5 and Additional file 1: Fig. S9). Under the metrics adopted here (see “Methods”), this is the value where single individuals experience on average more than one input per generation. This implies that it is much easier to evolve a map efficiently, and thus to affect the other maps, if there is within-lifetime variation in the inputs to that map. This disproportionate effect of the most fine-grained screened map on adaptive evolution is observed even when all the three maps are simultaneously selected (Additional file 1: Fig. S8). When maps are not simultaneously selected, but the map under selection is different from the map that has been under selection in the recent past (e.g., due to a change in ecological demands), the current evolution of the former will be influenced by the past selective pressures on the later (Additional file 1: Fig. S7). That would make possible, for instance, that past selection for plasticity has an effect on current genetic evolvability.
Besides selective grain, the ability of GRNs to evolve a target map also depends on the complexity of the target map (i.e., how non-linear is the phenotypic response to input variation). Simple (i.e., linear) maps can easily evolve with moderate fine-grained selection (≈2 inputs per lifetime) whilst evolving more complex (i.e., quadratic or cubic) map requires an increasing number of inputs per lifetime (Additional file 1: Fig. S9).
While universal differences in complexity between maps are hard to conceive of (examples of simple and complex responses have been reported for GP, EP and PP maps), there is a clear, widespread difference in the selective grain of the three maps. This arises from the fact that, in most organisms, individuals can experience different environmental inputs during their lifetime but are limited to a single genotype and a single set of parentally inherited initial conditions. As a result, the EP map selection would be most fine-grained, and hence the one most intensely sculpted by natural selection. Because of this asymmetry, the EP map can exercise a stronger influence on the other maps than vice versa (Fig. 5 and Additional file 1: Fig. S8). In other words, while every map can theoretically be the leader of adaptive evolution, the logic by which natural selection operates in real-world organisms makes the EP map a prominent driver of evolutionary dynamics. The generality and the evolutionary consequences of this are further discussed in the next section.