 Research article
 Open Access
 Published:
Quasispecies in population of compositional assemblies
BMC Evolutionary Biology volume 14, Article number: 265 (2014)
Abstract
Background
The quasispecies model refers to information carriers that undergo selfreplication with errors. A quasispecies is a steadystate population of biopolymer sequence variants generated by mutations from a master sequence. A quasispecies error threshold is a minimal replication accuracy below which the population structure breaks down. Theory and experimentation of this model often refer to biopolymers, e.g. RNA molecules or viral genomes, while its prebiotic context is often associated with an RNA world scenario. Here, we study the possibility that compositional entities which code for compositional information, intrinsically different from biopolymers coding for sequential information, could show quasispecies dynamics.
Results
We employed a chemistrybased model, graded autocatalysis replication domain (GARD), which simulates the network dynamics within compositional molecular assemblies. In GARD, a compotype represents a population of similar assemblies that constitute a quasistationary state in compositional space. A compotype's centerofmass is found to be analogous to a master sequence for a sequential quasispecies. Using singlecycle GARD dynamics, we measured the quasispecies transition matrix (Q) for the probabilities of transition from one centerofmass Euclidean distance to another. Similarly, the quasispecies’ growth rate vector (A) was obtained. This allowed computing a steady state distribution of distances to the center of mass, as derived from the quasispecies equation. In parallel, a steady state distribution was obtained via the GARD equation kinetics. Rewardingly, a significant correlation was observed between the distributions obtained by these two methods. This was only seen for distances to the compotype centerofmass, and not to randomly selected compositions. A similar correspondence was found when comparing the quasispecies time dependent dynamics towards steady state. Further, changing the error rate by modifying basal assembly joining rate of GARD kinetics was found to display an error catastrophe, similar to the standard quasispecies model. Additional augmentation of compositional mutations leads to the complete disappearance of the masterlike composition.
Conclusions
Our results show that compositional assemblies, as simulated by the GARD formalism, portray significant attributes of quasispecies dynamics. This expands the applicability of the quasispecies model beyond sequencebased entities, and potentially enhances validity of GARD as a model for prebiotic evolution.
Background
The quasispecies model
The quasispecies theory describes the replication of asexual replicators at high error rate, and was first proposed to describe errorprone replication of primitive informationcarrying macromolecules at the origin of life [1],[2]. A quasispecies is often viewed as a steadystate population of variant biopolymer sequences, generated by mutations from a sequence [2][4]. This replication with mutation can lead to a population with a different dominant sequence than the original one, even if the original had the highest replication rate, i.e. highest fitness. As such, the quasispecies model is an example of how selection and evolution can arise from simple kinetic underpinnings [4]. Selection acts on the population as a whole rather than on the individual members [5].
While the theory is general to replication, a widely used application of quasispecies is in describing RNA viruses, which have low replication fidelity with measured high mutations rates [6][10], though the model’s validity for some RNA viruses has been a topic of dispute [7],[9],[11][13]. Other biological applications of quasispecies are to the multiple laboratory instances of the Chinese hamster ovary cell line [14] and to catalytic RNA molecules [15],[16].
Using the quasispecies equation [2], it is possible to quantify an error threshold which relates the amount of information a replicating system can store to its single digit error probability [4],[8],[17]. The error threshold is defined as the minimum accuracy of replication which is required in order to preserve the information of the selected state of the system, beyond which the population structure breaks down. When the genotypephenotype map involves redundancy (i.e. more than one genotype give rise to the fittest phenotype), the error threshold can be formulated in terms of phenotypes, and it the population can sustain a lower degree of replication accuracy [18],[19]. As RNA viruses replicate with relatively high mutations rates [10], they are susceptible to conditions which increase their mutation rates to push them beyond the error catastrophe [20][22], a process parallel to extinction by the direct induction of deleterious mutations [23],[24]. The error catastrophe path not only supports the quasispecies nature of RNA viruses, but is also an example of a relation between modeling and experiments.
Sequential versus compositional information
Biological systems have two types of information. The first is the wellestablished sequencebased information, as manifested in biopolymers such as DNA, RNA and proteins. The second information type is compositional information, which plays a parallel central role in biological systems [25][29],[54]. Composition is formally defined as a vector V whose elements are the counts or concentrations of molecular types. In an example, the identity of a living cell can be defined, to an extent, by the counts of all its RNA types (transcriptome) and proteins types (proteome) [30][35]. Compositional information is intrinsically different from sequencebased information, and the total number of different possible compositions, for a given alphabet size of N_{G} and a total count of N_{max} molecules in V is: \left(\begin{array}{l}{N}_{G}+{N}_{\mathit{max}}1\\ \phantom{\rule{3.5em}{0ex}}{N}_{\mathit{max}}\end{array}\right), while the total number of different sequences of a string of length N_{max} is: N_{G}^{Nmax}.
There are significant differences between sequential and compositional entities. For one, biopolymer sequence information is digitally encodable but compositional information is not, which may be viewed as a key difference between chemistry and biology. In the realm of polymeric entities, a point mutation is the replacement of a monomer type in a particular sequence position, necessitating the breaking and formation of covalent bonds. For compositional entities, a point mutation is the “random access” exchange of a molecule of a given type with a molecule of another type. Further, for sequences, the probability of a mutation at a specific location only depends on sequence length and not on the specific sequence. In contrast, for compositions such probability depends both on the size and the actual composition of the entity. For a composition with N_{G} = 2 and N_{max} = m + n: A_{m}B_{n}, the probability of a mutational transition to A_{m+1}B_{n1} is m/(m + n). Finally, for sequences, replication entails the templatebased synthesis of a polymeric strand. In clear difference, a compositional entity undergoes replication/reproduction via compositionpreserving growth, facilitated by a network of “manytomany” molecular interactions, followed by fission [36]. In some formal respect, this is true also for presentday living cells. For example, a crucial step prior to cell division is the biosynthetic doubling of the compositional counts of the proteins that characterize a given cell type. But such similarity cannot be taken very far, since present day cells divide by a highly complex and completely genetically controlled mechanism. Compositional entities have been invoked in models for early evolution [28],[37][40].
The present manuscript attempts to show that compositional replicators, as described above, behave as quasispecies (Figure 1). As a model of compositional replication/reproduction, we employ the graded autocatalysis replication domain (GARD), a chemistrybased formalism that simulates network dynamics within amphiphilecontaining compositional assemblies [36][38]. The GARD model quantitatively describes dynamics of outofequilibrium homeostatic growth, mediated by a network of mutual rate enhancement parameters with occasional assembly fission [36]. Molecules join assemblies, in a probabilistic fashion which is biased by a network of mutual rate enhancement parameters as dictated by assemblycomposition, and occasional fission occurs such that an assembly is outofequilibrium [36]. GARD provides a detailed microscopic description of the walk in compositional space between the points representing molecular assemblies in a replicationlike process. This is different from the quasispecies model, in which a microscopic view of replication is typically not provided.
GARD’s quasistationary states in compositional space are composomes, and their specieslike clusters are compotypes [41]. The latter may serve as targets for selection [42], and exhibit ecologylike constant population dynamics [43].
As GARD assemblies store information in the form of nonrandom molecular compositions, and transfer this information to fissiongenerated progeny, they could be considered as alternative to the RNA world scenario for the origin of life [44][50]. To obtain a more complete picture of this proposed analogy, we asked whether GARD compositional assemblies may behave similarly to sequencebased quasispecies, despite the differences between the realms of sequence composition. We show that the cloud of compositional variants within a compotype obeys the quasispecies model and that it exhibits an errorcatastrophe similar to the classical quasispecies.
Results
A compotype is qualitatively similar to a quasispecies
The GARD model depicts the dynamic behavior of a population of compositional assemblies. It portrays a “cloud” of compositional states, with dynamical interconversions (compositional mutations). Depending on the values of the rate enhancement parameters in β network (Equation 2), this may lead to cases with one or more compotypes (Figures 2 and 3). There are qualitative points of similarity between such compositional entities and quasispecies of sequencebased entities such as RNA molecules or viruses: both cases embody an ensemble of informational entities displaying a relatively high degree of mutual differences. Despite the similarities in the dynamics, the quasispecies and GARD equations are not identical. If each assemblyjoining reaction is a Poisson process, GARD turns into a Markov chain (see Additional file 1: Supporting Data). The corresponding steady state of frequencies of different compositional assemblies is then linear, in contrast to the nonlinear quasispecies equations. Those linear equations, however, require the complete set of all possible assemblies with all possible sizes (from 0 to N_{max}), which is unattainable due to huge dimensionality of the system. It is the central empirical observation of this paper that the use of nonlinear but rather simple quasispecies equations reproduces the statistics of GARD. While the complex reasons for this fact constitute a separate study, currently underway, here a numerical analysis is performed indicating that GARD is welldescribed by the quasispecies equations. The focus in the present study is on cases which exhibit only a single compotype each (N_{C} = 1), whereby compositional entities are disposed around a single center of mass (Figure 3), analogous to the master sequence in sequencebased quasispecies.
Compotypes are quantitatively similar to sequencebased quasispecies
A central aim of the present paper is to provide evidence for quantitative similarities between the compositional assemblies and quasispecies in sequence space. For this, a total of almost 600 cases of GARD population steady states, each with N_{C} = 1, were analyzed. A simplifying principle in which groups of compositional assemblies with similar Euclidean distance to the compotype’s center of mass are lumped into “shells” was utilized (Figure 3). This is in analogy to certain quasispecies analysis, in which sequences with a similar Hamming distance towards the master sequence are lumped together [51]. This allowed deriving the compositional assemblies’ parameters for the quasispecies equation (Equation 1), and to compare the results from GARD’s simulations with those predicted from the quasispecies formalism. Due to the high dimensionality of the system (N_{G} = 100) the difference in volume between neighboring shells is enormous, which is why the results give the occupancy rather than the concentration of assemblies in each shell.
As a first step, a single growth process (SGP) is defined, which serves as a common “generator” for both the quasispecies and the GARD formalisms (Figure 1). An SGP entails the growth via molecular accretion of a compositional assembly from size N_{max}/2 to N_{max} (Methods). For GARD simulations, this serves as an “atom” of the computational procedures that portray multiple growth and fission cycles in numerous assemblies in a reactor under constant population conditions [43]. For the quasispecies formalism, SGPs allow measuring the elements of Equation 1: the growth rates collected in the vector A and the transition probabilities collected in the matrix Q. Growth rates are obtained by a route analogous to the calculation of replication times in GARD populations ([43] and Methods). Transition probabilities from initial to final positions in compositional space are computed using SGPs (Methods). In other words, assemblies in the same distance shell are grouped together and the relevant properties (i.e. Q and A) of each shell are averaged over the assemblies contained in this shell. Fitness is defined as the rate of faithful replication (Q_{ii}A_{i}).
Once A and Q are populated, it is straightforward to employ the quasispecies formalism in order to compute the steady state distribution of fractional occupancy of assemblies within the different distance shells. In parallel, the same distribution is computed based on the fullfledged GARD model, essentially a long series of single growth process followed by fission events [43]. Rewardingly, the distributions obtained by both methods portrayed a high degree of similarity (Figures 4 and 5). Such results support the notion of inherent resemblance between the presently analyzed compositional entities and the classical constituents of quasispecies, namely sequencebased entities.
Importantly, such a good agreement between the distributions is obtained only when A and Q are measured with respect to the compotype (which is an attractor in the compositional space, see Additional file 1: Supporting Data), whereas comparing the distributions with respect to a random assembly or even the eigenvector of β results in a meager agreement (pvalues 6.38 × 10^{−7} and 4.75 × 10^{−8}, respectively. See Additional file 1: Supporting Data).
Similar time dependent dynamics for GARD and for quasispecies
It is asked whether the similarity of dynamic behavior transcends the steady state distributions. For that, the time dependent evolution of the fractional occupancy distribution between the GARD and the quasispecies equation were compared. In both cases the computation started from the same initial conditions, and the system was allowed to propagate towards steady state. The time development as predicted from the GARD equations showed appreciable similarity to that predicted by the quasispecies equation (Figures 6 and 7). This lends further support to the mutual resemblance of the two models.
Compositional error threshold
It is asked whether compositional entities, as described by GARD, may manifest an error threshold, in resemblance to sequencebased entities in the quasispecies model. For this a quantitative analog of the global mutation rate was sought. A change in such a parameter should show a graded diversification of the compositional vectors away from the compotype’s center of mass, eventually leading to a dismantle of the compotype structure. It is discovered that one of the basic rate constants of the GARD model, k_{f}, the basal molecular joining rate (Equation 2), is a suitable proxy. Decreasing k_{f} results in an overall diminution of assembly growth, leading to a predominance of the backward (assemblyexit) reactions governed by k_{b}. This results in an enhanced probability of amphiphile misincorporation, and hence increased compositional mutations. Indeed, as k_{f} diminishes by a factor of 10, the assembly population typically strays away and the assembly fraction residing within the compotype boundaries goes to 0 (Figure 8).
When k_{f} was gradually diminished, a behavior reminiscent to that of classical error catastrophe in sequencebased quasispecies [51] (Figure 9). With decreasing k_{f}, the occupancy of increasingly distant compositional shells was enhanced and then diminished. Beyond a specific range of k_{f} values there was a relatively sharp decline of the compotype occupancy, similar to the sharp decline of the consensus sequence in sequencebased quasispecies. The specific shape of this response to decreasing kf depends on the fitness landscape in each simulation (which is an emergent property of β).
Discussion
The present work aimed at showing that compositional replicators may behave as quasispecies. For this, the graded autocatalysis replication domain (GARD) model, which simulates the kinetics of amphiphilecontaining compositional assemblies, was employed. GARD was originally developed in an attempt to bridge between the “geneticfirst” and the “metabolismfirst” scenarios for the origin of life [37]. The genetic (or replicator) first scenario, also known as “information first” scenario, assumes that a molecule identical or very similar to present day RNA played the role of the selfperpetuating biopolymer [44],[46],[52],[53]. The “freefloating” or surfaceadsorbed mixture of such molecules is assumed to have later evolved both a metabolic network and an encompassing container. The metabolismfirst scenario suggests that the very first life precursors are likely to have been relatively elaborate molecular networks of simple molecules [38],[45],[48],[54],[55]. The GARD model is basically about small molecules, resembling those typically considered as metabolites, which when accreting into molecular assemblies portray a dynamic behavior resembling that of replicators. When doing so, GARD assemblies utilize an unorthodox form of information transfer, namely, the propagation of compositional information.
An error threshold is a hallmark of quasispecies dynamics. In the case of sequencebased quasispecies, one of the parameter that influences this threshold is polymer length, whereby longer polymers show higher error threshold susceptibility [4],[56]. In our analyses a more facile approach to error threshold is observed when diminishing k_{f}, the basal rate of monomer joining into a molecular assembly. It may be asked whether, as a parallelism, GARD error threshold could be related to an assembly size parameter. Previously, it was shown that for a given N_{G}, diminishing the maximal assembly size (N_{max}) results in higher compotype diversity [42]. This may be interpreted as occurring via compositional mutations as described [57]. Thus, an enhanced mutability via reduced N_{max} is suggested as a good candidate proxy to increasing polymer length in the context of an error catastrophe. Future detailed analyses could provide support to this notion.
Conclusions
In conclusion, molecular assemblies that hold compositional information rather than sequencebased information are shown here to comply with a quasispecies description. Because the transmission of compositional information has been proposed to be important in early evolution, these results further underline the importance of the quasispecies model in studying prebiotic evolution. Further, because presentday cells are, in many ways, compositional entities, such results may also have implications to the understanding of populations of presentday cells.
Methods
The EigenSchuster quasispecies equations
The quasispecies formalism describes a population of selfreplicating genotypes (Equation 1) [2][4]. Due to replication errors, a genotype produces not only offspring of its own kind, but might also produce offspring of other genotypes. This is represented by the transition matrix (Q) which denotes the probability at which a certain genotype will produce an offspring of another genotype. Thus, the growth of a particular genotype is governed not only by its own replication rate, but also by the replication rate of the other genotypes. The quasispecies equation is written as:
Where for a genotype i, x_{i} is its time dependent concentration, A_{i} is its replication rate (as it reflects its fitness [3]) and Q_{ij} is the probability of genotype j mutating into i (with Q_{ii} being the probability of selfreplication). Ĕ(t) = ∑x_{i}A_{i} is termed “average excess rate” and serves to keep the total population size constant (∑x_{i} = 1 at all time points). A steadystate solution to this equation is obtained as the eigenvector with largest eigenvalue of the matrix W = {Q⋅diag(A)} (where (diag(A) is a matrix whose values along the diagonal are the values of the A vector, and zero otherwise), in accordance to PerronFrobenius theorem [3],[58]. This eigenvector holds the occurrences fractional occupancy of phenotypes at steadystate, which are the quasispecies.
The GARD model
GARD is a kinetic model which describes the growth and fission of a molecular assembly (Figure 10 is a scheme of the model), typically assumed to consist of a repertoire of N_{G} amphiphilic molecule types (environmental repertoire) [36]. Molecules from a buffered environment form and join an assembly and molecules within it can leave. Once the number of molecules in an assembly reaches a predefined maximal size (N_{max}), a random fission action is applied to produce two progenies of same size (N_{max}/2) which can grow again and again in growthfission cycles. The dynamics are described by a set of ordinary differential equations:
Where ni is the current count of molecule type i in an assembly (i = 1..N_{G}), k_{f} and k_{b} are the basal forward and backward rate constants (assembly joining and leaving, respectively), ρ_{i} is the buffered environmental concentration and N is current assembly size (N = ∑n_{i}). β_{ij} is the rateenhancement exerted by an assembly molecule of type j on incoming or outgoing molecule of type i. β can be represented as N_{G} × N_{G} matrix or as network with N_{G} nodes and N_{G}^{2} edges [42], where different β instances represent different environmental chemistries. Typically, GARD is run in a singlelineage mode, where at each split event only one progeny (picked at random) is followed and the other one is discarded [36].
A composome is defined as a set of subsequently faithfully replicating assemblies (a term originally derived from compositional genome), where a faithfully replicating assembly is defined as an assembly which is highly similar to its predecessor and successor, when GARD is run in singlelineage mode [36]. Similar composomes are grouped into a compotype, using Kmeans clustering algorithm [41]. A compotype is represented by a compositional vector constituting the center of mass of all its member assemblies.
When β is represented in the matrix form, it is a positive matrix, as each of its β_{ij} values are sampled from a lognormal distribution [59]. According to the PerronFrobenius theorem, such a matrix has a unique largest real eigenvalue with a corresponding all positive real eigenvector [58]. The eigenvector was treated as a compositional assembly and marked V_{β}.
Singlegrowthprocess
A SGP is complete single cycle, leading from an assembly at size N_{max} to a following assembly at N_{max} (Figures 1 and 10). It is performed as follows: a parent assembly is picked at size N_{max}; the parent than undergoes fission to produce a progeny at size N_{max}/2 (see comment below); this progeny is then grown to size N_{max} according to the GARD equations (Equation 2) and the SGP is complete. A SGP tracks only one of the progeny, and tracking the other progeny is considered an additional SGP.
GARD simulations
The GARD10 MATLAB code was used for all simulations [42]. Different simulations were run using identical parameters but with different β networks, generated by the MATLAB pseudorandom number generator with different random seeds. When addressing GARD’s population dynamics (populationGARD), dataset was obtained from [43]. In populationGARD, each simulation represents a chemostat which is initially seeded with 1,000 random compositions. Assemblies are allowed to simultaneously grow based on their idiosyncratic kinetic parameters, while the total size of the population is maintained constant, based on a Moran process [60]. This was done for a total of 50,000 SGPs and the sampling of GARD distance distribution was done by collecting the states of the chemostat along the population steady state (t = 4.95.0 × 10^{4} with time intervals of 0.1 × 10^{4}. See for example Figure 1 in [43]). Depending on the values of the rate enhancement parameters in β, different simulations exhibit one or more compotypes [42]. The focus in the present study is on 572 cases which portray only a single compotype each (N_{C} = 1).
Sampling the compositional space and constructing Q and A
The large size of the compositional space, particularly given the values used in this work, N_{G} = 100 and N_{max} = 100, makes direct calculation of Q matrix computationally impossible. Therefore, the N_{G}dimensional molecular space was divided into shells of constant thickness, centered on the compotype center of mass, and assemblies were grouped according to their Euclidean distance from the center of mass (Equation 3). This is in contrast to a previous study [61], where the Q and A vector where directly calculated by using substantially different N_{G} and N_{max} values than those typically employed in GARD, which enabled direct enumeration of the small number of possible compositions.
The Euclidean distance between two assemblies is calculated as:
Where n_{i}^{1} is the count of the i’th molecular type in assembly V^{1}. The maximum possible distance between any two assemblies is N_{max}√2.
Assemblies in the same distance shell were grouped together and the relevant properties (i.e. Q and A) of each shell were averaged over the assemblies contained in this shell. Q_{ij} is then the average probability that a parent at distance shell j will gave rise to a progeny at shell i after a single SGP, and A_{i} is the average growth rate of progenies at shell i.
For each simulation, the compositional space was sampled by performing 600,000 SGPs based on 30,000 parent assemblies, as detailed:
10,000 parent assemblies were generated at random, each by randomly picking a molecular type and adding a random count of this type until the assembly size reaches N_{max}. Another 10,000 parents were generated by conducting 10,000 random walk step pairs starting from the compotype center of mass, where in each step a molecule is randomly removed from the assembly and a random one is added to it. Another 10,000 parents were generated by random walk starting from the V_{β}. Then, for each parent, 20 SGP were performed, each beginning with the parent assembly. Examples of Q and A are given in additional file 1: Supporting Data.
Additional file
Abbreviations
 GARD:

Graded autocatalysis replication domain
 SGP:

Single growth process
References
Eigen M: Selforganization of matter and evolution of biological macromolecules. Naturwissenschaften. 1971, 58 (10): 46510.1007/BF00623322.
Eigen M, Schuster P: Hypercycle  Principle of Natural SelfOrganization. A. Emergence of Hypercycle. Naturwissenschaften. 1977, 64 (11): 541565. 10.1007/BF00450633.
Eigen M, McCaskill J, Schuster P: Molecular quasispecies. J Phys ChemUs. 1988, 92 (24): 68816891. 10.1021/j100335a010.
Biebricher CK, Eigen M: What is a quasispecies?. Curr Top Microbiol Immunol. 2006, 299: 131.
Stich M, Briones C, Manrubia SC: Collective properties of evolving molecular quasispecies. BMC Evol Biol 2007, 7.,
Holland JJ, De La Torre JC, Steinhauer DA: RNA virus populations as quasispecies. Curr Top Microbiol Immunol 1992, 176:1–20,
Domingo E: Quasispecies theory in virology. J Virol. 2002, 76 (1): 463465. 10.1128/JVI.76.1.463465.2002.
Wilke CO: Quasispecies theory in the context of population genetics. BMC Evol Biol 2005, 5:44.
Lauring AS, Andino R: Quasispecies Theory and the Behavior of RNA Viruses. PLoS pathogens 2010, 6(7):e1001005.
Sanjuan R, Nebot MR, Chirico N, Mansky LM, Belshaw R: Viral mutation rates. J Virol. 2010, 84 (19): 97339748. 10.1128/JVI.0069410.
Jenkins GM, Worobey M, Woelk CH, Holmes EC: Evidence for the nonquasispecies evolution of RNA viruses. Mol Biol Evol. 2001, 18 (6): 987994. 10.1093/oxfordjournals.molbev.a003900.
RuizJarabo CM, Arias A, MolinaParis C, Briones C, Baranowski E, Escarmis C, Domingo E: Duration and fitness dependence of quasispecies memory. J Mol Biol. 2002, 315 (3): 285296. 10.1006/jmbi.2001.5232.
Holmes EC, Moya A: Is the quasispecies concept relevant to RNA viruses?. J Virol. 2002, 76 (1): 460465. 10.1128/JVI.76.1.460462.2002.
Wurm FM: CHO quasispecies—implications for manufacturing processes. Processes. 2013, 1 (3): 296311. 10.3390/pr1030296.
Arenas CD, Lehman N: Quasispecieslike behavior observed in catalytic RNA populations evolving in a test tube. BMC Evol Biol. 2010, 10 (1): 8010.1186/147121481080.
Kun A, Santos M, Szathmary E: Real ribozymes suggest a relaxed error threshold. Nat Genet. 2005, 37 (9): 10081011. 10.1038/ng1621.
Swetina J, Schuster P: Model Studies on RNA Replication.2. SelfReplication with Errors  a Model for Polynucleotide Replication. Biophys Chem. 1982, 16 (4): 329345. 10.1016/03014622(82)870373.
Takeuchi N, Poorthuis PH, Hogeweg P: Phenotypic error threshold; additivity and epistasis in RNA evolution. BMC Evol Biol. 2005, 5: 910.1186/1471214859.
Huynen MA, Stadler PF, Fontana W: Smoothness within ruggedness: the role of neutrality in adaptation. Proc Natl Acad Sci U S A. 1996, 93 (1): 397401. 10.1073/pnas.93.1.397.
Sierra S, Davila M, Lowenstein PR, Domingo E: Response of footandmouth disease virus to increased mutagenesis: influence of viral load and fitness in loss of infectivity. J Virol. 2000, 74 (18): 83168323. 10.1128/JVI.74.18.83168323.2000.
Crotty S, Cameron CE, Andino R: RNA virus error catastrophe: direct molecular test by using ribavirin. Proc Natl Acad Sci U S A. 2001, 98 (12): 68956900. 10.1073/pnas.111085598.
Summers J, Litwin S: Examining the theory of error catastrophe. J Virol. 2006, 80 (1): 2026. 10.1128/JVI.80.1.2026.2006.
Ojosnegros S, Perales C, Mas A, Domingo E: Quasispecies as a matter of fact: viruses and beyond. Virus Res. 2011, 162 (1–2): 203215. 10.1016/j.virusres.2011.09.018.
Perales C, Martin V, Domingo E: Lethal mutagenesis of viruses. Curr Opin Virol. 2011, 1 (5): 419422. 10.1016/j.coviro.2011.09.001.
Orgel LE: Evolution of the genetic apparatus: a review. Cold Spring Harb Symp Quant Biol. 1987, 52: 916. 10.1101/SQB.1987.052.01.004.
Higgs ES: What is good ecological restoration?. Conserv Biol. 1997, 11 (2): 338348. 10.1046/j.15231739.1997.95311.x.
Kono N, Arakawa K, Tomita M: Validation of Bacterial Replication Termination Models Using Simulation of Genomic Mutations. PLoS ONE. 2012, 7 (4): e3452610.1371/journal.pone.0034526.
RootBernstein R: A modular hierarchybased theory of the chemical origins of life based on molecular complementarity. Accounts Chem Res. 2012, 45 (12): 21692177. 10.1021/ar200209k.
Gonzalez AG: Use and misuse of supervised pattern recognition methods for interpreting compositional data. J Chromatogr A. 2007, 1158 (1–2): 215225. 10.1016/j.chroma.2007.02.091.
Pertea M: The human transcriptome: an unfinished story. Genes. 2012, 3 (3): 344360. 10.3390/genes3030344.
Lubeck E, Cai L: Singlecell systems biology by superresolution imaging and combinatorial labeling. Nat Methods. 2012, 9 (7): 743U159. 10.1038/nmeth.2069.
Wills QF, Livak KJ, Tipping AJ, Enver T, Goldson AJ, Sexton DW, Holmes C: Singlecell gene expression analysis reveals genetic associations masked in wholetissue experiments. Nat Biotechnol. 2013, 31 (8): 74810.1038/nbt.2642.
Wilhelm M, Schlegl J, Hahne H, Moghaddas Gholami A, Lieberenz M, Savitski MM, Ziegler E, Butzmann L, Gessulat S, Marx H, Mathieson T, Lemeer S, Schnatbaum K, Reimer U, Wenschuh H, Mollenhauer M, SlottaHuspenina J, Boese JH, Bantscheff M, Gerstmair A, Faerber F, Kuster B: Massspectrometrybased draft of the human proteome. Nature. 2014, 509 (7502): 582587. 10.1038/nature13319.
Nesvizhskii AI: A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J Proteomics. 2010, 73 (11): 20922123. 10.1016/j.jprot.2010.08.009.
Mann M, Kulak NA, Nagaraj N, Cox J: The coming age of complete, accurate, and ubiquitous proteomes. Mol Cell. 2013, 49 (4): 583590. 10.1016/j.molcel.2013.01.029.
Segre D, BenEli D, Lancet D: Compositional genomes: prebiotic information transfer in mutually catalytic noncovalent assemblies. Proc Natl Acad Sci U S A. 2000, 97 (8): 41124117. 10.1073/pnas.97.8.4112.
Segre D, Lancet D: Composing life. Embo Rep. 2000, 1 (3): 217222. 10.1093/emboreports/kvd063.
Segre D, BenEli D, Deamer DW, Lancet D: The lipid world. Origins Life Evol B. 2001, 31 (1–2): 119145. 10.1023/A:1006746807104.
Hunding A, Kepes F, Lancet D, Minsky A, Norris V, Raine D, Sriram K, RootBernstein R: Compositional complementarity and prebiotic ecology in the origin of life. Bioessays. 2006, 28 (4): 399412. 10.1002/bies.20389.
Norris V, Hunding A, Kepes F, Lancet D, Minsky A, Raine D, RootBernstein R, Sriram K: The first units of life were not simple cells. Ori Life Evol Biosph. 2007, 37 (4–5): 429432. 10.1007/s110840079088z.
Shenhav B, Oz A, Lancet D: Coevolution of compositional protocells and their environment. Philos T R Soc B. 2007, 362 (1486): 18131819. 10.1098/rstb.2007.2073.
Markovitch O, Lancet D: Excess mutual catalysis is required for effective evolvability. Artif Life. 2012, 18 (3): 243266. 10.1162/artl_a_00064.
Markovitch O, Lancet D: Multispecies population dynamics of prebiotic compositional assemblies. J Theor Biol. 2014, 357: 2634. 10.1016/j.jtbi.2014.05.005.
Gilbert W: Origin of Life  the RNA World. Nature. 1986, 319 (6055): 618618. 10.1038/319618a0.
Dyson F: Origins of Life. 1999, Cambridge University, Cambridge, 2
Joyce GF: The antiquity of RNAbased evolution. Nature. 2002, 418 (6894): 214221. 10.1038/418214a.
Orgel LE: Prebiotic chemistry and the origin of the RNA world. Crit Rev Biochem Mol Biol. 2004, 39 (2): 99123. 10.1080/10409230490460765.
Shapiro R: Small molecule interactions were central to the origin of life. Q Rev Biol. 2006, 81 (2): 105125. 10.1086/506024.
Bernhardt HS: The RNA world hypothesis: the worst theory of the early evolution of life (except for all the others). Biol Direct. 2012, 7: 2310.1186/17456150723.
Takeuchi N, Hogeweg P: Evolutionary dynamics of RNAlike replicator systems: a bioinformatic approach to the origin of life. Phys Life Rev. 2012, 9 (3): 219263. 10.1016/j.plrev.2012.06.001.
Eigen M: Error catastrophe and antiviral strategy. Proc Natl Acad Sci U S A. 2002, 99 (21): 1337413376. 10.1073/pnas.212514799.
Gesteland FR, Cech RT, Atkins FJ: The RNA World. 1999, Cold Spring, Cold Spring Harbor Laboratory
Hanczyc MM, Fujikawa SM, Szostak JW: Experimental models of primitive cellular compartments: encapsulation, growth, and division. Science. 2003, 302 (5645): 618622. 10.1126/science.1089904.
Anet FA: The place of metabolism in the origin of life. Curr Opin Chem Biol. 2004, 8 (6): 654659. 10.1016/j.cbpa.2004.10.005.
Luisi PL, Walde P, Oberholzer T: Lipid vesicles as possible intermediates in the origin of life. Curr Opin Colloid Interface Sci. 1999, 4 (1): 3339. 10.1016/S13590294(99)000126.
Takeuchi N, Hogeweg P: Errorthreshold exists in fitness landscapes with lethal mutants. BMC Evol Biol. 2007, 7: 1510.1186/14712148715. author reply 15
Inger A, Solomon A, Shenhav B, Olender T, Lancet D: Mutations and lethality in simulated prebiotic networks. J Mol Evol. 2009, 69 (5): 568578. 10.1007/s002390099281y.
Kuppers BO: Molecular theory of evolution: outline of a physicochemical theory of the origin of life. 1983, SpringerVerlag, Berlin, Germany
Segre D, Shenhav B, Kafri R, Lancet D: The molecular roots of compositional inheritance. J Theor Biol. 2001, 213 (3): 481491. 10.1006/jtbi.2001.2440.
Moran PAP: Random processes in genetics. Math Proc Camb Philos Soc. 1958, 54 (01): 6071. 10.1017/S0305004100033193.
Vasas V, Szathmáry E, Santos M: Lack of evolvability in selfsustaining autocatalytic networks constraints metabolismfirst scenarios for the origin of life. Proc Natl Acad Sci U S A. 2010, 107 (4): 14701475. 10.1073/pnas.0912628107.
Acknowledgements
We thank Moran Gershoni and Simon Fishilevich for pointing to relevant literature. This work was supported by the Minerva Center for Life Under Extreme Planetary Conditions, the J & R Foundation and the Crown Human Genome Center.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
OM conceived, designed and supervised research. RG performed research. IF performed mathematical analysis. DL, OM, IF and RG wrote, read and approved the final manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
12862_2014_265_MOESM1_ESM.docx
Additional file 1: Supporting Data. Additional text, mathematical analysis and figures supporting the results of this article. (DOCX 269 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.
The Creative Commons Public Domain Dedication waiver (https://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Gross, R., Fouxon, I., Lancet, D. et al. Quasispecies in population of compositional assemblies. BMC Evol Biol 14, 265 (2014). https://doi.org/10.1186/s1286201402651
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1286201402651
Keywords
 Origin of life
 Composomes
 Lipid world
 GARD
 Quasispecies
 Error threshold
 Compositional information
 Composition
 Sequence