Neutral genomic signatures of host-parasite coevolution

Background Coevolution is a selective process of reciprocal adaptation in hosts and parasites or in mutualistic symbionts. Classic population genetics theory predicts the signatures of selection at the interacting loci of both species, but not the neutral genome-wide polymorphism patterns. To bridge this gap, we build an eco-evolutionary model, where neutral genomic changes over time are driven by a single selected locus in hosts and parasites via a simple biallelic gene-for-gene or matching-allele interaction. This coevolutionary process may lead to cyclic changes in the sizes of the interacting populations. Results We investigate if and when these changes can be observed in the site frequency spectrum of neutral polymorphisms from host and parasite full genome data. We show that changes of the host population size are too smooth to be observable in its polymorphism pattern over the course of time. Conversely, the parasite population may undergo a series of strong bottlenecks occurring on a slower relative time scale, which may lead to observable changes in a time series sample. We also extend our results to cases with 1) several parasites per host accelerating relative time, and 2) multiple parasite generations per host generation slowing down rescaled time. Conclusions Our results show that time series sampling of host and parasite populations with full genome data are crucial to understand if and how coevolution occurs. This model provides therefore a framework to interpret and draw inference from genome-wide polymorphism data of interacting species.

Assuming that s ij = s i , δ ij = δ i (i.e. both parameters being independent of the parasite genotype) 3 and setting dN W /dt to zero, we have It particularly follows that s i = δ i = 0 requires d i = b i (1 − c H i ) to have a constant population size 5 in the host for arbitrary choices of H i and I ij .
6 SI 2. Basic reproduction ratios 7 From (2), we obtain Assuming that d i = d and δ ij = δ j (i.e. both parameters being independent of the host genotype), 9 the former equation simplifies to 10 dP j dt The reproduction ratios of the parasite genotypes are given by If all R 0,j < 1, the parasite genotypes are eliminated since they kill more hosts than they infect new 12 healthy ones. When one of these ratios is greater than one, the corresponding parasite genotype is 13 maintained in the population.
14 SI 3. Fixed points of the dynamical system 1 We calculated the fixed points by setting (1) and (2) to zero. Note that the solutions were first 2 obtained in terms of the numbers of healthy and infected hosts before being added up to obtain the 3 fixed points of hosts and parasites assuming one parasite per host. The results for the two-allele 4 case are summarized below. For the MA and iGFG model the results for more than two alleles 5 correspond to those of two alleles, whereas results for the iMA and the GFG model can only be 6 obtained for special cases when A = 3, but the solutions are not enlightening. and .
For the MA model the fixed point, where all alleles may have nonzero frequencies, is given by Besides the trivial solution of all alleles having frequency zero, the remaining solutions are given 10 by (W * 1 , 0, P * 1 , 0) and (0, W * 2 , 0, P * 2 ).

4
For c H 1 = c H 2 = c P 1 = c P 2 = 0, we also obtain whereÎ 22 denotes the equilibrium solution of the infected host genotype. With the additional as-6 sumption of equivalent rates and costs among both host and parasite genotypes, so that particularly whereĤ 2 denotes the equilibrium solution of the second healthy host genotype.

15
The relative site frequencies over time r n,i (t) are obtained as r n,i (t) = f n,i (t)/ n−1 k=1 f n,k (t), where 16 the denominator gives the absolute number of segregating sites, S n (t). For the average number of 17 pairwise differences, we have Π n (t) = 1/ n 2 n−1 k=1 k(n − k)f n,k (t).

19
These implementations can be computationally demanding since the inverse of the relative popula-20 tion size function ρ(t) has to be integrated numerically with high precision before being applied to 21 an exponential function that has to be integrated numerically again from the initial time zero up to towards a stable equilibrium; I 12 and I 21 get lost; green: H 1 and H 2 converge to certain numbers; 1 10 6 2 10 6 3 10 6 4 10 6 5 10 6 6 10 6 W1 1 10 6 2 10 6 3 10 6 4 10 6 5 10 6 6 10 6 P2 I 12 and I 21 get lost; I 11 and I 22 increase towards infinity In the left column the possible fates of (healthy and infected) genotypes are summarized and assigned to colors. In the right column corresponding exemplary parametric plots of host and parasite allele frequencies are shown using Equations (1) and (2). The black dot depicts the initial allele frequency, whereas the red dot shows the zero point in the first and the fixed point in the third and in the fourth subfigure.  16 1 17 1 16 1 16 1 17 1   17 1 17 1 17 1 17 1 17 1   29 2 28 2 28 2 29 2 29 2   56 7 57 7 57 7 57 7 57 7   103   In the left column the possible fates of (healthy and infected) genotypes are summarized in addition or slightly modified to the scenarios already presented in Table SI 5.1.1 by means of the MA model. In the right column exemplary parametric plots of host and parasite allele frequencies are shown (in the first three panels for the dark color). The black dot depicts again the initial allele frequency.