Lack of population structure in nuclear genealogies
The most striking result emerging from this study is a complete lack of monophyly at nuclear genes in Iberian and North African species of Podarcis. Both nuclear genealogies fail to detect obvious population structure that relates to that observed in mitochondrial DNA. In fact, there is a single common pattern emerging from the study of mtDNA, allozymes and both the nuclear genealogies, which is a close relationshp between P. hispanica from Tunisia and Jebel Sirwah.
This situation of polyphyly in nuclear genealogies relative to mitochondrial data is not uncommon (see for example [27–29], amongst many others, for similar results). At first glance, this discordance could suggest that mitochondrial DNA-defined species of Iberian and North African Podarcis do not correspond to true evolutionary entities and that this differentiation is the result of stochastic or deterministic effects acting only on the mitochondrial genome. In fact, it has been demonstrated that quite deep phylogeographic breaks in single gene genealogies may appear in the absence of historical barriers to gene flow simply because of stochastic lineage sorting [30, 31]. It has also been shown that mitochondrial DNA divergence may not be necessarily corroborated by nuclear divergence because of selection acting on mitochondrial DNA [32]. However, there is a remarkable degree of concordance between the units defined based on mitochondrial DNA [12–15] and those observed by morphological analyses [19–21] and multilocus nuclear data [16–18]. It is therefore unlikely that subdivision within Iberian and North African Podarcis is nonexistent. Why, then, do nuclear genealogies apparently not support the partitions observed in mtDNA and, in particular, in allozyme data?
A simple explanation could be a low level of diversity and consequently of resolution in estimates of relationships. However, monophyly of Podarcis mtDNA lineages has been observed even in mitochondrial genes evolving at a very slow rate (such as the control region data set analysed by Pinho et al. [15], for which ~14.5% of the positions are variable (compared to ~22% in 6-Pgdint7 and ~19.7% in β-fibint7); also, if a lack of resolution was responsible for the polyphyly of mtDNA groups, then we would expect that no significant differences would be observed in tree likelihoods with and without enforced species' monophyly. Tests involving topological constraints, however, demonstrated that this is not the case.
Distinguishing between incomplete lineage sorting of ancestral polymorphism and gene flow
Given that lack of informative variation is not a satisfactory explanation for the discrepant results observed between mtDNA and nuclear genealogies, we are left with two possible scenarios, which are not mutually exclusive: a) incomplete lineage sorting of ancestral polymorphism and b) gene flow, particularly male-biased.
In order to investigate whether, despite their polyphyly, species of wall lizards are differentiated with respect to nuclear genealogies, it becomes critical to discriminate between the influences of these two scenarios in shaping the patterns of allele sharing. In order to do so, we estimated levels of gene flow based on nuclear genealogies. Our data strongly suggest that, although gene flow may be important between some species pairs, the polyphyletic pattern mostly derives from the incomplete lineage sorting of ancestral polymorphism. This is a likely scenario because nuclear genes take on average four-times as much time to reach monophyly than mitochondrial DNA does [33]. Therefore, when differentiation is recent, there is a distinct possibility that mitochondrial lineages may not be monophyletic with respect to nuclear genealogies. Three major lines of evidence support our conclusion of a greater effect of incomplete lineage sorting:
Comparison with allozyme variation
If gene flow was the main cause for the non-monophyly of species when considering nuclear gene variation, then allozyme data would also fail to observe any differentiation. This is certainly not the case in Iberian and North African Podarcis. A recent study documented not only that species are in general highly differentiated and monophyletic when considering a population tree, but also that most species can be individualized as discrete clusters by taking into account individual multilocus genotypes [18]. There are nevertheless exceptions with this regard, since some species pairs could not be distinguished, but even if these exceptions resulted solely from introgression, the correspondent pattern in nuclear genealogies would be the sharing of alleles between these particular hybridizing forms and not others, which is not verified.
Relationships between allele age and transpecificity
A second line of evidence suggesting a major role for incomplete lineage sorting relies on the simple observation of patterns of allele relationships, particularly on the β-fibint7 data set. If gene flow was the main cause for the observed species polyphyly, we would expect that both ancestral (alleles that have a central position in the haplotype network) and derived alleles were transpecific or closely related to alleles found in other species. However, we do not detect this pattern; instead, we find that putatively older alleles are more widespread among species than alleles placed as tips, which are likely to have arisen after the species were separated. In order to visualize this trend, we plotted for the β-fibint7 data set the number of mutations separating a given haplotype from haplotype BR25, placed in the centre of the network, against the number of mutations separating that particular haplotype from the nearest haplotype found in another species (Fig. 3). The ancestral nature of haplotype BR25 was confirmed using an alternative method of haplotype network construction [34], which pinpoints the haplotype that is most probably the ancestral within a network. Although Figure 3 is a very rough representation, it clearly illustrates the relationship between allele "ancestrality" and trans-species proximity. Exceptions to this "rule" therefore constitute the most obvious cases of gene flow, as documented by the outlier position of the haplotypes BR62 and BR66 (putatively introgressed from P. hispanica Galera type into P. hispanica sensu stricto). The results are not as clear for 6-Pgdint7, both because a central haplotype cannot be pinpointed with confidence and because this gene shows more segregation between species groups. Nevertheless, there is still a positive correlation between both measures (results not shown).
Non-equilibrium estimates of gene flow between species
Inferences based on coalescent simulations in a divergence with gene flow framework [11] suggest low levels of gene flow among forms, since we estimated virtually zero levels of historical gene flow between all but eight species pairs, of which only three cases were actually found to be significantly different from zero, Moreover, only one of these corresponds to important levels of gene flow since divergence.
Taken together, these results illustrate that the polyphyletic pattern probably arises as a consequence of the incomplete lineage sorting of ancestral polymorphism, and that, even though they are non-monophyletic, Iberian and North African wall lizard species are indeed overall differentiated with respect to nuclear genealogies.
Differences between equilibrium and non-equilibrium estimates of migration rates
This study highlights how different estimates of gene flow provide different pictures of the evolutionary dynamics of a species group. On one hand, according to F
ST
-based estimates of Nm, most species pairs have exchanged genes since they began to diverge; indeed, only a few comparisons suggest the absence of gene flow (Table 2). Based on these estimates, we would be led to think that gene flow is a strong evolutionary force shaping genetic variability between species of Podarcis. On the other hand, gene flow estimates based on an isolation with migration model revealed only a few cases of clear genetic exchange between species. These results need to be interpreted with caution because the isolation with migration model that was fitted to our data does not directly apply to a multi-species scenario. However, as discussed above, these results are inline with inferences based on allozyme data and with the patterns of allele sharing or proximity, suggesting that gene flow, although existent, has probably had a minor role in shaping these lizards' evolution.
These differences between both classes of estimators are not completely unexpected. Wright's [7] island model of migration, on which the relationship between F
ST
and Nm is based, is indeed subject to numerous assumptions, of which several (if not all) are hardly met in most biological systems. One assumption of particular relevance to this study is that of equilibrium between migration and drift, which is violated in populations that have recently become completely isolated. In such cases low F
ST
does not necessarily translate into high levels of gene flow. This is a particular concern when estimating gene flow between species rather than populations, since in these cases effective population size are typically high and migration rates low, which implies that equilibrium conditions will take a very long time to be reached [9].
The pitfalls of gene flow estimates based on F
ST
have been widely debated in the literature [8, 9, 35]. However, only a few studies have actually demonstrated the influence of these limitations in the inference of migration rates in non-equilibrium conditions [36, 37]. The present study therefore provides an additional warning against the misuse of equilibrium-based measures of gene flow on recently-diverged evolutionary groups.
Noteworthy, this warning is valid not only for F
ST
-based estimates but for any method that assumes equilibrium migration, even if other problems inherent to F
ST
are minimized [38].
Evidence for historical gene flow among Podarcis species
Although our results argue in favour of a major role for incomplete lineage sorting in shaping the observed polyphyletic patterns in nuclear gene genealogies, we identified a few species pairs that have probably been exchanging genes.
Very high levels of admixture were detected from P. bocagei to P. hispanica type 1A. Albeit lower, still significant levels were detected from P. hispanica "Galera" type to P. hispanica sensu stricto and from P. hispanica sensu stricto into P. vaucheri. Excluding the latter, these cases do not constitute a surprise since suggestions of hybridization and introgression between these species pairs have been described before. Concerning P. bocagei and P. hispanica type 1A, these were based on individual multilocus genotype clustering [18] and the observation of interspecific matings in nature (R. Ribeiro and M. A. Carretero, pers. comm.). Similarly, gene flow between P. hispanica "Galera" type and P. hispanica sensu stricto had also been reported based on discordance between mitochondrial and morphological variation, with some additional evidence from nuclear data [23].
Our results suggest that gene flow seems to have been more frequent between forms that are sympatric at least in part of their distribution (P. bocagei/P. hispanica type 1A; P. hispanica "Galera"/P. hispanica sensu stricto), which is not totally unexpected. However, we also find strong support for gene flow events between P. vaucheri and P. hispanica sensu stricto. These two species still have poorly studied distribution limits, so it is yet unknown if they establish contact zones or if they are completely allopatric. Nevertheless, in theory present contact should not be a prerequisite for gene exchange; given that species have most likely suffered repeated pulses of range expansions and contractions due to geological or climatic events, presently allopatric pairs of species may have met at some point in their past and then hybridized.
Besides spatial considerations, other results regarding introgression between species are noteworthy. First, in the case of P. bocagei and P. hispanica type 1A, the inferred levels of admixture were surprisingly high, above the values considered as thresholds for the maintenance of genetic differentiation [7]. However, the two species are sympatric and nevertheless maintain their morphological identifiability. It has also been shown that males discriminate conspecific from heterospecific females based on chemical stimuli, which suggests at least some degree of assortative mating [39]. Furthermore, the two species correspond to clearly distinct genetic clusters as shown by almost perfect assignment using individual allozyme multilocus genotype clustering approaches [18]. It therefore seems that the present estimates of gene flow between these two species are slightly inflated; this unexpected outcome could result from our limited sampling and therefore needs further clarification.
Second, we should discuss in more detail the situation of P. hispanica sensu stricto. It has been suggested that the divergent mitochondrial clade that we refer to as P. hispanica sensu stricto is a "ghost" lineage; in other words, this divergent mtDNA clade would be an extant signature of a nowadays extinct species whose mtDNA survived in the nuclear background of other species inhabiting its former area of distribution (namely P. hispanica Galera type and P. hispanica type 3) [23]. Similar scenarios of mtDNA persistence through interspecific capture and nuclear dilution have in fact been shown to occur in several other cases [40, 41], so this hypothesis would not be completely farfetched. However, our results do not agree with this scenario. On one hand, we find that significant amounts of gene flow have indeed occurred between the mtDNA-defined P. hispanica sensu stricto and P. hispanica Galera type. However, these levels do not seem to have been large enough for the two forms to stand out as undifferentiated. The same situation is suggested by analyses of allozyme variation, which report high levels of differentiation and a complete lack of gene flow between nearby interspecific populations of these two taxa [18]. On the other hand, with respect to P. hispanica type 3, although this was unclear in preliminary runs, IM returned very low population migration rates in both directions. This therefore contrasts with analyses of allozyme data, in which these two forms were found to be non-diagnosable [18]. Because data with respect to the nuclear differentiation of mitochondrial DNA lineages from eastern Iberian Peninsula thus remains controversial, this subject clearly needs further assessment.
Third, although migration from P. muralis to P. hispanica type 3 was found to be non significant according to the test proposed by Nielsen and Wakeley [10], there is strong evidence suggesting that gene flow may have occurred between these two species. In fact, an allele separated by a single mutation from those typical of P. muralis alleles, which in turn are distinct by several mutations from the ingroup, was detected in an individual with P. hispanica type 3 mtDNA. For this to correspond to an allele shared by incomplete lineage sorting, it would imply that a single mutation had occurred in either of the lineages since the divergence of these two species (over 10 million years ago [42]), which is unlikely. We therefore believe this to be a false negative for the occurrence of migration, which the conservative significance test is prone to produce [10]. Although the confirmation of this result requires further investigation, this may imply that species of Podarcis take a long time of divergence to acquire complete reproductive isolation.
Nevertheless, this study should not be taken as a definitive assessment of gene flow among Podarcis species. First, because other false negatives may exist among the species pairs in which IM identified non-trivial migration rates. Second, and most importantly, because our limited sample size does not allow pinpointing with complete certainty which species have exchanged genes and which have not. We may infer from this work that historical gene flow among Iberian and North African Podarcis has been low, yet existent; however, a detailed analysis of the patterns of genetic exchange across species will require the detection and thorough characterization of contact zones.
Implications regarding the evolutionary patterns of Iberian and North African Podarcis
A primary observation that can be drawn from this work is that single nuclear genealogies do not reflect a branching pattern related to putative isolation in allopatry that led to the formation of the observed species, mainly because of shared ancestral polymorphism. Hudson and Coyne [43] estimated that, assuming that mutation and drift are the only evolutionary forces operating, reciprocal monophyly should be attained at ~95% of nuclear loci 9N to 12N generations after the population split. It is noteworthy that the minimum separation time between the various lineages has been estimated to be around 5 million years (i.e ~2 million generations), based on mtDNA phylogenetic analyses. Assuming that these estimates are correct, the fact that mtDNA lineages are not reciprocally monophyletic at the nuclear level suggest that the majority, if not all, of extant Iberian and North African Podarcis lineages were able to maintain a relatively high historical effective population size throughout their presumably eventful history, which allowed sorting of mtDNA lineages to monophyly without the corresponding pattern in nuclear genes. We hypothesise that larger effective population sizes in Podarcis might explain the contrasting patterns of subdivision observed at nuclear loci between wall lizards and, for example, the Iberian newt Lissotriton boscai. This species complex is characterized by similarly old or even younger mtDNA splits [44], but in this case the various groups have reached almost complete reciprocal monophyly at the β-fibint7 locus [45].
Pinho et al. [18] suggested that contrasting evolutionary scenarios inferred for the evolution of the genus on the basis of mitochondrial vs. nuclear markers (i.e. step-by-step speciation vs. radiation) could be accommodated taking into account the different effective population sizes that characterize both classes of markers. Indeed, speciation may have occurred at a rate fast enough so that nuclear gene genealogies are uninformative regarding species relationships but mitochondrial genealogies are not. In fact, although multi-species clades in Pinho et al.'s [15] mitochondrial phylogenetic assessment of relationships are supported by high bootstrap values, internal branches are short, which is consistent with a fast speciation rate. Accordingly, our estimates of divergence time between mtDNA-defined lineages on the basis of nuclear markers do not correlate to mitochondrial DNA branching patterns; for example, the largest time since divergence was inferred for the mitochondrial sister pair P. carbonelli/P. hispanica type 2.
An intriguing characteristic of both nuclear gene genealogies isthe detection of a skew towards rare alleles, as suggested by largely negative and significant Tajima's D. Although such a multilocus pattern can usually be interpreted as a signature of demographic growth, it has been shown that the same patterns may also arise as a consequence of fine-scale population structure, which is translated as a bias into sampling schemes such as ours, in which few individuals from multiple populations are analysed [46].
A gradual view of speciation
With the analyses of nuclear genealogies in Podarcis wall lizards from the Iberian Peninsula and North Africa, we have provided compelling evidence for general historical reproductive isolation among distinct lineages despite their non reciprocal monophyly. These results are therefore in accordance with emergent morphological data and previous reports of mitochondrial and nuclear gene variation suggesting that these forms may correspond to different, almost completely reproductively isolated species. We have also provided evidence for the existence of limited to important levels of gene flow among a few species pairs, also in accordance with previously published data. Although the effect that such admixture processes may have in species differentiation remains to be completely acknowledged, preliminary work on the contact zone between P. bocagei and P. carbonelli suggests spatially-restricted genetic exchange and a bimodal hybrid zone concordant with the existence of strong barriers to gene flow [47].
The lack of reciprocal monophyly and the permeability to gene flow implies that wall lizard lineages will fail to be considered good species in the light of most species concepts. Nevertheless, it appears that most forms remain differentiated despite the merging influence of gene flow. Our results in Podarcis wall lizards therefore add to recent evidence stemming from the study of other incipient species groups (such as heliconiine butterflies [48] for example) illustrating that the acquisition of reproductive isolation is a gradual process and contributing to the establishment of a "species with fuzzy borders" paradigm.
Podarcis wall lizards in the Iberian Peninsula and North Africa present a wide variety of distribution patterns, including sympatric, parapatric and allopatric forms and will therefore constitute an excellent model for testing hypotheses related to the origin and maintenance of reproductive isolation in closely related species.