Evidence against the existence of old and new lineages of A. mexicanus.
We discuss below that the presence of two divergent mtDNA haplotypes is not per se a strong support for the existence of two fish lineages. Moreover, the incongruence of the mtDNA phylogeny with phylogenies obtained with several independent nuclear loci definitively invalidates this hypothesis.
A widely accepted scenario in the community working on A. mexicanus cavefish is that some cave populations are ancient, i.e. hundreds of thousands or even millions of years old, and related to extinct surface fish, whereas other cave populations are more recent and related to extant surface fish. We demonstrate below that this hypothesis relies only on the existence of two divergent mtDNA haplogroups that are supposed to reflect the existence of two divergent fish lineages. The hypothesis that cavefish originated from two separate surface fish stocks was first formulated on the basis of a NADH dehydrogenase 2 (ND2) phylogeny of cave and surface fish [39]. On the one hand all surface fish from the Sierra de El Abra belonged to a haplogroup named “lineage A”, as well as two surface fish from Texas and a surface fish from the Coahuila state, in northeastern México. Pachón and Chica cavefish also belonged to this haplogroup A. On the other hand, Curva, Tinaja and Sabinos cavefish, living in caves that are geographically close to each other, belonged to another and well differentiated haplogroup named “lineage B”. The authors concluded that the cavefish belong to an old stock of fish, “the lineage B” that was present at the surface a long time ago, but now extinct and replaced by surface fish with haplotypes belonging to haplogroup A. Noteworthy this hypothesis implies that the mtDNA haplotype A1 found in Pachón and Chica cavefish (a haplotype found in most surface localities) is the result of recent mtDNA introgressions into these caves. The authors of this publication proposed another explanation: whereas it is likely that introgression can occur in Chica cave where surface fish and hybrids have been found, they suggested that Pachón cavefish, that seem much more isolated, have evolved independently and more recently than haplogroup B cavefish, and they are undergoing troglomorphic evolution more rapidly than other cavefish populations [39]. These hypotheses were among the most parsimonious that could be formulated at that time with this data set. It is important to note that, and even if we accept the ad hoc hypothesis that a surface population has been replaced by another surface population with mtDNAs belonging to a divergent haplogroup, it does not necessarily imply that the age of the cavefish related to the first population is the time of coalescence of the two divergent mtDNA haplogroups. Indeed, two surface populations could have evolved independently for a very long time in two separated Mexican regions, allowing the evolution of two divergent haplogroups, but one population (the extant population) could have replaced the first one (extinct population) only very recently. Moreover, some cave populations could have evolved from the first extinct surface population recently too, but of course before its extinction. Very recently too, other cave populations could have evolved from the extant surface population. In summary, under the hypothesis of replacement of a surface population by another one, the coalescence time of mtDNA can give, at best, the older age possible for cavefish descendants of the extinct surface fish. Nevertheless, even if all cave settlements are very recent it does not preclude finding fish carrying divergent mtDNA.
Another study using a partial sequence of the cytochrome b gene confirmed the existence of two divergent mtDNA haplogroups [36]. This result was expected as mitochondrial genes are completely linked and a unique phylogeny is expected for mitochondrial genomes. Moreover, in this study a third haplogroup was identified in Yucatan. Using a more comprehensive sample and the same mtDNA marker [37], up to seven divergent haplogroups were found in Mexico (A to G, haplogroup G for cytb corresponding to haplogroup B with ND2) with a highly structured geographic distribution suggesting past fragmentation and/or a strong isolation by distance. In this study, haplogroup G was still cave specific (Piedras, Sabinos, Tinaja and Curva that are caves close to each other) and haplogroup A was still Northern Gulf coast and cave specific (Fig. 1). However a more recent analysis [42], expanding further the sample of populations, allowed the identification of surface fish with haplotypes very close to haplogroup G (named Clade II lineage Ie) and haplogroup A (named Clade I Ia) in sympatry in a same water bodies, i.e. Mezquital and Aganaval, in northwestern Mexico. This finding invalidates the hypothesis that haplogroup G evolved in the El Abra region a long time ago, went extinct, and was replaced by haplogroup A. Moreover, haplotypes “G like” were also found in surface fish localities (Rascon and Tamasopo) close to Sierra de El Abra [42].
Haplogroups A and G are highly divergent, supporting a model in which they accumulated mutations in two populations isolated for a long period of time. The presence of both haplogroups in northwestern and northeastern Mexico suggests that these populations mixed recently, at the time of a secondary contact. Dating results discussed below suggest that during the last glaciation, two allopatric populations from north Mexico, one carrying haplotypes belonging to haplogroup A and the other carrying haplotype belonging to haplogroup G, might have moved south and mixed there. After this glaciation they might have moved north again, now sharing haplotypes belonging to haplogroup A and G (this haplotype mixture is actually observed in the northwestern region, i.e. Mezquital and Aganaval water bodies). In the northeastern region, haplotypes belonging to the haplogroup G have up to now been found only in several caves in a restricted geographic area and haplotypes “G like” in surface localities also in a restricted area. Noteworthy, such recent secondary contact of divergent haplogroups were also observed at several other places in south Mexico [34, 42] suggesting that several populations of Astyanax mexicanus were isolated for a long time in different regions in Mexico and Central America and have been recently undergoing secondary contact.
In summary we think that, considering the mtDNA polymorphism alone, there is no reason to believe that the coalescence time of the mtDNA haplogroups should correspond to the age of the most ancient cavefish populations. On the contrary, taking into account the most recent publications, it suggests a recent admixture of two divergent populations. This admixture should be recent enough to allow the maintenance of both haplogroups at different geographic scale (in north Mexico as a whole and in northwestern and northeastern Mexico independently), as genetic drift should have eliminated, at a small geographic scale, one of them after a long period of time.
Moreover, the existence of two fish lineages implies that the mtDNA phylogeny should be congruent with unlinked nuDNA phylogenies, whereas a recent admixture of two surface populations before fish settlements in caves should lead to random fixation of alleles at different unlinked loci and thus incongruent phylogenies between mtDNA and nuDNA loci as well as between different unlinked nuDNA loci.
Noteworthy, a phylogenetic analysis was performed using a large SNP data set in order to estimate the number of independent cave settlements at the origin of five cavefish populations from three distinct regions [43]. This analysis did not support the two lineages hypothesis but indicated at least four independent origins for these cavefish populations. However, this study could not evaluate if congruent phylogenies are obtained with samples of sequences from different nuclear loci. Moreover, dating the age of the internal nodes of the phylogenetic tree was not possible.
We compared mtDNA and nuDNA phylogenies using published sequences of several nuclear genes. First, we reconstructed a maximum likelihood phylogeny with Rag1. The resolution is so low that it precludes any phylogenetic inference, and even species defined using mtDNA are not supported. The congruence of the phylogenies with mtDNA and mtDNA+Rag1 [42] is the result of the very low quantity of phylogenetic signal in Rag1 compared to mtDNA and it does not support the congruence of mtDNA and Rag1 phylogenies.
Then we examined phylogenies obtained with other nuclear genes (Mc3r, Mc4r, Lepb, Lepr, Pomcb). These phylogenies are based on four sequences, but there are nevertheless highly informative. For each gene we found a unique phylogeny without homoplasy suggesting no recombination within each locus. Moreover, the incongruence of the phylogenies obtained with these unlinked loci supports their independent evolutionary histories. These results do not support two well defined fish lineages whereas admixture of gene phylogenies is expected when sampled localities are poorly isolated and/or have been separated for a short period of time.
Partial or complete coding sequences of five other genes (Per1, Per2, Tef1, Cry1a and Cpd photolyase) from three localities (Pachón and Chica caves and a surface locality close to Micos) were also available [52]. These sequences were aligned with the sequence of the Texas surface fish and the most parsimonious tree reconstructed for each gene. Surface fish sequences were always very close confirming the mtDNA evidence that the surface population sampled in Texas is genetically very close to Sierra de El Abra surface fish. These phylogenies are also interesting because they highlight another fact. Whereas for some genes, all the haplotypes are almost identical (very few mutations in Mc1r, Mc4r, Lepb, Pomcb, Per2, Tef1, Cpd photolyase), we can identify two, and only two, divergent haplotypes for Mc3r, Lepr, Per1 and Cry1a. Moreover, the distribution of divergent haplotypes is not the same for different loci (a divergent haplotype of Mc3r is found in Tinaja cave only, a divergent haplotype of Lepr in Molino cave only, divergent haplotypes of both Per1 and Cry1a at the surface only). Taking into account the existence of two divergent mtDNA (“G” haplotype in Tinaja and “A” haplotype in Pachón and surface fish), these phylogenies suggest that two divergent fish lineages with well differentiated genomes mixed and divergent alleles at each locus segregated randomly at the surface and in caves. When there are no divergent alleles at a locus (Mc1r, Mc4r, Lepb, Pomcb, Per2, Tef1, Cpd photolyase), one can suppose that alleles from one ancestral population went extinct. On this basis we came to the conclusion that cavefish could be much more recent than usually thought. In order to make a quantitative analysis of this hypothesis we applied two different approaches to estimate the age of some cave populations using multiple unlinked nuclear loci.
Dating isolation times of cave populations
Dating the age of a recently isolated population that can exchange migrants with the “source” population is a difficult task [58]. If divergence is low and there is shared polymorphism between two populations, it can be the result of regular migration between these populations that diverged a long time ago, or the consequence of a recent divergence of completely isolated populations, or something in between. One can thus estimate how long ago the populations diverged (assuming no gene flow) using phylogenetic methods, or one can estimate the gene flow (assuming that the populations are at mutation/migration/drift equilibrium, i.e. they have been separated for a very long time, migrations occurred regularly and thus the phylogenetic signal has been erased) using population genetic methods. Such methods have been applied to study the evolution of A. mexicanus cavefish [32,33,34,35,36,37,38,39, 42, 43, 59], but neither one is of much use and often misleading if the goal is to develop a full picture that includes estimates of recent separation time and gene flow [58]. In such case it is necessary to consider non-equilibrium models and methods allowing the joint estimation of demographic parameters (populations sizes and migration rates) and divergence time [58]. Accordingly, we used IMa2, a widely used program for “isolation with migration” (IM) model analyses [53] to estimate divergence time between surface and cave populations with a dataset of multi-locus microsatellite polymorphism. IMa2 is based on backward simulations of coalescence of samples of alleles. For Pachón cave population, we estimated the divergence time using an alternative approach based on forward simulations of evolution of SNPs. Analyses of microsatellite polymorphism with IMa2 supported a recent origin of all cave populations. Analyses of SNPs confirmed a recent origin of Pachón cavefish.
Dating with microsatellites
The microsatellite data set was kindly provided by M. Bradic and R. Borowsky [33]. We performed a series of pairwise analyses implying a cave population and a surface population or two cave populations using IMa2 in order to estimate the marginal posterior probability density of model parameters (i.e. population sizes, migration rates and divergence time). Whereas the current version of IMa2 can handle more than two populations, the phylogenetic relationships between populations must be known. In the case of Astyanax mexicanus, as shown above, there is no obvious phylogenetic relationships between surface and cave populations. In addition, as the number of parameters increases very fast with the number of populations analyzed, it is a very difficult task to analyze more than two populations. However, it is possible to study a large complex divergence problem that involves multiple closely related populations by analyzing pairs of populations [53, 60].
First we focused on the divergence time of Pachón cave population (O1, according to Bradic et al. nomenclature [33]) and a sample of surface fish from close localities (S3) (Fig. 1) because large samples of alleles had been genotyped for many loci and it may allow more accurate estimations of model parameters than for cave populations for which a limited number of alleles were genotyped. Assuming a mutation rate, estimations of the effective population sizes, migration rates and divergence time can be obtained. In most population genetic studies, including A. mexicanus [33], the mutation rate of microsatellite loci is assumed to be about 5 × 10− 4. This is a quite high mutation rate, but it is very likely as the loci retained for population genetic analyses are the most variable, thus those with the highest mutation rate. The estimation of the effective population size (Ne) was 150 [150–1150] and 10,750 [6650–16,350] for Pachón cave population and surface fish population, respectively. These estimations make sense as it is obvious after several trips in Sierra de El Abra that the census population size (Nc) of surface fish is much higher than the census population size of the cavefish. Nc has been estimated for Pachón cavefish (8502; 95% confidence limits [1279–18,283]) [1] but it has never been estimated for surface fish. On the one hand we expect that Ne is correlated with Nc, but for fish that can potentially lay or fertilize hundreds of eggs or no eggs at all during their life such as A. mexicanus fish, the variance of the number of descendants is probably high (in the range of 101 to 102) and thus Ne might be one to two orders of magnitude smaller than Nc [61, 62]. If Nc is in the order of magnitude of 104, it is expected that Ne is in the order of magnitude of 102 to 103 as found in the present study with IMa2. Previous studies, using the same data set and the program Migrate [63], found similar Ne for several caves, including Pachón [33, 35].
The results obtained with IMa2 suggested that the migration rate from the cave to surface is negligible whereas the migration rate from the surface to the cave is low. This is also expected. If fish could easily exit the cave, no evolution of cavefish would have occurred. Moreover, it is difficult to imagine that blind cavefish who found their way to the surface will have a good fitness there. Concerning the migration rate of surface fish into the cave, it is likely that the fitness of surface fish in a cave is low compared to well-adapted cavefish (Luis Espinasa, personal communication). So, even if the migration rate of adult surface fish into cave is not negligible, the “effective migration rate”, that is the rate of migration of surface fish that actually reproduce in the cave is probably extremely low. Accordingly, surface fish has never been observed in Pachón cave, but fish that were likely hybrids were reported during a couple of years in the 80’s [64]. This observation made by only one group of investigators had never been made by anybody else, before or after.
The IMa2 estimation of divergence time of 5110 years [1302–18,214] suggests that the Pachón population is much younger than usually thought. Indeed, and without any complex computations, a simple glance at the microsatellite data set (Additional file 3) actually supports a recent origin of this cavefish: 1) there is no divergence of the distribution of allele sizes found in the surface fish and Pachón cavefish, 2) the alleles found in the cave are also present at the surface. The difference between these populations is that, at each locus, there are many different allele sizes at the surface but a much smaller number of allele sizes in the cave. This differentiation without divergence can be easily explained by a much higher genetic drift in the small cave population than in the large surface population. Of note, even if we consider that the mutation rate is 10 times lower than the rate taken to make these estimations, the origin of the Pachón cavefish would still be much more recent than usually thought.
Similar results were obtained with all the cave populations studied. They all appeared having low Ne and a recent origin, less than 10,000 years ago. Taking into account the large variance of the divergence time estimations, we can conclude that they are most likely all less than 20,000 years old. In addition, taking also into account the uncertainty on the mean mutation rate of the microsatellites that could be about 5 times lower than assumed, the limit to the estimation of the age of the cave populations could be pushed to about 100,000 years.
Dating with SNPs
Even though the recent evolution of A. mexicanus cavefish is well supported by analyses of multiple microsatellite loci with IMa2, this estimation was so at odds with the current opinion of antiquity of most cave populations that we considered necessary to bring additional evidence using a completely different approach and a totally different set of data. A congruent estimation of the model parameters of interest, in particular the divergence time, would greatly strengthen our conclusion. We focused on dating Pachón cave population for which we identified a large sample of SNPs in RNA-seq of pooled embryos of fish maintained in the lab. We performed the dating of the divergence time of this population with a Texas surface population of fish for which SNPs were also identified using the same approach. As discussed above, all data (mtDNA and nuDNA) showed unambiguously that these surface fish are closely related to the surface fish sampled in the Sierra de El Abra region. Despite this evidence and the absence of a high structuration of the genetic diversity of the surface fish in the Sierra de El Abra region, if there is a genetically closer surface population living near the Pachón cave, the straightforward conclusion would be that the result we obtained is an overestimate of the age of the Pachón cave population.
Of note, the Texas surface fish shared about 7% of their SNPs with Pachón cavefish. As shared polymorphism is expected to decrease quickly if at least one population has a low effective population size and the two populations are completely isolated, it suggests that the divergence is recent or the migration rate is high.
In order to make estimations of the model parameters (i.e. population sizes, migration rates and divergence time) that could explain the distribution of SNPs within and between populations, we ran simulations of the evolution of the SNPs forward in time in two populations, allowing migration from the surface to the cave. Running these simulations with different sets of parameters allowed finding simulations for which, after a given number of years of divergence, the distribution of the SNPs within and among simulated populations was very similar to the distribution observed within and among the real populations. The time of divergence was taken as an estimation of the age of the cave population. The use of simulations in order to estimate unknown values of model parameters is common in population genetics, for example in Approximate Bayesian Computation methods [65].
The rationale of this analysis is that when a population is divided into two populations, ancestral polymorphism is shared by the daughter populations for a period of time, even if the populations are totally isolated. As divergence proceeds, each descendant population experience fixation and loss of alleles at loci that were polymorphic, and this random sorting of alleles is part of the way populations diverge. The divergence of the populations also increases through time because new alleles appear at new and different polymorphic sites in both populations. However, migration allows sharing of the ancestral and new polymorphism and can restrict the divergence of the populations. As already discussed it is thus challenging to estimate if shared polymorphism is due to a recent split, gene flow or both [66]. However, some observations on SNPs distribution within and among the cave and surface populations suggested a recent divergence.
We found 2.34 times more derived alleles that reached fixation (substitutions) at synonymous polymorphic sites in the cavefish population than in the surface fish population. An excess of non-synonymous and non-coding substitutions were also observed. We searched for an explanation for these observations that could appear at first glance unexpected, in particular for synonymous substitutions which are for most of them probably neutral or nearly neutral. It is well known that if two populations have diverged for a long time and if the mutation rate is the same in both populations, the neutral substitution rates should be equal and independent of the population sizes [67]. Nevertheless, a simple explanation, totally compliant with this fundamental result of theoretical population genetics, relies on the fact that when an ancestral population is divided into a large (surface) and a small (cave) population, the probability of fixation of a neutral allele is the same in both populations (it is the allele frequency) but the process of fixation is faster in the small than in the large population. Indeed the mean time to fixation is \( {\overline{t}}_1(p)=-4N\left(\frac{1-p}{p}\right)\mathit{\ln}\left(1-p\right) \) (where N is the population size, p is the allele frequency) [68]. The straightforward consequence is a transient acceleration of the substitution pace in the small population that is not anymore observed after a long period of time [67]. In other words, it is expected that during a short period of time after their separation a small population would had fixed more derived alleles present in the ancestral population than the large population. We thought that this information, together with other information about the distribution of polymorphism within and between populations, could be used for divergence dating.
We defined summary statistics describing the polymorphism and the divergence of two populations that could be accurately estimated using pooled RNA-seq [69]. The evolutionary model is identical to the model analyzed with IMa2 and relies on the same parameters (three effective population sizes, two migration rates, and a divergence time). Of note, we set the migration rate from cave to surface to zero as the analyses with IMa2 and several trips to this cave convinced us that the impact of migration of Pachón cavefish on surface fish DNA polymorphism is negligible.
In our simulations, we set the generation time to two and five years for the surface and cave populations, respectively. This surface fish generation time is twice the estimations obtained for other Astyanax species [70] and the cavefish generation time is the value estimated by P. Sadoglu, unpublished but reported as a personal communication [40]. This estimation is based on the hypothesis that cavefish may live and remain fertile for a long time, about 15 years. It is unlikely that these generation times are underestimates and they could actually be overestimates of the true generation times, in particular for Pachón cavefish [56]. As the estimation of the age of the Pachón cavefish population directly depends on these generation times, the divergence times we discuss below are more likely overestimates than underestimates. We took into account migration rate from surface to cave. This migration rate depended on two parameters: the probability of migration/year and the percentage of cavefish that were surface migrants when a migration occurred. This way it is possible to simulate different possibilities such as few migrants that enter the cave very often or many migrants in very rare occasions, and all intermediate cases between these two extremes.
We also took into account that genetic drift occurred during several generations in the lab. In order to estimate the effect of genetic drift in the lab and the accuracy of alleles frequency estimations using pooled RNA-seq, we compared the frequencies of two alleles (Pro106Leu) of MAO gene estimated using a sample of wild caught Pachón fish, a sample of adult fish maintained in the lab with frequencies estimated using pooled RNA-seq of lab embryos. Although it has been published that Pachón population is not polymorphic at this locus [71], we found polymorphism with a larger sample of wild caught fish and similar frequencies with both adult and embryo lab fish (data not shown) suggesting that genetic drift did not remove from the lab population a polymorphism known to be present in the Pachón population.
Of note whereas the estimation of the divergence time depends on the generation time and the mutation rate with IMa2, our estimation depends only on the generation time. This is due to the fact that the summary statistics we used, i.e. SNPs class frequencies, to compare simulated and real populations, do not depend on the mutation rate. We set a mutation rate allowing a number of SNPs within and between simulated populations large enough to have accurate estimations of the summary statistics.
Without migration, shared polymorphisms were quickly lost and the best fit of the model to the data was obtained when the cavefish population size (1250) was smaller than the surface fish population (10,000) and the age of the cavefish population was 20,600 years. When migration was included, good fit with the data also implied large differences in population sizes, a low migration rate and low numbers of migrants. The very best fit was observed for a surface fish population size of 20,000, a cave population size of 1250, and a cave population age of 54,200 years. Nevertheless, most very good fits were obtained with a divergence time in the range of 20,000 to 30,000 years.
As expected no fit was found when the surface and cave populations had similar sizes. The best fit was observed when the ancestral population size was similar to the surface population size. It is at odds with the estimation of a much larger ancestral population size estimated with IMa2. We do not have a clear explanation for this discrepancy, but it could be the consequence of the admixture of two divergent populations at the origin of the ancestral population. Such admixture could have increased the number of alleles at each microsatellite loci, and IMa2 thus inferred a larger ancestral population size. For SNPs, admixture of divergent populations results in more polymorphic sites, but the number of alleles at a given locus is not increased as the probability that two parallel mutations occurred independently at the same locus in two populations is extremely low.
Other evidence for a recent origin of the Pachón cavefish
In a recent analysis of the expression of 14 crystallin genes in the Pachón cavefish, 4 genes are not expressed or expressed at a very low level, but no stop codon or frameshift could be identified [26]. This result is in accordance with a recent origin of this population, as several loss-of-function mutations should have reached fixation after several hundred thousand years of evolution of genes that would no longer be under selection, as they are not necessary in the dark [47]. Indeed, other fish species that are likely confined into caves for millions of years have fixed loss-of-function mutations in several opsins and crystallins genes [48,49,50]. We are currently working on dating cavefish populations using the frequency of loss of function mutations in genes that are dispensable in the dark.
Second, a recent study has shown that the heat shock protein 90 (HSP90) could restrict the expression of eye-size polymorphism in surface populations [72]. Indeed, HSP90 inhibition allowed the observation of a larger eye-size variation. Moreover, it has been possible to select fish with a reduced-eye phenotype which can be observed even in the presence of HSP90 activity. This result suggests that standing genetic variation in extant surface populations could have played a role in the evolution of eye loss in cavefish. This is also compatible with a recent origin of the cave population.
Non-equilibrium models and cavefish population genetics
A recent origin of the so-called “old” Pachón population can solve a conundrum put forward by previous and the present analyses. First, at the SNP and microsatellite level, the diversity is not that low in Pachón cave when compared with surface populations, i.e. about one third. If the populations are at migration/drift equilibrium, it means that the effective population size of Pachón cavefish is about one third of the surface populations, and this is at odds with the likely large difference in census population sizes [1, 33]. Of course, we can propose ad hoc hypotheses to explain this discrepancy. Cavefish may have a much lower reproductive success variance than surface fish, or surface fish could have larger population size fluctuations through time than cavefish. In such cases, the effective population sizes could be much closer to one another than census population sizes because it is well established that large variance in reproductive success and large population size fluctuations hugely reduce the effective population size [62]. An alternative explanation is that the genetic diversity in the Pachón cave is actually higher than expected at mutation/drift/migration equilibrium. Our results suggest that the effective population size of the surface fish is at least one order of magnitude larger than the effective population size of cavefish, a ratio that is more in accordance with the unknown but certainly very different long term census population sizes. The present study is a striking illustration of how misleading can be analyses of evolutionary processes that do not take account that biological systems are not necessarily at equilibrium. The two analyses we described above rely on approaches that do not suppose mutation/migration/drift equilibrium. They allowed the estimation of demographic parameters that are more in line with expectations based on field observations than previous estimations. There are much more surface fish than cavefish and the impact of migration of cavefish on surface fish diversity is likely extremely low, and most likely null.
The new time frame we propose for the evolution of the cavefish populations would not allow enough time for the fixation of many de novo mutations and most derived alleles that reached fixation in caves were probably already present in the ancestral population. This was also suggested by a recent population genomic study [73]. This may imply that the cave phenotype evolved mainly by changes in the frequencies of alleles that were rare in the ancestral surface population. In particular, some of these alleles would have been loss-of-function or deleterious mutations that could not reach high frequency in surface populations but they could reach high frequency or fixation quickly in a small cave population where they are neutral or even advantageous.