Impact of model assumptions on demographic inferences: the case study of two sympatric mouse lemurs in northwestern Madagascar

Background Quaternary climate fluctuations have been acknowledged as major drivers of the geographical distribution of the extraordinary biodiversity observed in tropical biomes, including Madagascar. The main existing framework for Pleistocene Malagasy diversification assumes that forest cover was strongly shaped by warmer Interglacials (leading to forest expansion) and by cooler and arid glacials (leading to forest contraction), but predictions derived from this scenario for forest-dwelling animals have rarely been tested with genomic datasets. Results We generated genomic data and applied three complementary demographic approaches (Stairway Plot, PSMC and IICR-simulations) to infer population size and connectivity changes for two forest-dependent primate species (Microcebus murinus and M. ravelobensis) in northwestern Madagascar. The analyses suggested major demographic changes in both species that could be interpreted in two ways, depending on underlying model assumptions (i.e., panmixia or population structure). Under panmixia, the two species exhibited larger population sizes across the Last Glacial Maximum (LGM) and towards the African Humid Period (AHP). This peak was followed by a population decline in M. ravelobensis until the present, while M. murinus may have experienced a second population expansion that was followed by a sharp decline starting 3000 years ago. In contrast, simulations under population structure suggested decreasing population connectivity between the Last Interglacial and the LGM for both species, but increased connectivity during the AHP exclusively for M. murinus. Conclusion Our study shows that closely related species may differ in their responses to climatic events. Assuming that Pleistocene climatic conditions in the lowlands were similar to those in the Malagasy highlands, some demographic dynamics would be better explained by changes in population connectivity than in population size. However, changes in connectivity alone cannot be easily reconciled with a founder effect that was shown for M. murinus during its colonization of the northwestern Madagascar in the late Pleistocene. To decide between the two alternative models, more knowledge about historic forest dynamics in lowland habitats is necessary. Altogether, our study stresses that demographic inferences strongly depend on the underlying model assumptions. Final conclusions should therefore be based on a comparative evaluation of multiple approaches. Supplementary Information The online version contains supplementary material available at 10.1186/s12862-021-01929-z.

heterozygosity (Ho) and inbreeding coefficients (FIS). All genetic summary statistics were calculated with ARLEQUIN [2] after conversion of the Variant Call Format (VCF) file from dataset 2 into the ARLEQUIN format using PGDSpider [3].
The genomic diversity statistics estimated for M. murinus and M. ravelobensis (Table S5) were lower than previous values estimated with microsatellites datasets [4][5][6], but comparable to those reported for other primate species based on a RADseq dataset (e.g., Ho = 0.3475 and 0.3148 for Rhinopithecus roxellana populations) [7], and even higher than for some other taxa (e.g., [8][9][10]). A slightly higher observed heterozygosity than expected under HWE coupled with the non-significant but negative population inbreeding coefficients suggest that populations were large enough to retain genetic diversity [5,11], or that outside individuals were regularly immigrating into the two forest sites.

Text S5 -Impact of the minimum read depth in the PSMC analyses
The PSMC runs considering different minimum read depth options (-d1 to -d12) resulted in similar demographic curves for M. ravelobensis (Ankomakoma; Figure S9). When decreasing the -d option from twelve to one, the genome-wide coverage only decreased from 18.79X to 18.2X, suggesting that the sites considered in the PSMC analyses were of good quality. Moreover, the two M. ravelobensis sequences suggested a similar demographic history for Ankomakoma and Ravelobe, despite the mean genome-wide coverage differences (18.79 and 17.04, respectively).
These results suggest that our PSMC inferences are likely not strongly biased by a mean genome-wide coverage below 18X. This ms command specifies that M. murinus was structured in a n-island model of migration with a constant population size over time and underwent five major changes in population connectivity.

Text S6 -List of the ms commands used for establishing the IICR figures
The demographic parameters were specified by the following command options: -I (number of islands) = 84 -eN t x (set all subpopulations to size x * N0), where x = 1.0 throughout the time -eM t m (set all elements of the migration matrix to m/(npop−1) at time t), where m = 1.22; 4.1; 0.93; 2.53; 86.0; 0.28 (backwards in time) t (specifies the time of a given demographic event in units of 4N0 generations). Considering N0 ~ 138 and a generation time of 2.5 years, the most recent change in population connectivity (T1) took place at 3.695 X 4N0 X generation time ~ 5,099 years. If applying the same logic to the remaining t values, the changes in population connectivity took place approximately 5.1; 13.7; 30.7; 42.7  Similarly to M. murinus, the ms command specifies that M. ravelobensis was structured in a nisland model of migration with a constant population size over the time and underwent five major changes in population connectivity. The demographic parameters were specified by the following

Text S7 -Impact of repeat regions in the demographic modelling
To evaluate the impact of repeat regions in our demographic inferences, we reran the Stairway Plot for the entire M. ravelobensis dataset (n = 55) and the PSMC analyses for the three wholegenome sequences without the repeat regions. None of the new results showed substantial deviations from previous results generated using the full dataset and included in the main text ( Figure 3a,b). The PSMC analyses without the repeat regions ( Figure S10a) exhibited a similar demographic trend for M. murinus as the analyses using the full dataset ( Figure 3a). The PSMC dynamics with and without the repeat regions for M. ravelobensis were also similar, except for the population bottleneck that occurred during the African Humid Period. This event was no longer detected when considering the dataset without repeat regions. Such differences are likely related to the substantial reduction on the number of informative sites after removal of the repeat regions (approximately 52%). Since the Stairway plot results without the repeat regions ( Figure S10b) revealed an identical curve to the one inferred using all the sites (Figure 3b), we conclude that the demographic inferences for our study species were not impacted by the inclusion of repeat regions. Figure S1. Principal Component Analyses plots based on genotype likelihoods in a M. murinus (N = 22) and b M. ravelobensis (N = 56) using the same datasets as for the clustering analyses. The axis labels show the variation explained by the first two principal components (PC1 and PC2). Results suggest weak genetic structure among the two sampling sites for both species. Although the PCA analyses revealed one genetically distinct M. murinus individual (M143), this animal was kept in our analyses, because no differences were observed in the clustering analyses ( Figure 2 and Figure S2), the inbreeding coefficient suggested no departure from Hardy-Weinberg Equilibrium (HWE ; Table S6), and the sample showed a good individual mean coverage depth (Table S2). In contrast, M. ravelobensis individual M73 was excluded from our subsequent analyses because it exhibited an F = -0.476 (Table S6), strongly deviating from HWE, which may impact downstream analyses. Animal illustrations copyright 2013 Stephen D. Nash / IUCN SSC Primate Specialist Group. Used with permission.  Figure S3. Number of clusters inferred by NGSadmix. a Likelihood results and delta K estimation suggest that the best K value for M. murinus is K = 3. b Likelihood results and delta K estimation suggest that the best K value for M. ravelobensis is K = 2. Delta K estimations followed the method of [12] (Ankomakoma). The demographic dynamics of M. ravelobensis using different minimum read depth options (-d1, -d3, -d6, -d9, -d12) resulted in similar demographic curves, suggesting that the sites considered in the PSMC analyses were of good quality. The analyses were performed considering 2.5 years as generation time and 1.2 × 10 -8 as mutation rate. The grey vertical bars identify three well-pronounced climatic events that took place in Africa: LIG (Last Interglacial), LGM (Last Glacial Maximum) and AHP (African Humid Period). Figure S10. Impact of the exclusion of repeat regions in the demographic modelling. a Inferred trajectories for M. murinus and M. ravelobensis after the exclusion of the repeat regions using the PSMC. The results are identical to the ones generated using the full dataset, except for the population bottleneck that occurred during the African Humid Period for the M. ravelobensis. These results suggest that larger datasets are essential to correctly infer recent demographic dynamics with this method. b Inferred trajectory for M. ravelobensis (n = 55) after the exclusion of the repeat regions using the Stairway Plot method. The results are identical to the ones generated using the full dataset. The grey vertical bars identify three well-pronounced climatic events that took place in Africa: LIG (Last Interglacial), LGM (Last Glacial Maximum) and AHP (African Humid Period).  Table S4. Summary of the demographic dynamics reconstructions for M. ravelobensis and M. murinus assuming population panmixia (PSMC and Stairway Plot) or population structure (IICRsimulations). The demographic dynamics that were in line with the available paleoenvironmental and the resulting demographic hypothesis are marked with a green check, while the dynamics that were not in concordance with the respective hypothesis are marked with a not applicable icon. The demographic dynamics of both mouse lemur species were better explained by a mix of population size and connectivity changes. These results suggest that population structure played an important role in the demographic dynamics of M. ravelobensis and M. murinus. NA = Hypothesis is not applicable to this study species. Table S5. Genomic diversity summary statistics for each M. murinus and M. ravelobensis study site using dataset 2. Rav = Ravelobe; Ank = Ankomakoma; n = Sample size (where M = number of males and F = number of females); % Poly = % Polymorphic sites; He = unbiased expected heterozygosity (estimated according to [6]); Ho = observed heterozygosity; FIS = population inbreeding coefficient (n.s.; p > 0.05).  Table S6. Individual inbreeding coefficients (F) estimated for the M. ravelobensis dataset (n = 56). F = 0 reflects random mating (HWE) and F = 1 the absence of heterozygous genotypes or a totally inbred sample [14]. One individual (M73) showed a highly negative F value, suggesting a deviation from HWE and it was therefore removed from our dataset.  Table S7. Individual inbreeding coefficients (F) estimated for the M. murinus dataset (n = 22). F = 0 reflects random mating (HWE) and F = 1 the absence of heterozygous genotypes or a totally inbred sample [14]. No significant deviations from HWE were found in our dataset and all individuals were kept for downstream analyses.  Table S8. List of individuals used for the different genomic analyses for each species and respective sampling locations. All individuals (without close relatives) were considered for the clustering (NGSadmix and PCA) and inbreeding analyses. The individual M73 from M. ravelobensis (Ravelobe) was not considered in the downstream analyses because it showed departure from HWE. Stairway Plot analyses were performed using the entire datasets and with a subset of the samples (n = 7) to control for differences in population size. Finally, the PSMC analyses were performed considering one M. murinus sampled in Ankomakoma and one M. ravelobensis sampled in each study site.  Table S9. Information regarding the number of raw reads obtained from the Illumina sequencing (# raw reads), the number of reads that pass the quality filters (# reads after filtering), the number of reads that were maintained after the read alignment against the reference genome (# reads after filtering), and the final number of reads that were kept for downstream analyses (# reads without PCR duplicates) for the 22 unrelated M. murinus samples (7 Ravelobe Table S10. Information regarding the number of raw reads obtained from the Illumina sequencing (# raw reads), the number of reads that pass the quality filters (# reads after filtering), the number of reads that were maintained after the read alignment against the reference genome (# reads after filtering), and the final number of reads that were kept for downstream analyses (# reads without PCR duplicates) for the 56 unrelated M. ravelobensis samples (31 Ravelobe and 25 Ankomakoma).