Patterns of variation of mutation rates of mitochondrial and nuclear genes of gastropods

Background Although mitochondrial DNA (mtDNA) of many animals tends to mutate at higher rates than nuclear DNA (nuDNA), a recent survey of mutation rates of various animal groups found that the gastropod family Bradybaenidae (suborder Helicina) shows a nearly 40-fold difference in mutation rates of mtDNA (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm) and nuDNA (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn), while other gastropod taxa exhibit only two to five-fold differences. To determine if Bradybaenidae represents an outlier within Gastropoda, I compared estimated values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm/\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn of additional gastropod groups. In particular, I reconstructed mtDNA and nuDNA gene trees of 121 datasets that include members of various clades contained within the gastropod subclasses Caenogastropoda, Heterobranchia, Patellogastropoda, and Vetigastropoda and then used total branch length estimates of these gene trees to infer \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm/\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn. Results Estimated values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm/\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn range from 1.4 to 91.9. Datasets that exhibit relatively large values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm/\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn (i.e., > 20), however, show relatively lower estimates of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn (and not elevated \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm) in comparison to groups with lower values. These datasets also tend to contain sequences of recently diverged species. In addition, datasets with low levels of phylogenetic breadth (i.e., contain members of single genera or families) exhibit higher values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm/\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn than those with high levels (i.e., those that contain representatives of single superfamilies or higher taxonomic ranks). Conclusions Gastropods exhibit considerable variation in estimates of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm/\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn. Large values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μm/\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μn that have been calculated for Bradybaenidae and other gastropod taxa may be overestimated due to possible sampling artifacts or processes that depress estimates of total molecular divergence of nuDNA in groups that recently diversified.


Background
Information concerning patterns of variation of mutation rates of nuclear and organellar genomes is important for understanding the factors that influence these rates and how they impact interactions among genomes [1][2][3][4]. For most animal taxa, mutation rates ( µ m ) of mitochondrial DNA sequences (mtDNA) are higher than mutation rates ( µ n ) of nuclear DNA sequences (nuDNA) [5][6][7]. Nonetheless, invertebrates tend to show smaller differences between µ m and µ n (~ 2 to 10-fold difference) than vertebrates (~ 10 to 25-fold difference) [3,4,[7][8][9][10][11]. A recent comparison of estimates of the ratio µ m /µ n of 122 animal taxa (one sponge, 78 vertebrates, 33 arthropods, and 10 molluscs) found that one mollusc group, members of the gastropod family Bradybaenidae (subclass Heterobranchia, order Stylommatophora, suborder Helicina), shows a nearly 40-fold difference in µ m and µ n , whereas other molluscs, including representatives of six other gastropod groups, exhibited only two to five-fold differences [4]. Is Bradybaenidae an exception within Mollusca or do other gastropods (e.g., other members of the species rich suborder Helicina) have exceptionally large values of µ m /µ n ? Furthermore, what potential factors contribute to the large values of µ m /µ n that Bradybaenidae and possibly other gastropod groups exhibit?
To address the above questions, I estimated and compared µ m /µ n of additional gastropod groups, including members of four of the six subclasses of Gastropoda-Caenogastropoda, Heterobranchia, Patellogastropoda, and Vetigastropoda-and additional groups from the suborder Helicina. I aimed to determine if Bradybaenidae is an outlier within Gastropoda or if other gastropod taxa also exhibit such large values of µ m /µ n . I also evaluated whether variation in µ m /µ n among gastropod taxa reflects relative differences in µ m or µ n among groups exhibiting different values of µ m /µ n . In addition, I sought to determine if large values of µ m /µ n might be overestimated due to possible inclusion of recently diverged species by comparing ratios of µ m /µ n from datasets that include different levels of taxonomic breadth (i.e., those that include members of genera, families, superfamilies, or higher taxonomic categories). I largely followed the approach described in [4] of gathering published mtDNA and nuDNA gene sequences from corresponding sets of individuals/species, reconstructing gene trees, and then inferring relationship between µ m and µ n by calculating the ratio of the total branch lengths of mtDNA and nuDNA gene trees. I estimated total branch lengths at third codon positions for coding regions (as implemented in [4]) and all positions of intron regions as a proxy for neutral divergence.
I identified 118 sets of PopSets from the same author(s) that included both mtDNA and nuDNA data of protein-coding regions or introns of at least three species. All but four of the mtDNA sequence datasets included COI sequences; the four exceptions contained cytochrome b (cytb) sequences. All but six of the nuDNA sequence dataset included coding regions of histone H3 (H3); these other datasets contained coding regions of actin, adenine nucleotide translocase (ANT), histone H4 (H4; two datasets), and a megalin-like lipoprotein (mlp) gene as well as sequences of an intron of a gamma glutamyl carboxylase gene.
Five sets of sequences included data from two mtDNA genes and one nuDNA gene (N = 2; COI and cytb in both cases) or from two nuDNA genes and one mtDNA gene (N = 3; H3 and H4 in two cases and H3 and ANT in the other). I also included datasets of the two sets of mtDNA and nuDNA that were previously examined in [4] but not uploaded originally to NCBI as PopSets (i.e., COI and H3 sequences of Bradybaenidae and Aglajidae from [12] 12]). I divided mtDNA and nuDNA sequences of one set of PopSets into three sets of individual alignments that each contained sequences of single superfamilies of the subclass Vetigastropoda (i.e., Fissurelloidea and Lepetelloidea from the order Lepetellida, and Trochoidea from the order Trochida); this was performed to enable comparison of µ m /µ n from these lower taxonomic categories. I also combined two sets of PopSets from the same author(s) given that the different PopSets included sequences from members of the same family. Hence, in total I examined 121 sets of mtDNA and nuDNA data. The datasets contain sequences of members of four gastropod subclasses, including Caenogastropoda (N = 31), Heterobranchia (N = 81), Patellogastropoda (N = 1), and Vetigastropoda (N = 8), as well as a considerable breadth of the superfamilies (N = 34) contained within these clades (Additional file 1: Table S1). While most datasets contained representatives of single genera (N = 54), others included members of single families (N = 34), superfamilies (N = 18), or higher level ranks (N = 15).

Sequence analyses
I reconstructed mtDNA and nuDNA gene trees using maximum likelihood approaches and then calculated total branch lengths (TBL) of these trees at third positions of codons of coding regions or all positions of intron sequences with MEGAX v.10.1.8 [14]. I then calculated the ratio of these values (i.e., µ m /µ n ) for PopSet pairs that included the same species (Additional file 1: Table S1).
Estimates of µ m /µ n of individual datasets ranged from a minimum of 1.4 (Vetigastropoda; Lepetellida; Fissurelloidea) to a maximum of 91.9 (Heterobranchia; Tectipleura; Helicina; Clausilioidea; Clausiliidae) with most taxa showing mean ratios that are less than 20 and median ratios that are less than ten ( Fig. 1, Additional file 1: Table S1). Gene trees of nuDNA sequences of two PopSets exhibited a TBL of zero (PopSet UID 1735180796, genus Viviparus, family Viviparidae, order Architaenioglossa; PopSet UID 1125137272, genus Acroloxus, family Acroloxidae, superorder Hygrophila); results from analyses of these datasets were excluded from those that utilize ratios of µ m /µ n or log-transformed values of TBL of nuDNA given that these values are undefined.
Results from an ANOVA that compared µ m /µ n values of datasets from members of higher level taxonomic groups (i.e., superorders, orders, and suborders) revealed significant differences in µ m /µ n among these groups (P = 0.00137). Based on results from a Tukey test, mean values of µ m /µ n are significantly different among Helicina (20.1) and Sacoglossa (3.3) and Cerithioidea (18.6) and Sacoglossa (Fig. 1). In addition, several groups include datasets that exhibit values of µ m /µ n that represent outliers in boxplots (Fig. 1). These include datasets from Architaenioglossa, Littorinimorpha, and Neogastropoda in Caenogastropoda; Nudibranchia, Helicina, and Sacoglossa in Heterobranchia; and Lepetellida in Vetigastropoda ( Fig. 1, Table 2). The number of species (N) included in mtDNA and nuDNA gene trees and TBL of these trees exhibit strong positive associations (Fig. 2). Relationships between N and TBL of mtDNA gene trees are not significantly different for datasets with relatively low values of µ m /µ n (i.e., < 20) and those with relatively high values (i.e., > 20) (P = 0.643) (Fig. 2a, b). On the other hand, relationships among N and TBL are significantly different for nuDNA data from low ratio and high ratio datasets (P < 2.2 × 10 -16 ), with TBL of high ratio datasets exhibiting a much lower rate of increase with increasing N in comparison to low ratio datasets (Fig. 2c, d).
Datasets that include representatives of genera and families exhibit significantly different values of µ m /µ n based on ANOVA and Tukey tests (P = 0.000055). In particular, while the mean µ m /µ n of datasets that contained members of genera and families were 17.9 (SD = 19.3, N = 52) and 12.7 (SD = 11.4, N = 34), respectively, mean values of µ m /µ n of datasets that included members of superfamilies (5.5, SD = 4.2, N = 19) and higher taxonomic categories (4.2, SD = 4.2, N = 15) were lower (Fig. 3). Also, although all four categories exhibit outliers in boxplots, only outliers from genera and families exhibited values of µ m /µ n that were greater than 20 (Fig. 3).

Discussion
A previous survey that examined relationships of µ of mtDNA and nuDNA of 122 animal taxa found that the gastropod family Bradybaenidae (subclass Heterobranchia, order Stylommatophora, suborder Helicina) exhibits a nearly 40-fold difference in µ m and µ n [4]. Bradybaenidae appeared to be an outlier among molluscs and most other invertebrates which exhibit much smaller differences between µ m and µ n (i.e., generally less than ten-fold differences). Results from my analysis of 121 mtDNA and nuDNA datasets of various gastropod clades show that Bradybaenidae is not an outlier within Gastropoda. Indeed, groups of species from most of the subclasses examined exhibited values of µ m /µ n that are relatively large (i.e., > 20). Moreover, several groups have values of µ m /µ n that exceed the value previously estimated for Bradybaenidae (Fig. 1). Based on analysis of patterns of divergence of mtDNA and nuDNA, differences in µ m /µ n among gastropod taxa appear to reflect differences in µ n and not µ m (Fig. 2). Nonetheless, given that datasets with exceptionally large values of µ m /µ n include low levels of taxonomic breadth, the relatively low values of µ n that were estimated for these datasets may be small because they contain recently diverged taxa that exhibit very few if any fixed differences at nuDNA.
Datasets from 11 groups are identified as outliers given that they exhibit values of µ m /µ n that are considerably larger (N = 10) or smaller (N = 1) than values from related groups (Fig. 1, Table 2). The species contained in the ten datasets that have relatively large values of µ m /µ n exclusively represent members of single genera that have radiated recently [15][16][17][18][19][20][21][22][23]. For example, the Conus dataset that represents an outlier ( µ m /µ n = 56.7) includes 44 members of the Cape Verde species flock, a group of species that may have radiated explosively during the past few million years [15,24,25]. An additional dataset that contains members of the superfamily Conoidea, including six Conus species (but not any from the Cape Verde species flock), represented an outlier because its value of µ m /µ n (5.9) was less than those of related taxa (Fig. 1). Excluding all but these six Conus species yields a value of Table 2 Datasets that exhibit outlier values of µ m /µ n (Fig. 1) The lowest taxon that encompasses the majority of the species included in PopSets is indicated; families of genera also indicated ANT: adenine nucleotide transferase; cytb: cytochrome b; COI: cytochrome oxidase subunit I; mlp: megalin-like lipoprotein a Average µ m /µ n based on two nuDNA gene regions  9.0 for µ m /µ n . Although this value is larger than the value exhibited by members of the entire superfamily, it is still much less than µ m /µ n of the Cape Verde Conus dataset. Moreover, another dataset that includes sequences of three other Conus species (and none from the Cape Verde species flock) also has a relatively small value of µ m /µ n (11.2). Hence, while some Conus species exhibit a relatively modest value of µ m /µ n , species from Cape Fig. 2 Relationships between number of species and total branch lengths of gene trees. a mtDNA datasets with µ m /µ n < 20, R 2 = 0.934, P < 2.2 × 10 -16 ; b mtDNA datasets with µ m /µ n > 20, R 2 = 0.778, P < 7.8 × 10 -8 ; c nuDNA datasets with µ m /µ n < 20, R 2 = 0.705, P < 2.2 × 10 -16 ; d nuDNA datasets with µ m /µ n > 20, R 2 = 0.652, P < 5.8 × 10 -6 Duda Jr. BMC Ecol Evo (2021) 21:13 Verde species that underwent a recent radiation show an exceptionally large one. Datasets with relatively large values of µ m /µ n show smaller values of µ n (i.e., TBL of nuDNA gene trees) relative to the number of species included in tree compared to datasets with relatively small values of µ m /µ n (Fig. 2). These results suggest that differences in µ m /µ n among groups reflect differences in µ n . Given that (i) all of the datasets that exhibit depressed values of µ n include low levels of taxonomic breadth (i.e., only include members of single genera or families) (Fig. 2) and (ii) datasets with little breadth exhibit larger values of µ m /µ n than those with high levels of taxonomic breadth (i.e., include members of superfamilies and higher taxonomic categories) (Fig. 3), µ n appears to be associated with the taxonomic breadth of species included in datasets. Recently diverged taxa may be more likely to exhibit elevated values of µ m /µ n because they show only very few if any fixed differences at nuDNA. Otherwise, some other processes or sampling artifacts may be responsible for depressing estimates of µ n in recently radiated taxa. The Bradybaenidae data contain sequences of many recently diverged species of several genera that do not show reciprocal monophyly in molecular phylogenies [12]. Furthermore, the largest value of µ m /µ n that was reported in [4] (79.2) is for a group of recently diverged amphibian species (genus Bufo) [26]. While, it was hypothesized that the extreme value estimated for this group may be due to sampling error related to the small sample size of the datasets examined [4], the value instead may have been overestimated owing to the recent divergence of the species included in the dataset.
Although most of the datasets examined included coding regions of sequences of COI for the mitochondrial gene and H3 for the nuclear gene, four of the outlier datasets included sequences of coding regions of the mitochondrial gene cytB and sequences of coding regions of three additional nuclear genes (a megalin-like lipoprotein gene, adenine nucleotide transferase gene, and H4). Although it will be important to perform broader surveys of genes and gene regions (e.g., introns and intergenic regions) to further validate this pattern, it is not limited to the same mitochondrial and nuclear gene pairs and hence appears to be reasonably robust to gene sampling.

Conclusions
Members of Gastropoda appear to show considerable variation in µ m and µ n , but overall tend to exhibit lower values of µ m /µ n than vertebrates [4]. Nonetheless, some of the values reported herein may reflect overestimates of µ m /µ n due to the inclusion of species that show low levels of total molecular divergence at nuDNA possibly due to their recent divergence. Although comparing TBL of mtDNA and nuDNA gene trees is an effective means for determining relationships among µ m and µ n , the approach may give overestimates of µ m /µ n when datasets include a number of recently diverged species. Nonetheless, for groups in which fossil calibrations are not available, estimating µ m /µ n could be useful for identifying clades that have radiated recently.

Datasets
To estimate µ m /µ n , I gathered mtDNA and nuDNA sequence data that were uploaded to GenBank as 'Pop-Sets' or collections of sequence data (as opposed to individual sequence submissions). I utilized this strategy in an effort to ensure that different authors' views on the identity of species did not affect estimates of µ m /µ n . I searched the NCBI PopSet database (https ://www.ncbi. nlm.nih.gov/popse t) using the term "Gastropoda [Organism]" in the search field (accessed on 18-May-2020). I downloaded search results as an XML file and parsed the data to extract various information such as PopSet title, author(s), and unique identifier; publication info; taxa represented in the PopSet; and gene name, gene source (i.e., mtDNA or nuDNA), and number of sequences. I then sorted the resultant data by gene source and Pop-Set author(s) to identify PopSets from the same author(s) that included both mtDNA and nuDNA sequences from at least three gastropod species. I selected PopSets that exclusively included intron or coding regions (but not both) and for which sequence data were available for both mtDNA and nuDNA. The final list of prospective PopSets included all but two of the gastropod datasets that were examined in [4]; these latter datasets included sequences of cytochrome oxidase subunit I (COI), a mtDNA gene, and histone H3 (H3), a nuDNA gene of Bradybaenidae [12] and Aglajidae [13], a member of the order Cephalaspidea.
I downloaded fasta files of PopSets or individual sequences (i.e., for the two datasets that were examined in [4] but not uploaded as PopSets) from 'PopSet' or 'Nucleotide' databases at NCBI (https ://www.ncbi.nlm. nih.gov/). I aligned each set of sequences using MUSCLE [27] in Seqotron v1.0.1 [28]. I evaluated sequence datasets by eye in Seqotron to ensure that alignments were robust. This included adjustments of out of frame insertions so that they occurred in the proper reading frame and elimination of ends of sequences that appeared to be misaligned (possibly due to base call errors) because they contained insertions that affected reading frames (i.e., did not occur in multiples of three). I then compared species and individuals present in the corresponding alignments of mtDNA and nuDNA sequence data and removed species and individuals from one alignment if they were not present in the other. I also eliminated all but one representative of each species in alignments and retained sequences of individuals that were represented in both datasets and/or that were the most complete; in cases where sequences from more than one individual satisfied these criteria, I retained the individual that was listed first.
I utilized taxonomy information presented in the PopSet or GenBank files to specify the genera, families, superfamilies and higher level taxonomic categories of species included in datasets. I then reconciled this information with the hierarchical classification of gastropods presented in MolluscaBase [29].

Sequence analyses
Total branch lengths (TBL) of mtDNA and nuDNA gene trees at putative neutral sites can be used to estimate relative differences in µ m and µ n based on calculation of the ratio µ m /µ n [4]. As described in [4], total molecular divergence (i.e., TBL of gene trees) at neutral sites is a function of neutral mutation rates and divergence times [30,31]. Given that divergence times of species represented in mtDNA and nuDNA gene trees should be the same, the ratio of the TBL of these trees (at neutral sites) provides an estimate of µ m /µ n [4]. I used estimates of divergence (i.e., TBL) at third codon positions for coding regions (as implemented in [4]) and all positions of intron regions as a proxy for neutral divergence.
I used MEGAX v.10.1.8 [14] to construct gene trees and estimate branch lengths of individual datasets. I constructed individual phylograms for each locus to limit the effect of having discordant gene trees that could result in overestimates of TBL due to incomplete lineage sorting (see [4]). I specified the genetic code and examined alignments to set the appropriate codon start position. I reconstructed gene trees using the General Time Reversible model with maximum likelihood. I eliminated sites that were not defined in 80% of sequences; otherwise, all other positions were utilized in tree building. I examined gene trees in MEGA to ensure that the phylogenies did not contain any long branches that could be due to any alignment errors. I then used maximum likelihood to estimate the TBL of gene trees at third positions of codons of coding regions or all positions of intron sequences.
I performed all statistical tests in R [32]. I compared values of µ m /µ n for sequence datasets that included species from the following higher level taxonomic groups: the orders Architaenioglossa, Littorinimorpha, and Neogastropoda, and the superfamilies Abyssochrysoidea and Cerithioidea from the subclass Caenogastropoda; the orders Nudibranchia, Pleurobranchida, Aplysiida, Cephalaspidea, Ellobiida, and Runcinida, suborders Achatinina and Helicina, and superorder Hygrophila of the subclass Heterobranchia; the subclass Patellogastropoda; and the orders Lepetellida and Trochida from the subclass Vetigastropoda. I used average values of µ m /µ n for datasets that included more than one mtDNA or nuDNA gene region. I compared µ m /µ n (using log-transformed values) among these groups with ANOVA and used a Tukey test to identify groups with significant differences in µ m /µ n . I utilized boxplots to visualize patterns of variation of µ m /µ n among and within gastropod taxa and identify outlier datasets.
To evaluate whether differences in µ m /µ n ratios reflect relative increases in µ m or decreases in µ n , I compared TBL of mtDNA and nuDNA gene trees to the number of species included in these trees. Measures of TBL should increase proportionally to the number of species examined [33]. I specifically compared TBL of mtDNA and nuDNA gene trees among datasets that exhibited different relative values of µ m /µ n (i.e., less than and greater than 20) and determined levels of significance with an ANOVA based on comparison of log-transformed values of TBL that were standardized to the number of species included in the tree.
If values of µ m /µ n are overestimated because they include recently diverged species, µ m /µ n values that are calculated from datasets that include little taxonomic breadth (e.g., those including representatives of genera and families) will be greater than values from datasets that include more taxonomic breadth (e.g., those including members of superfamilies, suborders, etc.). To determine if this is the case, I compared estimates of µ m /µ n among datasets that include different levels of taxonomic breadth. While some PopSets only included members of single genera, others included various members of families, superfamilies, suborders, orders, and subclasses. I utilized an ANOVA to compare log-transformed values of µ m /µ n among datasets representing genera, families, superfamilies and combined higher taxonomic categories; I used a Tukey test to determine which samples exhibit significantly different values. I also utilized boxplots to visualize patterns of variation among datasets that included different levels of taxonomic breadth.