Material
We studied museum specimens of P. sulcatus, originating from a total of 71 populations from throughout the IWP, accumulated from several independent collecting trips of MG during the 1990th to early 2000 years, now stored in the Museum für Naturkunde, Berlin (ZMB, Germany), supplemented by material from several other museum collections, including ZMB, Universität Rostock (CWR, Germany), Senckenberg Naturmuseum, Frankfurt (SMF, Germany), Australian Museum, Sydney (AMS) and the Natural History Museum, London (NHMUK, United Kingdom). These samples were collected between 1921 and 2013, though most samples (42) were collected before the year 2000 (see Table S1 for a detailed list of all used samples and sample locations).
(a) Analysis of brood pouch content for developmental mode evaluation
We studied brood pouch contents of gravid females, to identify reproductive modes within P. sulcatus populations. As all previous studies on the matter [15,16,17,18,19,20] never identified a co-occurrence of reproductive modes within the same population, we interpret finding only larval stages within brood pouches of a given population to represent the veliger releasing mode, whereas we regard populations containing any juvenile stages within brood pouches as indication of direct development. However, this interpretation is not unambiguous as the brood pouch contents only provide a snapshot of the developmental stages present within a population at the time of sampling. Larval stages could still develop into juveniles within the brood pouch or be released at an earlier stage. To account for this ambiguity we combined results from singular populations with other populations from the same area, to infer the reproductive biology in a given region. As our samples were collected between 1921 and 2013 it is highly unlikely that for any given region all samples were collected during the same season, reducing the possibility of a seasonal sampling bias.
In total, 365 adult specimens were dissected using a Leica M125 stereo microscope (Leica Microsystems GmbH, Wetzlar, Germany). If present, ontogenetic stages were extracted and counted according to predefined size classes: early larva, late larva, juveniles < 0.5 mm, juveniles 0.5–1 mm, juveniles 1–1.5 mm; see Fig. 2 b–e. We here define larvae as individuals with only a protoconch and juveniles as individuals with a teleoconch. This transition is easily spotted in P. sulcatus as the transition from teleoconch to protoconch is marked by a deep sinusigeral notch [20]. Following Nation [28], larval and juvenile stages were dried for subsequent imaging with a LEO 1525 GEMINI scanning electron microscope (SEM, LEO Electron Microscopy Inc., Thornwood, NY, USA). Dried larval and juvenile stages were mounted on SEM object stubs (Agar Scientific Ltd., Stansted, UK) and coated with platinum using a Polaron SC7640 sputter coater (Quorum Technologies Ltd., Ashford, UK).
DNA extraction & amplification
DNA was extracted from foot muscle tissue using mollusc specific protocols [24, 25]. Wherever possible specimens with offspring of known developmental mode in their brood pouches were preferentially used. We performed extractions on 2–3 specimens per studied population, amounting to a total 163 genetically sampled specimens. We added sequences of a confamilial species to our dataset as outgroup: Supplanaxis abbreviata (Pease, 1865) (see Table S1 for a comprehensive list of used samples). Partial sequences of the mitochondrial cytochrome c oxidase subunit I (COI) gene, with primers LCO1490 (5′-GGT CAA CAA ATC ATA AAG ATA TTG G-3′ [29]) and HCO2198var (5′-TAW ACT TCT GGG TGK CCA AAR AAA T-3′ [30]), and 16S rRNA (16S) gene, with primers 16SF (5′-CCG CAC TAG TGA TAG CTA GTT TC-3′ [31]) and H3059var (5′-CCG GTY TGA ACT CAG ATC ATG T-3′ [31]) were amplified by polymerase chain reaction (PCR). Amplifications were performed in 20 μl volumes containing 2 μl 10x DreamTaq Green Buffer, 0.1 μl DreamTaq DNA polymerase (both Thermo Fisher Scientific, Waltham, MA, USA), 0.4 μl dNTP mix 10 mM each (VWR chemicals, VWR International GmbH, Darmstadt, Germany), 1 μl of each primer (Sigma-Aldrich Chemie GmbH, Taufkirchen, Germany), 13.5 μl ddH2O and 2 μl DNA template under the following reaction conditions: initial denaturation at 94 °C for 3 min, 35 PCR cycles (94 °C for 30 s, 50 °C for 45 s, 72 °C for 1 min), final extension at 72 °C for 10 min. PCR products (10 μl) were cleaned enzymatically by adding 2 μl FastAP Thermosensitive Alkaline Phosphatase (1 U/μl) and 1 μl Exonuclease I (20 U/μl) (both Thermo Fisher Scientific) followed by an incubation step at 37 °C for 15 min and inactivation at 85 °C for 15 min. All amplified products were sequenced at Macrogen Europe (Amsterdam, The Netherlands). See Table S2 for GenBank accession numbers for all obtained sequences.
Phylogenetic analysis
DNA sequences were edited and assembled using GENEIOUS R 9.1.3 (Biomatters Ltd., Auckland, New Zealand). Primer sequences were removed resulting in COI sequences of ~ 658 bp and 16S sequences of ~ 810 bp. COI sequences were aligned using MUSCLE [32] as implemented in GENEIOUS under default settings. For the 16S alignment MAFFT 7 [33], using the Q-INS-I algorithm, the 1PAM/κ = 2 option for the scoring matrix for nucleotide sequences and otherwise default settings, was used. We used PartitionFinder 2.1.1 [34] to select the appropriate partitions and evolutionary models. Four partitions were assumed initially (1st, 2nd and 3rd codon positions of COI and 16S). The results of the PartitionFinder analysis using the Bayesian information criterion suggested a single partition and the HKY + G model, which was used for the subsequent Bayesian inference (BI) and maximum likelihood (ML).
We performed a BI using MrBayes version 3.2.6 [35] running Metropolis-coupled Monte Carlo Markov chain (MC3) searches with four chains in two separate runs for 50,000,000 generations with trees sampled every 1000 generations under default heating. Potential scale reduction factors close to 1 and estimated effective sample sizes above 200 from the MrBayes output were used as diagnostics to ensure that the MC3 searches had reached stationarity and convergence. The first 5,000,000 generations of each run were discarded as burn-in. We performed heuristic ML analyses in GARLI 2.0 [36] using the best-fit model as suggested by PartitionFinder. Support values were computed by bootstrapping with 1000 replications. Using PAUP* 4.0b10 [37], we conducted heuristic maximum parsimony (MP) searches with unordered characters, 100 random sequence addition replicates, the tree bisection and re-connection (TBR) branch-swapping, and gaps treated as missing data. Internal branch support was assessed in PAUP* by bootstrapping with 1000 replications, using full heuristic searches with 10 random addition sequence replicates, TBR branch swapping, and one tree held at each step during stepwise addition. Posterior probabilities from the BI analysis and bootstrap support (BS) values from the ML and MP analyses were mapped onto the BI 50% majority-rule consensus tree with SumTrees version 3.3.1 (part of the DendroPy 3.8.0 package [38]). BS ≥ 70% from the ML and MP analyses and posterior probabilities (PP) ≥ 0.95 were interpreted as positive support for a node.
Molecular species delimitation models
To test for the presence of potential cryptic lineages we used the Automated Barcode Gap Discovery (ABGD method) [39], via its online application (http://wwwabi.snv.jussieu.fr/public/abgd/abgdweb.html). For the COI gene we used the Kimura (K80) TS/TV 2.0 distance (K2P). We entered previously established [22] minimum (Pmin = 0.0031) and maximum (Pmax = 0.0411) COI K2P values for marine caenogastropod snails (Littorinidae) and otherwise default settings.
We also used the general mixed Yule-coalescent (GMYC) approach in its Bayesian implementation (bGMYC) [40] for DNA sequence-based species delimitation. We constructed ultrametric trees based on COI data with BEAST 2.4.1 [41]. The chain was run for 11,000,000 generations, with a sample frequency of 1000. The first 1000,000 of the generations were discarded as burn-in. The GTR + I + G model was applied; otherwise default settings were used. Tracer 1.7.1 [42] was used to check that all effective sample sizes were above 200. GMYC and bGMYC analyses were conducted with the Split [43] and bGMYC R packages [44], respectively. The single-threshold as well as the multiple-threshold analyses were both conducted using the maximum clade credibility tree from the BEAST analysis constructed with TreeAnnotator 2.1.2 (part of the BEAST software suite) setting the posterior probability limit to 0. The bGMYC analysis was based on 100 trees drawn equidistantly from the post burn-in generations obtained from the BEAST analysis. For each of the 100 trees, the Markov-chain Monte Carlo sampler was run for 100,000 generations, discarding the first 90,000 generations as burn-in and sampling every 100 generations.
Ancestral range estimation
An ancestral range estimation was conducted based on 1000 randomly selected post-burn-in trees from the BI analysis of the dataset accounting for statistical uncertainty and the 50% majority-rule consensus tree from the BI analysis using the statistical dispersal-vicariance analysis (S-Diva) method [45] implemented in RASP 4.0 beta [46]. We constructed a matrix in which each individual was assigned to one of five geographic regions (roughly following [14]): A) Southwest Africa, Madagascar and Mauritius; B) the Red Sea; C) the Arabian Sea (including the Gulf of Yemen); D) the Indo-West Pacific (including Indonesia, Malaysia and Thailand) and; E) Australia and New Caledonia. For our analysis, we allowed transitions from B only to A and C, as well as from E only to D. All other areas were assumed as directly connected. The analysis was run allowing a maximum of three areas at each node and otherwise default settings.
Distribution maps and figure assembly
The species’ distribution was reconstructed on a dot by dot basis based on ethanol stored specimens from the aforementioned collections, with maps based on the open access Natural Earth map (free vector and raster map data @ naturalearthdata.com). Figures and maps were assembled using Adobe Photoshop CS2 version 9.0 for Windows and Adobe Illustrator CS2 version 12.0 for Windows (both Adobe Systems, San Jose, CA, USA).