Skip to main content

Post-glacial phylogeography and evolution of a wide-ranging highly-exploited keystone forest tree, eastern white pine (Pinus strobus) in North America: single refugium, multiple routes



Knowledge of the historical distribution and postglacial phylogeography and evolution of a species is important to better understand its current distribution and population structure and potential fate in the future, especially under climate change conditions, and conservation of its genetic resources. We have addressed this issue in a wide-ranging and heavily exploited keystone forest tree species of eastern North America, eastern white pine (Pinus strobus). We examined the range-wide population genetic structure, tested various hypothetical population history and evolutionary scenarios and inferred the location of glacial refugium and post-glacial recolonization routes. Our hypothesis was that eastern white pine survived in a single glacial refugium and expanded through multiple post-glacial recolonization routes.


We studied the range-wide genetic diversity and population structure of 33 eastern white pine populations using 12 nuclear and 3 chloroplast microsatellite DNA markers. We used Approximate Bayesian Computation approach to test various evolutionary scenarios. We observed high levels of genetic diversity, and significant genetic differentiation (F ST = 0.104) and population structure among eastern white pine populations across its range. A south to north trend of declining genetic diversity existed, consistent with repeated founder effects during post-glaciation migration northwards. We observed broad consensus from nuclear and chloroplast genetic markers supporting the presence of two main post-glacial recolonization routes that originated from a single southern refugium in the mid-Atlantic plain. One route gave rise to populations at the western margin of the species’ range in Minnesota and western Ontario. The second route gave rise to central-eastern populations, which branched into two subgroups: central and eastern. We observed minimal sharing of chloroplast haplotypes between recolonization routes but there was evidence of admixture between the western and west-central populations.


Our study reveals a single southern refugium, two recolonization routes and three genetically distinguishable lineages in eastern white pine that we suggest to be treated as separate Evolutionarily Significant Units. Like many wide-ranging North American species, eastern white pine retains the genetic signatures of post-glacial recolonization and evolution, and its contemporary population genetic structure reflects not just the modern distribution and effects of heavy exploitation but also routes northward from its glacial refugium.


Ecological changes and anthropogenic activities over the past 12,000 years have had a profound effect on the distribution of plant populations including that of forest trees in eastern North America [1]. Following the last glacial maximum (LGM, 26,500-19,000 ybp), a time when plant species were forced to inhabit unglaciated areas, climate oscillations and topographic and hydrological barriers have influenced the location of suitable habitats and the migration of plant populations [2]. Superimposed on phylogeographic patterns triggered by postglacial climatic changes is the impact of recent human disturbance and habitat change [3]. Therefore, knowledge of the historical distribution and postglacial phylogeography, evolution and expansion of a species is important to better understand its current distribution and population structure, the historical processes that shaped its current distribution and to predict potential fate in the future, especially under climate change conditions, as well as conservation and management of its genetic resources. However, these aspects for eastern North American plant species, especially forest trees, are not well understood.

In eastern North America (unlike Europe, with its east–west mountain ranges), the absence of major geographic barriers to northward dispersal can lead to a presumption that northward post-glacial recolonization simply proceeded uniformly across the longitudinal range of the recolonizing species. Thus the only phylogeographic patterns likely to be introduced during post-glacial recolonization would be south-to-north declines in genetic diversity introduced by repeated founder events [4]. However, applications of molecular genetic markers have shown that this view is far too simple, and that instead the landscape exposed by glacial retreat likely directed recolonization in ways that also introduced longitudinal structure in modern populations [5]. For tree species with ranges that extend from the Atlantic Ocean to the Midwest plains, two geographical features may have disrupted uniform post-glacial northward migrations: the Appalachian Mountains and the Great Lakes. In a few eastern and transcontinental North American conifer tree species, whose post-glacial phylogeography has recently been studied, refugia and post-glacial migration routes were inferred to be separated by the Appalachian Mountains. Three refugia (Beringian, Mississippian and east Appalachian) have been reported for transcontinental white spruce (Picea glauca) [6, 7], three southern and one northern refugia for transcontinental black spruce (Picea mariana) [8], and three (one east and one west of Appalachians and one in northern Canada) for widely-distributed almost-transcontinental jack pine (Pinus banksiana) [9]. For Atlantic white cypress (Chamaecyparis thyoides), Mylecraine et al. [10] argued that refugia and recolonization routes were separated by the Appalachian Mountains.

The objective of this study was to examine the range-wide population genetic structure and phylogeography, and infer post-glacial migration and evolution of eastern white pine (Pinus strobus) in North America in order to understand how historical processes have shaped the current species’ distribution and population genetic structure. We have chosen eastern white pine because (1) it is a widely distributed, ecologically and economically important key-stone species in eastern North America, (2) it is predicted that this species will expand its range northward under the anticipated climate change, (3) there is no information on range-wide population genetic structure and phylogeography of this species, and its post-glacial migration and evolution is not well-understood, and (4) it is of conservation genetic concerns because it has been heavily exploited for over 150 years [11]. Fossil-based studies, using radio-carbon dating, of eastern white pine historical distribution provided some indication that the species inhabited a glacial refugium in the mid-Atlantic plain of the eastern North American seaboard (Virginia and North Carolina) [12]. After the LGM, northward expansion continued to within 320 km of James Bay, northern Ontario, in part due to warmer-than-modern climates approximately 6000 year ago [13]. In Minnesota, contraction to the current range is estimated to have occurred about roughly 1000 years ago [14]. However, due to many inherent limitations, the fossil pollen data alone does not provide a clear and detailed picture of phylogeography and post-glacial migration and evolution of a species. Therefore, it is essential to examine a species’ phylogeography using molecular genetic markers.

Here we have examined range-wide population genetic structure and phylogeography of eastern white pine, using microsatellite DNA markers from the nuclear and chloroplast genomes, and inferred its post-glaciation migration and evolution using this genetic data supplemented with previously published fossil pollen information. We tested the following hypotheses: (i) if the current eastern white pine arose from a southern refugium, there should be a geographical trend of south to north decreasing genetic diversity as a result of repeated founder effect; (ii) that current eastern white pine populations descended from a single glacial refugium during the LGM and multiple post-glacial recolonization routes; and (iii) the Appalachian mountains and Great Lakes separated the post-glacial colonization routes resulting in longitudinal genetic differentiation of eastern white pine in the north. We take advantage of the fact that the chloroplast DNA is paternally inherited and thus pollen dispersed in conifers, e.g., [15], and the nuclear DNA is biparentally inherited and thus both pollen and seed dispersed. A combination of the nuclear and chloroplast microsatellite markers, as applied in [16], provides stronger inferences than based on either of these markers. We demonstrate that eastern white pine populations are highly genetically variable and significantly differentiated across the species’ range. Then we infer that eastern white pine survived in a single southern refugium and expanded northward using two major post-glacial migration routes, one of them further branched into two. The genetic signatures of extensive harvesting of eastern white pine over the past century and half on the contemporary genetic structure tended to blur the genetic signals of postglacial phylogeography and evolution that we disentangled.


Study species

Eastern white pine is a long-lived widely distributed species in eastern North America from Newfoundland in the east to southeastern Manitoba in the west and to parts of Georgia and South Carolina in the southeast [17]. It is highly ecologically and economically important and a keystone species of temperate white pine ecosystems [17]. Eastern white pine has undergone heavy exploitation over more than 150 years [11]) and multiple episodes of post-glacial range expansion and retraction [18]. The range-wide population genetic structure of this species remains poorly understood, and current genetic diversity and population structure estimates are based on studies covering only quite small geographic areas in the northern parts of the species’ range [1927]. These studies have indicated that eastern white pine populations have moderate to high levels of genetic diversity and low levels of genetic differentiation. So far, there is no range-wide study reported on molecular population genetic diversity and structure in this species. Eastern white pine is a predominantly outcrossing species [28].

Populations and sampling

Thirty-three natural eastern white pine populations were sampled from throughout the species’ range (Table 1, Fig. 1). Fifty mature eastern white pine trees were sampled randomly from each population. To minimize the chance of sampling siblings, we left a 30 m buffer between the sampled trees. We collected needles from each of the individual 1650 mature trees sampled from 33 populations. After collection, each needle sample was stored in a sealed plastic bag, with a 5 g silica desiccant pack, at −20 °C pending DNA extraction.

Table 1 Eastern white pine populations sampled, and their locations, abbreviated names and geographical coordinates
Fig. 1
figure 1

Location map of 33 eastern white pine populations sampled throughout the species’ range in North America. The species’ range in represented by the shaded region on land. The full names of the populations are provided in Table 1

DNA extraction and microsatellite genotyping

We isolated total genomic DNA from 300 mg of ground needles from each individual using a modified CTAB method [29]. After extraction, the DNA was diluted to a 10 mM working concentration and stored in double distilled water at −20 °C. Twelve nuclear microsatellite markers (RPS1b, RPS2, RPS12, RPS20, RPS25, RPS34b, RPS39, RPS50, RPS6, RPS118b, RPS119 and RPS127) [30] were used to genotype all sampled individuals and three chloroplast microsatellite markers (pt26081, pt63718, pt71936) [31] to genotype a subset of 20 individuals per population. The nuclear microsatellite markers were the same as used previously by Rajora et al. [22] in eastern white pine from Ontario. The chloroplast microsatellite markers (cpSSRs) were those that provided the best and unambiguous patterns in eastern white pine, selected from screening the cpSSRs developed earlier for Pinus thunbergii [31]. We used a subset of 20 individuals per population for cpSSR genotyping because the chloroplast genome is quite conserved; thus using a larger sample size will unlikely improve the results and inferences significantly. This sample size is consistent with or larger than that generally used in similar studies.

For each microsatellite marker, we modified one of the two primers to accept a fluorescently labelled m13 tail (700 nm or 800 nm wavelength). Polymerase chain reaction (PCR) reactions for the microsatellite amplification were performed in a volume of 10 μL. The reaction mixtures contained 10–15 ng of DNA, 2 μL 5x PCR Buffer (Promega, Madison, WI, USA), 0.5 μL 2 mM MgCl2 (EMD), 1 μL 4 mM dNTPs, 0.1 μL 10 mM untailed primer, 0.1 μL 10 mM fluorescently labelled m13 primers, 0.06 μL 10 mM m13 tailed primer, 0.1 μL 5 U/μL Taq DNA polymerase and 5.09 μL nuclease free water.

We performed PCR amplifications starting with a “touchdown” step from 65 °C to the annealing temperatures according to Echt et al. [30] and Rajora et al. [22], 27 cycles of 94 °C 30 s, annealing temperature 30 s and 72 °C 30 s, and an extension step at 72 °C for 3 min. PCR reactions were performed using 96 well EP Gradient S Master Cyclers (Eppendorf, Germany). To assist with genotyping calibration and to monitor possible PCR errors, we included a positive (previously tested working sample) and a negative (PCR master mix with no DNA) control in each PCR plate.

We used Licor Biosciences IR4300 DNA analyzers (Licor, Lincoln, Nebraska, USA) to visualize microsatellite polymorphisms through a 6 % agarose gel matrix (National Diagnostic Ureagel-6), suspended in a TBE buffer. To determine microsatellite fragment lengths, five LiCor 50–350 base pair molecular weight size standards were included in each gel. We used Licor Biosciences Saga Generation 2 (v3.3, Lincoln Nebraska, USA) to score microsatellite genotype data, which was verified manually. The nuclear microsatellite genotype data was scored as allelic constitution of individuals at a locus. The chloroplast microsatellite data was first scored as allelic constitution of individuals at individual three loci, and then as multilocus haplotypes to account for linkage between the chloroplast regions.

Data analysis

Genetic diversity

Genetic diversity parameters of individual populations for the nuclear and chloroplast microsatellites were determined using GenAlEx 6 [32]. Number of alleles per locus (AN), number of private alleles (AP), and observed and expected heterozygosity (HO and HE) were calculated for the nuclear markers. Number of alleles per locus (AN), Shannon’s Information Index (I) see [33], haplotype diversity (H), and unbiased haplotype diversity (uH) were calculated for the chloroplast markers. We also estimated the effective number of alleles (AE) per locus, rarefaction-based allelic richness (AR), and inbreeding index (F IS) for the nuclear microsatellites using FSTAT v2.9.3.2 [34]. Departures from Hardy-Weinberg equilibrium were examined. We tested non-random association of alleles at different nuclear loci using a linkage disequilibrium test in FSTAT v2.9.3.2 [34] and Arlequin v3.5.1.2 [35]. We calculated correlation between latitude and genetic diversity indices of populations to test for any declining genetic diversity trend from south to north, an expected signature of founder effects along the recolonization route(s).

Population genetic differentiation and structure

Inter-population genetic differentiation for nuclear microsatellites was determined by using F-statistics [36] employing GenAlEx 6 [32], and AMOVA [37] using Arlequin v3.5 [35]. GST and RST/NST among populations were calculated from the chloroplast markers using 1000 permutations in PermutCpSSR v2.0 [38].

The population genetic structure resulting from natural barriers and human activities was examined using two Bayesian model-based clustering approaches. First, STRUCTURE [39] was used to examine the range-wide population structure, based on the 12 nuclear microsatellites, under the assumption that sample locality has no significant role in population structure. STRUCTURE works by grouping individuals into clusters (K) such that Hardy-Weinberg equilibrium is maximized within clusters. By varying the K-values across several runs and inspecting the resulting probabilities for these various K values, one can infer the likely number of groups which best capture the variation present in the data. We performed multiple runs of STRUCTURE to test K values ranging from 1 to 33, over 50 replications, using an admixture model and correlated allele frequencies options [40], a 105 burn-in length and 105 MCMC replications for each run. In order to facilitate the selection of the best K value, we used STRUCTURE HARVESTER [41]; an online application that uses the Evanno et al. [42] technique for assessing and visualizing likelihood values across multiple values of K and detecting the number of genetic groups that best fit the data.

Due to the large variation in geographical distances among the locations of the sampled populations, we sought to disentangle any artifactual population structure signal caused by populations in close proximity. We did this by performing a second Bayesian population structure analysis using BAPS v5.3 [43]. Unlike STRUCTURE, BAPS provides the user with an option to integrate spatial coordinates into the prior assumptions [44]. We also employed the BAPS to examine the population structure as defined by the chloroplast markers, using the haplotype data. Both the nuclear and chloroplast datasets were analyzed for a maximum of 33 spatial cluster groups with a population mixture option.

Regions where abrupt genetic differentiation exists over relatively small geographic distances can be indicative of boundaries of population groups and genetic barriers in a species range perhaps where distinct phylogeographic lineages meet. We used Barrier v2.2 [45] to identify genetic barriers and boundaries of population groups, for nuclear and chloroplast microsatellites data, using both the multi-locus pairwise F ST matrix and individual locus F ST pairwise matrices to determine the number of loci that support any inferred barriers.

Phylogeographical analysis

Although we observed varying levels of population differentiation and a significant magnitude of population structure among eastern white pine populations, signals of past phylogeographic patterns were present in nearly all analyses (e.g. regional clustering in low K-value STRUCTURE runs, Barrier analysis). In order to disentangle these patterns from the present population structure, we employed geographic distribution patterns of chloroplast haplotypes, and tested various phylogeographic hypotheses using the Approximate Bayesian Computation (ABC) analysis.

We first examined the composition and geographic distribution of chloroplast haplotypes to infer genetic lineages and post-glacial northward migration of eastern white pine. The geographic distribution of the haplotype data was visualized using PhyloGeoViz [46]. The distribution of these haplotypes across the species’ range was combined with previous information on fossil pollen occurrence [12] to formulate possible recolonization scenarios, including possible routes and divergence times.

We used DIYABC v2.0.3 [47] to test competing hypothetical scenarios regarding phylogeography and population divergence in eastern white pine on a range-wide scale. The hypotheses were constructed primarily to test the order (from south to north) and time of divergence of the population groups, as well as the possibility of population admixture after divergence. For the ABC simulations, we analyzed the nuclear and chloroplast marker data separately. We hypothesized four groups of populations (lineages) based on the signals from STRUCTURE, BAPS and Barrier analyses and geographical distribution of chloroplast haplotypes (see Results). These groups were as follows: Western, Central, Eastern and Southern (Additional file 1: Table S1). First, we compared the competing scenarios of population divergence without admixture and then with population admixture (Additional file 2: Figure S1). The information on the parameters and their prior distributions used in the analysis are provided in Additional file 1: Table S1. Then we compared the best scenarios taken from each of the without and with admixture analyses. We simulated one million data sets for each of the scenarios, and four million data sets for the comparison between the two best scenarios (~ two million each). The population divergence scenarios differed in the order of population divergence and in the number and time of demographic expansion events. The population admixture scenarios were developed based on both the chloroplast haplotype distribution and the best scenario from with and without admixture comparison.

We performed a logistic regression to estimate posterior probability of each scenario, taking the simulated data sets closest to our real data set between 0.1 % and 1 % [47]. The 95 % credibility intervals for the posterior probabilities were computed through the limiting distribution of the maximum likelihood estimators. Once the most likely scenarios were identified, we used a linear regression analysis to estimate the posterior distributions of parameters under this scenario. We chose the 1 % of the simulated data sets closest to our real data for the logistic regression after applying a logit transformation to the parameter values. In order to evaluate the goodness-of-fit of the estimation procedure, we performed a model checking computation [47] by generating 10,000 pseudo-observed data sets with parameters values drawn from the posterior distribution given the most likely scenario.


Range-wide genetic diversity

A total of 340 alleles were observed at 12 nuclear microsatellite loci in 1650 eastern white pine individuals. Twenty alleles were observed at three chloroplast microsatellites in the subset of 660 individuals. The genetic diversity parameters and fixation index estimates for the studied eastern white pine populations based on nuclear microsatellites are in Table 2, whereas genetic diversity parameters based on chloroplast microsatellites are in Table 3. A total of 60 chloroplast haplotypes were observed (Additional file 3: Table S2). Five of these were most common (Table 4).

Table 2 Genetic diversity parameters and fixation index (F IS) for eastern white pine populations based on nuclear microsatellites
Table 3 Genetic diversity parameters of eastern white pine populations based on chloroplast microsatellites
Table 4 Allele composition of the most abundant chloroplast microsatellite haplotypes observed in eastern white pine populations

A geographic trend of decreasing genetic diversity from south to north was observed for both nuclear and chloroplast microsatellite markers (Tables 2 and 3). Populations in the southern portions of the eastern white pine range had higher AN, AR, AE and heterozygosity than northern ones for nuclear markers (Table 2). Generally, the Asheville population from North Carolina showed the highest and the Saint Margarets Bay population from Nova Scotia the lowest nuclear microsatellite genetic diversity. Similar patterns were observed for the chloroplast microsatellite genetic diversity (Table 3). In general, populations in the northeast (New Hampshire-NH, Maine-ME, Massachusetts-MA, Nova Scotia-NS and New Brunswick-NB) had lower levels of average AN, AR, and AE for nuclear markers and lower levels of average chloroplast haplotype diversity than the rest of the populations. The Newfoundland population GL showed somewhat higher levels of genetic diversity than the Nova Scotia and New Brunswick eastern white pine populations. Overall, the Nova Scotia populations had the lowest allelic diversity (Tables 2 and 3). Western populations (western Ontario and Minnesota) had, on average, slightly higher levels of heterozygosity at nuclear microsatellites. All of the genetic diversity parameters for the nuclear markers were inversely correlated with latitude: AN: r = −0.6699, p = 0.00002; AR: r = −0.6663, p = 0.00002; AE: r = −0.5308, p = 0.00148; HO: r = −0.1005, p = 0.5779; HE: r = -0.3895, p = 0.02506. Likewise all of the chloroplast microsatellite genetic diversity parameters were also negatively correlated with latitude: AN: r = −0.2411, p = 0.1765; I: r = −0.3318, p = 0.0592; H: r = −0.3207, p = 0.0688; uH: r = −0.3311, p = 0.05981. However, the number of private alleles did not show any such geographical patterns: latitude, r = 0.0458, p = 0.8002. The inbreeding coefficient (F IS) also showed a trend of decreasing from south to north for the nuclear microsatellite markers: r = −0.3738, p = 0.0326.

Inter-population genetic differentiation

The overall mean F ST among the populations was 0.104 from the nuclear microsatellite markers, and it was significantly higher than 0. Likewise, the AMOVA results revealed significant among-population variation of 10.38 %, which was consistent with the F ST estimates. As expected for a gymnosperm forest tree species, the majority of genetic variation was explained by within-population variation (~90 %). The inter-population genetic differentiation from chloroplast microsatellites was lower than estimates from the nuclear markers (AMOVA = 6 %; GST = 0.035, RST = 0.045, NST = 0.075) but still significantly higher than 0.

Population genetic structure

The two Bayesian analyses of population genetic structure revealed significant levels of genetic structure of eastern white pine populations across its range, and the results were consistent between the two approaches with only slight differences. After performing Evanno et al. [42] adjustments in STRUCTURE HARVESTER [41], we observed a number of high delta K peaks, yet the most prominent was at K = 30 genetic groups (Additional file 4: Figure S2). As such, STRUCTURE revealed 30 genetic groups among 33 eastern white pine populations (Fig. 2). Two populations each from Nova Scotia (NSDL and NSUM), New Brunswick (NBPM and NBCR), and Minnesota (MNWL and MNBL) grouped together in the same group. Each of the rest of the 27 population formed its own individual group (Fig. 2). BAPS, with the addition of geographic coordinates, identified 26 genetic groups among 33 populations (Additional file 5: Figure S3). In all cases, populations that were clustered together were in close geographical proximity.

Fig. 2
figure 2

Summary bar plot of estimated membership coefficient (Q) of the studied individuals from 33 eastern white pine populations from STRUCTURE analysis for (K =) 30 clusters. Each individual is represented by a single vertical line while each colour represents one of the 30 clusters. The full names of the populations are provided in Table 1

Although STRUCTURE identified 30 distinct genetic groups (Fig. 2), when we examined the clustering of samples at lower K values, we observed what might be underlying phylogeographic patterns. For example, at K = 2, western populations (MNBL, MNWL, and ONCL) were clustered into a distinct group from the rest of the populations (Fig. 3a). At K = 3 an additional distinct division was observed between central samples (Pennsylvania- PA, New York-NY, Ontario-ON and Quebec-QC) and eastern samples (NH, ME, MA, NF, NS and NB) (Fig. 3b), resulting in three groups of populations. The fourth group consisted of the two southern populations from Virginia and North Carolina.

Fig. 3
figure 3

Summary bar plot of estimated membership coefficient (Q) of eastern white pine individuals from 33 populations from STRUCTURE analysis. Each individual is represented by a single vertical line. a K = 2: populations were clustered into a western (red) and a central/eastern (green) group. b K = 3: populations were clustered into a western (red), a central (blue) and an eastern (green) group, representing the three major phylogeographic lineages. The full names of the populations are provided in Table 1

The results from the Barrier analysis were generally consistent with those of the STRUCTURE analysis with K = 3. We identified two barriers among the 33 sample populations from both the nuclear and chloroplast microsatellite markers that separated the populations into three groups (Additional file 5: Figure S3). The first barrier separated the 3 most western locations (ONCL, MNBL, and MNWL) from all others, and was supported by all 12 nuclear and 3 chloroplast microsatellite loci. The second barrier separated a central and southern (10 populations) from an eastern group (20 populations) and was supported by 10 nuclear and 3 chloroplast microsatellite loci.

The neighbour joining and maximum likelihood trees based on Nei’s genetic distances [48] or pairwise F ST estimates generally supported the STRUCTURE K = 3 and barrier analyses results for three major groups among the 33 eastern white pine populations.

Phylogeographic patterns

The geographic distribution of the most common chloroplast haplotypes across the sampled range is presented in Fig. 4. The southernmost population NCAV from North Carolina had all five most common chloroplast microsatellite haplotypes, whereas the eastern populations had two or three of these chloroplast microsatellite haplotypes. From the haplotypes constitution and haplotypes sharing among populations, three phylogenetic lineages were apparent: western, central and eastern (Fig. 4). The Green (AG) haplotype was shared between the western and some central populations, indicating some sort of admixture between these apparent lineages.

Fig. 4
figure 4

Geographic distribution of the most abundant chloroplast haplotypes in eastern white pine populations. Colours correspond to individual haplotype. Yellow, Haplotype S; Red, Haplotype V; Green, Haplotype AG; Black, Haplotype AJ; Blue, Haplotype AP. The allelic composition of the haplotype is provided in Table 4

The patterns of evolutionary divergence and phylogeographic lineages of eastern white pine populations revealed from the hypotheses testing using the ABC analysis were generally consistent between the nuclear and chloroplast markers (Table 5; Fig. 5). The best scenario of evolutionary divergence based on the highest posterior probabilities from the nuclear microsatellites was Sc1 (Table 5, Fig. 5a). This placed the first event (1-t3) of population divergence between the southern group (ST) and the western group (WS), the second split (1-t2) between the ST and the ancestral population of the central (CNT) and eastern (EST) groups, and the final split (1-t1) between the central and eastern groups (Fig. 5a). The best scenario of evolutionary divergence based on the highest posterior probabilities from the chloroplast microsatellites was Sc4 (Table 5; Fig. 5b), which is the same as observed from the nuclear microsatellite data but with the addition of an admixture event between the western and central groups (Table 5, Fig. 5b). The parameters (effective population size (N), divergence time in terms of the number of generations (t), and mutation rate (Mμ) estimated for the best evolutionary divergence scenarios are in Table 6, which showed similar patterns for the nuclear and chloroplast microsatellite data.

Table 5 Posterior probabilities for the hypothesized eastern white pine evolutionary scenarios from the ABC analysis
Fig. 5
figure 5

Highest probable ancestral connectivity observed using DIY Approximate Bayesian Computation (ABC) analysis between four population groups from a nuclear microsatellite, and b chloroplast microsatellite markers. Starting at t0 Group CNT and EST coalesced at t1 forming 2A; 2A coalesced with ST at t2 forming 2B; WS coalesced with 2B at t3. One admixture event was observed in the chloroplast data (AD, dash lines) between CNT and WS before t1. (ST: southern group. EST: eastern group. CNT: central group. WS: western group. t0-t3: divergence times. AD: admixture events. See Additional file 2: Table S1 for group information

Table 6 Parameter estimates for the best population divergence scenarios from the Approximate Bayesian Computation (ABC) analysis


In order to better understand the extant population genetic structure, the historical processes that shaped the current distribution and potential fate of a species in the future, especially under climate change conditions, and conservation of its genetic resources, knowledge of its postglacial phylogeography, evolution and expansion is important. Here we have examined the range-wide genetic diversity and population structure of widely distributed and heavily exploited keystone species, eastern white pine, using microsatellite markers of the nuclear and chloroplast genomes and inferred its postglacial evolution and migration testing various hypothetical evolutionary scenarios. We have demonstrated that the extant eastern white pine populations have relatively high genetic diversity, with south to north trend of reduced genetic diversity, and are significantly genetically structured across the range. The signals of postglacial phylogeography and evolution were disentangled from the effects of resource extraction of over the past century and half. Our results suggest a single southern refugium, two recolonization routes and three genetically distinguishable lineages in eastern white pine that we suggest be treated as separate Evolutionarily Significant Units.

Population genetic diversity

Eastern white pine has relatively high nuclear microsatellite DNA genetic diversity over its range. We observed levels of nuclear microsatellite genetic diversity (allelic diversity and/or heterozygosity) in the sampled eastern white pine populations that were on average higher than the microsatellite diversity observed in studies of other widely distributed conifer species, including its sister species western white pine, Pinus monticola (AN = 7.5, HE = 0.67) [26], and lodgepole pine, Pinus contorta (AN = 11.8, HO = 0.46, HE = 0.43 [49]. The observed nuclear microsatellite genetic diversity was also higher than that reported earlier for eastern white pine from Galloway Lake area, Ontario based on the same microsatellites (AN = 9.4, HO = 0.52, HE = 0.60) [22], and Hartwick Pines State Park, Michigan (AN: 6.7, 7.3; HO: 0.47, 0.48; HE: 0.46, 0.49) [23], Menominee Reserve in Wisconsin (HE = 0.49) [24] and in a study covering roughly one third of the species’ range in Canada (AN = 6.6, HO = 0.74 HE = 0.80) [26] based on some of the same microsatellites. These observations of higher genetic diversity observed in our study is consistent with much larger range of eastern white pine studied in ours than previous studies. Also, our study included southern populations from Virginia and North Carolina that were found to be the most genetically diverse. The genetic diversity of the chloroplast microsatellites was in all cases lower than that for the nuclear markers. This is consistent with the lower mutation rate in chloroplast than nuclear microsatellites in Pinus [50] and other plants. Chloroplast microsatellites (cpSSR) have been previously used to test the somatic stability of the cloned material [51] and spatial genetic structure [52] in eastern white pine. This is the first report of chloroplast microsatellite genetic diversity across the range in eastern white pine. The haplotype diversity observed in our study is lower than that reported for four eastern white pine populations sampled from the Beaver Island Archipelago in Michigan (H = 0.80) [52]. The differences are likely due to the differences in the sample size and the number of cpSSRs used between the two studies. We used 20 individuals per population and three cpSSRs, whereas the average sample size in the Myers et al. [52] study was 78 and they used six cpSSRs with only one common between the two studies.

Our study clearly demonstrates the existence of south–north patterns in the genetic diversity levels, with the populations in Virginia and North Carolina having higher levels of genetic diversity than the northern populations. This is consistent with the possible repeated founder effects during post-glacial migration northward of eastern white pine from a southern Pleistocene refugium. The lower genetic diversity in the northern eastern white pine populations may also be due to divergent selection in response to south to north gradient in climate factors, such as temperature and moisture regimes, and range marginalization [27]. However, none of the microsatellite loci showed any signatures of divergent selection when we tested for outliers with respect to the magnitude of F ST using BayeScan ver. 2.1 [53]. Somewhat higher genetic diversity in the Newfoundland population as compared to the New Brunswick and Nova Scotia populations may be due to its location in the Grand Lake Ecological Reserve (, where human impacts have been limited. On the other hand, the New Brunswick and Nova Scotia eastern white populations have been heavily exploited. The number of private alleles did not show any geographical patterns in our study. Private alleles may arise from population-specific new mutations and severely curtailed inter-population gene flow. Geographic patterns for private alleles will be expected if the mutation and gene flow rates were geographically structured among populations within a region: southern, northern, eastern central, and western. Apparently, this is not the case with the eastern white pine populations studied. However, a separate study is required to validate this assumption.

Population genetic differentiation and structure

We observed 10.4 % interpopulation genetic differentiation based on the F ST and AMOVA analyses, and 26 (BAPS) or 30 (STRUCTURE) groups of populations among the 33 eastern white pine populations. These results clearly suggest that significant population genetic structure and differentiation exist across the range of eastern white pine. The observed levels of genetic differentiation could be considered as low when compared to the plant kingdom as a whole but for the conifer trees, the levels are higher than the average of 0.073 (7.3 %) [54]. Significant inter-population genetic differentiation may be due to the reduction in population size and numbers and fragmentation resulting from heavy exploitation of this species over 150 years [11]. Mortality caused by invasive white pine blister rust (Cronartium ribicola) may have also reduced the population size and number of eastern white pine. Encroaching agriculture, grasslands and deciduous forests, and changing precipitation and wind patterns may negatively impact the distances over which seeds and pollen are dispersed between populations. All of the above factors may have reduced the levels of inter-population gene flow and increased inbreeding. However, eastern white pine has strong inbreeding depression [55], and selection against inbreds can occur at a very early stage in conifers [56]. Although eastern white pine is long-lived and has highly dispersed pollen [52]; these factors may not be enough to counterbalance the effects of anthropogenic and natural disturbances to sustain a homogenized genetic structure over its range.

The inter-population genetic differentiation of 10.4 % in our study is higher than that reported earlier for eastern white pine from its part of the Canadian range based on microsatellite (F ST = 0.084) [26] and allozyme markers (F ST = 0.061 [21], F ST = 0.019 [19]). This may be the result of the large area covered by our study that, for the first time, included populations from the western and southern edges of the range. Within the smaller range, in particular the western populations, we observed similar levels of differentiation (F ST: 0.084; Phi variance: 0.071) as previously reported by Mehes et al. [26]. Chloroplast microsatellite genetic differentiation was lower than that for the nuclear markers. This is likely due to pollen-mediated paternal inheritance of the chloroplast genome in Pinus [15]) and long-distance gene dispersal via pollen as compared to that via seeds in conifers.

Phylogeography and post-glacial evolution of eastern white pine

The phylogeographic patterns emerged from the nuclear and chloroplast genetic markers were consistent between themselves and broadly consistent with the findings from previous fossil pollen studies [12]. The most parsimonious hypothesis and scenario from our genetic data and ABC model testing would be to suggest that eastern white pine likely expanded northward along two routes of post-glacial recolonization from a single southern refugium (Fig. 6) that coincides well with the fossil pollen data [57]. The highest probability scenario from the ABC analysis and earlier fossil pollen evaluation [12] suggest that this refugium likely existed on the mid-Atlantic plain from coastal Virginia to the southern cost of South Carolina. The Ashville population from North Carolina is the only sampled location from an area that contained eastern white pine pollen from the LGM. This population showed the highest genetic diversity for both nuclear and chloroplast microsatellite markers (Tables 2 and 3), containing all five most abundant chloroplast microsatellite haplotypes (Fig. 4). This is typical for populations of glacial refugia. Thus, it is highly likely that the North Carolina sample location is a remnant of the eastern white pine LGM refugia. From the ABC analysis (Fig. 5) we can infer that much of the species’ range, to the east of the Great Lakes, is the product of a recolonization route that moved along the eastern seaboard (Fig. 6). The evolutionary history and scenario from our ABC analyses and the fossil pollen findings of Davis [12] and Jacobson [14] suggest that populations to the west of the Great Lakes, particularly in Minnesota and western Ontario, are likely descended from a second recolonization route, which was separated approximately 12,000 year ago (divergence time from ABC – Table 6; pollen existence time from Davis [12]), west of the Appalachian Mountains and south of the Great Lakes (Fig. 6).

Fig. 6
figure 6

Probable eastern white pine post-glacial recolonization routes (arrows) from the glacial refuge (shaded grey area) based on the highest probable Approximate Bayesian Computation (ABC) scenarios observed from nuclear and chloroplast microsatellite data and available fossil pollen information. The Appalachian Mountain Range is shown by ˄. Dashed line indicates assumed route of colonization of western populations. The contour line represents approximate colonization time (x1000 ybp) based on fossil pollen and recolonization information from Davis [12]

In either scenario, the main northward recolonization route on the eastern seaboard is the source of the four most abundant cpSSR haplotypes observed in the eastern United States, Ontario, Quebec and the Maritime Provinces (Fig. 4). This route, supported by nuclear microsatellite Bayesian population structure and ABC models for both the nuclear and chloroplast markers, remained to the east of the Appalachian Mountains. A single branching point was identified in the vicinity of the southern Hudson River Valley (Fig. 6). The ABC analysis and the results from fossil pollen studies place this divergence at roughly 11,000 years ago (Figs. 5 and 6; Davies [12]). This northward and eastward divergence resulted in discontinuities between the haplotypes of the coastal (NH, ME, MA, NL, NS, NB) and central (NY, ON, QC) populations. The population isolation due to the geographical characteristics of this region, including the lowlands at the mouth of the Hudson River and the continuation of the northern Appalachian Mountains in New England States, was likely responsible for this divergence. During the glacial recession, climate in the mountains may have remained inhospitable to forest growth after northern lowlands became favourable, resulting in migration around these mountains, as inferred from the ABC scenarios. The high mountain altitudes and restricted valley habitats of northern New England may still minimize migration between populations on either side of these mountains. This hypothesis is supported by genetic similarities between populations in the eastern Quebec (including Cap Tourment and Temiscouta), which may represent a region with low levels of admixture through northern New Brunswick. The population in Newfoundland shares similar haplotypes with those in the Maritime Provinces and thus is likely descended from migrants from that region.

Under our primary hypothesis assuming one glacial refuge, the major population break observed between eastern and western eastern white pine populations was likely the result of two major features of eastern white pine’s geographical context. Initially, high altitude environments in the Appalachian Mountain may have separated the ancestral western migrants from their counterparts to the east of the mountains. Further in north and west, the Great Lakes may have reduced or prevented the dispersal of seeds between populations on the southern and western shores from the populations in Ontario. The cpSSR haplotype and ABC simulations results do not support the possibility that the western (Minnesota and westernmost Ontario) populations are the descendants of populations in Ontario. This inference is supported by the presence of cpSSR haplotypes in the western populations that are not found in any Ontario population (e.g., Haplotype AJ, Fig. 4). Additional support for a second recolonization route comes from the previous fossil pollen data studies [12]. Between roughly 10,000 and 8000 years ago, eastern white pine inhabited a range south of the Great Lakes (Indiana and Illinois) [12]. Though eastern white pine populations no longer exist in these areas, this is the most likely route of migration into Minnesota and western Ontario.

Although the post-glacial migration and evolution scenarios for eastern white pine were consistent between the chloroplast and nuclear data, the chloroplast data provided additional details, in particular regarding pollen dispersal. We observed shared cpSSR haplotypes between the isolated western populations (Boot Lake and Whale Lake, Minnesota and Crow Lake, Ontario) and the populations in central Ontario (Fig. 4). This was supported by an admixture event between the western and central lineages identified by the ABC scenarios (Fig. 5), possibly from pollen dispersal across a historical expanded range to the north of the Great Lakes [13] or pollen dispersal through fragmented forests in the northern peninsula of Michigan and Wisconsin. The sharing of chloroplast haplotypes (e.g. Haplotype AG, Fig. 4) between western populations and central Ontario populations may also indicate that the west to east prevailing winds have facilitated, or continue to facilitate, pollen dispersal between these regions. The opposite may be the case in between the central and coastal eastern populations. Between these regions, pollen dispersal by west to east prevailing winds may be limited by the northern Appalachian Mountains, leading to strong cpSSR haplotype differentiation (Fig. 4) as also supported by the rejection of the admixture simulation model between these regions by the ABC analysis (Table 5, Fig. 5).

The post-glacial phylogeographic patterns and evolutionary history of eastern white pine inferred in our study appear to be unusual as compared with those of other widely distributed tree species in North America, in that eastern white pine appears to have had a single glacial refugium and multiple post-glacial recolonization routes. In particular, populations of jack pine (Pinus banksiana), black spruce (Picea mariana), white spruce (Picea glauca), and red maple (Acer rubrum), three species found throughout the northern United States and Canada, have been reported to have descended from multiple glacial refugia [6, 8, 9, 58]. Jack pine has two distinct lineages, separated into populations in the Maritime Provinces of Canada, which originated from a northern refugium, and the rest of the species’ range, which originated from a southern refugium [9]. For white spruce, across a range similar to that of eastern white pine, two lineages descended from two southern refugia, Appalachian and Mississippian and one northern refugium in Alaska [6, 7]. Red maple populations originated from at least two populations on the eastern seaboard, one near the glacial margin and another more southern [56]. Two southern refugia have been identified for black spruce [8] and three for Chamaecyparis thyoides [10].

Overall our results validate our hypothesis that eastern white pine had a single southern LGM refugium but it took different post-glacial recolonization routes separated by Appalachian Mountains and Great Lakes, and the current distribution and population structure reflects the post-glacial migration history of the species.

Human and natural disturbances and phylogeography signals

The recent human and natural disturbances can affect the genetic structure of the extant forest tree populations. The resulting genetic information can blur the genetic signals of post-glacial phylogeography and evolution. This was the case with the results of the STRUCTURE analysis in our study, which revealed 30 groups (26 from the BAPS analysis) among the 33 eastern white pine populations sampled. Only when we set the K values at 2 and 3 based on the results from the geographic distribution of the cpSSR haplotypes and Barrier analysis, the postglacial phylogeographic signals emerged from the STRUCTURE analysis. The ABC simulation analysis confirmed the phylogeogaphic patterns emerged. Thus, the cpSSR, ABC and Barrier analyses disentangled the blurring effects of human and natural disturbances from the genetic signals of postglacial evolution and expansion of eastern white pine populations. Hence our study highlights the necessity to disentangle the confounding effects of human and natural disturbances on the contemporary genetic structure from that due to post-glacial phylogeography and evolution.

Evolutionary significant units and their genetic conservation implications

Our results indicate that eastern white pine populations have significant levels of genetic structure and differentiation across the species’ range. We have inferred three postglacial lineages in eastern white pine originating from a southern glacial refugium: eastern, central and western. Localized conservation and management strategies may be required in at least two and perhaps all three regions. The westernmost populations (Minnesota and western Ontario) represent a distinctive lineage and should be the focus of further study to determine if these populations contain adaptive traits for local conditions. As such, this lineage may represent a single Evolutionary Significant Unit (ESU) separated from the central and eastern populations. The divergence observed between the central and eastern coastal populations suggests that these lineages represent at least two additional ESUs. According to Ryder [59], who gave the concept of ESU, the ESUs are geographically and genetically diverged for both neutral genetic markers and adaptive traits. The three ESUs that we have identified in eastern white pine are geographically distinct and genetically diverged for presumably neutral genetic markers. We have not examined the variation in adaptive traits, which should be examined in future. Nevertheless, we have examined range-wide variation in SNPs in candidate genes putatively involved in controlling traits for local adaptation.

Genetic resource conservation and climate change

As stated earlier in this paper, eastern white pine has been heavily harvested over the past 150 years [11], and consequently there are concerns about conservation of its genetic resources. Despite heavy exploitation over its range, and significant but low inter-population genetic differentiation, eastern white pine has maintained relatively high genetic diversity. This is likely due to presumably long distance gene flow and high inbreeding depression in eastern white pine [55], including selection against inbreds at a very early stage, as also reported for sympatric conifer white spruce [56]. We have examined genetic diversity in eastern white pine using only a handful of nuclear and chloroplast microsatellite markers, which by no means represent the whole nuclear and chloroplast genomic diversity. However, if the genetic diversity at the studied markers is a random sample of the species’ genetic diversity, genetic resources of eastern white pine are likely in good shape and could be conserved and sustainably managed in the extant natural populations, provided no further genetic degradation occurs. Therefore, the current and future harvesting practices should be genetically sound to maintain healthy genetic resources of this species, see [20, 22].

Eastern white pine has also gone through multiple episodes of post-glacial range expansion and retraction [18], encountering oscillation in climatic (such as temperature and moisture regimes) and topographical factors over time and space. Despite experiencing all of these events, eastern white pine has maintained genetic diversity, which provides raw material for species, populations and individuals to adapt and evolve, especially under changed climate, environment and disease conditions. This species is expected to migrate northwards under the predicted climate change conditions. Based on its past history of post-glacial migration and evolution, eastern white pine may be able to cope with the anticipated climate change conditions. Its marginal populations, especially at the northern margins of its range, will likely play a major role in its northward range expansion. We have examined genetic diversity at the microsatellite markers, which are considered to be selectively neutral. We suggest that genetic diversity of range-wide as well as marginal populations should be studied at a large number of markers from genes under selection (such as SNPs).


Eastern white pine has relatively high magnitude of genetic diversity, and significant differentiation and genetic structure across its natural range. Its contemporary population genetic structure shows the signatures of post-glacial migration and evolution as well as effects of natural and human disturbances. The current distribution of eastern white pine is the result of at least two post-glacial recolonization routes from a southern single glacial refugium. The two regions of greatest genetic differentiation corresponding to post-glacial recolonization routes are: (1) west of the Great Lakes and (2) along the eastern seaboard. However, it cannot be determined from the markers used in our study whether any of the geographic patterns in population genetic structure is adaptive. We have identified three ESUs (western, central and eastern) in eastern white pine which should be taken into account in conserving and managing the species’ genetic resources. If future work also finds evidence for adaptive differentiation among the identified western, central and eastern coastal genetic lineages, eastern white pine conservation and genetic resource management plans should be made specific to each of these three regions, especially under the climate change conditions. In order to better delineate genetic lineages resulting from post-glacial migration, it is necessary to disentangle the confounding genetic signatures of human and natural disturbances on the contemporary genetic structure from that due to post-glacial phylogeography and evolution.

Availability of supporting data

The raw nuclear and chloroplast microsatellite data are provided in Additional file 6: Table S3 and Additional file 7: Table S4. The supporting results and data are provided in the Additional files 1, 2, 3, 4 and 5.


  1. Malcolm JR, Markham A, Neilson RP, Garaci M. Estimated migration rates under scenarios of global climate change. J Biogeogr. 2002;29:835–49.

    Article  Google Scholar 

  2. Critchfield WB. Impact of the Pleistocene on the genetic structure of North American conifers. In: Lanner RM, editor. Proceedings of the eighth North American Forest Biology Workshop, Logan, UT, USA, Utah State University. 1984. p. 70–118.

    Google Scholar 

  3. Wetzel S, Burgess D. Understorey environment and vegetation response after partial cutting and site preparation in Pinus strobus L. stands. For Ecol Manag. 2001;151:43–53.

    Article  Google Scholar 

  4. Cwynar LC, MacDonald GM. Geographical variation of lodgepole pine in relation to population history. Am Nat. 1987;129:463–9.

    Article  Google Scholar 

  5. Soltis DE, Morris AB, McLachan JS, Manos PS, Soltis PS. Comparative phylogeography of unglaciated eastern North America. Mol Ecol. 2006;15:4261–93.

    Article  PubMed  Google Scholar 

  6. de Lafontaine G, Turgeon J, Payette S. Phylogeography of white spruce (Picea glauca) in eastern North America reveals contrasting ecological trajectories. J Biogeogr. 2010;37:741–51.

    Article  Google Scholar 

  7. Anderson LL, Hu FS, Paige KN. Phylogenetic history of white spruce during the last glacial maximum: uncovering cryptic refugia. J Hered. 2011;102:207–16.

    Article  PubMed  Google Scholar 

  8. Jarmillo-Correa JP, Beaulieu J, Bousquet J. Variation in mitochondrial DNA reveals multiple distant glacial refugia in black spruce (Picea mariana), a transcontinental North American conifer. Mol Ecol. 2004;13:2735–47.

    Article  Google Scholar 

  9. Godbout J, Beaulieu J, Bousquet J. Phylogeographic structure of jack pine (Pinus banksiana; Pinaceae) supports the existence of a coastal glacial refugium in north-eastern North America. Am J Bot. 2010;97:1903–12.

    Article  CAS  PubMed  Google Scholar 

  10. Mylecraine KA, Kuserm JE, Smouse PE, Zimmermann G. Geographic allozyme variation in Atlantic white-cedar, Chamaecyparis thyoides (Cupressaceae). Can J For Res. 2004;34:2443–54.

    Article  CAS  Google Scholar 

  11. Buchert GP. Genetics of white pine and implications for management and conservation. For Chron. 1994;70:427–34.

    Article  Google Scholar 

  12. Davis MB. Holocence vegitational history of the Eastern United States. In: Wright Jr HE, editor. Late-Quaternary environments of the United States, vol. 2. Minneapolis: The Holocene. University of Minnesota Press; 1983.

    Google Scholar 

  13. Terasmae J, Anderson TW. Hypsithermal range extension of white pine (Pinus strobus L.) in Quebec, Canada. Can J Earth Sci. 1970;7:406–13.

    Article  Google Scholar 

  14. Jacobson GL. The palaeoecology of white pine (Pinus strobus) in Minnesota. J Ecol. 1979;67:697–726.

    Article  Google Scholar 

  15. Wagner DB, Furnier GR, Saghai-Maroof MA, Williams SM, Dancik BP, Allard RW. Chloroplast DNA polymorphisms in lodgepole and jack pines and their hybrids. Proc Natl Acad Sci U S A. 1987;84:2097–100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Nettel A, Dodd RS, Afzal-Rafi Z. Genetic diversity, structure, and demographic change in tanoak, Lithocarpus densiflorus (Fagaceae), the most susceptible species to sudden oak death in California. Am J Bot. 2009;96:2224–33.

    Article  CAS  PubMed  Google Scholar 

  17. Wendel GW, Smith HC Eastern white pine. In: Burns, R.M., Honkala, B.H. (Tech. Coordinators), Silvics of North America vol. 1, Conifers. U.S.D.A. Forest Service Handbook 654, Washington, DC. 1990; p476–488.

  18. Ritchie C. Post-glacial vegetation of Canada. Cambridge University Press, First edition. 1987; ISBN 0 521 30868 2.

  19. Beaulieu J, Simon J-P. Genetic structure and variability in Pinus strobus L. in Quebec. Can J For Res. 1994;24:1726–33.

    Article  Google Scholar 

  20. Buchert GP, Rajora OP, Hood JV, Dancik BP. Effects of harvesting on genetic diversity in old-growth eastern white pine in Ontario, Canada. Conserv Biol. 1997;11:747–58.

    Article  Google Scholar 

  21. Rajora OP, DeVerno L, Mosseler A, Innes DJ. Genetic diversity and population structure of disjunct Newfoundland and central Ontario populations of eastern white pine (Pinus strobus). Can J Bot. 1998;76:500–8.

    Google Scholar 

  22. Rajora OP, Rahman MH, Buchert GP, Dancik BP. Microsatellite DNA analysis of genetic effects of harvesting in old-growth eastern white pine (Pinus strobus) in Ontario. Mol Ecol. 2000;9:339–48.

    Article  CAS  PubMed  Google Scholar 

  23. Marquardt PE, Epperson BK. Spatial and population genetic structure of microsatellites in white pine. Mol Ecol. 2004;13:3305–15.

    Article  CAS  PubMed  Google Scholar 

  24. Marquardt PE, Echt CS, Epperson BK, Pubanz DM. Genetic structure, diversity, and inbreeding of eastern white pine under different management conditions. Can J For Res. 2007;37:2652–62.

    Article  CAS  Google Scholar 

  25. Williams DA, Wang YQ, Borchetta M, Gaines MS. Genetic diversity and spatial structure of a keystone species in fragmented pine rockland habitat. Biol Conserv. 2007;138:256–68.

    Article  Google Scholar 

  26. Mehes M, Nkongolo KK, Michael P. Assessing genetic diversity and structure of fragmented populations of eastern white pine (Pinus strobus) and western white pine (P. monticola) for conservation management. J Plant Ecol. 2009;2:143–51.

    Article  Google Scholar 

  27. Chhatre VE, Rajora OP. Genetic divergence and signatures of natural selection in marginal populations of a keystone, long-lived conifer, eastern white pine (Pinus strobus) from northern Ontario. PLoS One. 2014. doi:10.1371/journal.pone.0097291.

    PubMed  PubMed Central  Google Scholar 

  28. Rajora OP, Mosseler A, Major JE. Mating system and reproductive fitness traits of eastern white pine (Pinus strobus) in large, central vs. small, isolated, marginal populations. Can J Bot. 2002;80:1173–84.

    Article  Google Scholar 

  29. Doyle JJ, Doyle JL. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochemical Bulletin. 1987;19:11–5.

    Google Scholar 

  30. Echt CS, Marquardt P, Hseih M. Characterization of microsatellite markers in eastern white pine. Genome. 1996;39:1102–8.

    Article  CAS  PubMed  Google Scholar 

  31. Vendramin G, Lelli L, Rossi P, Morgante M. A set of primer for the amplification of 20 chloroplast microsatellites in Pinaceae. Mol Ecol. 1996;5:585–98.

    Article  Google Scholar 

  32. Peakall R, Smouse PE. Genalex 6: genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol Notes. 2006;6:288–95.

    Article  Google Scholar 

  33. Sherwin W, Jabot F, Rush R, Rossetto M. Measurement of biological information with applications from genes to landscapes. Mol Ecol. 2006;15:2857–69.

    Article  PubMed  Google Scholar 

  34. Goudet J. FSTAT, a program to estimate and test gene diversities and fixation indices (version 2.9.3). Available from: Heredity. 2001;97:418–26.

    Google Scholar 

  35. Excoffier L, Lischer HLE. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010;10:564–7.

    Article  PubMed  Google Scholar 

  36. Weir BS, Cockerham CC. Estimating F-statistic for the analysis of population structure. Evolution. 1984;38:1358–70.

    Article  Google Scholar 

  37. Excoffier L, Smouse PE, Quattro JM. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics. 1992;131:479–91.

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Burban C, Petit RJ, Carcreff E, Jactel H. Range-wide variation of the maritime pine bast scale Matsucoccus feytaudi Duc (Homoptera; Matsucoccidae) in relation to the genetic structure of its host. Mol Ecol. 1999;10:1593–602.

    Article  Google Scholar 

  39. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–59.

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003;164:1567–87.

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Earl DA, von Holdt BM. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour. 2012;4:359–61.

    Article  Google Scholar 

  42. Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol. 2005;14:2611–20.

    Article  CAS  PubMed  Google Scholar 

  43. Corander J, Waldmann P, Marttinen P, Sillanpää MJ. BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics. 2004;20:2363–9.

    Article  CAS  PubMed  Google Scholar 

  44. Corander J, Sirén J, Arjas E. Bayesian spatial modelling of genetic population structure. Comput Stat. 2008;23:111–29.

    Article  Google Scholar 

  45. Manni F, Guerard E, Heyer E. Geographic patterns of (genetic, morphologic, linguistic) variation: How barriers can be detected by using monmonier’s algorithm. Hum Biol. 2004;76:173–90.

    Article  PubMed  Google Scholar 

  46. Tsai YHE. PhyloGeoViz: a web-based program that visualizes genetic data on maps. Mol Ecol Resour. 2011;11:557–61.

    Article  PubMed  Google Scholar 

  47. Cornuet JM, Pudlo P, Veyssier J, Dehne-Garcia A, Gautier M, Leblois R, Marin JM, Estoup A. DIYABC v2.0: a software to make approximate Bayesian computation inferences about population history using single nucleotide polymorphism, DNA sequence and microsatellite data. Bioinformatics. 2014;30:1187–9.

    Article  CAS  Google Scholar 

  48. Nei M. Genetic distance between populations. Am Nat. 1972;106:283–92.

    Article  Google Scholar 

  49. Thomas BR, McDonald SE, Hicks M, Adams DL, Hodgetts RB. Effects of reforestation methods on genetic diversity of lodgepole pine: an assessment using microsatellite and randomly amplified polymorphic DNA markers. Theor Appl Genet. 1999;98:793–801.

    Article  Google Scholar 

  50. Provan J, Soranzo N, Wilson NJ, Goldstein DB, Powell W. A low mutation rate for chloroplast microsatellites. Genetics. 1999;153:943–7.

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Cloutier D, Rioux D, Beaulieu J, Schoen DJ. Somatic stability of microsatellite loci in eastern white pine, Pinus strobus L. Heredity. 2003;90:247–52.

    Article  CAS  PubMed  Google Scholar 

  52. Myers ER, Chung MY, Chung MG. Genetic diversity and spatial genetic structure of Pinus strobus (Pinaceae) along an island landscape inferred from allozyme and cpDNA markers. Plant Syst Evol. 2007;264:5–30.

    Article  Google Scholar 

  53. Foll M, Gaggiotti O. A genome-scan method to identify selected loci appropriate for both dominant and codominant marker: A Bayesian perspective. Genetics. 2008;180:977–93.

    Article  PubMed  PubMed Central  Google Scholar 

  54. Hamrick JL, Godt MJ. Effects of life history traits on genetic diversity in plant species. Philos Trans R Soc Lond B. 1996;351:1291–8.

    Article  Google Scholar 

  55. Kriebel HB. Genetics and breeding of five-needle pines in the eastern United States. In: Sniezko et al. eds. Breeding and genetic resources of five-needle pines: growth, adaptability and pest resistance; Proceedings RMRS-P-32, IUFRO Working Party 2.02.15 conference July 23–27, 2001, Medford, OR, USA. U.S. Department of Agriculture, Forest Service, Rocky Mountain Research, Fort Collins. 2004; 20–27.

  56. O’Connell LM, Mosseler A, Rajora OP. Extensive long-distance pollen dispersal in a fragmented landscape maintains genetic diversity in white spruce. J Hered. 2007;97:640–5.

    Article  Google Scholar 

  57. Webb III T. The appearance and disappearance of major vegetational assemblages: Long-term vegetational dynamics in eastern North America. Vegetatio. 1987;69:177–87.

    Article  Google Scholar 

  58. McLachlan JS, Clark JS, Manos PS. Molecular indicators of tree migration capacity under rapid climate change. Ecology. 2005;86:2088–98.

    Article  Google Scholar 

  59. Ryder OA. Species conservation and systematics: the dilemma of subspecies. Trends Ecol Evol. 1986;1:9–10.

    Article  Google Scholar 

Download references


We would like to thank Andrew Baird, Karl Ziemer, Bud Terpstra and Vonda Terpstra for assistance with sample collection. The research results reported here are based on a part of the Ph.D. thesis work conducted by John W.R. (Ian) Zinck under the supervision of Om P. Rajora. The research was funded and Ian Zinck was financially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant RGPIN 170651, the Canada Research Chair Program (CRC950- 201869) funds, and StoraEnso Port Hawkesbury funds to the Principal Investigator Om P. Rajora.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Om P. Rajora.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

Both authors have contributed significantly to the work reported in the manuscript. JWRZ was a Ph.D. student under the supervision of OPR, and contributed to the study conception and its experimental design, conducted the field and laboratory work, analyzed the data and prepared the initial drafts of the manuscript. OPR is the principal investigator of the project and contributed to the conception of the study and its experimental design, performed some data analysis, provided overall research guidance and direction and funding, and prepared and revised the manuscript. Both authors have read and approved the final manuscript.

Additional files

Additional file 1: Table S1.

Definition and prior distribution of parameters used in the ABC tests of various eastern white pine phylogeographic divergence scenarios for four groups. (DOCX 15 kb)

Additional file 2: Figure S1.

Competing phylogeographic scenarios of eastern white pine population group divergence and admixture. Sc1, Sc2, Sc3: Scenarios without admixture. Sc4 and Sc5: Scenarios with admixture. ST: southern group. EST: eastern group. CNT: central group. WS: western group. t0-t3: divergence times. AD: admixture events. Information on groups is provided in Additional file 1: Table S1. (DOCX 799 kb)

Additional file 3: Table S2.

Allele composition of all chloroplast microsatellite haplotypes (A-BH) derived from three chloroplast microsatellite markers. (DOCX 16 kb)

Additional file 4: Figure S2.

Summary scatterplot of Delta K values for eastern white pine populations testing (K=) 1 – 33 clusters calculated from the STRUCTURE results using the Evanno et al. [42] method. (DOCX 24 kb)

Additional file 5: Figure S3.

Geographic patterns of genetic variation among populations from BAPS analysis. Locations contained within a box represent sampled locations that were clustered into a single population by Bayesian algorithms (BAPS, Corander et al., 2004 [43]). Genetic barriers, identified by Monmonier’s algorithms, are represented by solid lines labeled A (supported by 12 nuclear microsatellite markers, and 3 chloroplast microsatellite markers) and B (supported by 10 nuclear microsatellite markers and 3 chloroplast microsatellite markers). See Fig. 1 and Table 1 for the population names and details. (DOCX 208 kb)

Additional file 6: Table S3.

Nuclear microsatellite genotype data. (PDF 996 kb)

Additional file 7: Table S4.

Chloroplast microsatellite genotype data. (PDF 138 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zinck, J.W.R., Rajora, O.P. Post-glacial phylogeography and evolution of a wide-ranging highly-exploited keystone forest tree, eastern white pine (Pinus strobus) in North America: single refugium, multiple routes. BMC Evol Biol 16, 56 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Phylogeography
  • Population genetic structure
  • post-glacial migration
  • Pinus strobus
  • Genetic signatures of heavy exploitation
  • Molecular evolution
  • Evolutionarily significant units