Identification of populations
STRUCTURE was used to determine whether European barley landraces can be divided into populations based on their microsatellite genotypes. In this context, a population is defined as a group of individuals that share a characteristic set of allele frequencies at the loci that are studied [15]. STRUCTURE places individuals in populations in such a way as to minimize within-population deviations from Hardy-Weinberg equilibrium and linkage equilibrium. It therefore assumes that the individuals are fully outcrossing, and modelling studies have suggested that with partially inbreeding taxa the results of STRUCTURE analysis can be spurious [18, 19]. Although cultivated barley has an outcrossing rate of less than 2% [20], the outcomes of previous STRUCTURE analyses of genetic data from barley have been supported by other analyses of the same data and are in agreement with the conclusions of earlier and later work [e.g. [21, 22]]. In these projects it therefore appears that STRUCTURE identified authentic populations despite barley's low outcrossing rate. An explanation might be provided by other simulations which have shown that, over hundreds of generations, the pattern of multilocus marker inheritance in a population of plants displaying 2% outbreeding is indistinguishable from that displayed by a panmictic population [23].
The results of our STRUCTURE analysis were reproducible for values of K up to 15 and, using standard methods, we concluded that the most likely number of populations was between 8-11. The population structure was hierarchical, and from K = 7-11 the only effect of each incremental increase in K was to subdivide an existing population with the memberships of the other populations remaining unchanged (Figure 5). This consistent pattern of population assignment indicates that the results of the STRUCTURE analysis were not spurious. To further assess the validity of the analysis, we constructed a neighbour-joining tree for all 651 accessions and marked the positions of those accessions with a proportional membership of ≥0.9 in their primary population at K = 9 (Figure 4). Accessions belonging to a single population clustered together, with the exception of populations 6 and 7, whose members are distributed in two parts of the tree. The tree topology therefore provides independent support for the STRUCTURE results, and also suggests that it is reasonable to use K = 9 as the basis for interpretation of the population structure.
Association between phenotype and population structure
For a domesticated plant such as cultivated barley, one possible way in which population structure could arise is as a result of selection for particular phenotypic traits. The phenotypes of greatest agronomic importance in modern farming are ear row number, caryopsis structure (hulled or naked grains), growth habit and flowering time. Wild barley has a two-rowed ear, each spikelet having a fertile central floret flanked by two infertile laterals which, when combined with a long awn, takes on an arrowlike form that is an effective aid to seed dispersal and burial [24]. Many cultivated barleys retain this ancestral head structure but in the derived six-rowed form the two lateral florets are fertile. Six-rowed barley is more often used as an animal or human feed, whereas two-rowed barley is favoured for malting and brewing. Wild and most cultivated barleys have hulled grains where the outer lemma and inner palea adhere to the pericarp epidermis at maturity. This form is favoured by brewers because the hull debris aids wort filtration, whereas the free-threshing 'naked' varieties are preferred when barley is grown for direct human consumption [25]. Wild barley has a winter growth habit, meaning that it requires vernalization - exposure to a prolonged period of cold - in order to promote subsequent flowering. The majority of European landraces lack this requirement and have a spring growth habit, where plants avoid periods of cold weather by completing their growth cycle during a single season, rather than over wintering as plants in a vegetative state [26]. Finally, most wild barleys display a photoperiod response that triggers flowering early in the season, before the conditions become too dry for further vegetative growth. Many cultivated barleys, especially landraces from northern Europe, are daylength nonresponsive, and so continue vegetative growth until flowering later in the summer [8], allowing them to take advantage of the longer growing season in northern Europe.
We compared phenotypic data for ear row number, caryopsis structure, growth habit and flowering time with the population memberships (Table 1). Populations 1-3 display a similar set of phenotypic features, most of these accessions being two-rowed (99, 98, 94% for populations 1-3, respectively), hulled (98, 100, 97%), spring habit (99, 93, 100%), and daylength nonresponsive (100, 95, 100%). Populations 6 and 9, which contained a high proportion of accessions with the facultative growth habit, also show some similarities when other phenotypes are considered, being largely six-row (87% for population 6, 94% for population 9), hulled (100, 99%) and daylength responsive (67, 91%). In population 6, however, the majority of the responsive accessions were members of group C (81% of responsive accessions), whereas in population 9 the majority were group B (80%), these two groups having distinct evolutionary histories [8]. No other similarities between the range of phenotypes displayed by different populations were apparent. These comparisons indicate that there is a degree of association between phenotype and population structure, suggesting that selection may have played a role in the origin and/or maintenance of one or more populations.
Association between climatic factors and population structure
The spread of agriculture involved the dispersal of barley well beyond the native range of the wild species into the variety of environments found in Europe. Adaptation to these new conditions is reflected in a north-south clinal distribution of landraces with the daylength responsive and nonresponsive genotypes of the photoperiod gene PPD-H1, nonresponsive forms more common in the cooler northern latitudes [8]. With wild barley, there is a strong correlation between population structure and temperature and precipitation [27]. It might therefore be anticipated that similar climatic correlations may be discernable in the population structure of cultivated barley.
Analysis of a series of climate variables supported these expectations (Table 2). Between-population variance was significantly higher than within-population variance for both spring and winter barleys. This trend was apparent at all months of the year, but for spring barleys was strongest during the growing season. For winter barleys the seasonal trend was less clear. The results indicate that the accessions in each population are adapted, at least to some extent, to their environment, but do not reveal whether this adaptation was a factor in the origin of individual populations, or merely reflects the more recent evolution of landraces to the environments in which they are being grown.
Origins of the populations
The relationships inferred from the groupings revealed by neighbour-joining (Figure 4), along with the phenotypic and geographic data, enable possible origins for the populations to be deduced.
Populations 1-3 are closely related, forming a distinct group in the neighbour-joining tree, and have identical phenotypes, virtually all of their members being two-rowed and hulled with spring growth habit and daylength nonresponsiveness (Table 1). We have previously shown that the nonresponsive phenotype of European barleys originated in the eastern Fertile Crescent and that the first nonresponsive plants probably entered Europe after the initial spread of agriculture [8]. This population of nonresponsive plants would almost certainly have had a distinct genetic makeup compared with the barley already present in Europe, which originated in the western Fertile Crescent. Populations 1-3 are almost exclusively nonresponsive (of the 99 accessions from these three populations that were typed, 97 possessed a nonresponsive haplotype) and could be the descendents of this original population of nonresponsive plants. These three populations possess the wild phenotypes for ear row number and caryopsis structure, but have acquired a spring growth habit, whereas their wild progenitors would have been winter types. The presence of some members of population 7 in the same region of the neighbour-joining tree as populations 1-3 is indicative of past cross-hybridisation between these populations, which we discuss below.
Population 4 is also made up entirely of daylength nonresponsive accessions. This population is located some distance from populations 1-3 in the tree topology. Population 4 has a narrow geographical distribution in Switzerland and the Carpathian mountains (Figure 6) and is the only population in which the majority of accessions have naked rather than hulled grains. The apparent lack of a close relationship between population 4 and populations 1-3 might indicate that the former is not directly descended from the latter. Instead, population 4 could have become homogeneous for daylength nonresponsiveness via a founder effect operating on a population that contained a mixture of responsive and nonresponsive types. The tree topology suggests that this progenitor of population 4 might have been related to the modern populations 6 and/or 7.
Population 5 forms a separate cluster in the neighbour-joining tree, but has a mixture of phenotypes, including two- and six-row barleys, hulled and naked forms, spring and winter habits and both daylight responsive and nonresponsive. There is little uniformity to the combination of phenotypes possessed by individual accessions, and the two deeply rooted groups within the population 5 cluster are equally mixed. These features, along with the broad geographical distribution, suggests that this population has not been subject to selection. With a crop such as barley, one way in which a distinct genetic population might arise is by geographical partitioning during or soon after the initial spread of agriculture. Populations might be expected to arise in this way if the process of spread involves two or more trajectories that isolate different parts of the crop so that cross-hybridization between the nascent populations is restricted. The original spread of agriculture into Europe is thought to have followed at least two trajectories, one along a northern route through the Balkans, Hungary and Danube and Rhine valleys, and the other through the Mediterranean basin to Italy and Iberia [28–30]. The lack of evidence for human or environmental selection might therefore indicate that population 5 is a relict of a population that originated from the geographical partitioning that occurred during this initial period of spread along the northern trajectory.
Another candidate as a relict is population 9, as the core area of distribution of this population lies within those regions of Mediterranean Europe where crops are thought to have spread via the southern trajectory. If the spread of cultivation along this trajectory resulted in evolution of a distinct population of barley then that population, at least initially, would have had a geographical distribution very similar to that displayed today by population 9.
Population 9 is predominantly six-rowed, hulled and daylight responsive, with a mixture of winter and spring types. Population 8 has similar phenotypic features to population 9 but contains a greater proportion of landraces with the winter growth habit and is exclusively daylight responsive, whereas population 9 includes some nonresponsive types. The possibility that the two populations might have an evolutionary relationship is supported by the STRUCTURE analysis, the two populations being grouped as one at K = 4, not splitting into separate populations until K = 6 (Figure 5), but the topology of the neighbour-joining tree gives less evidence for a close relationship.
The final two populations, 6 and 7, are grouped as one by STRUCTURE at K ≤ 8, and their accessions are located together in the neighbour-joining tree, albeit in three separate parts of the topology. Their geographical distributions are largely non-overlapping, with population 6 centering on the northern Balkans, Hungary and Romania, and population 7 in northern Europe, Scandinavia and the Baltic States. This suggests that originally they formed a single population spanning most of the eastern half of Europe, subsequently splitting into two, possibly by geographical partitioning. They are largely six-row, entirely hulled and predominantly spring growth habit, but they contain a mixture of daylength responsive and nonresponsive forms. The latter are located almost exclusively within the lower part of the tree shown in Figure 4, alongside populations 1-3. The implication is that cross-hybridization resulted in transfer of the daylength nonresponsive phenotype from populations 1-3 to some members of populations 6 and 7. Daylight nonresponsiveness and spring growth habit can be advantageous for the successful growth of barley in the more northerly regions of Europe. Acquisition of daylength nonresponsiveness by a group of early barley landraces that had already evolved a spring growth habit might therefore have been one of the evolutionary adaptations that enabled cultivation of those plants to be extended further north into the regions now occupied by populations 6 and 7. It might therefore be hypothesized that these populations represent a derived form of barley that evolved during the spread of agriculture into central and northern Europe. We explore these and other archaeological interpretations of the population structure in more detail elsewhere (Jones et al., in preparation).