Source populations
Mating experiments and offspring scoring were conducted in three cohorts. For each cohort, pairs were set up in one year and offspring were scored in the following year(s). Cohort D was formed from field-caught individuals sampled in 2012 and raised in 2013, cohort L was formed from field-caught individuals sampled in 2016 and raised in 2017, cohort N was formed from the same field-caught individuals sampled in 2016 but that were raised in 2018.
Cohort D was founded by parents sampled at a single location in the Swiss Alps in July 2012 (Crans Montana, Valais). Cohorts L and N were founded by parents sampled at two locations in the High Tauern mountain range in the Austrian Alps (Albitzen/Heiligenblut, Carinthia, and Peischlachalm/Karls am Großglockner, East Tyrol) in July/August 2016. While the Austrian and Swiss sites are separated by about 400 km, the mechanistic underpinnings of the green-brown polymorphism are likely to be similar, although allele frequencies might differ. Notably, sequence divergence between Alpine and Chinese populations were shown to be low (1.6% mitrochondrial divergence [31]) making it unlikely that genet differentiation within the Alps is high.
In all cases, we sampled last and second-last instar nymphae (nymphal stages 3 and 4) to ensure virginity. Subjects were raised into the imaginal stage in the laboratory. Upon final molt, the sexes were separated before they became sexually mature.
General housing conditions
Subjects were maintained in cages of dimensions 22 × 16 × 16 cm3 and had access to ad libitum food (potted bundles of cut grass in vials filled with water and sealed by a cotton plug) and water (water-filled vials with a cotton plug in horizontal position such that the plug was soaked with water). Small pots (diameter 4 cm, height 3 cm) filled with a 1:1 mixture of sand and vermiculite were provided as egg laying substrate to adult females. Sand pots were searched 1–2 times per week for egg pods, solid structures that typically contain up to 12 eggs. Eggs were left at room temperature for at least four weeks while they were regularly sprayed with water to keep them moist. Overwintering happened in petri dishes lined with moist filter paper at low positive temperatures (approximately + 4 °C) in refrigerators. Diapause was ended after 3–8 months (1.5 years for cohort N) by transferring petri dishes to room temperature. Most hatching happened after about 10–14 days.
Morph classification
Color morphs of the club-legged grasshopper can be classified into green, brown, and pied (Fig. 1). Only green individuals show green color while both brown and pied variants lack any distinct green tones altogether. The green color morph is conspicuous and easily scored from the second or third nymphal stage (out of four nymphal stages in this species). Pied morphs are characterized by a transverse black-and-white pattern on the lateral sides of head and pronotum and usually a marked black frons [16]. Because of the lack of green, we consider pied as a variant of brown and therefore analyze the occurrence of green versus brown/pied (brown sensu lato) as well as the occurrence of pied relative to brown morphs. Pied morphs are most easily identified in last instar nymphae and in adult females. The identification of pied morphs is less easy in adult males, because they lack the black frons and their lateral pattern is blurred by overall darkening. Pied morphs are also more difficult to score as young nymphae (before the last nymphal stage). Therefore, misscoring is more likely in pied vs. brown than in pied/brown vs. green morphs. Scoring of offspring was done blind to morph mating combination.
Mating design
We set up all pairwise combinations of green and brown individuals. For cohort L we did so in approximately equal numbers for all mating combinations. As pied individuals occur in much lower frequencies compared to brown and green ones [16], it was not possible to create a full-factorial mating design with sufficient sample size. We therefore created mostly pied x pied mating combinations as to compare them to brown x brown families (here and throughout the text we name the color morph of the female first). Mating pairs were kept separately in cages. In cases were no copulation was observed and no egg pods were laid, males were replaced by another male of the same color morph.
Offspring rearing and scoring
Hatchlings were transferred from petri dishes to rearing cages on the day of hatching. Hatchlings from different egg pods were kept in separated cages while hatchlings from the same egg pod were housed together. First instar hatchlings are pale at birth, but turn dark within a few hours with no visible morph differences. Second instars (a stage usually reached after about one week) are still mostly dark and sometimes difficult to score. From the third instar onwards (reached about one week after the second instar) it is straightforward to distinguish green from brown individuals and it is usually possible to recognize pied individuals by their black face mask and characteristic lateral pattern.
In cohorts D and N we scored adult individual (cohort D) or last instar nymphae (cohort N) before they were used for other purposes. In cohort L we aimed to acquire color morph phenotypes as early as possible. We therefore decided to score color morphs identities every 2–3 days (which was done by different observers) from the third instar onwards. Each family was scored 3.9 ± 1.9 times (mean ± SD). Since individuals were kept in families and since it is unfeasible to mark subjects individually across their different nymphal stages, morph data is resolved to the level of egg pods. Since we scored the number of offspring family-wise, we were able to identify misscoring that led to an increase in one morph category and a simultaneous decrease in another. In a total of 15 cases (3% of the total), there was a mis-scoring that involves green to brown (9 cases) or brown-to-green (6 cases) scores, which affected 9 out of 53 (17%) of all families. In a total of 16 cases (6% of the ones that included brown/pied), there was a misscoring that involves brown to pied (12 cases) or pied-to-brown (4 cases) changes, which affected 11 out of 53 (21%) of all families. At least with respect to the green morphs, changes are more likely to involve actual miscounting or misscoring rather than changes in color, since previous data based on carefully following single individuals did not show any green-brown or brown-green switch [29]) and later counts in our data usually matched very well with counts prior to the putative misrecording. We resolved ambiguous cases by taking the numbers that were scored most often.
Statistical analysis
We analysed the effect of mating combination using generalized linear mixed models (GLMM, package lme4 version 1.1–20, [40], in R 3.6.0, [41]) with logit link and binomial error distribution. Cohort and cross-type were fitted as fixed effects and mating pair identity and an observation-level identifier as random effects, the latter controlling for overdispersion. Likelihood ratio tests were used for hypothesis testing. We used the package RColorBrewer version 1.1–2 [42]) for display in one of the figures.
Simulation
We used a simulation approach to assess if the observed patterns of inheritance match a monogenic inheritance of a dominant green allele. To this end we sampled simulated parental alleles at random from specified allele frequencies. One copy of each parental allele was then randomly inherited to each offspring. The simulation followed closely the sampling design in terms of parental phenotypes as well as samples size of parents and offspring. We then assessed each pattern (see results section) in the offspring generation. The simulation was conducted across a range of allele frequencies (0.1 to 0.9 at steps of 0.01) with 1000 replicates per allele frequency. The distribution across all replicated simulations were compared to the empirically observed patterns. We calculated the ratio of simulated to observed values as a measure of agreement, with values of unity indicating a perfect match. Since allele frequencies may differ between sites, the simulation involved only cohort L and N that were sampled in Austria and for which accurate field morph frequencies were available.