Skip to main content

Positive selection at sites of chemosensory genes is associated with the recent divergence and local ecological adaptation in cactophilic Drosophila



Adaptation to new hosts in phytophagous insects often involves mechanisms of host recognition by genes of sensory pathways. Most often the molecular evolution of sensory genes has been explained in the context of the birth-and-death model. The role of positive selection is less understood, especially associated with host adaptation and specialization. Here we aim to contribute evidence for this latter hypothesis by considering the case of Drosophila mojavensis, a species with an evolutionary history shaped by multiple host shifts in a relatively short time scale, and its generalist sister species, D. arizonae.


We used a phylogenetic and population genetic analysis framework to test for positive selection in a subset of four chemoreceptor genes, one gustatory receptor (Gr) and three odorant receptors (Or), for which their expression has been previously associated with host shifts. We found strong evidence of positive selection at several amino acid sites in all genes investigated, most of which exhibited changes predicted to cause functional effects in these transmembrane proteins. A significant portion of the sites identified as evolving positively were largely found in the cytoplasmic region, although a few were also present in the extracellular domains.


The pattern of substitution observed suggests that some of these changes likely had an effect on signal transduction as well as odorant recognition and protein-protein interactions. These findings support the role of positive selection in shaping the pattern of variation at chemosensory receptors, both during the specialization onto one or a few related hosts, but as well as during the evolution and adaptation of generalist species into utilizing several hosts.


Studying host adaptation is critical to understanding the ecological and evolutionary history of phytophagous insects [1,2,3,4,5]. Host shifts represent a distinct and sometimes novel set of complex challenges since new host plants can differ in nutritional content, chemical composition, microorganisms and defensive quality [6,7,8]. The use of novel resources when shifting to a new host can therefore affect survival of larva and adult stages, fecundity of feeding females as well as behavioral patterns of oviposition and mating, causing ultimately substantial impact on fitness. This has led to the hypothesis that in addition to genes involved in performance relative to a new environment, adaptation to new hosts often involves mechanisms of host preference [8,9,10]. Given their role in odorant and gustatory sensory and ultimately in the discrimination of suitable hosts, chemoreceptors have been the focus of several studies examining the evolutionary mechanisms underlying host adaptation [11,12,13,14,15,16,17].

Chemical stimuli recognition occurs through odorant (Or) and gustatory receptor (Gr) proteins, converting volatile and soluble chemicals from the environment into electrical outputs through nerve impulses [18, 19]. In insects, these proteins have seven trans-membrane domains located on the dendritic membranes of sensory neurons housed in sensilla of receptor organs, with a reverse orientation when compared to mammalian topology (COOH-tail extracellular) [20, 21]. While the number of genes in these families seems to be largely stable across Drosophila evolution, there have been constant gene loss and gain events through the birth-and-death model in insects [22,23,24,25,26,27,28] where divergence arises from the selection of beneficial mutations between paralogous genes [29]. This mode of evolution, as well as the clustered organization of these genes has provided strong support to the hypothesis that these loci play a critical role in ecological shifts and specialization [9, 11,12,13,14,15,16,17, 22, 23]. In particular, the ecological specialization observed within Drosophila has been suggested to have contributed to the higher pseudogenization levels observed in sensory genes [14, 15]. However, the analysis of 12 Drosophila species genomes [30], has shown that genome size or endemism could also predict pseudogenization rates, but the few specialist species compared and the lack of evolutionary data over short time scales has prevented efforts to disentangle these causes [31].

The dramatic heterogeneity in family size and divergence among chemosensory gene paralogs suggest that these gene families have evolved rapidly. However, the pattern of variation and divergence at a majority of genes are more consistent with purifying selection and only a small portion of genes have shown evidence of positive selection, some of which have been linked to specialization [14, 15, 29, 31]. Thus, specialist species such as D. sechellia and D. erecta have shown higher Ka/Ks ratios when compared with their closest generalists species [15]. The majority of amino acid sites showing signatures of positive selection in these families have been found in the cytoplasmic domain, suggesting that selection changes are associated with signal transduction rather than the detection of odorant molecules [32]. To date most comparisons have involved a few highly divergent species; here we examine the case of the cactophilic local specialist D. mojavensis and its generalist sister species D. arizonae to investigate the role of positive selection in four candidate chemoreceptor genes (one Gr and three Or). These species have adapted to develop and feed on necrotic cacti, showing partially overlapping distributions in xeric regions of southwestern United States and northwestern Mexico [33, 34]. The four candidate loci examined here (Or67c, Or83c1, Or83c2 and Gr63a) have shown a pattern of differential expression associated with host shifts suggesting their role in host preference, presumably by detecting volatiles associated with either the presence of nutrients or toxins produced in rotting cacti [35,36,37]. Or67c and Gr63a are known to play a role in detecting the by-products of alcoholic fermentation as well as CO2 [38, 39], while Or83c1 and Or83c2 have been suggested to play role in food detection [38, 40].

Drosophila mojavensis offers a unique case to study host adaptation since its biogeographical history has been shaped by multiple host shifts in a relatively short period of time [41]. Since its divergence from its sister species D. arizonae, the D. mojavensis lineage has expanded through a rapid specialization process (< 0.5 Mya; [42, 43]) producing four host populations adapted to different cactus species (Fig. 1): agria (Stenocereus gummosus) in Baja California; coastal prickly pear (Opuntia littoralis) in Santa Catalina Island; red barrel (Ferocactus cylindraceus) in the Mojave Desert and organpipe (S. thurberi) in the Sonora Desert population [33, 34]. We investigated patterns of molecular evolution in four chemoreceptor genes across the four host populations of D. mojavensis and the generalist D. arizonae to test for positive selection using a multiple approach of gene-wide, polymorphic-based, codon-based and functional tests. Then, we analyzed the distribution of sites showing evidence of positive selection on the structure of these trans-membrane proteins in order to test whether selection has been biased towards specific domains and discussed the implication of these results in terms of the role of positive selection in host preference adaptation.

Fig. 1
figure 1

Map showing geographic distributions and evolutionary relationships of D. mojavensis host populations, the cactus generalist D. arizonae and D. navojoa (Opuntia sp. breeder). D. mojavensis lines were sampled from each population: Baja California (BAJ), Sonora Desert (SON), Mojave Desert (MOJ) and Santa Catalina Island (CAT) (cactus hosts are indicated). D. arizonae lines were sampled from the Sonora Desert population while the sole D. navojoa line was sampled from Jalisco, Mexico. Map was made using public domain vector and raster data from Natural Earth ( and then modified to highlight species distributions


Genetic diversity, neutrality tests and divergence

For each locus examined, between 7 and 15 sequences were generated for each of the four D. mojavensis populations (Sonora Desert, SON; Santa Catalina Island, CAT; Mojave Desert, MOJ and Baja California, BAJ), as well as between 8 and 12 sequences of D. arizonae. After excluding introns, these amplicons had between 1170 bp for Or83c2 to 1479 bp for Gr63a (Table 1). Given the population structure in D. mojavensis (Additional file 1: Table S2) [44, 45], we sampled multiple individuals per population. Here we report average values per population of D. mojavensis to make them comparable with those in D. arizonae (Table 1). Between 12 and 35 polymorphic sites and 6 and 12 different haplotypes were detected across the loci and species examined (Table 1). Diversity estimates in D. arizonae often fall between the estimates for individual population in D. mojavensis (Additional file 1: Table S2). However, most genetic diversity estimates were higher in D. arizonae when compared with average estimates across D. mojavensis populations. Thus, haplotype diversity was higher in D. arizonae for all genes, genetic diversity for three genes and nucleotide diversity for two of genes analyzed (Table 1). Tajima’s D and Fu and Li’s D and F neutrality tests were not significant for most genes (Table 1). Fu’s FS statistic was significantly negative for the majority of genes in both species (Table 1). For one of the loci (Gr63a) and in one D. mojavensis population (MOJ) these tests were significantly positive (Additional file 1: Table S3), indicating a recent bottleneck causing a heterogeneous pattern in the demographic history of this species.

Table 1 Descriptive parameters for genetic diversity, neutrality tests and divergence. Results are shown for each gene in D. mojavensis and D. arizonae. In the case of D. mojavensis, genetic diversity and divergence values correspond to the average across the four populations (See Methods for abbreviations)

The genetic divergence assessed by comparing the number of synonymous (Ks) and nonsynonymous (Ka) substitutions ratio (Ka/Ks) with respect to the D. navojoa outgroup was often larger in D. arizonae for all loci with the exception of Or83c1 (Table 1). The genetic structure as assessed by ΦST indicated the existence of significant population structure in D. mojavensis, with ΦST values ranging between 0.26 and 0.59 (Table 1). Population structure was highly significant for all genes as well as the majority of pairwise comparisons between D. mojavensis populations (Additional file 1: Table S4).

Evidence of positive selection

All genes showed significant signatures of positive selection in at least one of the gene-wide (Table 2), codon-based and polymorphism-based (Table 3) tests performed. Codeml consistently showed all genes evolving under positive selection, ie. LRT favored the M8 selection model (Table 2). The genes Or67c and Or83c2 were also significant in one (PARRIS) and three (PARRIS, BSR, aBSREL) of the additional gene-wide selection tests performed respectively, as well as the McDonald-Kreitman test (Table 2). The BSR test identified selection in Or83c2 for specific branches belonging to the SON and CAT populations of D. mojavensis as well as branches belonging to D. arizonae lineage (Table 2). McDonald-Kreitman test was also performed for pairwise comparisons between host populations of D. mojavensis and D. arizonae, showing signatures of positive selection for SON and BAJ in Or67c and SON, BAJ and CAT in Or83c2 (Table 2 and Additional file 1: Table S5). Only the gene Or83c2 showed signatures of positive selection in both the D. mojavensis and D. arizonae lineage as determined by the McDonald-Kreitman tests and polarizing the fixed sites utilizing the outgroup species D. navojoa (Table 2 and Additional file 1: Table S5). Furthermore, all genes showed evidence of positive selection at individual sites in at least three of the codon-based selection tests performed and all tests provided significant values in at least two genes (Table 3). A combined analysis of these tests led us to identify four sites evolving under positive selection for the gene Gr63a (using the codemlBEB, REL, FUBAR and PRIME tests); nine sites in Or67c (using the codemlBEB, SLAC, FEL, REL, MEME, FUBAR and PRIME tests); eight sites in Or83c1 from (using the codemlBEB, FEL, IFEL, REL, MEME, FUBAR and PRIME tests); and eight sites positively selected in the gene Or83c1 (using the codemlBEB, SLAC, FEL, IFEL, REL, MEME, FUBAR and PRIME tests) (Table 3, Fig. 2). Though we identified several sites evolving under positive selection, we also investigated whether these changes led to changes in amino acid characteristics that might affect protein structure and hence function. Results of the PRIME analysis showed that 38% of the candidate sites across the four loci indicated possible substantial changes to protein structure (Table 3).

Table 2 Gene-wide tests of positive selection and McDonald-Kreitman test. A summary of P-values obtained for selection tests is shown for each gene. Results for McDonald-Kreitman test are shown for the whole gene as well as for each domain (Cytoplasmic, transmembrane or extracellular) (See Methods for abbreviations)
Table 3 Codon-based tests of positive selection. A summary of P-values and posterior probabilities obtained for selection tests are shown for those codons where at least one test resulted with significant value after FDR correction (global α = 0.05). The codon location in domains (Cyto, trans or extr) is indicated for each codon under positive selection. Biochemical properties significantly associated with amino acid sites under selection are indicated for PRIME test (See Methods for abbreviations)
Fig. 2
figure 2

Predicted topology of chemoreceptor proteins showing seven transmembrane domains and a cytoplasmic NH2-tail. Positively selected sites are highlighted in red. Topology was predicted by the TMHMM software

Since the performance of selection tests based on phylogenetic comparisons could be affected by recombination increasing the number of false positively selected sites [46,47,48], we tested for the presence of recombination breakpoints and found evidence for such events in Or83c1 and Or83c2. Therefore, we also performed the test of selection on these two loci within each linkage block. After controlling for recombination, two sites in Or83c1 and four sites in Or83c2 continued to indicate a pattern of positive selection. The remaining sites did not pass the threshold of the Bayes factors (BF > 100) in the REL test, but posterior probabilities remained over 0.8. Suggesting that, although the power of the test is decreased since blocks are analyzed instead of genes, a similar selective history affected the pattern of variation at these sites as well. For this reason and because the phylogenetic relationships among the three species did not change between partitions, suggesting that the assumptions of selection tests are not violated, only the results for the gene as a whole without partitions were further analyzed.

Signatures of selection among gene domains

Given the pattern observed at these loci suggesting positive selection between D. mojavensis and D. arizonae, we further tested whether selection had occurred within specific domains of these transmembrane proteins. Approximately 55% of the sites under positive selection are located in the cytoplasmic domain while only 24 and 21% are in the extracellular and transmembrane domains, respectively (Fig. 2). We assessed the distribution of selected sites across the protein domains of all four loci through a GLM model (Table 4). We found no significant gene effect, but a significant domain effect (Table 4), with the cytoplasmic domain enriched for positively selected sites when compared with the transmembrane and extracellular domains according to multiple comparisons (Fig. 3). We additionally tested for such structural heterogeneity in the level of selection by running the McDonald-Kreitman independently for each domain of each gene. These tests showed significant evidence of positive selection in the cytoplasmic domain for Or83c2 and marginally for Or67c, while no significant effects for the transmembrane and extracellular domains (Table 2). The proportion of sites under positive selection was just marginally significant when accounting for the domain size in the GLM model for Or83c2 and Or67c (Table 4), which points to selection also being influenced by domain size. This is expected since, larger domains have a higher number of sites available for selection to act on.

Table 4 ANOVA of the GLM analysis for the number of sites showing evidence of positive selection and the proportion of sites under positive selection (controlling for the number of amino acids within a domain) (See Methods for abbreviations)
Fig. 3
figure 3

Distribution of the sites showing evidence of positive selection by protein domains in each chemoreceptor gene: Cytoplasmic (Cyto), Extracellular (Extr) and Transmembrane (Trans). The number of sites under positive selection was significantly higher in the cytoplasmic region (Tukey test for multiple comparisons; p < 0.05)


It has been suggested that specialist species not only lose a subset of sensory pathway genes during host specialization, but some of those that are retained undergo periods of positive selection [14, 15]. Here we investigated the role of positive selection in the molecular evolution of four chemoreceptor genes (one Gr and three Or) during the evolution of D. mojavensis, a species that has experienced recent bouts of specialization resulting in a set of four cactus host populations, and its generalist sister D. arizonae. We used multiple approaches to test for positive selection including codon-based, gene-wide, polymorphism and functional analyses. All genes investigated showed evidence of positive selection at codon and gene-wide approaches while two loci indicated positive selection when polymorphism information was analyzed. Although most sites at these genes are likely to be evolving under purifying selection, our approach suggests evidence of positive selection shaping the evolution of several amino acid sites involved in molecular and physiological functions, which allowed us to investigate the distribution of candidate sites under positive selection across the domains of these chemosensory receptor proteins.

The level of genetic diversity was higher in D. arizonae when compared with the average value across the D. mojavensis populations. This pattern is in agreement with previous studies of other nuclear genes in these species [45], which is likely a consequence of a higher overall effective population size in D. arizonae [49, 50]. This pattern of variation is however dissimilar to what has been observed from mitochondrial loci [44], which suggests that recent evolutionary forces are shaping nuclear and mitochondrial genes differently [45]. Likewise, the Fu’s FS was significantly negative at most loci, which is evidence of excess number of alleles [51]. Since we controlled for the detected population structure performing tests per population, this pattern is more likely indicating that both species are undergoing population expansions. Although we did not find such evidence from all neutrality tests, most values were consistently negative and the Fu’s FS statistic is particularly sensitive to recent changes in population sizes [51]. This expansion in both species has been previously suggested to be associated with a shared biogeographical history in some of the D. mojavensis populations (SON and BAJ) and D. arizonae. As suggested by Machado et al. [45], these populations have experienced expansion and contraction events due to Pleistocene glaciation cycles that affected a great part of the biota [52,53,54]. This is consistent with our results per population since the sampled population of D. arizonae is from the Sonoran Desert and only the populations from Sonora and Baja California showed signatures of expansion in D. mojavensis. The sole significantly positive value indicates a deficiency of alleles in the Mojave Desert population of D. mojavensis at a single locus (Gr63a) [51]. Given that demographic changes would be expected to affect the majority of loci, this particular case could possibly be a result of balancing selection. Further evidence is needed to obtain a more complete understanding of what is shaping variation at this locus in this population.

Most of the evidence for the role of positive selection in Or and Gr genes during host specialization comes from the genome-wide analysis in Drosophila species, particularly the case of D. sechellia and D. erecta [14, 15, 31]. McBride and Arguello [15] found higher Ka/Ks ratios in chemoreceptors of these specialist species when compared with their closest generalists, although such ratios could be alternatively explained by relaxed purifying and not necessarily positive selection. Such comparisons have been based on genome-wide studies, where the lack of polymorphic variation and specialization events in short time scales have made the efforts to disentangle these two hypotheses difficult. Gardiner et al. [32] found 20 genes with evidence of functional divergence in a genome-wide study of 12 Drosophila genomes, but only six were considered to be under positive selection. This suggests that the number of genes positively selected in these families is often low in insects, however, as suggested by Gardiner et al. [32], the low power of the phylogenetic-based tests should caution one against making broad generalizations. Despite the low Ka/Ks values of these genes, suggesting that most sites are evolving under purifying selection, we were able to identify a number of candidate codons under selection, some of which might contribute to significant effects on protein structure and function. These changes may be involved in detection of new odor molecules as well as protein-protein interactions during signal transduction. These findings suggest that the low number of sites under positive selection previously reported in sensory genes (as recognized by Gardiner et al. [32]) may be at least in part due to the low power of the particular selection tests previously utilized and the absence population level polymorphism data.

According to the specialization hypothesis [14], beneficial mutation associated with amino acid changes that improve the recognition capacity of smell and taste receptors should be favored in specialists, resulting in a narrowly tuned receptor [11]. This proposed mechanism of evolution at chemosensory loci might also have shaped the variation of chemosensory loci during the specialization process of the four cactus host populations of D. mojavensis [33, 34, 55]. The necroses of each cactus differ in their chemical composition as well as the yeast and bacterial communities on which D. mojavensis feeds [56,57,58]. It has been shown that these populations not only differ in terms of their performance while utilizing different necrotic cacti [59], but also exhibit different behavioral and electrophysiological responses of olfactory organs, which suggests that their peripheral nervous system has been shaped by local ecological differences [59, 60].

Although we provide evidence supporting the hypothesis of positive selection shaping the evolution of amino acid sites involved in molecular functions of sensory genes during the host specialization in D. mojavensis, not all the analyses performed allow for the possibility to detect lineage specific changes. Nevertheless, for those analyses addressing specific lineages, the evidence supports the role of positive selection during the evolution of D. mojavensis. This was evident from the BSR and McDonald-Kreitman tests in Or83c2. While Or67c was not significant for the McDonald-Kreitman test when compared with D. navojoa, both species showed similar fixed differences. In addition, the changes specific to the D. mojavensis lineage are supported by strong genetic differentiation among D. mojavensis populations as estimated by ΦST, which are often larger than what has been previously reported for neutral SSR markers [61], suggesting divergent selection may be increasing ΦST values in coding sequences among D. mojavensis populations.

We also found selective changes exclusive to the D. arizonae lineage. This was evident from the Ka/Ks value using D. navojoa as an outgroup, which was often larger in D. arizonae as well as the McDonald-Kreitman tests for Or83c2 and Or67c (Table 2). These findings in D. arizonae pose an interesting question about the role of positive selection in the evolution of a generalist that has diverged from a specialized species. Cytological and molecular evidence suggests that the D. mojavensis/D. arizonae lineage diverged from the lineage of the Opuntia specialist D. navojoa approximately 4 Mya (reviewed in [41]). Hence D. arizonae has acquired the ability to use columnar cacti, as well as Opuntia, among other hosts. The potential role of positive selection during the acquisition of new hosts in the evolutionary transition between specialist to generalist species may favor mutations associated with amino acid changes involved in the chemical recognition of a wider repertoire of odors, a broadly tuned receptor [11]. Alternatively, the changes observed could be associated with the modulation of odorant binding signals during the early stages of the signal transduction cascade [62, 63].

The majority of the sites under positive selection in the Or and Gr genes map to the cytoplasmic domain of these transmembrane proteins (Tables 23 and 4). In general, this pattern was unexpected since this domain interacts with secondary messengers involved in signal transduction and was therefore expected to be conserved. Interestingly, this pattern seems to be more common than previously thought, since there has been a previous report of the cytoplasmic domain of chemosensory receptors evolving rapidly [32]. We only analyzed one Gr, which prevents us from including gene family as a factor in the GLM model. Despite such limitations of the model, the sole Gr gene analyzed here showed clear differences in the distribution of candidate sites when compared with Or genes. For example, Gr63a had the lowest number of candidate sites and was the only gene with the same number of candidate sites in extracellular and cytoplasmic regions, with no evidence of positive selection in the transmembrane domain. Such differences still need to be confirmed by comparing more Gr genes, but again, this result agrees with those patterns reported elsewhere [32]. As suggested by Gardiner et al. [32], this probably reflects the functional and molecular differences between Or and Gr receptors. Or genes are associated with volatile recognition while Gr recognize soluble ligands, pheromones and particularly for Gr63a, levels of CO2 [11, 20, 21, 38, 64] . In addition, the olfactory receptors requires the interaction with the universal receptor Or83b [11, 20, 21], while Gr63a requires interaction with Gr21a for its functionality [39, 65]. Such functional and molecular differences are likely to lead to distinct regions of constraint as reflected by the observed differences in the distribution of positively selected sites.

The majority of positively selected sites mapped to IL1 and IL2 loops, which would be expected since these loops are not involved in the dimerization with Or83b, the universal co-receptor [32]. Or67c was the only gene showing a candidate site in the IL3 domain. This loop interacts with IL3 of Or83b during dimerization [64], which make it likely to be under constraint. We also found candidate sites in the COOH-tail for Or83c1 and Or67c, which contrasts with the results of Gardiner et al. [32] since they found no candidate sites in this region for the ten Or genes under positive selection among the 12 Drosophila species studied. This terminal region is also important for coupling with Or83b, being conserved even in highly divergent genes [64].

The higher number of positively selected sites in the cytoplasmic domain suggest that selection has been acting primarily on the transduction of signal captured by chemoreceptors and not only its detection. Nevertheless, after controlling for the number of amino acids per domain, the GLM analysis was just marginally significant. Additionally, Or83c1 and Or67c also show divergent selection in the extracellular region (COOH-tail) essential for signal detection [64], suggesting that both signal detection and transduction have been shaped by positive selection during the molecular evolution of these chemoreceptors.


Local ecological adaptation can shape the evolutionary trajectories of myriads of traits and their respective underlying genetic architecture. Chemosensory pathways, especially those involved in host preference in insects, can be shaped by the selective forces associated with changes in host utilization. We have provided evidence supporting the hypothesis of host specialization as an evolutionary scenario that favors positive selection at specific amino acid sites in genes of sensory pathways. Given that selection has been particularly biased towards cytoplasmic domains at amino acid sites involved in structural and physiological functions, these selective forces appear to be largely shaping the post-odorant binding role of these chemosensory receptors, i.e. signal transduction. Our approach highlights the advantage of using a set of recently diverged species and populations locally adapted to distinct ecological conditions in the understanding of the mechanisms of chemosensory evolution. Furthermore, the incorporation of population level polymorphism data which allowed multiple and powerful approaches for detecting positive selection in an ecological context proved to be highly effective.



Isofemale lines from all four D. mojavensis cactus host populations were utilized in this study: Baja California (BAJ), Sonora Desert (SON), Mojave Desert (MOJ) and Santa Catalina Island (CAT) (Fig. 1). Additionally, sequences from a D. arizonae population from the Sonora Desert and one line of D. navojoa (Drosophila Species Stock Center 15081–1374.11), a cactophilic (Opuntia spp) species, from Jalisco, Mexico was utilized for comparisons requiring an outgroup. For both D. mojavensis and D. arizonae approximately 7–15 independent isofemale lines per population/species were sampled. These isofemale lines have been maintained in the laboratory for 60–120 generations and are highly inbred (Fig. 1). All stocks were maintained in 8-dram vials with banana-molasses media [66] in a 25°C incubator on a 14:10 h light:dark cycle.

Molecular procedures and alignments

DNA samples for each species were amplified via PCR using Apex Taq DNA Polymerase (Genesee Scientific) following manufacturer’s recommendations in a 25 μl final volume, involving 1 μl of DNA and 0.2 μM of each primer. Amplifications were performed using a PCR program of 35 cycles involving 94 °C for 30 s followed by the corresponding annealing temperature for each gene for 30 s (Additional file 1: Table S1), 72 °C for 1 min, and then a final extension at 72 °C for 10 min. PCR products were purified using QIAquick PCR Purification columns (Qiagen) following manufacturer’s recommended procedures and sanger sequenced in both directions (Operon Eurofins). Sequences were visually inspected for quality using Geneious [67] and aligned based on their amino acid sequences using ClustalW and manual verification. Haplotype phase was inferred for each sample using the software PHASE [68,69,70].

Genetic diversity, neutrality tests and divergence

The genetic diversity at nucleotide and haplotype levels was estimated for each species through a set of descriptive parameters using the software DNAsp v5 [71], including the average number of sequences (N), total number of sites (NS), number of variable sites (S), number of haplotypes (h), haplotype diversity (Hd), nucleotide diversity (π) and the average number of nucleotide differences (θW). In order to infer any deviation from neutral expectations at the population level, genetic structure or drastic changes in population sizes (recent bottlenecks or population expansion), a set of neutrality tests were performed, including Tajima’s DT [72], D and F tests [73] and Fu’s FS statistic [74] using the software DNAsp. All genes were assessed for recombination events using the GARD test, a genetic algorithm for recombination detection [75, 76] implemented in the package HyPhy v2 [77].

Divergence between D. mojavensis populations and D. arizonae was estimated by the number of synonymous (Ks) and nonsynonymous (Ka) substitutions ratio (Ka/Ks) with respect to the D. navojoa outgroup using the software DNAsp v5 [71]. The genetic structure among D. mojavensis populations was estimated through pairwise ΦST, an analogue version of the Wright’s fixation index FST [78, 79] which is estimated by an Analysis of Molecular Variance (AMOVA), taking into account information on the genetic distances among haplotypes as well as their frequencies following Excoffier et al. [80]. Significance was based on 10,000 permutations in the software Arlequin version 3.5.2 [81].

Phylogenetic inference

Because several of the selection tests we performed require a phylogenetic tree of aligned sequences, each gene was individually analyzed using Bayesian inference in MrBayes [82] after testing for the best evolutionary model inferred in the software jModelTest v0.1.1 [83]. Two independent runs were used from different starting points by a Metropolis-coupled Markov Chain Monte Carlo analysis with four chains, one cold and three incrementally heated (heating parameter = 0.1) for 10 million generations, sampling every 5,000th tree. Both runs were checked for convergence of the Markov chains with the standard deviation of split frequencies being less than 0.001. Parameter estimates were then analyzed in Tracer [84] to ensure that these had reached stable values with adequate mixing and ESSs above 200.

Assessing positive selection

Signatures of positive selection were assessed based on the nonsynonymous/synonymous substitution ratio (dN/dS = ω). When there is no selection, both classes of substitutions are expected to become fixed with the same probability (ω = 1). On the other hand, selection can increase fixation probabilities for nonsynonymous mutations because of selective advantages (Positive selection, ω > 1), or decrease these probabilities due to selective constrains (Purifying / negative selection, ω < 1). A number of methods have been developed for detecting positive selection based on ω ratios at different levels, whole alignment, phylogenetic branches, codons and combinations of those. Therefore, we tested for selection using multiple tests and four approaches: i) gene-wide; ii) polymorphic-based; iii) codon-based; iv) functional implications of candidate sites.

i) Gene-wide selection tests were performed in order to detect general signatures of selection at the gene level in the alignments or trees among sequences of the three species, without making any assumption about foreground branches. First, the software codeml, implemented in the package PAML v4 [85] was used to compare a null model (M7) in which ω is assumed to be beta-distributed among sites and a selection model (M8), where codons are allowed to have an extra category of positively selected sites ω > 1. The significance of these comparisons was assessed using a likelihood-ratio test [86]. We also used the package HyPhy v2 [77] in order perform a set of gene-wide tests for positive selection to complement and provide a comparison with the LRTs in codeml. Thus, we used the PARRIS test (a partitioning approach for robust inference of selection), which makes maximum likelihood inference of positive selection robust to the presence of recombination. The BUSTED test (Branch-site Unrestricted Statistical Test for Episodic Diversification) [87] was used to specifically assesses whether a gene has experienced positive selection on at least one site and one branch given a phylogeny. The BSR (Branch-Site Random effects likelihood test) [88] was used to test for episodic diversifying selection detecting linages with ω > 1. The aBSREL test (adaptive Branch-Site Random Effects Likelihood), the improved version of the commonly-used “branch-site” models, was used to test if positive selection had occurred on a proportion of branches. aBSREL models both site-level and branch-level ω heterogeneity, testing whether a proportion of sites have evolved under positive selection. An FDR correction was performed to account for multiple comparisons across genes for each test.

ii) A polymorphism-based approach was used to test for positive selection comparing the ω ratios within species with those between species using the McDonald-Kreitman test [89]. This test takes advantage of the intraspecific polymorphic variation, since the substitution ω ratios within species are expected to equal those ratios between species under a neutral scenario but differ under selective scenarios. An FDR correction was performed to account for multiple comparisons across genes for each test.

iii) We performed a set of codon-based selection tests for identifying specific sites within genes showing signatures of selection. Bayes Empirical Bayes (BEB) [90] was used to estimate the posterior probability of sites under positive selection following the M8 model from codeml results. Again, these results were compared with codon-based selection tests available in the package HyPhy, including SLAC (Single-Likelihood Ancestor Counting method) [91], FEL (Fixed Effects Likelihood method), IFEL (Internal Fixed Effects Likelihood), REL (Random Effects Likelihood method), MEME (Mixed Effects Model of Evolution) [92] and FUBAR tests (Fast Unconstrained Bayesian Approximation for inferring selection) [93].

iv) Finally, we investigated whether changes at the amino acid level were associated with altered biochemical properties using the PRIME test (PRoperty Informed Model of Evolution) following Conant-Stadler [94] and Atchley categories [95]. The five predefined amino acid properties proposed by Conant et al. [94] are: chemical composition of the side chain (C), residue polarity (RP), volume of the residue side chain (VR), isoelectric point of the side chain (IC), and hydropathy (H). The five composite properties proposed by Atchley et al. [95] are: polarity index (P), secondary structure factor (S), volume (V), refractivity (R), and isoelectric point (I). Although the method performs multiple tests on each site, p-values are reported after Bonferroni correction to control for the number of false positives.

Sites with evidence of positive selection were mapped onto the receptor’s topology predicted with the software TMHMM Server v2 ( and then diagrams of the 2D receptors structure were created using PROTTER [96].

Signatures of selection among gene domains

We used two approaches in order to investigate whether selection is acting on specific domains in transmembrane receptor genes (cytoplasmic, transmembrane or extracellular). First, the number of sites under positive selection was compared among genes and domains through a Generalized linear model using the R package GLM after data normalization using the logarithmic transformation. This analysis was done twice, for raw counts of sites under selection and after accounting for the size of each domain in order to test for the potential effect of domain size on the number of sites showing signatures of selection, since longer domains have higher chance to show sites under positive selection. Additionally, the McDonald-Kreitman test was performed for each domain independently.



adaptive Branch-Site random effects likelihood


Baja California


Branch-Site random effects likelihood test


Branch-site unrestricted statistical test for episodic diversification


Chemical composition of the side chain


Santa Catalina Island

Codeml M8 :

Codeml comparison between M7 and M8 models

Cyto :


D :

Fu and Li’s D test

D. ari :

D. arizonae

D. moj :

D. mojavensis

D. nav :

D. navojoa

D T :

Tajima’s D test

Extr :


F :

Fu and Li’s F test

F S :

Fu’s FS statistic



h :

Number of haplotypes

H d :

Haplotype diversity


Isoelectric point


Isoelectric point of the side chain

Ka/Ks :

Nonsynonymous (Ka)/synonymous (Ks) ratio

MK Cyto :

McDonald-Kreitman test for cytoplasmic domain

MK Extr :

McDonald-Kreitman test for extracellular domain

MK Total :

McDonald-Kreitman test for whole gene

MK Trans :

McDonald-Kreitman test for transmembrane domain


Mojave Desert

N :

Average number of sequences

N S :

Total number of sites


Polarity index


Partitioning approach for robust inference of selection


PRoperty Informed Model of Evolution




Residue polarity

S :

Number of variable sites


Secondary structure factor


Sonora Desert

Trans :





Volume of the residue side chain

θ W ns :

Theta Watterson for nonsynonymous sites

θ W s :

Theta Watterson for synonymous sites

θ W :

Total theta Watterson

π :

Total nucleotide diversity

π ns :

Nucleotide diversity for nonsynonymous sites

π s :

Nucleotide diversity for synonymous sites

Φ ST :

Genetic structure index


  1. Bernays EA, Chapman R. In: Bernays EA, Chapman RF, editors. Host-plant selection by phytophagous insects. New York: Chapman & Hall; 1994.

    Chapter  Google Scholar 

  2. Jaenike J. Host specialization in phytophagous insect. Annu Rev Ecol Syst. 1990;21:243–73.

    Article  Google Scholar 

  3. Bruce TJA, Wadhams LJ, Woodcock CM. Insect host location: a volatile situation. Trends Plant Sci. 2005;10:269–74.

    Article  CAS  Google Scholar 

  4. Matsubayashi KW, Ohshima I, Nosil P. Ecological speciation in phytophagous insects. Entomol Exp Appl. 2010;134:1–27.

    Article  Google Scholar 

  5. Hardy NB, Otto SP. Specialization and generalization in the diversification of phytophagous insects: tests of the musical chairs and oscillation hypotheses. Proc R Soc B Biol Sci. 2014;281:20132960.

    Article  Google Scholar 

  6. Agosta SJ. On ecological fitting, lant-insect associations, herbivore host shifts, and host plant selection. Oikos. 2006;114:556–65.

    Article  Google Scholar 

  7. Forbes AA, Devine SN, Hippee AC, Tvedte ES, Ward AKG, Widmayer HA, et al. Revisiting the particular role of host shifts in initiating insect speciation. Evolution (NY). 2016;71:1126–37.

    Article  Google Scholar 

  8. Thompson JN. Evolutionary ecology of the relationship between oviposition preference and performance of offspring in phytophagons insects. Entomol Exp Appl. 1988;47:3–14.

    Article  Google Scholar 

  9. Vertacnik KL, Linnen CR. Evolutionary genetics of host shifts in herbivorous insects: insights from the age of genomics. Ann NY Acad Sci. 2017;1389:186–212.

    Article  Google Scholar 

  10. Dworkin I, Jones CD. Genetic changes accompanying the evolution of host specialization in Drosophila sechellia. Genetics. 2009;181:721–36.

    Article  CAS  Google Scholar 

  11. Andersson MN, Lofstedt C, Newcomb RD. Insect olfaction and the evolution of receptor tuning. Front Ecol Evol. 2015;3:1–14.

    Google Scholar 

  12. Bohbot JD, Pitts RJ. The narrowing olfactory landscape of insect odorant receptors. Front Ecol Evol. 2015;3:1–10.

    Article  Google Scholar 

  13. Matsuo T. Genes for host-plant selection in Drosophila. J Neurogenet. 2008;22:195–210.

    Article  CAS  Google Scholar 

  14. McBride CS. Rapid evolution of smell and taste receptor genes during host specialization in Drosophila sechellia. Proc Natl Acad Sci U S A. 2007;104:4996–5001.

    Article  CAS  Google Scholar 

  15. McBride CS, Arguello JR. Five Drosophila genomes reveal nonneutral evolution and the signature of host specialization in the chemoreceptor superfamily. Genetics. 2007;177:1395–416.

    Article  CAS  Google Scholar 

  16. Simon JC, D’alençon E, Guy E, Jacquin-Joly E, Jaquiéry J, Nouhaud P, et al. Genomics of adaptation to host-plants in herbivorous insects. Brief Funct Genomics. 2015;14:413–23.

    Article  CAS  Google Scholar 

  17. Smadja CM, Canbäck B, Vitalis R, Gautier M, Ferrari J, Zhou JJ, et al. Large-scale candidate gene scan reveals the role of chemoreceptor genes in host plant specialization and speciation in the pea aphid. Evolution (NY). 2012;66:2723–38.

    Article  Google Scholar 

  18. Amrein H, Thorne N. Gustatory perception and behavior in Drosophila melanogaster. Curr Biol. 2005;15:673–84.

    Article  Google Scholar 

  19. Grabe V, Sachse S. Fundamental principles of the olfactory code. BioSystems. 2018;164:94–101.

    Article  Google Scholar 

  20. Touhara K, Vosshall LB. Sensing odorants and pheromones with chemosensory receptors. Annu Rev Physiol. 2009;71:307–32.

    Article  CAS  Google Scholar 

  21. Vosshall LB, Stocker RF. Molecular architecture of smell and taste in Drosophila. Annu Rev Neurosci. 2007;30:505–33.

    Article  CAS  Google Scholar 

  22. Sánchez-Gracia A, Vieira FG, Rozas J. Molecular evolution of the major chemosensory gene families in insects. Heredity (Edinb). 2009;103:208–16.

    Article  Google Scholar 

  23. Sánchez-Gracia A, Vieira FG, Almeida FC, Rozas J. Comparative genomics of the major chemosensory gene families in arthropods. Encyclopedia of Life Sciences (ELS). Wiley, Ltd: Chichester. 2011.

  24. Benton R. Multigene family evolution: perspectives from insect chemoreceptors. Trends Ecol Evol. 2015;30:590–600.

    Article  Google Scholar 

  25. Guo S, Kim J. Molecular Evolution of Drosophila Odorant Receptor Genes. Mol Biol Evol. 2007;24:1198–207.

    Article  CAS  Google Scholar 

  26. Almeida FC, Sánchez-Gracia A, Campos JL, Rozas J. Family size evolution in drosophila chemosensory gene families: a comparative analysis with a critical appraisal of methods. Genome Biol Evol. 2014;6:1669–82.

    Article  Google Scholar 

  27. Nozawa M, Nei M. Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proc Natl Acad Sci U S A. 2007;104:7122–7.

    Article  CAS  Google Scholar 

  28. Vieira FG, Rozas J. Comparative genomics of the odorant-binding and chemosensory protein gene families across the arthropoda: Origin and evolutionary history of the chemosensory system. Genome Biol Evol. 2011;3:476–90.

    Article  CAS  Google Scholar 

  29. Vieira FG, Sánchez-Gracia A, Rozas J. Comparative genomic analysis of the odorant-binding protein family in 12 Drosophila genomes: Purifying selection and birth-and-death evolution. Genome Biol. 2007;8:R235.

    Article  Google Scholar 

  30. Consortium (Drosophila 12 Genomes consortium). Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007;450:203–18.

    Article  Google Scholar 

  31. Gardiner A, Barker D, Butlin RK, Jordan WC, Ritchie MG. Drosophila chemoreceptor gene evolution: Selection, specialization and genome size. Mol Ecol. 2008;17:1648–57.

    Article  CAS  Google Scholar 

  32. Gardiner A, Butlin RK, Jordan WC, Ritchie MG. Sites of evolutionary divergence differ between olfactory and gustatory receptors of Drosophila. Biol Lett. 2009;5:244–7.

    Article  CAS  Google Scholar 

  33. Heed WB. Ecology and genetics of Sonoran Desert Drosophila. In: Brussard PF, editor. Ecol. Genet. Interface. New York: Springer-Verlag; 1978. p. 109–26.

    Chapter  Google Scholar 

  34. Ruiz A, Heed WB, Wasserman M. Evolution of the mojavensis cluster of cactophilic Drosophila with descriptions of two new species. J Hered. 1990;81:30–42.

    Article  CAS  Google Scholar 

  35. Matzkin LM, Markow TA. Transcriptional differentiation across the four subspecies of Drosophila mojavensis. In: Michalak P, editor. Speciat. Nat. Process. Genet. Biodivers: Nova Science Publishers, Inc; 2013. p. 119–35.

  36. Matzkin LM. Population transcriptomics of cactus host shifts in Drosophila mojavensis. Mol Ecol. 2012;21:2428–39.

    Article  Google Scholar 

  37. Matzkin LM, Watts TD, Bitler BG, Machado CA, Markow TA. Functional genomics of cactus host shifts in Drosophila mojavensis. Mol Ecol. 2006;15:4635–43.

    Article  CAS  Google Scholar 

  38. Mansourian S, Stensmyr MC. The chemical ecology of the fly. Curr Opin Neurobiol. 2015;34:95–102.

    Article  CAS  Google Scholar 

  39. Turner SL, Ray A. Modification of CO2 avoidance behaviour in Drosophila by inhibitory odorants. Nature. 2009;461:277–81.

    Article  CAS  Google Scholar 

  40. Ronderos DS, Lin C-C, Potter CJ, Smith DP. Farnesol-detecting olfactory neurons in Drosophila. J Neurosci. 2014;34:3959–68.

    Article  Google Scholar 

  41. Matzkin LM. Ecological genomics of host shifts in Drosophila mojavensis. In: Landry C, Aubin-Horth N, editors. Adv. Exp. Med. Biol, vol. 781. New York: Springer; 2014. p. 233–47.

    Google Scholar 

  42. Matzkin LM. The molecular basis of host adaptation in cactophilic drosophila: Molecular evolution of a glutathione S-transferase gene (GstD1) in Drosophila mojavensis. Genetics. 2008;178:1073–83.

    Article  CAS  Google Scholar 

  43. Smith G, Lohse K, Etges WJ, Ritchie MG. Model-based comparisons of phylogeographic scenarios resolve the intraspecific divergence of cactophilic Drosophila mojavensis. Mol Ecol. 2012;21:3293–307.

    Article  Google Scholar 

  44. Reed LK, Nyboer M, Markow TA. Evolutionary relationships of Drosophila mojavensis geographic host races and their sister species Drosophila arizonae. Mol Ecol. 2007;16:1007–22.

    Article  CAS  Google Scholar 

  45. Machado CA, Matzkin LM, Reed LK, Markow TA. Multilocus nuclear sequences reveal intra- and interspecific relationships among chromosomally polymorphic species of cactophilic Drosophila. Mol Ecol. 2007;16:3009–24.

    Article  CAS  Google Scholar 

  46. Anisimova M, Nielsen R, Yang Z. Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003;164:1229–36.

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Arenas M, Posada D. Coalescent simulation of intracodon recombination. Genetics. 2010;184:429–37.

    Article  CAS  Google Scholar 

  48. Arenas M, Posada D. The influence of recombination on the estimation of selection from coding sequence alignments. In: Fares MA, editor. Nat. Sel. Methods Appl. Boca Raton: CRC Press/Taylor & Francis; 2014. p. 112–25.

    Google Scholar 

  49. Matzkin LM. Population genetics and geographic variation of Alcohol dehydrogenase (Adh) paralogs and Glucose-6-Phosphate dehydrogenase (G6pd) in Drosophila mojavensis. Mol Biol Evol. 2004;21:276–85.

    Article  CAS  Google Scholar 

  50. Matzkin LM, Eanes WF. Sequence variation of alcohol dehydrogenase (Adh) paralogs in cactophilic Drosophila. Genetics. 2003;163:181–94.

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Ramírez-Soriano A, Ramos-Onsins SE, Rozas J, Calafell F, Navarro A. Statistical power analysis of neutrality tests under demographic expansions, contractions and bottlenecks with recombination. Genetics. 2008;179:555–67.

    Article  Google Scholar 

  52. Nason JD, Hamrick JL, Fleming TH. Historical vicariance and postglacial colonization effects on the evolution of genetic structure in Lophocereus, a Sonoran desert columnar cactus. Evolution (N Y). 2002;56:2214–26.

    Google Scholar 

  53. Dyer RJ, Nason JD. Population Graphs: the graph theoretic shape of genetic structure. Mol Ecol. 2004;13:1713–27.

    Article  Google Scholar 

  54. Smith CI, Farrell BD. Range expansions in the flightless longhorn cactus beetles, Moneilema gigas and Moneilema armatum, in response to Pleistocene climate changes. Mol Ecol. 2005;14:1025–44.

    Article  CAS  Google Scholar 

  55. Fellows DP, Heed WB. Factors affecting host plant selection in Desert- adapted cactiphilic Drosophila. Ecology. 1972;53:850–8.

    Article  Google Scholar 

  56. Kircher HW. Chemical composition of cacti and its relationship to Sonoran Desert Drosophila. In: Barker JSF, Starmer WT, editors. Ecol. Genet. Evol. Cactus-Yeast-Drosophila Syst. New York: Academic Press; 1982. p. 143–58.

    Google Scholar 

  57. Fogleman JC, Starmer WT, Heed WB. Larval selectivity for yeast species by Drosophila mojavensis in natural substrates. Proc Natl Acad Sci. 1981;78:4435–9.

    Article  CAS  Google Scholar 

  58. Fogleman JC, Danielson PB. Chemical interactions in the cactus-microorganism- Drosophila model system of the Sonoran Desert. Am Zool. 2001;41:877–89.

    CAS  Google Scholar 

  59. Date P, Crowley-Gall A, Diefendorf AF, Rollmann SM. Population differences in host plant preference and the importance of yeast and plant substrate to volatile composition. Ecol Evol. 2017;7:3815–25.

    Article  Google Scholar 

  60. Date P, Dweck HKM, Stensmyr MC, Shann J, Hansson BS, Rollmann SM. Divergence in olfactory host plant preference in D. mojavensis in response to cactus host use. PLoS One. 2013;8:1–10.

    Article  Google Scholar 

  61. Ross CL, Markow TA. Microsatellite variation among diverging populations of Drosophila mojavensis. J Evol Biol. 2006;19:1691–700.

    Article  CAS  Google Scholar 

  62. Koshland D, Goldbeter A, Stock J. Amplification and adaptation in regulatory and sensory systems. Science. 1982;217:220–5.

    Article  CAS  Google Scholar 

  63. Mobashir M, Schraven B, Beyer T. Simulated evolution of signal transduction networks. PLoS One. 2012;7:1–10.

    Article  Google Scholar 

  64. Benton R, Sachse S, Michnick SW, Vosshall LB. Atypical membrane topology and heteromeric function of Drosophila odorant receptors in vivo. PLoS Biol. 2006;4:240–57.

    Article  CAS  Google Scholar 

  65. Isono K. Molecular and cellular designs of insect taste receptor system. Front Cell Neurosci. 2010;4:1–16.

    Google Scholar 

  66. Coleman JM, Benowitz KM, Jost AG, Matzkin LM. Behavioural evolution accompanying host shifts in cactophilic Drosophila larvae. Ecol Evol. 2018;

    Article  Google Scholar 

  67. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–9.

    Article  Google Scholar 

  68. Stephens M, Smith NJ, Donnelly P. A new statistical method for haplotype reconstruction from population data. Am J Hum Genet. 2001;68:978–89.

    Article  CAS  Google Scholar 

  69. Stephens M, Donnelly P. A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet. 2003;73:1162–9.

    Article  CAS  Google Scholar 

  70. Stephens M, Scheet P. Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet. 2005;76:449–62.

    Article  CAS  Google Scholar 

  71. Librado P, Rozas J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25:1451–2.

    Article  CAS  Google Scholar 

  72. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123:585–95.

    CAS  PubMed  PubMed Central  Google Scholar 

  73. Fu Y-X, Li W-H. Statistical tests of neutrality of mutations. Genetics. 1993;133:693–709.

    CAS  PubMed  PubMed Central  Google Scholar 

  74. Fu Y-X. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics. 1997;147:915–25.

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Kosakovsky-Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW. GARD: a genetic algorithm for recombination detection. Bioinformatics. 2006;22:3096–8.

    Article  Google Scholar 

  76. Kosakovsky-Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW. Automated phylogenetic detection of recombination using a genetic algorithm. Mol Biol Evol. 2006;23:1891–901.

    Article  Google Scholar 

  77. Kosakovsky-Pond SL, Frost SDW, Muse SV. HyPhy: Hypothesis testing using phylogenies. Bioinformatics. 2005;21:676–9.

    Article  Google Scholar 

  78. Wright S. The genetic structure of populations. Ann Eugen. 1951;15:323–54.

    Article  CAS  Google Scholar 

  79. Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution (N. Y). 1984;38:1358–70.

    CAS  Google Scholar 

  80. Excoffier L, Smouse PE, Quattro JM. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics. 1992;131:479–91.

    CAS  PubMed  PubMed Central  Google Scholar 

  81. Excoffier L, Lischer HEL. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010;10:564–7.

    Article  Google Scholar 

  82. Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001;17:754–5.

    Article  CAS  Google Scholar 

  83. Posada D. jModelTest: Phylogenetic model averaging. Mol Biol Evol. 2008;25:1253–6.

    Article  CAS  Google Scholar 

  84. Drummond A, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7:1–8.

    Article  Google Scholar 

  85. Yang Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.

    Article  CAS  Google Scholar 

  86. Zhang J, Nielsen R, Yang Z. Evaluation of an improved Branch-Site Likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22:2472–9.

    Article  CAS  Google Scholar 

  87. Murrell B, Weaver S, Smith MD, Wertheim JO, Murrell S, Aylward A, et al. Gene-wide identification of episodic selection. Mol Biol Evol. 2015;32:1365–71.

    Article  CAS  Google Scholar 

  88. Kosakovsky-Pond SL, Murrell B, Fourment M, Frost SDW, Delport W, Scheffler K. A random effects Branch-Site model for detecting episodic diversifying selection. Mol Biol Evol. 2011;28:3033–43.

    Article  CAS  Google Scholar 

  89. McDonald JH, Kreitman M. Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991;351:652–4.

    Article  CAS  Google Scholar 

  90. Yang Z, Wong WSW, Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22:1107–18.

    Article  CAS  Google Scholar 

  91. Kosakovsky-Pond SL, Frost SDW. Not so different after all: A comparison of methods for detecting amino acid sites under selection. Mol Biol Evol. 2005;22:1208–22.

    Article  Google Scholar 

  92. Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Kosakovsky-Pond SL. Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 2012;8:e1002764.

    Article  CAS  Google Scholar 

  93. Murrell B, Moola S, Mabona A, Weighill T, Sheward D, Kosakovsky-Pond SL, et al. FUBAR: A fast, unconstrained bayesian Approximation for inferring selection. Mol Biol Evol. 2013;30:1196–205.

    Article  CAS  Google Scholar 

  94. Conant GC, Wagner GP, Stadler PF. Modeling amino acid substitution patterns in orthologous and paralogous genes. Mol Phylogenet Evol. 2007;42:298–307.

    Article  CAS  Google Scholar 

  95. Atchley WR, Zhao J, Fernandes AD, Druke T. Solving the protein sequence metric problem. Proc Natl Acad Sci. 2005;102:6395–400.

    Article  CAS  Google Scholar 

  96. Omasits U, Ahrens CH, Müller S, Wollscheid B. Protter: Interactive protein feature visualization and integration with experimental proteomic data. Bioinformatics. 2014;30:884–6.

    Article  CAS  Google Scholar 

Download references


We would like to thank Laurel Brandsmeier for the laboratory work she performed on this project. We also would like to thank two anonymous reviewers whose comments improved this manuscript.


This project was initially supported by a National Science Foundation grant DEB-1219387 to LMM and later by a National Science Foundation grant IOS-1557697 to LMM.

Availability of data and materials

All nucleotide data are available for each gene in GenBank (Gr63a: MH750445 - MH750487, Or67c: MH750488 - MH750550, Or83c: MH750551 - MH750612 and Or83c2: MH750613 - MH750680).

Author information

Authors and Affiliations



The project was conceived and designed by LMM. CWA performed the majority of the laboratory work. FD performed most of the statistical analyses. All authors participated in the data analysis and the writing of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Luciano M. Matzkin.

Ethics declarations

Ethics approval and consent to participate

This section is not applicable to the present study.

Consent for publication

This section is not applicable to the present study.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:

Table S1. Primer sequences (Forward and Reverse) for each gene and species. Table S2. Descriptive parameters for haplotype, nucleotide diversity and neutrality tests. Results are shown for each gene in each D. mojavensis population and D. arizonae (See Methods for abbreviations). Table S3. Neutrality tests for each gene in each D. mojavensis population and D. arizonae (See Methods for abbreviations). Table S4. Divergence between D. mojavensis and D. arizonae species and populations within D. mojavensis. Ka/Ks ratios and genetic structure estimated by ΦST are given for pairwise comparisons between species and populations for each gene (See Methods for abbreviations). Table S5. McDonald-Kreitman test for each D. mojavensis and D. arizonae. A summary of P-values is shown for each gene using D. arizonae and D. navojoa as an outgroup (See Methods for abbreviations). (DOCX 50 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Diaz, F., Allan, C.W. & Matzkin, L.M. Positive selection at sites of chemosensory genes is associated with the recent divergence and local ecological adaptation in cactophilic Drosophila. BMC Evol Biol 18, 144 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: