Skip to main content

Divergence and adaptive evolution of the gibberellin oxidase genes in plants



The important phytohormone gibberellins (GAs) play key roles in various developmental processes. GA oxidases (GAoxs) are critical enzymes in GA synthesis pathway, but their classification, evolutionary history and the forces driving the evolution of plant GAox genes remain poorly understood.


This study provides the first large-scale evolutionary analysis of GAox genes in plants by using an extensive whole-genome dataset of 41 species, representing green algae, bryophytes, pteridophyte, and seed plants. We defined eight subfamilies under the GAox family, namely C19-GA2ox, C20-GA2ox, GA20ox,GA3ox, GAox-A, GAox-B, GAox-C and GAox-D. Of these, subfamilies GAox-A, GAox-B, GAox-C and GAox-D are described for the first time. On the basis of phylogenetic analyses and characteristic motifs of GAox genes, we demonstrated a rapid expansion and functional divergence of the GAox genes during the diversification of land plants. We also detected the subfamily-specific motifs and potential sites of some GAox genes, which might have evolved under positive selection.


GAox genes originated very early—before the divergence of bryophytes and the vascular plants and the diversification of GAox genes is associated with the functional divergence and could be driven by positive selection. Our study not only provides information on the classification of GAox genes, but also facilitates the further functional characterization and analysis of GA oxidases.


The gibberellins (GAs), one of important phytohormones, form a large family of diterpene hormones that are involved in various growth and developmental processes in plants [16]. Among hundreds of plant GAs, only a few, such as GA1 and GA4 are bioactive [6]. The function of bioactive GAs generally depends on their concentration in a given tissue, and this is mainly affected by their biosynthesis and/or deactivation [7]. In Arabidopsis, the synthesis pathway of GAs has been elucidated, and most genes encoding enzymes to catalyze GA biosynthesis have been identified [711]. In the GA synthesis pathway, three classes of enzyme (i.e., Terpene synthases (TPSs), P450s and GA oxidases (GAoxs)) are required for the biosynthesis of bioactive GAs from geranylgeranyl diphosphate (GGDP, a common precursor of terpenes in plants), and the pathway can be divided into two main steps: early and late. The early steps are catalyzed by a series of enzymes, i.e., CPS, KS, KO, and KAO, which are all encoded by single genes. The enzymes catalyzing later steps (i.e., GA2 oxidase, GA20 oxidase, and GA3 oxidase) are encoded by small gene families, and these oxidases are all soluble 2-oxoglutarate-dependent dioxygenases (2OGDs) [7, 10, 12, 13]. In comparison with the biosynthesis enzymes in the early steps, those in the later steps are differentially regulated by developmental and environmental cues, and these enzymes play key roles in the regulation of bioactive GA levels. For example, the loss-of-function in GA20 oxidase (GA20ox) and GA3 oxidase (GA3ox) can generate dwarf phenotypes, such as the well-known Green Revolution sd-1 [10, 1417]. Interestingly, the function of GA2 oxidase (GA2ox) is to decrease the levels of active GAs rather than increasing the GA level as do GA20ox and GA3ox [18]. So far, most studies have focused on the functions of GA2ox, GA20ox and GA3ox [10, 12, 1821]. However, little is known about the evolutionary history of the GAox genes [13, 2224].

GAoxs are ubiquitous in vascular plants, but no GAs has ever been isolated and chemically identified from bryophytes [2528]. Those authors hypothesized that GAs may have first appeared in ancient pteridophytes and that the hormonal signaling pathway developed later during the evolution of land plants. However, several putative GA biosynthetic genes have been found in the Physcomitrella patens genome, suggesting the GAs could appear in the earlier lineages of land plants [29, 30]. Although the pathways leading to production of bioactive GAs have been studied in plants [3133], the evolutionary history of the GA synthetic pathway and its catalyzing enzyme genes, especially of GAox genes remains unclear.

The GAox family has been evolutionarily analyzed, but only focusing on the GA20ox, GA3ox and GA2ox subfamilies of a few angiosperm species [13, 23, 24]. Han and Zhu (2011) identified 61 GA oxidase genes from rice (Oryza sativa), Arabidopsis and soybean (Glycine max), and performed the phylogenetic analysis of these GAox genes. They were grouped into four clades (GA20ox, GA3ox, C19-GA2ox, and C20-GA2ox.), but their relationship remains unclear, especially of new putative GA20ox homologs. For example, four new genes were described under the GA20ox subfamily (i.e., OsGA20ox5, OsGA20ox6, OsGA20ox7 and OsGA20ox8), but three of them (OsGA20ox5, OsGA20ox6, and OsGA20ox8) did not reside within the GA20ox subgroup according to their phylogenetic tree. Meanwhile, Han and Zhu (2011) showed that the GA2ox subfamily is not monophyletic [23], which was supported by Giacomelli and her collegues’ study [24]. In addition, there is controversy over the relationship between GA20ox, GA3ox, C19-GA2ox and C20-GA2ox [23, 24]. Giacomelli et al. (2013) found the sister relationship between C19-GA2ox and GA3ox, while the close relationship was found between GA3ox and C20-GA2ox in Han and Zhu’s study. Not only the relationship between GAox genes but the evolutionary history and expansion mechanisms of the GAox genes are also poorly understood.

The GAox family belongs to the 2OGDs superfamily. The catalytic core of GAoxs has a double-stranded β-helix (DSBH) fold containing a HX[DE] dyad (where X could be any amino acid) and a conserved carboxy-terminal histidine which together chelate a single iron atom. Specifically, the DSBH region has seven conserved strands that are common to all these kinds of proteins and is arranged in two sheets in a jelly-roll topology [34]. To date, the function of several GAox genes has been determined but their gene structure not studied systematically [10, 20, 3540]. Of GA oxidases, GA 20ox was first isolated from cotyledons of immature pumpkin seeds by Lange et al. (1994) [41]. Structurally this gene has characteristic motifs that are highly conserved in 2OGDs. Later on, other GAox genes were isolated and characterized in plants, such as Carrizo citrange [42], poplar [43], Phaseolus coccineus and Arabidopsis [19], rice [38, 44, 45], spinach [37, 46], watermelon [39], pea [47, 48]. On the characteristic motifs of different GA oxidases, no consensus has come out. Meanwhile, neither diagnostic sequences nor gene structures have been established for identification of GAox subfamilies. For example, four GAox homologs were found in the Physcomitrella patens genome, but it is hard to conclude what they are and which gene subfamily they belong to [30]. Thus, characterizing the motif specific to every gene subfamily would be of particular value. In brief, the study of gene structures and conserved motifs will not only provide an insight into the classification of GAox genes, but also their evolutionary history and functional diversification.

Almost all aspects of the functional divergence of genes are in some way linked to gene duplications, which occur ubiquitously [4954]. What was the evolutionary force driving the functional divergence of genes or gene families? Generally, gene duplications are thought to be an important precursor to the functional divergence of genes, and selection is the main evolutionary force driving gene function diversification [22, 52, 5559]. We investigated whether the functional divergence of GAox genes was driven by positive selection, in particular for four newly described ones. We first analyzed the distribution of positively selected amino acids in the characteristic motifs of genes. Then, we used the programs FunDi and GroupSim to validate the hypothesis that the conserved domains were associated with divergence (change) in function among GAs, and performed the GroupSim to identify the specificity determining positions (SDPs) [6062].

With the availability of more whole genome sequence data and new genomics tools developed in model plants, the study of GA biosynthetic pathways, GAox genes and their evolution is now feasible. To reach the above objectives, we performed a comprehensive analysis of the GAox gene homologs in 41 species, using whole genome sequencing data, covering the main lineages of green plants from green algae, bryophytes, pteridophytes and gymnosperms to angiosperms. First, a genome-wide search was conducted and GAox homologs identified. Then, we performed phylogenetic analysis to explore the evolutionary history of GAox genes and established the diagnostic features of conserved motifs for every GAox subfamily. Next, the expansion mechanism and evolutionary forces associated with the GAox genes were investigated by analyzing the evolutionary history, structure and functional divergence of these genes. Finally, we compared the GA biosynthesis pathways in bryophytes, pteridophytes and seed plants, to explore the evolution of pathways and their relationships with the adaptation of plants to terrestrial environments.


Sequence acquisition and compilation

In order to investigate the evolutionary history of the GAox family in plants, all possible sequences of GAox homologs need to be sourced from whole genome sequencing data. GAox belongs to the 2OGDs superfamily, which has a very conserved and characteristic 2OGDs motif (Pfam id, PF03171) [7, 34, 63]. Functional GAox genes of Arabidopsis and rice, and four newly found GAox genes (OsGA20ox5, OsGA20ox6, OsGA20ox7, and OsGA20ox8; Han and Zhu, 2011) were used as queries to BLAST against the whole genome sequences of 41 species. The sequences with the active site core of 2OGD, and the motifs such as NYYPXCXXP, LPWKET, LSWSEA were recognized as putative GAox genes. GAox homologs of each species were downloaded from the following databases:; http://congenie. org;;; Then, an extensive screening of databases (dbEST, GenBank) was performed using different BLAST algorithms to obtain all GAox homologs [64]. Finally, we manually checked all the search results to reduce hits with partial conserved-domains and other false positives, using BioEdit (v7.0.5) [65]. Sequences were accepted from BLAST results as long as they shared at least 35 % identity, and had an expected threshold lower than 1e-40 with the conserved domain in the 2OGDs family. Meanwhile, the gene prediction and annotation analysis was considered for the GAox homologs acquisition. Then, 854 GAox homologous were obtained from 41 species; these covered all the main lineages of plants including unicellular green algae, bryophytes, pteridophytes, gymnosperms and angiosperms. Additionally, two ACOs (1-aminocyclopropane-1-carboxylate) of rice were included in phylogenetic analyses considering both ACOs and GA oxidases belong to the 2OGD superfamily.

Multiple sequence alignment and phylogenetic analysis

A protein multiple sequence alignment (MSA) was generated using MUSCLE (Multiple Sequence Comparison by Log-Expectation) software [66] with the default settings. Experimentally characterized GAox genes of Arabidopsis thaliana and Oryza sativa were used as criteria to detect potential problems such as frame shift mutations. Protein-coding DNA sequences (CDS) were aligned based on the protein alignments in the DAMBE with the default parameters, then converted the CDS alignments into PAML format for further analyses [67, 68].

RAxML 7.2.6 (Randomized Axelerated Maximum Likelihood) [69] was used to constructed maximum-likelihood (ML) phylogenetic trees under the JTT amino acid substitution model which was derived from ProtTest 2.4 [70], with an estimated gamma distribution parameter, and optimized starting with a BIONJ tree. Statistical support for the nodes on the maximum-likelihood tree was evaluated by bootstrap analysis with 500 iterations.

Gene structure, motif identification and homology modeling analysis

To study the evolution of the GAox gene structure, the DNA sequences corresponding to each predicted gene and the gene structure of the GAox gene family were all retrieved from

To obtain the characteristic motif of each GAox subfamily, the following steps were taken: 1) the web-based multiple expectation maximization for motif elicitation (MEME) analysis (, [71]) was employed with the following parameters: maximum number of motifs = 15, and optimum motif widths constrained to between 6 and 30 residues. 2) The NCBI-CDD (Conserved Domain Database) ( and the SMART (Simple Modular Architecture Research Tool, databases were used for identification and function explanation of those putative motifs [72]. 3) Multiple sequence alignment of the GAox genes was achieved using MUSCLE software, to find the characteristic motifs of every GAox subfamily. 4) To examine the relationship between the characteristic motif of every GAox subfamily and the putative functional region, the 3D structures of eight OsGAox genes, i.e., OsGA2ox7, OsGA2ox5, OsGA20ox4, OsGA3ox2, OsGA20ox5, OsGA20ox6, OsGA20ox7 and OsGA20ox8, were predicted following the homology modeling using the Swiss-Model ( with the default parameters, and the higher reliability PDBs were selected as templates according to the global quality estimation (GMQE) and qualitative model energy analysis (QMEAN) scores [7375].

Three-dimensional model images with the characteristic motifs and the putative positive selection and functional divergence sites were manipulated and rendered in PyMOL ( [76]. All the sequence logos were generated using the online Weblogo platform (

Testing positive selection and functional divergence

We used the programs PAML 4.8, FunDi and GroupSim to detect sites involved in the functional divergence of GAox sequences and the adaptive evolution of GAox subfamilies. Sites identified as being functionally divergent or positively selective by all or two of these three programs were highlighted in the 3D protein structure of GA oxidases, which could be tested with direct experimental techniques in the future.

In order to reduce computation time and the potential impact of synonymous site saturation due to sequence divergence [77], the positive selection test was conducted on the site and branch-site models with 46 less divergent GAox homologs from six grass species of monocots. Two ACOs of rice were added in phylogenetic analysis of GAox genes. We performed our analysis using alignments inferred by phylogeny-aware multiple sequence alignment program Probabilistic Alignment Kit (PRANK) [78, 79], which has been reported to outperform other aligners in simulations [80]. In addition, by GUIDANCE filtering [81], the PRANK alignment has a higher average GUIDANCE score of 0.70, while the MAFFT, MUSCLE and CLUSTALW alignments is 0.65, 0.63 and 0.66 respectively, though the GUIDANCE cannot tell which alignment is better.

Selective pressure was tested based on the phylogeny of GAox genes by comparing the nonsynonymous/synonymous substitution ratios (ω = d N /d S ) with ω = 1, < 1, and > 1, which indicate neutral evolution, purifying selection, and positive selection, respectively. To test the functional divergence of different GAox subfamilies, the codeml program in the PAML 4.8 package was used to investigate possible selection acting on different gene subfamilies [82]. We selected F3X4 as codon frequency model, as it accounts for the most important feature of the mutation process, the unequal base frequencies, and the correction for transition/transversion bias is obtained by ML estimation of the kappa parameter [8285]. The equilibrium frequencies of codons and nucleotides among subfamilies are measured by the DAMBE. Considering the fact that positive selection often operates episodically on a few amino acid sites in a small number of lineages of a phylogenetic tree [86, 87], the modified branch-site model A was run for each gene in each lineage across the phylogenetic tree of the monocots. The LRT (likelihood ratio test) is a general method for testing assumptions (model parameters) through comparison of two competing hypotheses. We used branch-site model A to construct branch-site test 2, which is also called the branch-site test of positive selection. The null and alternative hypotheses are as follows: Null hypothesis (branch site model A, with ω2 = 1 fixed): model = 2; NSsites = 2; fix_omega = 1; omega = 1. Alternative hypothesis (branch site model A, with ω2 estimated): model = 2; NSsites = 2; fix_omega = 0; omega = 1.5 (or any value > 1). Because of very high sequence divergence, and the likelihood of saturation of synonymous changes, we treated the branch-site test as an exploratory analysis. Additional analyses at the codon level (sites models) and amino acid level (described below) were employed to more formally assess the hypothesis that GAoxs have been subject to functional divergence following the major gene duplication events. Therefore, we analyzed separately each of eight subfamilies (less divergent) under the sites models, only with GAox homologs of angiosperms. Unlike branch-site model, site-specific models allowing ω to vary among sites [55, 88, 89] were used to determine whether particular amino-acid residues within GAox families have been subject to positive selection. We use an LRT comparing M0 (one-ratio) with M3 (discrete) (κ = 3) to test for variable selective pressure among sites, and two LRTs to test for sites evolving by positive selection, comparing: M1a (Nearly Neutral) against M2a (Positive Selection), and M7 (beta) (κ = 10) against M8 (beta & ω) (κ = 10). The PP (posterior probability) for the sites under positive selection was calculated by the Bayes empirical Bayes (BEB) method [90].

Default options were used in the GroupSim and FunDi using QmmRAxML [91]. In order to evaluate the predictions made by FunDi, we analyzed all the comparisons of eight GAox subfamilies, except for the C19-GAox/C20-GA2ox and C20-GA2ox/GAox-C, due to their large datasets (gene homologs and the alignments) and computation cost (Additional file 1: Table S3). The sites of each subfamily motif were highlighted with red if they occurred in more than five comparisons (Fig. 1). Generally, sites with a functional divergence score above 0.5 were considered to be functionally divergent in order to identify the maximum number of possible candidate sites for consideration [60, 62]. In addition, we analyzed the subfamily characteristic motifs, and mapped those sites detected by two or three programs (PAML 4.8, FunDi and GroupSim) on the 3D protein structures (Figs. 1 and 2). We found that these sites are those positively selected sites detected by the branch-site model of PAML 4.8 with PP ≥ 95 % and those under functional divergence predicted by FunDi and GroupSim with PP ≥ 99 %.

Fig. 1

Specific conserved domains of the eight GA oxidase subfamilies and sites predicted to be functionally divergent by at least two of the three programs (PAML 4.8, FunDi, and GroupSim). The overall height of each stack indicates the sequence conservation at that position, whereas the height of symbols within each stack reflects the relative frequency of the corresponding amino acid. Green arrows mean that the sites in these columns are under functional divergence detected by GroupSim with PP ≥ 99 %; the sites highlighted in red are under functional divergence detected by FunDi with PP ≥ 99 %; the sites highlighted with blue star are positive selected sites detected by PAML 4.8 with PP ≥ 95 %

Fig. 2

The structure of the eight OsaGAoxs. The characteristic conserved domain of each gene is highlighted in purple, amino acid residues that in the characteristic conserved domains identified by two and three programs are highlighted in blue and red, respectively. Residues in that bind the active-site Fe (yellow ball) and those that interact with the 5-carboxylate of 2-oxoglutarate are highlighted in yellow


Identification and distribution of GAox genes in plants

Our BLAST search results revealed 854 sequences of GAox homologs from the 41 species with whole genome sequencing data; these represent the main lineages of green plants (Additional file 2: Table S2). In algae, 6 putative GAox homologs were found based on the catalytic core amino acid sequences of 2OGDs in the carboxyl terminus (Fig. 3). As shown in Additional file 2: Table S2, the GAox genes varied across the different species and plant taxonomic groups, and there exist different distribution patterns for GAox subfamilies, especially for GAox-A, GAox-B, GAox-C and GAox-D. For instance, all 37 land plants have GAox gene homologs from the subfamilies C19-GA2ox, C20-GA2ox, GA20ox and GA3ox, while other subfamily genes are not present in one or more taxa, especially in some basal lineages of the land plants. In Physcomitrella patens genome, putative GAox homologs were only found in C19-GA2ox, GA3ox and GAox-C. There were no homologs of C20-GA2ox and GAox-B in a pteridophytes Selaginella moellendorffii. Homologs of C20-GA2ox and GA20ox were not found in a gymnosperm Picea abies. Of angiosperms, monocotyledons have all eight GAox subfamily homologs, while basal angiosperm Amborella trichopoda and some eudicotyledons have no GAox-D homologs in their genome. In addition, not only the distribution pattern but the gene sequence and structure were also varied between GAox genes of algae and land plants (Additional file 3: Data S1 and Additional file 4: Figure S1).

Fig. 3

Active site core of 2-oxoglutarate-dependent dioxygenases. Amino acid residues that bind the active-site Fe and those that interact with the 5-carboxylate of 2-oxoglutarate are highlighted in green and red, respectively. Six GAox homologs from algae are highlighted in blue

Phylogenetic analysis of plant GAox genes

According to the phylogenetic analysis of 854 GAox homologs, the maximum-likelihood (ML) tree showed that all GAox homologs from green algae were clustered into a single clade (Alga clade), while those from land plants formed another main clade (Embryophyta clade), which was further divided into eight subclades with high statistical support, corresponding to the subfamilies C19-GA2ox, C20-GA2ox, GA20ox, GA3ox, GAox-A, GAox-B, GAox-C and GAox-D, respectively (Fig. 4a). Out of those, the latter four subfamilies corresponding to OsGA20ox5, OsGA20ox6, OsGA20ox7 and OsGA20ox8 reported in Han and Zhu’s study (2011) were characterized for the first time. Each of these new subfamilies (i.e., GAox-A, GAox-B, GAox-C and GAox-D) formed a monophyletic group, respectively (Fig. 4a), and they are phylogenetically distant from the GA20ox subfamily. GAox homologs from Physcomitrella patens, Selaginella moellendorffii, Picea abies and Amborella trichopoda resided in the basal lineages of the tree, which is consistent with previous studies on the phylogenetic position of these species. In addition, for angiosperm species, subfamilies C19-GA2ox, C20-GA2ox, and GA20ox exhibited a distinct binary branching pattern in the phylogenetic tree, which suggests that large scale genome duplications may have played a critical role in the evolution of these gene subfamilies.

Fig. 4

Phylogenetic tree and gene structure of the GA oxidase in plants. a The maximum likelihood tree of 41 species’ GA oxidase genes constructed on 854 deduced full-length peptide amino acid sequences and rooted with homologous sequences of algae. b The sketch map of GA oxidases structure: Exons and introns are indicated by blue boxes and yellow lines, respectively. For more details of the GA oxidase structures, see Additional file 4

Characteristic motifs of GAox subfamilies

When we used MEME to identify and examine the conserved motifs of GAoxs, we found a total of 15 conserved motifs (Table 1). Among these 15 motifs, 13 were shared by most GAoxs, and motifs 1 and 2 represent the catalytic core of 2OGDs. Motifs 7, 9, 11 and 12, were found in different GAoxs. For example, motif 7 was absent from all the GA20ox and GA3ox genes, whereas motifs 9 and 11 were not observed in any of the GA2ox genes. Interestingly, motifs 7, 9 and 12 were all localized in N-terminals of the amino acid sequences. The alignment of N-terminal amino acid sequences of all GAox gene homologs revealed eight types of gene sequence (Fig. 1 and Additional file 3: Data S1), which correspond to the C19-GA2ox, C20-GA2ox, GA20ox, GA3ox, GAox-A, GAox-B, GAox-C and GAox-D subfamilies, respectively. Figure 1 shows that there are 16 to 21 amino acids in the characteristic motifs specific to the eight subfamilies of the GAox family, and the motifs of subfamilies GAox-A, GAox-B, GAox-C and GAox-D have more variable amino acid sites compared with the others. The subfamilies C19-GA2ox, C20-GA2ox, GA20ox and GA3ox have been studied previously, while GAox-A, GAox-B, GAox-C and GAox-D are proposed for the first time in this study; they include the newly named genes OsGA2ox5, OsGA20ox5, OsGA20ox6, OsGA20ox7 and OsGA20ox8 in rice, respectively [23].

Table 1 Motif distribution in GA oxidases

Comparative gene structure analysis of 854 GAox gene sequences revealed that gene homologs of every GAox subfamily often have similar exon-intron structures, but the gene structure is more variable in subfamilies C19-GA2ox and C20-GA2ox, which had two kinds of these structures (Fig. 4b and Additional file 4: Figure S1). Most GAox gene subfamilies have three exons, but this is not the case in subfamilies GAox-C and GA3ox. The GA3ox genes have two exons, and the GAox-C genes generally have four exons. However, in monocots of subfamily GAox-C, three exons were also found in some GAox homologs.

Positive selection and functional divergence among GAox subfamilies

The phylogenetic trees were constructed based on the sequences of 46 monocots and two ACOs in order to investigate selection pressure among eight GAox subfamilies and selected amino acid sites in characteristic motifs. The LRTs for M2 vs. M1 (2ΔL = 0, p > 0.05) suggested that the positive selection model (M2) was not significantly better than the nearly neutral model (M1). But models M3 and M8 fit the data significantly better than the null models M0 and M7 (for M3 vs. M0, 2ΔL = 1946.50, p < 0.001; for M8 vs. M7, 2ΔL = 244.99, p < 0.001), they identify several sites with an ω value significantly greater than 1. At the PP >95 % level, 13 amino acid sites were identified under positive selection by M8 (Table 2). As an exploratory analysis, the branch-site test suggested that all eight GAox subfamilies were under positive selection (p < 0.05) (Fig. 5). Bayes Empirical Bayes (BEB) analysis showed that at the PP > 95 %, branch-site model A identified 2/13/29/8/1/12/2/29 sites as being potentially subjected to positive selection on the C19-GA2ox/C20-GA2ox/GA20ox/GA3ox/GAox-A/GAox-B/GAox-C/GAox-D branches, respecitvely.

Table 2 Summary statistics for detecting selection using branch-site and site models of PAML4.8
Fig. 5

ML tree of the GA oxidases used to test positive selection. Numbers on the tree represent bootstrap values

The separate analyses of the eight subfamilies under the site models showed that the selective pressure could vary among amino acid sites in eight subfamilies by comparison of the M0 and M3 for all GAox subfamilies. The M3 detected no sites under positive selection in most of subfamilies, but both M1a against M2a, and M7 against M8 suggested the presence of sites under positive selection in each subfamily, except for the GAox-D. No sites were found in the GAox-D, possibly due to its small sample size (Additional file 1: Table S3).

The programs FunDi and GroupSim identified a number of sites with significant functional divergence scores for all subfamilies [6062]. FunDi method predicted a wider range of site divergences, and the fractions of sites with posterior probability of functional divergence above 0.5 ranges from 0.798 to 0.987 across GAox subfamilies (Table 3). The GroupSim method detected 159 sites in all GA oxidases (Table 4).

Table 3 Fraction of sites predicted to be functionally divergent by FunDi program
Table 4 Number of sites predicted to be functionally divergent by GroupSim program

In order to analyze the characteristics and functional divergence sites of each subfamily motif, we modeled the 3D structures of eight GA oxidases of Oryza sativa based on the higher reliability templates (Additional file 5: Table S1). According to the 3D protein structures, we found: 1) all the characteristic motifs locate in the crevice of GA oxidases; 2) each of the characteristic motifs consists of a loop and β sheet; 3) all the motifs detected have several positively selected or functional divergence sites by all or two programs (PAML 4.8, FunDi and GroupSim), and most of these sites are polar amino acids (Fig. 2). Those features suggested that the characteristic motif is an important component of GA oxidases and plays a significant role in the GA biosynthesis.


Origin and evolution of GAox family

The present study showed that GAox genes, critical genes in regulating GA biosynthesis especially for the later steps of GA biosynthetic pathways, were ubiquitous in plants. Unlike the four previously well-studied subfamilies (i.e., C19-GA2ox, C20-GA2ox, GA20ox and GA3ox), there were fewer homologs in the subfamilies GAox-A, GAox-B, GAox-C and GAox-D ranging from 30 to 63 (Additional file 2: Table S2). Homologs of subfamilies GAox-B and GAox-D genes were not found in the basal angiosperm Amborella trichopoda, but were present in the gymnosperm Picea abies. Our phylogenetic analysis of the GAox genes showed that all six GAox homologs identified from green algae fell into a single clade, while those from land plants were clustered together and then divided into eight subclades (Fig. 4a). Previous studies found no bioactive GAs have been detected and chemically identified in mosses, nor could external GAs induce GA-specific developmental processes in these plants [27, 92, 93]. However, Anterola and Shanle (2008) found 15 putative GA oxidase homologs in Physcomitrella patens. This means that there are GA oxidases, but no bioactive GAs in moss species, which could be due to incompletion of GA synthetic pathway. In this study, we found 18 putative GAox genes in Physcomitrella patens, belonging to three different subfamilies. These results indicate that diversification of GAox genes occurred before the divergence of land plants, then diversified and expanded further in the course of land plant evolution. Considering GAox genes in the main plant lineages, it is reasonable to believe that a rapid gene expansion occurred in the GAox family after the divergence of land plants from the common ancestor of green plants; this may have been related to specialized metabolisms and adaptations to terrestrial environments.

According to our phylogenetic analysis, gene structure and the characteristic motifs of GAox genes, eight subfamilies were well defined under the GAox family of land plants, namely C19-GA2ox, C20-GA2ox, GA20ox, GA3ox, GAox-A, GAox-B, GAox-C and GAox-D. Of these subfamilies, C19-GA2ox, C20-GA2ox, GA20ox and GA3ox have been thoroughly studied previously, while new subfamilies GAox-A, GAox-B, GAox-C and GAox-D have not. According to our phylogenetic analysis, these four subfamilies resided in basal clades in the tree. All this information suggests that these subfamilies of genes were present in the common ancestral species of extant land plants, and these genes may have some unknown functions in land plants. Thus, the catalytic activity and biological function of these genes needs to be elucidated.

Although the structures and functional sites were predicted in our study, the function of GAox-A/B/C/D genes has not been examined yet. Therefore, it is unknown the location of these genes in GA biosynthesis pathway. Combined with the findings of previous studies [11, 40, 92, 94], a proposed model of GAs biosynthesis pathway is presented in Fig. 6. Obviously, only very early steps of this pathway were present in Physcomitrella patens. Biochemically, P450s (cytochrome P450 monooxygenases) provide direct substrates or precursors of substrates for C20-GA2ox, so the presence of C20-GA2ox may be a prerequisite for the presence of the GAox family. As the GAox-A/B/C/D are basal clades of phylogenetic tree (Fig. 4), they may appear subsequently during the evolution of land plants. However, it is currently difficult to give a clear explanation for the evolutionary history of GAox genes due to lack of information of causal relationship between active GAs and GAox-A/B/C/D.

Fig. 6

Comparison of the major gibberellin (GA) biosynthesis pathways in Physcomitrella patens (bryophyte), Selaginella moellendorffii (pteridophyte) and seed plants. The biosynthesis pathways of bryophyte, pteridophyte and seed plants are highlighted in blue, orange and black arrows, respectively. Abbreviations: CPS, ent-copalyl diphosphate synthase; GGDP, geranylgeranyl diphosphate; KS, ent-kaurene synthase; KO, ent-kaurene oxidase; KAO, entkaurenoic; P450s, cytochrome P450 monooxygenases; TPSs, terpene synthases

Despite having similar catalytic cores of 2OGDs (Fig. 3), the GAox homologs of green algae exhibited some differences to those of the nonvascular plant Physcomitrella patens and vascular plants. They were grouped into one clade in the phylogenetic tree, and not classified into any of GAox subfamilies. In addition, there are no characteristic motifs of eight GAox subfamilies. These findings suggest that putative algal GAox genes have little diversification of sequences and functions.

Diversification of gene structure

Previous studies have shown that gene structural diversity is a possible mechanism for the evolution of multigene families [9597]. The GAox family provides an example of exon-intron diversity in the evolution of gene family, although the protein sequences of its catalytic domain are well conserved. Comparing the phylogenetic tree with the exon-intron structure (Fig. 4 and Additional file 4: Figure S1), it is apparent that the most closely related members within subfamilies shared similar gene structure in terms of intron number and exon length, suggesting that the diversification of subfamilies could be under a strong selection and indicative of their functional conservativeness. Most of the GAox subfamilies have three exons, also suggesting that the diversification of the GAox family is under a strong selection. However, two recent diverged GAox subfamilies, GA3ox and GAox-C possess two and four exons, respectively; this supports the hypothesis that duplications of ancestral mosaic genes have been followed by more recent gains and losses of introns [98]. In view of the numbers and locations of introns, GAox genes may have experienced different evolutionary histories since the monocot-dicot split. Unlike the gene structure of dicots, that of monocots is usually simple, without or with one/two shorter introns, suggestive of the fact that these genes probably arose from retrotransposon-based random insertions. This might indicate different evolutionary patterns for these two kinds of plants. Most homologs in the three GAox subfamilies (i.e. C19-GA2ox, C20-GA2ox and GAox-C subfamily) with relatively longer introns may have arisen from unequal crossing-over [23].

To elucidate the motif diversification of GAox subfamilies, putative motifs were predicted using MEME and alignments (Table 1). As expected, most of the closely related members had common motif compositions, and the length and location of GAox gene domains were highly conserved, suggesting similar functions among members of the same GAox subfamily and the same selection acting on them. All proteins of the eight subfamilies have very similar C-terminal structures of their catalytic domains (Fig. 3 and Additional file 3: Data S1), the function of most putative motifs remains unknown as they do not have homologs in the Pfam and SMART databases. Most of the motifs were shared by the majority of GAoxs, but the short amino acid stretches in the N-terminal may characterize each subfamily. Combined with the previous studies of GAox [3539, 45, 46], we conclude that the alignment of amino acid sequence from 691 to 757 sites can be used to identify a putative protein of the GAox family (Fig. 1 and Additional file 3: Data S1). In addition, these motifs can be used to distinguish subfamily C20-GA2ox from C19-GA2ox [20]. According to the results of 3D modeling analysis, all characteristic motifs were located in the crevice of the protein surface (Fig. 2). In addition, the positive selection tests suggest that most of these motifs contain several sights that could have evolved in response to positive selection pressure. To date, no functional examination has been conducted on these conserved motifs, so our results provide evidence that these specific motifs may have some active sites and play critical roles in GA biosynthesis and catabolism.

Based on our findings, the similar gene structures and conserved motifs in each GAox subfamily provide an insight into the evolutionary history of the GAox family and its classification. In addition, the variation in gene structure and the difference in motif compositions among different subfamilies indicate that GAox subfamilies are functionally diversified. Furthermore, the eight conserved motifs are the diagnostic characters for the identification of GAox subfamilies, and they must be important for catalytic functions in the GA biosynthetic pathway.

Function divergence and adaptation of the GAox family

Gene duplication is considered to be a major mechanism for the generation of evolutionary novelty and adaptation [50, 51]. In plants, gene duplication followed by functional divergence is particularly important for the diversification of biochemical metabolites [99102]. Substitutions can change the function of duplicated genes, and may be due to either a relaxation of purifying selection or the action of positive selection [56, 58, 59, 103]. To investigate evidence for positive selection in the GAox subfamilies, we analyzed the eight branches by site models and branch-site models (Fig. 5 and Additional file 1: Table S3). Further analyses on each of the subfamilies under the site models (M0 vs M3; M1a vs M2a; M7 vs M8) detected several positively selected sites for all the subfamilies except for the GAox-D, suggesting that the selective pressure varies among lineages and amino acid sites in GA oxidases. The biochemical context of substitutions that was under positive selection is consistent with a scenario involving the adaptive evolution of GAox genes. Under branch-site models, some selected sites (PP > 95 %) were found to be scattered throughout the primary sequences (data not shown), and they were folded in the large cavity of three-dimensional structures of the analyzed protein. Among these sites, one to eight positively selected sites (PP > 95 %) were located in or near the characteristic motifs, in all branches except C19-GA2ox and GAox-A. The mutations at these sites among the different subfamilies suggested their importance for the enzymatic activity in GA biosynthesis and catabolism. Further studies using site-directed mutagenesis are needed to determine whether these selected sites confer an ability on GAox genes to discriminate between different substrate types.

It should be noted that our branch-site model analyses might suffer from uncertainty because of limitations of the codon models, such as changes in equilibrium nucleotide and codon frequencies among clades with deep divergence (Additional file 6: Table S5) [85, 104, 105]. Therefore, we further performed FunDi and GroupSim to investigate the divergence of the distribution of site-rates. Because these methods are more appropriate for the high sequence divergences, they could provide an important cross-check of the codon-based results and confirmed that the GA oxidases underwent functional divergence. Despite this, the signal for positive selection contributing to the divergence of GA oxidases should be taken with caution considering that the codon and amino acid models can sometimes be negatively affected by model misspecification and alignments quality [83, 86, 106110]. Taken together, our study provides new insights into the functional divergence of the GAox family and implicates the potential roles of positive selection in the evolution of GAox genes, which deserves further investigations.


This study provides the first large-scale evolutionary analysis of GAox genes in plants, using an extensive whole-genome dataset of 41 species representing green algae, bryophytes, pteridophytes and seed plants. Our study not only provides information about the classification of GAox genes, but also facilitates the further functional characterization and analysis of GA oxidases. Our results indicate: 1) Gibberellin oxidase (GAox) originated very early—before the divergence of bryophytes and the vascular plants, and diversified into eight subfamilies in the course of land plant evolution. Of these subfamilies, GAox-A, GAox-B, GAox-C and GAox-D are proposed for the first time; 2) the diversification of GAox genes could be attributed to functional divergence and such divergence is most likely to facilitate the completion of the GA synthesis pathway for plants to adapt to terrestrial environments; 3) each subfamily of GAox genes has its characteristic motif and a signature of positive selection have been detected in most subfamilies.

Availability of supporting data

Additional file 3-Data S1 and Additional file 7-Data S2. Protein sequences used in the present analysis.

Additional file 8-Table S4. GenBank accessions numbers for sequences used in this study.





Ent-copalyl diphosphate synthase




GA oxidase


Geranylgeranyl diphosphate




Ent-kaurene oxidase


Ent-kaurene synthase


Likelihood ratio test


Cytochrome P450 monooxygenases


Terpene synthases


2-oxoglutarate-dependent dioxygenase


  1. 1.

    Jones RL, Varner JE. The bioassay of gibberellins. Planta. 1966;72:155–61.

    CAS  PubMed  Article  Google Scholar 

  2. 2.

    Pharis RP, King RW. Gibberellins and reproductive development in seed plants. Annu Rev Plant Physiol. 1985;36:517–68.

    CAS  Article  Google Scholar 

  3. 3.

    Hooley R. Gibberellins: perception, transduction and responses. Plant Mol Biol. 1994;26:1529–55.

    CAS  PubMed  Article  Google Scholar 

  4. 4.

    Blázquez MA, Green R, Nilsson O, Sussman MR, Weigel D. Gibberellins promote flowering of Arabidopsis by activating the LEAFY promoter. Plant Cell. 1998;10:791–800.

    PubMed Central  PubMed  Article  Google Scholar 

  5. 5.

    MacMillan J. Biosynthesis of the gibberellin plant hormones. Nat Prod Rep. 1997;14:221–43.

    CAS  Article  Google Scholar 

  6. 6.

    MacMillan J. Occurrence of gibberellins in vascular plants, fungi, and bacteria. J Plant Growth Regul. 2001;20:387–442.

    CAS  PubMed  Article  Google Scholar 

  7. 7.

    Hedden P, Phillips AL. Gibberellin metabolism: new insights revealed by the genes. Trends Plant Sci. 2000;5:523–30.

    CAS  PubMed  Article  Google Scholar 

  8. 8.

    Graebe JE. Gibberellin biosynthesis and control. Annu Rev Plant Physiol. 1987;38:419–65.

    CAS  Article  Google Scholar 

  9. 9.

    Olszewski N, Sun TP, Gubler F. Gibberellin signaling: biosynthesis, catabolism, and response pathways. Plant Cell. 2002;14:S61–80.

    PubMed Central  CAS  PubMed  Google Scholar 

  10. 10.

    Sakamoto T, Miura K, Itoh H, Tatsumi T, Ueguchi-Tanaka M, Ishiyama K, et al. An overview of gibberellin metabolism enzyme genes and their related mutants in rice. Plant Physiol. 2004;134:1642–53.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  11. 11.

    Yamaguchi S. Gibberellin metabolism and its regulation. Annu Rev Plant Biol. 2008;59:225–51.

    CAS  PubMed  Article  Google Scholar 

  12. 12.

    Phillips AL, Ward DA, Uknes S, Appleford NEJ, Lange T, Huttly AK, et al. Isolation and expression of three gibberellin 20-oxidase cDNA clones from Arabidopsis. Plant Physiol. 1995;108:1049–57.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  13. 13.

    Kawai Y, Ono E, Mizutani M. Evolution and diversity of the 2-oxoglutarate‐dependent dioxygenase superfamily in plants. Plant J. 2014;78:328–43.

    CAS  PubMed  Article  Google Scholar 

  14. 14.

    Carrera E, Jackson SD, Prat S. Feedback control and diurnal regulation of gibberellin 20-oxidase transcript levels in potato. Plant Physiol. 1999;119:765–74.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  15. 15.

    Itoh H, Tanaka-Ueguchi M, Kawaide H, Chen X, Kamiya Y, Matsuoka M. The gene encoding tobacco gibberellins 3beta-hydroxylase is expressed at the site of GA action during stem elongation and flower organ development. Plant J. 1999;20:15–24.

    CAS  PubMed  Article  Google Scholar 

  16. 16.

    Oikawa T, Koshioka M, Kojima K, Yoshida H, Kawata M. A role of OsGA20ox1, encoding an isoform of gibberellin 20-oxidase, for regulation of plant stature in rice. Plant Mol Biol. 2004;55:687–700.

    CAS  PubMed  Article  Google Scholar 

  17. 17.

    Davidson SE, Swain SM, Reid JB. Regulation of the early GA biosynthesis pathway in pea. Planta. 2005;222:1010–9.

    CAS  PubMed  Article  Google Scholar 

  18. 18.

    Schomburg FM, Bizzell CM, Lee DJ, Zeevaart JA, Amasino RM. Overexpression of a novel class of gibberellin 2-oxidases decreases gibberellin levels and creates dwarf plants. Plant Cell. 2003;15:151–63.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  19. 19.

    Thomas SG, Phillips AL, Hedden P. Molecular cloning and functional expression of gibberellin 2- oxidases, multifunctional enzymes involved in gibberellin deactivation. Proc Natl Acad Sci U S A. 1999;96:4698–703.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  20. 20.

    Lee DJ, Zeevaart JAD. Molecular cloning of GA2-oxidase3 from spinach and its ectopic expression in Nicotianasylvestris. Plant Physiol. 2005;138:243–54.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  21. 21.

    Pimenta Lange MJ, Liebrandt A, Arnold L, Chmielewska SM, Felsberger A, Freier E, et al. Functional characterization of gibberellin oxidases from cucumber, Cucumis sativus L. Phytochemistry. 2013;90:62–9.

    CAS  PubMed  Article  Google Scholar 

  22. 22.

    Yang Y, Zhang F, Ge S. Evolutionary rate patterns of the Gibberellin pathway genes. BMC Evol Biol. 2009;9:206.

    PubMed Central  PubMed  Article  CAS  Google Scholar 

  23. 23.

    Han F, Zhu B. Evolutionary analysis of three gibberellin oxidase genes in rice, Arabidopsis, and soybean. Gene. 2011;473:23–35.

    CAS  PubMed  Article  Google Scholar 

  24. 24.

    Giacomelli L, Rota-Stabelli O, Masuero D, Acheampong AK, Moretto M, Caputi L, et al. Gibberellin metabolism in Vitis vinifera L. during bloom and fruit-set: functional characterization and evolution of grapevine gibberellin oxidases. J Exp Bot. 2013;64:4403–19.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  25. 25.

    Johri MM. Possible origin of hormonal regulation in green plants. Proc Indian Natl Sci Acad. 2004;70:335–65.

    CAS  Google Scholar 

  26. 26.

    Johri MM. Hormonal regulation in green plant lineage families. Physiol Mol Biol Plants. 2008;14:23–38.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  27. 27.

    Yasumura Y, Crumpton-Taylor M, Fuentes S, Harberd NP. Step-by-step acquisition of the gibberellin-DELLA growth-regulatory mechanism during land-plant evolution. Curr Biol. 2007;17:1225–30.

    CAS  PubMed  Article  Google Scholar 

  28. 28.

    Rensing SA, Lang D, Zimmer AD, Terry A, Salamov A, Shapiro H, et al. The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science. 2008;319:64–9.

    CAS  PubMed  Article  Google Scholar 

  29. 29.

    Hayashi K, Kawaide H, Notomi M, Sakigi Y, Matsuo A, Nozaki H. Identification and functional analysis of bifunctionalent-kaurene synthase from the moss Physcomitrella patens. FEBS Lett. 2006;580:6175–81.

    CAS  PubMed  Article  Google Scholar 

  30. 30.

    Anterola A, Shanle EK. Genomic insights in moss gibberellin biosynthesis. The Bryologist. 2008;111:218–30.

    Article  Google Scholar 

  31. 31.

    Hedden P, Phillips AL, Rojas MC, Carrera E, Tudzynski B. Gibberellin biosynthesis in plants and fungi: a case of convergent evolution? J Plant Growth Regul. 2001;20:319–31.

    CAS  PubMed  Article  Google Scholar 

  32. 32.

    Tudzynski B. Gibberellin biosynthesis in fungi: genes, enzymes, evolution, and impact on biotechnology. Appl Microbiol Biotechnol. 2005;66:597–611.

    CAS  PubMed  Article  Google Scholar 

  33. 33.

    Pimenta Lange MJ, Lange T. Gibberellin biosynthesis and the regulation of plant development. Plant Biol. 2006;8:281–90.

    CAS  PubMed  Article  Google Scholar 

  34. 34.

    Aravind L, Koonin EV. The DNA-repair protein AlkB, EGL-9, and leprecan define new families of 2-oxoglutarate- and iron-dependent dioxygenases. Genome Biol. 2001;2:7.7–8.

    Article  Google Scholar 

  35. 35.

    Chiang HH, Hwang I, Goodman HM. Isolation of the Arabidopsis GA4 locus. Plant Cell. 1995;7(2):195–201.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  36. 36.

    Xu YL, Li L, Wu K, Peeterst Anton JM, Gage DA, Zeevaart JAD. The GA5 locus of Arabidopsis thaliana encodes a multifunctional gibberellin 20-oxidase: molecular cloning and functional expression. Proc Natl Acad Sci U S A. 1995;92(14):6640–4.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  37. 37.

    Wu K, Li L, Gage DA, Zeevaart JAD. Molecular cloning and photoperiod-regulated expression of gibberellin 20-oxidase from the long-day plant spinach. Plant Physiol. 1996;110(2):547–54.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  38. 38.

    Toyomasu T, Kawaide H, Sekimoto H, von Numers C, Phillips AL, Hedden P, et al. Cloning and characterization of a cDNA encoding gibberellin 20‐oxidase from rice (Oryza sativa) seedlings. Physiol Plant. 1997;99(1):111–8.

    CAS  Article  Google Scholar 

  39. 39.

    Kang HG, Jun SH, Kim J, Kawaide H, Kamiya Y, An G. Cloning and molecular analyses of a gibberellin 20-oxidase gene expressed specifically in developing seeds of watermelon. Plant Physiol. 1999;121(2):373–82.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  40. 40.

    Lo SF, Yang SY, Chen KT, Hsing YI, Zeevaart JAD, Chen LJ, et al. A novel class of gibberellin 2-oxidases control semidwarfism, tillering, and root development in rice. Plant Cell. 2008;20:2603–18.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  41. 41.

    Lange T, Hedden P, Graebe JE. Expression cloning of a gibberellin 20-oxidase, a multifunctional enzyme involved in gibberellin biosynthesis. Proc Natl Acad Sci U S A. 1994;91(18):8552–6.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  42. 42.

    Huerta L, Garcia-Lor A, Garcia-Martinez JL. Characterization of gibberellin 20-oxidases in the citrus hybrid Carrizo citrange. Tree Physiol. 2009;29(4):569–77.

    CAS  PubMed  Article  Google Scholar 

  43. 43.

    Busov VB, Meilan R, Pearce DW, Ma C, Rood SB, Strauss SH. Activation tagging of a dominant gibberellin catabolism gene (GA2-oxidase) frompoplar regulates tree stature. Plant Physiol. 2003;132:1283–91.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  44. 44.

    Itoh H, Ueguchi-Tanaka M, Sentoku N, Kitano H, Matsuoka M, Kobayashi M. Cloning and functional analysis of two gibberellin 3β-hydroxylase genes that are differently expressed during the growth of rice. Proc Natl Acad Sci U S A. 2001;98(15):8909–14.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  45. 45.

    Sakamoto T, Kobayashi M, Itoh H, Tagiri A, Kayano T, Tanaka H, et al. Expression of a gibberellin 2-oxidase gene around the shoot apex is related to phase transition in rice. Plant Physiol. 2001;125:1508–16.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  46. 46.

    Lee DJ, Zeevaart JAD. Differential regulation of RNA levels of gibberellin dioxygenases by photoperiod in spinach. Plant Physiol. 2002;130:2085–94.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  47. 47.

    Lester DR, Ross JJ, Davies PJ, Reid JB. Mendel’s stem length gene (Le) encodes a gibberellin 3 beta-hydroxylase. Plant Cell. 1997;9:1435–43.

    PubMed Central  CAS  PubMed  Google Scholar 

  48. 48.

    Martin DN, Proebsting WM, Hedden P. Mendel’s dwarfing gene: cDNAs from the Le alleles and function of the expressed proteins. Proc Natl Acad Sci U S A. 1997;94(16):8907–11.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  49. 49.

    Innan H, Kondrashov F. The evolution of gene duplications: classifying and distinguishing between models. Nature Rev Genet. 2010;11:97–108.

    CAS  PubMed  Article  Google Scholar 

  50. 50.

    Semon M, Wolfe KH. Consequences of genome duplication. Curr Opin Genetics Dev. 2007;17:505–12.

    CAS  Article  Google Scholar 

  51. 51.

    Sun HZ, Ge S. Molecular evolution of the duplicated TFIIAγgenes in Oryzeae and its relatives. BMC Evol Biol. 2010;10:128.

    PubMed Central  PubMed  Article  CAS  Google Scholar 

  52. 52.

    Ohno S. Evolution by gene duplication. London: George Alien & Unwin Ltd. Berlin, Heidelberg and New York: Springer; 1970.

    Google Scholar 

  53. 53.

    Li W-H. Accelerated evolution following gene duplication and its implications for the neutralist-selectionist controversy. In: Otha T, Aoki K, editors. Population genetics and molecular evolution. Tokyo: Japan Scientific Press; 1985. p. 333–52.

    Google Scholar 

  54. 54.

    Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–5.

    CAS  PubMed  Article  Google Scholar 

  55. 55.

    Liu PL, Wan JN, Guo YP, Ge S, Rao GY. Adaptive evolution of the chrysanthemyl diphosphate synthase gene involved in irregular monoterpene metabolism. BMC Evol Biol. 2012;12:214.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  56. 56.

    Zhang J, Rosenberg HF, Nei M. Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci U S A. 1998;95:3708–13.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  57. 57.

    Shiu SH, Byrnes JK, Pan R, Zhang P, Li WH. Role of positive selection in the retention of duplicate genes in mammalian genomes. Proc Natl Acad Sci U S A. 2006;103:2232–6.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  58. 58.

    Dykhuizen D, Hartl DL. Selective neutrality of 6pgd allozymes in E. COLI and the effects of genetic background. Genetics. 1980;96(4):801–17.

    PubMed Central  CAS  PubMed  Google Scholar 

  59. 59.

    Zhang J. Evolution by gene duplication: an update. Trends Ecol Evolut. 2003;18:292–8.

    Article  Google Scholar 

  60. 60.

    Gaston D, Susko E, Roger AJ. A Phylogenetic Mixture Model for the Identification of Functionally Divergent Protein Residues. Bioinformatics. 2011;27:2655–63.

    CAS  PubMed  Article  Google Scholar 

  61. 61.

    Capra JA, Singh M. Characterization and prediction of residues determining protein functional specificity. Bioinformatics. 2008;24:1473–80.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  62. 62.

    Gaston D, Roger AJ. Functional divergence and convergent evolution in the plastid-targeted glyceraldehyde-3-phosphate dehydrogenases of diverse eukaryotic algae. PLoS One. 2013;8(7):e70396.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  63. 63.

    Valegård K, van Scheltinga ACT, Lloyd MD, Hara T, Ramaswamy S, Perrakis A, et al. Structure of a cephalosporin synthase. Nature. 1998;394:805–9.

    PubMed  Article  Google Scholar 

  64. 64.

    Altschul SF, Madden TL, Schäffer AA, Zhang JH, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  65. 65.

    Hall TA. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser. 1999;41:95–8.

    CAS  Google Scholar 

  66. 66.

    Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  67. 67.

    Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997;25:4876–82.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  68. 68.

    Xia X, Xie Z. DAMBE: software package for data analysis in molecular biology and evolution. J Hered. 2001;92:371–3.

    CAS  PubMed  Article  Google Scholar 

  69. 69.

    Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22:2688–90.

    CAS  PubMed  Article  Google Scholar 

  70. 70.

    Darriba D, Taboada GL, Doallo R, Posada D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics. 2011;27:1164–5.

    CAS  PubMed  Article  Google Scholar 

  71. 71.

    Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009;37:W202–8.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  72. 72.

    Marchler-Bauer A, Anderson JB, Cherukuri PF, DeWeese-Scott C, Geer LY, Gwadz M, et al. CDD: a Conserved Domain Database for protein classification. Nucleic Acids Res. 2005;33:D192–6.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  73. 73.

    Schwede T, Kopp J, Guex N, Peitsch MC. SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res. 2003;31:3381–5.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  74. 74.

    Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, et al. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res. 2014;42:W252–8.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  75. 75.

    Benkert P, Kunzli M, Schwede T. QMEAN server for protein model quality estimation. Nucleic Acids Res. 2009;37:W510–4.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  76. 76.

    Seeliger D, de Groot BL. Ligand docking and binding site analysis with PyMOL and Autodock/Vina. J Comput Aided Mol Des. 2010;24:417–22.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  77. 77.

    Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, et al. Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007;450:203–18.

    PubMed  Article  CAS  Google Scholar 

  78. 78.

    Löytynoja A, Goldman N. An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci U S A. 2005;102:10557–62.

    PubMed Central  PubMed  Article  CAS  Google Scholar 

  79. 79.

    Löytynoja A, Goldman N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science. 2008;320:1632–5.

    PubMed  Article  CAS  Google Scholar 

  80. 80.

    Markova-Raina P, Petrov D. High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes. Genome Res. 2011;21:863–74.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  81. 81.

    Penn O, Privman E, Ashkenazy H, Landan G, Graur D, Pupko T. GUIDANCE: a web server for assessing alignment confidence scores. Nucleic Acids Res. 2010;38 suppl 2:W23–8.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  82. 82.

    Yang Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.

    CAS  PubMed  Article  Google Scholar 

  83. 83.

    Yang Z, Nielsen R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol. 2000;17:32–43.

    CAS  PubMed  Article  Google Scholar 

  84. 84.

    Bielawski JP, Yang Z. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. J Mol Evol. 2004;59:121–13.

    CAS  PubMed  Article  Google Scholar 

  85. 85.

    Bay RA, Bielawski JP. Inference of functional divergence among proteins when the evolutionary process is non-stationary. J Mol Evol. 2013;76(4):205–15.

    CAS  PubMed  Article  Google Scholar 

  86. 86.

    Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22:2472–9.

    CAS  PubMed  Article  Google Scholar 

  87. 87.

    Yang Z, Nielsen R. Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol Biol Evol. 2002;19:908–17.

    CAS  PubMed  Article  Google Scholar 

  88. 88.

    Nielsen R, Yang Z. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics. 1998;148(3):929–36.

    PubMed Central  CAS  PubMed  Google Scholar 

  89. 89.

    Yang Z, Swanson WJ. Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes. Mol Biol Evol. 2002;19(1):49–57.

    PubMed  Article  Google Scholar 

  90. 90.

    Yang Z, Wong WSW, Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22:1107–18.

    CAS  PubMed  Article  Google Scholar 

  91. 91.

    Wang HC, Li K, Susko E, Roger AJ. A class frequency mixture model that adjusts for site-specific amino acid frequencies and improves inference of protein phylogeny. BMC Evol Biol. 2008;8(1):331.

    PubMed Central  PubMed  Article  CAS  Google Scholar 

  92. 92.

    Hirano K, Nakajima M, Asano K, Nishiyama T, Sakakibara H, Kojima M, et al. The GID1-mediated gibberellin perception mechanism is conserved in the lycophyte Selaginella moellendorffii but not in the bryophyte Physcomitrella patens. Plant Cell. 2007;19:3058–79.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  93. 93.

    Bomke C, Tudzynski B. Diversity, regulation, and evolution of the gibberellin biosynthetic pathway in fungi compared to plants and bacteria. Phytochemistry. 2009;70:1876–93.

    PubMed  Article  CAS  Google Scholar 

  94. 94.

    Evans MMS, Poethig RS. Gibberellins promote vegetative phase change and reproductive maturity in maize. Plant Physiol. 1995;108(2):475–87.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  95. 95.

    Cho G, Doolittle RF. Intron distribution in ancient paralogs supports random insertion and not random loss. J Mol Evol. 1997;44:573–84.

    CAS  PubMed  Article  Google Scholar 

  96. 96.

    Boudet N, Aubourg S, Toffano-Nioche C, Kreis M, Lecharny A. Evolution of intron/exon structure of DEAD helicase family genes in Arabidopsis, Caenorhabditis, and Drosophila. Genome Res. 2001;11:2101–14.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  97. 97.

    Babenko VN, Rogozin IB, Mekhedov SL, Koonin EV. Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res. 2004;32:3724–33.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  98. 98.

    Trotman CAN. Introns-early: Slipping lately? Trends Genet. 1998;14:132–4.

    CAS  PubMed  Article  Google Scholar 

  99. 99.

    Pichersky E, Gang DR. Genetics and biochemistry of secondary metabolites in plants: an evolutionary perspective. Trends Plant Sci. 2000;5(10):439–45.

    CAS  PubMed  Article  Google Scholar 

  100. 100.

    Ober D. Seeing double: gene duplication and diversification in plant secondary metabolism. Trends Plant Sci. 2005;10(9):444–9.

    CAS  PubMed  Article  Google Scholar 

  101. 101.

    Benderoth M, Textor S, Windsor AJ, Mitchell-Olds T, Gershenzon J, Kroymann J. Positive selection driving diversification in plant secondary metabolism. Proc Natl Acad Sci U S A. 2006;103(24):9118–23.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  102. 102.

    Kroymann J. Natural diversity and adaptation in plant secondary metabolism. Curr Opin Plant Biol. 2011;14(3):246–51.

    CAS  PubMed  Article  Google Scholar 

  103. 103.

    Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. Preservation of duplicate genes by complementary, degenerative mutations. Genetics. 1999;151(4):1531-45.

  104. 104.

    Zhang J. Frequent false detection of positive selection by the likelihood method with branch-site models. Mol Biol Evol. 2004;21:1332–9.

    CAS  PubMed  Article  Google Scholar 

  105. 105.

    Bielawski JP, Yang Z. Maximum likelihood methods for detecting adaptive evolution after gene duplication. J Struct Funct Genomics. 2003;3:201–12.

    CAS  PubMed  Article  Google Scholar 

  106. 106.

    Yang Z, Bielawski JP. Statistical methods for detecting molecular adaptation. TREE. 2000;15:496–503.

    PubMed  Google Scholar 

  107. 107.

    Wong WS, Yang Z, Goldman N, Nielsen R. Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics. 2004;168:1041–51.

    PubMed Central  CAS  PubMed  Article  Google Scholar 

  108. 108.

    Jordan G, Goldman N. The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol Biol Evol. 2012;29:1125–39.

    CAS  PubMed  Article  Google Scholar 

  109. 109.

    Privman E, Penn O, Pupko T. Improving the performance of positive selection inference by filtering unreliable alignment regions. Mol Biol Evol. 2012;29:1–5.

    CAS  PubMed  Article  Google Scholar 

  110. 110.

    Levy Karin E, Susko E, Pupko T. Alignment errors strongly impact likelihood-based tests for comparing topologies. Mol Biol Evol. 2014;31:3057–67.

    PubMed  Article  Google Scholar 

Download references


We are grateful to Dr. Ping-Li Liu of Beijing Forestry University for her technical assistance. We thank Dr. Hua Gao of Peking University for his helps in the structural modeling analysis. Our deep gratitude also goes to the editor and anonymous reviewers for their valuable comments and constructive suggestions on the manuscript. This work is supported by the National Natural Science Foundation of China (Grant No. 30990240).

Author information



Corresponding author

Correspondence to Guang-Yuan Rao.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

GYR, SG, XW and YH designed the experiments. YH and XW performed the experiments. GYR, SG, YH and XW analyzed the data. YH and GYR wrote the manuscript. All authors read and approved the final manuscript.

Additional files

Additional file 1: Table S3.

MLEs of parameters and sites inferred to be under positive selection for the eight GAox subfamilies data set. (XLSX 15 kb)

Additional file 2: Table S2.

GA oxidases in the species analyzed in this study. (XLSX 19 kb)

Additional file 3: Data S1.

Alignment of putative GA oxidase sequences. (FASTA 1761 kb)

Additional file 4: Figure S1.

The GA oxidase structure. (PDF 1521 kb)

Additional file 5: Table S1.

List of templates used in this study. (XLSX 10 kb)

Additional file 6: Table S5.

A. Frequencies of nucleotides; B. Frequencies of codons; C. Nucleotide frequencies by codon sites. (RAR 24 kb)

Additional file 7: Data S2.

The aligned (protein-coding) DNA sequences (CDS). (RAR 238 kb)

Additional file 8: Table S4.

The accession numbers of the protein sequences used in the present analysis. (XLSX 34 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huang, Y., Wang, X., Ge, S. et al. Divergence and adaptive evolution of the gibberellin oxidase genes in plants. BMC Evol Biol 15, 207 (2015).

Download citation


  • Characteristic motif
  • Functional divergence
  • Gibberellin
  • Phytohormones
  • Positive selection
  • 2-oxoglutarate-dependent dioxygenase