Bdelloid rotifers are the oldest, most diverse and successful animal taxon for which males, hermaphrodites, and traditional meiosis are unknown. Their degenerate tetraploid genome, with 2–4 copies of most loci, includes thousands of genes acquired from all domains of life by horizontal transfer. Many bdelloid species thrive in ephemerally aquatic habitats by surviving desiccation at any life stage with no loss of fecundity or lifespan. Their unique genomic diversity and the intense selective pressure of desiccation provide an exceptional opportunity to study the evolution of diversity and novelty in genes involved in DNA repair.
We used genomic data and RNA-Seq of the desiccation process in the bdelloid Adineta vaga to characterize DNA damage reversal, translesion synthesis, and the major DNA repair pathways: base, nucleotide, and alternate excision repair, mismatch repair (MMR), and double strand break repair by homologous recombination (HR) and classical non-homologous end joining (NHEJ). We identify multiple horizontally transferred DNA damage response genes otherwise unknown in animals (AlkD, Fpg, LigK UVDE), and the presence of genes often considered vertebrate specific, particularly in the NHEJ complex and X family polymerases. While 75–100% of genes involved in MMR and HR are present in 0–2 copies, genes involved in NHEJ, which are present in only a single copy in nearly all other animals, are retained in 3–8 copies. We present structural predictions and expression evidence of neo- or sub-functionalization of multiple copy genes involved in NHEJ and other repair processes.
The horizontally-acquired genes and duplicated genes in BER and NHEJ suggest resilience to oxidative damage is conferred in part by increased DNA damage recognition and efficient end repair capabilities. The pattern of gene loss and retention in MMR and HR may facilitate recombination and gene conversion between divergent sequences, thus providing at least some of the benefits of sex. The unique retention and divergence of duplicates genes in NHEJ may be facilitated by the lack of efficient selection in the absence of meiotic recombination and independent assortment, and may contribute to the evolutionary success of bdelloids.
Bdelloid rotifers are a diverse class of over 450 species of desiccation-tolerant, radiation-resistant, asexual microinvertebrates that inhabit diverse aquatic and limnoterrestrial habitats around the globe. Populations are wholly composed of parthenogenic females, with no evidence of males or hermaphrodites. Bdelloids reproduce through mitotic division of oocyte mother cells, with no chromosome pairing or reduction, producing clonal offspring [1, 2]. While evidence of some form of genetic exchange has been reported  the mechanism remains unknown and the group is generally considered to have evolved without sexual reproduction and to be the largest, most successful obligately asexual animal taxon [4,5,6].
The genome structure of bdelloid rotifers is degenerate tetraploid [5, 7, 8]. Under this model, a whole genome duplication in a common bdelloid ancestor created a tetraploid with two copies (ohnologs) of the genome. Within each ohnologous diploid set, (former) alleles diverged due to the lack of meiotic independent segregation and syngamy , but may also be routinely homogenized by gene conversion [5, 8]. Ohnologs and former alleles have diverged by an average of 27% and 1% at non-synonymous positions, respectively. Gene copies have been lost over time, so that roughly 40% of genes are still present in four copies. Additionally, 5–10% of bdelloid genes have been horizontally-acquired from non-metazoans, and the horizontal gene transfer (HGT) process is both ancient and ongoing. The horizontally-acquired, or “alien” genes are transcriptionally active and properly spliced, and produce functional proteins [5, 10,11,12].
Bdelloids can enter a state of anhydrobiosis at any life stage in response to desiccation, an ability first recognized by Leeuwenhoek more than 300 years ago . When water in the environment evaporates, a bdelloid loses nearly all unbound water, reducing its weight by 95% and effectively ceasing metabolism; when water returns to the environment, the bdelloid hydrates and resumes activity with no loss of lifespan or fecundity [14,15,16,17]. Bdelloid species that inhabit desiccation-prone environments enter anhydrobiosis on average once per generation [18, 19]. That bdelloids have adapted to this condition can be seen in the higher fecundity and increased total fitness of bdelloid populations that have been through repeated rounds of anhydrobiosis compared to those that remain constantly hydrated [20, 21].
The tolerance of bdelloids to desiccation likely underpins their extreme resistance to ionizing radiation (IR). Even when hydrated, exposure to 600 Gy—well in excess of the lethal dose for most other animals—causes minimal reduction of fertility in exposed mothers or their offspring . In bdelloids, resistance to the damage of anhydrobiosis and IR appears to rely on antioxidants rather than osmoprotectants such as trehalose [23,24,25], and damage resilience is seen in the DNA double strand break (DSB) repair during recovery from desiccation and IR [22, 26,27,28,29,30].
The extreme oxidizing conditions of desiccation and IR produce closely spaced DNA lesions that can lead to strand breaks. DSBs are a late stage of oxidative damage, with an estimated 25 single strand breaks for every observed DSB [31, 36]; thus, substantial additional DNA damage underlies the DSBs found in previous studies [22, 29]. Further, oxidatively-caused lesions create DNA ends with a variety of chemical alterations that can block DNA synthesis and/or ligation, the final two repair steps [31,32,33]. Such chemical blockages are removed by the end-processing enzymes of base excision repair and non-homologous end-joining, the main pathways for mending oxidatively-damaged bases and oxidatively-broken DNA, respectively.
Given the many unique aspects of bdelloid biology, we reasoned that understanding the evolution and function of bdelloid DNA damage response (DDR) genes would reveal functions selected for in oxidative damage resilience, and suggest mechanisms that may contribute genetic novelty in the absence of sex, helping to explain the persistent mystery of the success of bdelloid rotifers. Therefore, we identified and analyzed the multiple copies of DDR genes in the bdelloid Adineta vaga, defined as components of direct damage reversal, DNA repair, and damage tolerance by translesion synthesis (TLS) polymerases.
Direct damage reversal describes the removal of certain covalent adducts to DNA, particularly pyrimidine dimers and small alkyl groups bound to guanine, by single proteins without excising a base or incising the DNA backbone . DNA repair occurs through excision repair, with DSBs repaired by homologous recombination or non-homologous end joining. When unrepaired damage is encountered during DNA replication, the replicative polymerase may be replaced by a lower fidelity, error-prone TLS polymerase to continue synthesis across the damaged template, leaving it unrepaired .
There are four excision repair pathways that correct damaged or mismatched bases by lesion excision and DNA backbone incision, repair synthesis, and ligation . Base excision repair (BER) removes small, subtle base lesions caused by oxidation, alkylation, deamination or base loss. Nucleotide excision repair (NER) recognizes and removes bulky, helix-distorting lesions often caused by UV or alkylation. Mismatch repair (MMR) removes misincorporated bases and indel-caused small loops. Alternative excision repair (AER) is initiated by a single endonuclease incision and completed with downstream mechanisms of NER, BER or single-strand break repair .
Homologous recombination (HR) is a high-fidelity, template-dependent repair system that processes DSBs and DNA gaps and can lead to crossing-over and/or gene conversion [38, 39]. MMR is spatially and temporally coupled with HR, and largely prevents divergent sequences from participating in HR. Classical non-homologous end-joining (NHEJ) is an extremely robust, mechanistically flexible repair path that binds and stabilizes broken ends, repairs ends, fills gaps and ligates . NHEJ is primarily responsible for repairing breaks induced by oxidation because it has extensive capabilities to repair complex chemical damages at the broken ends [31, 40].
The synthesis step of most repair pathways is carried out by high-fidelity, high-processivity replicative polymerases. Synthesis as characterized in vertebrate short-patch BER and NHEJ is primarily performed by the X-family polymerases DNA polymerase beta (Polβ) and lambda (Polλ), respectively. Polβ and Polλ are lower-processivity and lower-fidelity gap-filling polymerases that have lyase domains to remove ligation-blocking groups often left at 5’ends in these pathways.
The results of our DDR inventory and of differential gene expression entering and recovering from desiccation focused our analyses primarily on DNA repair pathways rather than reversal or tolerance. We present evidence of the retention of genes often considered vertebrate-specific, horizontal gene transfer of genes novel to metazoans, the loss of some MMR and HR genes, expansion and diversification of NHEJ genes, and functional divergence based on codon differences, predicted protein structure, and differential gene expression during desiccation.
We examined the Adineta vaga genome for 116 major conserved metazoan genes involved in direct damage reversal repair, BER, NER, AER, MMR, HR, NHEJ, and replicative and translesion synthesis. We identified 107 of these genes representing a total of 270 gene copies (Fig. 1, Table 1, Additional file 1). We also identified four genes, three present in two copies and one in four copies, that have been acquired from non-metazoan sources. Through manual editing and RNA-Seq data mapping we improved the annotation of 58 copies. As shown in Fig. 1, gene copy number varied from zero to eight, with most genes present as either a pair of former alleles (2 copies) or a pair of allele pairs (4 copies) representing ohnologs of the ancestral genome duplication [5, 8]. Genes for which we identified an odd number of copies may have two identical copies that collapsed into a single contig during genome assembly, or may have a hemizygous copy due to deletion or decay of the allelic copy. Retention of ohnologous copies is highly variable across DDR pathways, ranging from zero in MMR and among replicative and trans-lesion polymerases, to 70% in NHEJ.
We identified 48 of the 52 major conserved genes associated in eukaryotes with excision repair. The four missing genes, MBD4, MPG, NEI-like, and SMUG1, are BER glycosylases with redundant or overlapping function with others, and it is not unusual for any one of them to be absent from a metazoan. Glycosylases are a diverse, relatively specialized suite of enzymes that typically perform a methodical search to recognize subtle alterations in the DNA duplex caused by damaged bases, and excise those bases, initiating BER. The MMR pathway is conspicuous for the absence of ohnologs and two orthologs, Mlh3 and Msh3. One or both of these genes are absent in many animals (Caenorhabditis elegans and Drosophila lack both). Monogonont rotifers also appear to lack Msh3, but do have Mlh3 , indicating a relatively recent loss in bdelloids. More than three quarters of the genes involved in HR lack ohnologs, and three conserved genes are missing: Nbs1, Rad52, and BRCA2. Nbs1 and Rad 52 are present in monogonont rotifers , but absent in many invertebrates (C. elegans lacks Nbs1 and both Drosophila and C. elegans lack Rad52). Most animals, including monogonont rotifers and other Lophotrochozoans, have maintained BRCA2.
More than half of the genes we investigated are significantly down-regulated during entry into desiccation or recovery from desiccation, or both. Only 8% of the genes are significantly up-regulated in either or both entry and recovery from desiccation. All gene copies, designations, genome coordinates, expression levels, and results of differential expression significance tests are listed in Additional file 1.
Horizontal transfer of excision repair genes
All four horizontally transferred genes we identified are involved in excision repair: formamidopyrimidine DNA glycosylase (Fpg), alkylpurine DNA glycosylase (AlkD), ultraviolet damage endonuclease (UVDE), and a kinetoplastid ATP-dependent polynucleotide ligase (LigK), classified in the cross pathway set in Fig. 1. There is evidence of all four genes in transcriptome data from other bdelloid species and no evidence that these genes are present in genome or transcriptome data from two monogonont species (not shown), thus these genes appear to have been acquired early in bdelloid evolution.
Adineta vaga has a pair of Fpg genes likely acquired from a fungus (Fig. 2a). This is the only known occurrence of Fpg in an animal. Fpg and endonuclease VIII (Nei) compose a family of glycosylases that perform the first two steps in BER: base excision and incision of the DNA backbone. Eubacteria possess both Fpg and Nei, which have different substrate specificities. Within eukaryotes, Nei (Nei-like, NEIL) is known only in some metazoans (primarily deuterostomes) and some protists, while Fpg is known only in some plants and fungi .
Bacterial Fpg is an extensively-studied functional analog of eukaryotic Ogg1, primarily recognizing the most common nucleotide oxidation product, 7,8-dihydro-8-oxoguanine (8-oxoG). In contrast, eukaryotic Fpg does not recognize 8-oxoG, rather it recognizes abasic sites, formamidopyrimidines, and two late-stage oxidation products of 8-oxoG: guanidinohydantoin and spiroiminodihydantoin [42, 43]. The substrate difference between bacterial and eukaryotic Fpg homologs has been attributed to the eukaryotic Fpg’s shorter αF-β9/10 loop, also called the 8-oxoG capping loop, which cannot cap and retain an 8-oxoG in the active site . Our structural comparisons of Fpg proteins from A. vaga, with those Geobacillus stearothermophilus, and Arabidopsis thaliana, as examined by , predict the A. vaga Fpg αF-β9/10 loop is intermediate sized (Fig. 2b). This suggests that A. vaga Fpg cannot cap 8-oxoG completely and its substrate affinities may be different from those of previously characterized homologs. Fpg is one of the more abundant glycosylase transcripts in the A. vaga transcriptome across all conditions tested.
Adineta vaga has a pair of UVDE genes likely acquired from an archaeon (Fig. 3). The only non-bdelloid animals with a UVDE-like gene in GenBank are two species of Trichuris, a genus of parasitic nematodes. Our phylogenetic analysis suggests that bdelloids and Trichuris represent independent occurrences of horizontal transfer of UVDE from different non-eukaryotic origins.
UVDE is an ATP-independent endonuclease that initiates AER. It recognizes and removes cyclobutane pyrimidine dimers (CPD) and 6–4 photoproducts (6–4)PP, both UV-induced damages normally recognized in eukaryotes by an approximately 30- protein recognition complex of NER. UVDE incises the DNA backbone 5′ to the damage, leaving a 5′ phosphate and 3′ hydroxyl ready for synthesis and ligation [37, 44, 45]. Evidence from in vitro and in vivo studies suggest UVDE can recognize a range of substrates typically targeted by NER, BER, or MMR including abasic sites, platinum adducts, uracil, dihydrouracil, abasic sites, mismatches, 3′-blocking groups, and short loops [46,47,48,49,50,51,52,53,54]. A. vaga UVDE is expressed at a low level under both hydrated and desiccating conditions.
Adineta vaga has a pair of AlkD genes likely acquired from bacteria (Fig. 4). AlkD is an unusual glycosylase that recognizes damage without intercalation or base-flipping and excises both inherently unstable cationic alkylation damage, particularly N3- and N7-alkylpurines, and large alkylation adducts normally repaired by NER [55, 56]. An AlkD ortholog is sporadically present in Metazoa and has not been identified in vertebrates or ecdysozoans. We have not found a metazoan AlkD ortholog in any rotifer, and the A. vaga AlkD is clearly distinct from the metazoan AlkD lineage (Fig. 4). AlkD is the only BER glycosylase that is upregulated in A. vaga entering desiccation.
Adineta vaga has a two ohnologous pairs of a unique form of DNA Ligase K (LigK, Fig. 5). LigK is an ATP-dependent polynucleotide ligase with a specific class of adenylation and OBF domains, both of which contain multiple DNA binding sites. Ligase K was named for its original characterization in kinetoplastids , and composes a distinct evolutionary lineage from eukaryotic Lig1 and Lig3/4. We have identified LigK throughout Eubacteria, in many fungi (though not Saccharomyces, Schizosaccharomycetes, or Neurospora), and sporadically in protists and Metazoa (though not in vertebrates, arthropods, or nematodes). The A. vaga LigK ohnologs are phylogenetically distinct from those in other Metazoa (Fig. 5a).
LigK contains a core of kDNA-adenylation and kDNA-OBF domains, usually at the C terminus, and generally contain other domains, such as poly(ADP-ribose) (PAR)-binding zinc finger (PBZ) domains, also called APLF-like zinc fingers, located on the N terminal side of the core domains. PBZ domains are strongly associated with nuclease activity and DNA metabolism, and possibly histone interactions [58, 59]. Each A. vaga ohnolog has a domain organization not seen in any other metazoan: the core domains occur at the N terminus, followed by a long disordered region containing one (ohnolog A) or two (ohnolog B) PBZ domains surrounded by a series of proline-serine-threonine rich degenerate repeats (Fig. 5b,c). The only other characterized LigK gene with a PBZ domain on the carboxy side of the core is from the soil fungus Mortierella, which lacks the serine-threonine rich degenerate repeats and has the PBZ domain at the C terminus. The difference in the number of PBZ domains between the A. vaga A and B ohnologs is likely due to expansion/contraction facilitated by PST-rich repeats rather than degeneration of a domain (Fig. 5d,e). The A and B ohnologs of LigK differ in their expression pattern during desiccation: transcript levels of ohnolog A do not change with desiccation, while transcript levels of ohnolog B increase significantly during recovery from desiccation (Fig. 5f).
Divergence of multiple copy DSB repair genes
Conspicuous among DDR genes in A. vaga are three present as duplicated ohnologs (more than 4 copies): Bloom helicase (Blm), DNA polymerase λ (Pol λ), and Aprataxin and PNK-like factor (APLF). In addition, 7 of 10 NHEJ are present as ohnologs or duplicated ohnologs, well above the genome average of 40% . Many NHEJ genes (DNAPKcs, Artemis, XRCC4, XLF, Ligase 4) were first described in mammalian V(D)J recombination and are absent in the model organisms Drosophila and C. elegans, and have therefore generally been considered “vertebrate specific.” However, homologs of DNAPKcs and Artemis were recently characterized in Dictyostelium  and have now been identified in automated annotation of multiple invertebrate genome assemblies.
Bloom helicase (Blm)
Adineta vaga has five copies of a Bloom-like helicase: two allele pairs and a fifth copy on an assembly scaffold without an allele partner (Fig. 6). In humans, Blm is involved in many aspects of HR and also interacts with MMR proteins Mlh1 and Msh6 . Human Blm contains a central RecQ domain flanked by an N-terminal BDHCT-box associated domain and a C-terminal Helicase and RNase Domain (HRDC); the BDHCT domain is weakly conserved in vertebrates and largely absent outside of chordates (Fig. 6a). Most metazoan Bloom-like proteins lack a BDHCT domain and many, such as the Drosophila homolog, also lack a HRDC domain but are active in homologous recombination . In A. vaga all five copies have an N-terminal RING-Ubox superfamily domain; the two pairs (B1,2 and C1,2) have the canonical RecQ family helicase domain while the fifth copy (A) has only the more general DEAD-like helicase superfamily domain in this region (Fig. 6b). Only the A copy has an identifiable C-terminal HRDC domain. This domain appears to function as a protein to DNA binding bridge, and is critical for topoisomerase dissolution of Holliday junctions . These Blm-like genes also present a rare case of potential functional divergence between allele copies: Copies C1 and C2 are 99% identical across the length of the protein, but only 95% identical in the first 270 aa. While the RING-Ubox domain is completely conserved, it is surrounded by regions where Ka/Ks are much greater than 1 Fig. 6c). The difference between the A copy and the B and C copies is reflected in profiles of differential transcript abundance entering or recovering from desiccation (Fig. 6d): expression of A increases more than 3-fold in recovery, while expression of B and C decrease about 2-fold both entering and recovering from desiccation.
Canonical NHEJ is initiated by the Ku70/Ku80 heterodimer threading onto exposed DNA ends. Adineta vaga has maintained ohnologous pairs of both Ku70 and of Ku80 and the conserved domains that characterize these proteins (Fig. 7). However, there is 20% amino acid difference between the ohnologs, and structural predictions suggests this divergence may affect function.
The Ku70 orthologs have similar structure through the N-terminal α/β domain and the central β barrel, but amino acid differences midway through the C-terminal arm cause divergent conformations that result in different positions of the respective SAP domains (Fig. 7a, c, d). The C-terminal arms of Ku70 and Ku80 embrace the β barrel of the opposite subunits  The SAP domain binds DNA and the affinity of Ku binding is modulated by sumoylation and acetylation of this region . The A ohnolog has two predicted sumoylation sites in its SAP domain, while the B ohnolog has one; however the B ohnolog has an additional predicted acetylation site. There has been an accelerated rate of amino acid divergence at the N terminus (Fig 7a), a negatively charged disordered region that flanks the DNA accepting ring and may modulate Ku-DNA interaction .
Like Ku70, the Ku80 orthologs also have potentially different tertiary structure (Fig. 7b, c, e). While the beginning of the N-terminal α/β domain is conserved, including the APLF binding sites, there are several indels at the end of the domain, including a Q3E4Q8 track present only in the A ohnolog. Amino acid differences at the end of the central β barrel and beyond predicted different orientations of both the C-terminal arm and DNAPKcs interaction region. There has been an accelerated rate of amino acid divergence at the C terminus, a region that interacts with DNAPKcs to result in different DNAPKcs activity depending on whether the DNA end has a 5′ overhang, a 3′ overhang, or is blunt .
All copies of Ku decreased in expression entering desiccation, but upon recovery Ku70A and Ku80A rebounded quickly to the expression level seen in the hydrated controls (Fig. 7f).
Adineta vaga has two orthologs of DNAPKcs (Fig. 8), a large DNA-dependent serine/threonine protein kinase that forms a ring structure that combines with Ku70 and Ku80 at either side of a DSB [67, 68]. The A ohnolog is 200 aa shorter than the B ohnolog in the N-terminal region, which builds the ring structure. The two ohnologs also differ in the presence of conserved domains: the A ohnolog has a NUC194 domain, a degenerated FAT domain, a phosphatidylinositol 3- and 4- kinase (PI3_PI4_kinase) domain, and an FATC domain. The B ohnolog shares the NUC194 and PI3_PI4_kinase domains but has an intact FAT domain and lacks the C terminal FATC domain. Although human DNAPKcs possesses all of these domains, the DNAPKcs of most eukaryotes do not. The domain structure of the A ohnolog (lacking an intact FAT domain) is also found in protists and fungi, while the domain structure of the B ohnolog (lacking the FATC domain) is found in some lophotrochozoans and in some mammals. Examples of both structures can be found in arthropods.
DNAPKcs undergoes conformational changes after DNA-binding and extensive autophosphorylation that in turn influence its kinase and repair activities . The ohnologs share 45 predicted autophosphorylation sites, including three positions where one ohnolog has a serine and the other a threonine. However, there are 34 predicted autophosphorylation sites in DNAPKcs copy A that are not in B and 26 predicted autophosphorylation sites in B that are not in A. Sliding window analysis of Ka/Ks along an alignment of the ohnologs reveals several regions of increased amino acid divergence, including the region where the FAT domain has degenerated in the A type. Transcript abundance of both ohnologs decreases during entry and recovery from desiccation (not shown).
In addition to direct action by the DNAPKcs-Ku70-Ku80 holoenzyme, accessory nucleases and phosphorylases are recruited to prepare broken DNA ends for ligation. Chief among these is Artemis, which acquires endonuclease activity upon phosphorylation by DNAPKcs to produce blunt ends from a wide variety of DNA structures . The two ohnologs of Artemis in A. vaga are conserved (93% amino acid identity) over the first two thirds of the protein (Fig. 9a), which contains the catalytic metallo-beta-lactamase (Lactamase_B) and DNA repair metallo-beta-lactamase (DRMBL) domains. However, there are differences in the predicted number and position of sites that would be phosphorylated by DNAPKcs, and the C terminal regions share only 45% amino acid identity. This region is poorly conserved across animals, but physical interactions with the N terminal portion are believed to regulate nucleolytic activity through autoinhibition . The transcript abundance of the A ohnolog is significantly decreased during recovery from desiccation while the B ohnolog is maintained at hydrated levels entering and recovering from desiccation (Fig. 9b). In addition, expression of the A ohnolog is 4–5 times more abundant than the B ohnolog in all conditions (Fig. 9c).
At the conclusion of NHEJ, ligation of DNA ends is achieved by the XRCC4-XLF-Lig4 holoenzyme. XRCC4 forms a homodimer that acts as a scaffold, interacting with Ku, DNAPKcs, XLF, Lig4, and the DNA. It is composed of an N terminal beta-barrel that interacts with XLF and other peptides, a ~ 80 aa helix that tightly binds Lig4 and interacts with DNA, and a poorly characterized, largely disordered C terminal domain that may contact the N terminal domain to stabilize protein interactions [71, 72]. The ohnologs of XRCC4 in A. vaga are more divergent (60% identity) than any other ohnologs in NHEJ (Fig. 10a). While the Ligase 4 binding domain is conserved, two kinks in the helix domain of the B ohnolog may affect Ligase 4 binding, DNA association, or both (Fig. 10b,c). The greatest difference between ohnologs occurs in the C terminal domain, with only 46% amino acid identity and several indels accounting for a difference in length of 34 amino acids, or nearly 20% of the region. Amino acid differences and indels result in a different pattern of predicted DNAPKcs phosphorylation sites and a substantial charge difference in the region, with net charges of − 11.5 and 0 for the A and B ohnologs, respectively. The transcript abundance of the A ohnolog is significantly decreased during recovery from desiccation while the transcript levels of the B ohnolog are maintained at hydrated levels entering and recovering from desiccation (Fig. 10d).
DNA polymerase λ
Both Polβ and Polλ are present in A. vaga, notable in part because both X-family polymerases were until recently thought to be absent from protostomes [73, 74]. There are six Polλ copies (three pairs) in A. vaga while every other replicative, repair, and TLS polymerase lacks ohnologs. Polλ is the primary gap-filling polymerase of NHEJ, and can participate in BER along with Polβ, the primary gap-filling polymerase of that pathway [75, 76]. Both Polβ and Polλ can also perform TLS , but they are not generally categorized as TLS polymerases.
Both Polβ and Polλ have 5′-deoxyribose-5-phosphate lyase domains to remove chemical groups that would block ligation. Polλ interacts with NHEJ proteins through its BRCT domain, while the serine-proline-rich region affects fidelity and may be post-translationally modified [78, 79]. The three pairs of Polλ in A. vaga all have the same domains but vary in length, and the relative location of the BRCT domain shifts between the copies, spanning different secondary structures, which themselves are not entirely conserved (Fig. 11a,b). Only the C pair shares the human Polλ secondary structures, but it also has the smallest BRCT domain. There is substantial sequence divergence between the copies in the serine-proline rich regions (Fig. 11c). Finally, we find evidence that multiple sites within the 8kD lyase domain, the flanking regions, and the palm domain are under positive selection in the A pair.
Although all copies appear capable of producing properly spliced coding sequence, one of the A copies and one of the C copies are not expressed during any condition of the RNA-Seq experiment. Transcript abundance of the expressed copy A is significantly decreased during recovery from desiccation, while the transcript abundance of the B copies remains unchanged entering or recovering from desiccation (Fig. 11e). Transcript abundance of the expressed C copy is significantly higher entering and recovering from desiccation.
APLF is an intrinsically disordered protein that participates in both BER and NHEJ. Typically identified as an accessory in NHEJ, APLF has remarkable spatial and temporal spans, interacting with core NHEJ members at the beginning and end of the break repair process and participating in multiprotein assemblages of up to six proteins. APLF has two DNA-damage dependent phosphorylation sites, a Ku80-binding domain, two poly(ADP-ribose) (PAR)-binding zinc fingers (PBZ) associated with both histone interactions and nuclease activity, and an acidic region with histone chaperone activity [58, 59, 71, 80,81,82] (Fig. 12a). APLF is able to recognize and incise at abasic sites and at some damaged sites, including hydroxyuracil, hydroxycytosine and thymine glycol . Remarkably, A. vaga has two sets of ohnologs of APLF (A:B and C:D), each with a different domain structure (Fig. 12b). The C and D ohnologs do not have the C-terminal histone-binding and chaperone region, the loss of which effectively places their non-canonical PBZ domains at the C-terminal end of the protein (Fig. 12b,c). The Ku80 binding domain is well conserved in the C and D ohnologs, less conserved in the A ohnologs, and poorly conserved in the B ohnologs (Fig.12d). Each ohnolog also has a distinct expression pattern during desiccation: abundance of ohnolog A transcripts decrease during entry and recovery, ohnolog B transcripts increase during recovery, ohnolog C transcripts are unchanged in desiccation, and ohnolog D transcripts decrease during recovery (Fig. 12e).
Bdelloids have evolved under selective pressure to cope with desiccation and its damaging effects to DNA. We found numerous DDR genes retained in A. vaga that have been lost in classic invertebrate models, genes incorporated by horizontal transfer, and genes present in unexpectedly high copy numbers. These patterns were non-randomly distributed in DDR: with the exception of Blm, all were in either BER or NHEJ, the two main pathways that repair oxidative damage. Within this collection of genes several themes emerge: resilience through increased damage recognition and end repair capability; modifications in DSB repair that may promote gene conversion between divergent sequences; and the potential functional divergence between ohnologs.
Damage recognition and end repair
Several A. vaga excision repair glycosylases and endonucleases, including three of non-metazoan origin (Fpg, UVDE, and AlkD), recognize multiple substrates in other systems. UVDE and AlkD can excise both small base lesions usually repaired by BER or MMR, and bulky lesions repaired by NER [47, 55]. UVDE and APLF can initiate AER by producing a single nick at abasic sites and at certain damaged bases . The horizontally-acquired Fpg of A. vaga may also act as an endonuclease to initiate AER if, like homologs from Candida albicans and A. thaliana, its preferred substrate is an abasic site . That APLF, the DDR gene present in the most copies, and three of the four horizontally-acquired DDR genes are all involved in damage recognition and removal suggests to us that enhancing those functions can improve resilience to oxidative damage.
End repair to remove chemical blockages preventing synthesis or ligation is rate-limiting in BER and in NHEJ [83, 84]. A phosphate, unsaturated sugar residue, or other non-hydroxyl group at the 3′ end of DNA is a block to repair synthesis by a DNA polymerase; a 5′ blocking group, often a 5′-deoxyribosephosphate (5-dRP), will prevent ligation. Artemis and APLF can remove 3′ synthesis blocking groups, while Artemis, Polλ, and the Ku heterodimer have 5’-dRP lyase activity. The shared PBZ domains of APLF and LigK suggest that LigK may have a 3′-end processing activity [58, 59]; LigK could also have a 5′ end processing function to remove blocks to ligation similar to some bacterial NHEJ ligases [85, 86]. These results suggest that end-repair genes have been incorporated into DDR by horizontal transfer or duplication because they improve rate limiting steps, augmenting end repair to improve resilience to oxidative damage.
DSB repair and gene conversion
Bdelloid oocytes are arrested at G0 , a condition which would favor DSB repair by NHEJ over HR due to the lack of sister chromatids. The proliferation of NHEJ genes compared to HR genes in A. vaga is consistent with a prominent role of NHEJ in bdelloid DDR. However, a role for HR in A. vaga is suggested by evidence of extensive gene conversion between former alleles that are much more divergent than sister chromatids [5, 8]. Higher rates of conversion between divergent sequences may be moderated by the absence of Rad52, BRCA2, and MMR genes, and possibly by the abundance of Blm helicases [88, 89]. Blm has roles promoting synthesis-dependent strand annealing-type HR and in dissolving double-Holliday junctions, both of which convert genes between homeologous sequences [88, 90]. We and others have hypothesized that gene conversion between divergent gene copies in bdelloid mimics meiotic recombination by exposing recessive alleles to selection, potentially allowing and escape from Muller’s Ratchet [5, 8, 19].
Functional divergence of multiple copy genes
We found that the multiple copies of LigK and the majority of NHEJ genes (Ku70, Ku80, DNAPKcs, Artemis, XRCC4, APLF, and Polλ) have regions of high primary sequence difference, differences in predicted modifications such as phosphorylation, differences in predicted secondary or tertiary structure, differences in expression pattern through the desiccation process, or a combination of these. In the case of LigK, the different number of PBZ domains is likely to confer very different functional properties to the two ohnologs. Studies of the PBZ domains of human APLF suggest that tandem PBZ domains act cooperatively, with a binding affinity to poly(ADP-ribose) 1000 times greater than single domains . Unlike the A ohnolog with a single PBZ domain, the B ohnolog with two PBZ domains is significantly overexpressed during recovery from desiccation.
While desiccation may be only one driver of functional divergence between ohnologs, and transcript abundance profiles are an imprecise measure of activity, it is remarkable that seven of the ten NHEJ genes we examined have ohnologs and that six of these show different expression profiles between ohnologs in desiccation. The expression profiles of Ku70 and Ku80 A and B ohnologs suggest specific pairing of these heterodimers. DNAPKcs is remarkable for having ohnologs that have different domain structures that have each evolved independently multiple times in different eukaryotic lineages. In the case of APLF, differences in the Ku binding domain between ohnologs may alter both Ku80-binding and XRCC1/4 binding by the nearby FHA domain, and the absence of the C-terminal histone binding region in C and D orthologs and non-canonical residues within their PBZ domains suggest they would bind PAR very differently than the A and B orthologs. Length differences between the short, compact structure of the C ohnolog versus the extended D ohnolog likely cause differences in scaffolding ability. The three ohnologs of Polλ show reduced expression, no change, and increased expression in response to desiccation. The proteins differ in a number of structural features and coding sequence show evidence of positive selection between copies. At this time, Polλ is the only DDR gene for which there are sequences available from enough non-bdelloid rotifers to perform these tests for selection. The four ohnologs of APLF also each have a different expression profiles.
The large number of NHEJ genes maintained in multiple copies is even more remarkable considering that when these genes are present in species other than bdelloid rotifers, they are invariably found only in single copies (the only exception we have identified are recent duplications of Ku70 and/or Ku80 in some fish genera). This suggests that other species are unable to maintain duplications of these genes as they inevitably diverge through mutation accumulation .
A unifying feature of all of the genes for which we find evidence of functional divergence is that they each have multiple roles in DNA repair. If these disparate functions cannot simultaneously be optimized by selection in a single copy of the gene, duplication would allow an escape from this adaptive conflict through subfunctionalization and potential eventual neofunctionalization . This raises the question of why, if the advantage is so clear, other animals have not retained duplicated copies of these genes. We suggest that given the complexity of these proteins and their importance to fitness it is difficult for duplicates to evolve neutrally as generally envisioned in models of subfunctionalization . If gene duplicates accumulate mildly deleterious mutations, sexual recombination will quickly eliminate them from a population before new beneficial structures can evolve. Because asexual species are much less efficient at removing deleterious mutations, it may be possible for them to cross fitness landscapes that sexual species cannot. This would suggest a potential long-term advantage to asexuality that could help explain both the large scale retention of duplicated genes in bdelloids and their evolutionary success.
Identification, annotation and analysis of DNA repair genes
We annotated the gene calls from the Adineta vaga genome assembly  available from http://www.genoscope.cns.fr/adineta/data/ or ftp://ftp.ensemblgenomes.org/pub/metazoa/release-41/fasta/adineta_vaga/dna/ using protein BLAST [95, 96] to KEGG  and NCBI refseq databases , considering evalue, aligned length, percent identity, bit scores, and reciprocal best BLAST confirmation. About one third of the identified genes appeared to have an odd number of copies, to have short coding sequences relative to alleles and/or ohnologs, or to be fragmented into multiple consecutive gene calls. We edited these using Genewise , manual curation, and RNA-Seq mapping results (described below), resulting in new or improved annotations for two-thirds of the inspected genes. We designate ohnologs as A, B, etc. and allele copies as A1, A2 or B1, B2 following .
We used the Phyre2 server  to predict protein secondary and tertiary structure and Chimera 1.11.2  to visualize and compare models. Chimera is developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco (supported by NIGMS P41-GM103311). We identified protein functional domains using hmmsearch and hmmscan tools in HMMER v3.1  to Pfam-A v30.0 , or the NCBI CD-Search tool  to the NCBI Conserved Domain Database . We used Illustrator for Biological Sequences  to visualize protein domain architecture. We used the Expresso mode of T-Coffee Version_11.00.8cbe486 [107, 108] to align predicted amino acid translations of A. vaga gene copies with orthologs and other homologs identified by BLASTP searches of NCBI databases.
We generated gene trees using RAxML 8.2.11  and MrBayes v3.2.6 . For RAxML, we chose the amino acid model with the best likelihood using the PROTGAMMAAUTO option, and searched for the best maximum likelihood tree with bootstrap support from 1000 replicates. For MrBayes, we estimated the appropriate evolutionary model (“prset aamodelpr = mixed”) and compared two runs of 4 chains each after 4 × 106 generations, sampling every 100 generations and discarding the first 30,000 samples as burn-in. Convergence diagnostics ESS, PSRF, and the average standard deviation of the split frequencies between runs were always > 100, 1.000–1.010, and < 0.10, respectively. To guard against erroneous placement of A. vaga sequences due to long-branch attraction effects we used the posterior Bayes factor approach of  as implemented in pbf v1.0 to adjust the posterior probability of each tree and split and examined the placement of A. vaga sequences relative to other clades.
For a list of all species and accession numbers used for each gene, RAxML command lines and MrBayes blocks, and complete maximum likelihood and Bayesian trees with bootstrap or posterior probability support, see Additional file 2.
We used Ka_Ks_Calculator 2.0 on sliding windows of coding sequence alignments to determine Ks and Ka using the Model Averaging method . Where we had constructed multiple alignments for phylogenetic analysis, we extracted aligned sequence pairs from the alignments described above and removed shared gaps; otherwise we aligned peptide pairs using Expresso as described above. We conducted both site and branch–site tests of positive selection using codeml in PAML v4.9c. Non-default parameters in our codeml control file for site tests were CodonFreq = 1, model = 0, NSites = 0 1 2 7 8, omega = 0.4, and we used the likelihood ratio test to compare NSites = 2 vs NSites = 1 with df = 2 and NSites = 8 and NSites = 7 with df = 1. For branch-site tests we changed to model = 2, NSites = 2 and tested fix_omega = 0, omega = 1.5 and omega = 1, fix_omega = 1.
Rotifer desiccation, recovery, and collection
We collected 3 biological replicates each of rotifers entering desiccation, recovering after 7 days of desiccation, and from hydrated controls. Our culture of A. vaga is derived from a single egg and grown in 6-well tissue culture plates in filtered spring water on a diet of E. coli, as previously described . For each replicate, we pooled nine 6-well dishes of A. vaga, added approximately equal number of animals to each of nine 150-mm tissue culture plates coated with 3% low-melting point agarose. A serological pipet with its tip wrapped in 10-um nylon mesh was used to remove 25-mL liquid from each plate. The plates were set bench-top at ambient temperature (~ 21 °C) and humidity with their lids raised ~ 0.5 cm on three paperclips attached to the sides of the lower dish. Plates were observed daily. Hydrated controls were collected after 2 days. Animals entering desiccation were collected after 4–6 days, when a water film remained and animals were often contracted, but still showed some movement. Desiccation started at or near day 10, when the agarose was dry and the rotifers had contracted into tuns. After 7 days in desiccation, 10 mL (final post-agarose-rehydration volume) sterile spring water was added to each plate. The rotifers began moving almost immediately and were harvested after 1 h of recovery by adding 20-ml HBSS to each dish and transferring rotifers, water, and buffer to sterile 35-mL centrifuge tubes on ice. The rotifers were pelleted 5 min at 5000 rcf at 4 °C, then all but 1-mL of supernatants were removed; rotifers were resuspended in the remaining liquid and transferred to 1.5-mL tubes on ice. Rotifers were then pelleted 1 min at 2000 rcf, room temperature, followed by complete removal of supernatant. Pellets were directly processed as described below. Culture dishes, serological pipets, and Autofil 0.22um PES filter bottle assemblies were from USA Scientific, Ocala, FL. Poland Spring water (Poland Spring, ME) is used for cultures. Gridded, 150-mm tissue culture plates were from Falcon Corning, Corning, NY. Low-melting point agarose was from Fisher Scientific, Pittsburgh, PA; and nylon 10-um Nitex mesh was from Sefar America, Inc., Kansas City, MO. Hank’s Buffered Saline Solution (HBSS) without calcium, magnesium or phenol red was from Thermo Fisher Scientific/ Gibco-Life Technologies, Waltham, MA.
RNA-Seq library construction and sequencing
For each replicate, RNA was extracted from 20-mg A. vaga pellets collected from hydrated rotifers, rotifers entering desiccation, and rotifers recovering from desiccation, using TRIzol Reagent per the manufacturer’s protocol, with the addition of linear acrylamide as a co-precipitant. An additional extraction with RNeasy MinElute prior to DNA removal by TURBO DNA-free Kit removed agarose carryover. RNA was quantified with RiboGreen and qualified by 260/230 ratio and Bioanalyzer RNA 6000 Pico Kit. Libraries were constructed using the KAPA Stranded mRNA-Seq Kit with Illumina TruSeq indexed adapters from 625 ng total RNA per sample. Magnetic bead binding steps in the library preparation protocol were performed in 1.5-mL tubes. Final libraries were assessed with PicoGreen and Bioanalyzer High-Sensitivity DNA chip prior to pooling and size selecting 445 bp using a Pippin Prep 1.5% agarose gel cassette. Final quantification of amplifiable library fragments was done by qPCR. Libraries were sequenced on three separate Illumina NextSeq runs with 150 bp paired-end reads and a dedicated index read. An average of 27,000,000 paired-end, 150-bp reads (over 7Gb) were obtained per library and 91% passed quality filtering . Reads are available from the NCBI Short Read Archive under accessions SRR7962065–SRR7962073 or as BioProject PRJNA494578. TRIzol Reagent, TURBO DNA-free Kit, Quant-iT RiboGreen RNA Assay, and Quant-iT PicoGreen dsDNA Assay Kit were from Thermo Fisher Scientific. Linear acrylamide was from AMRESCO, Solon, OH and the RNeasy MinElute Cleanup Kit was from Qiagen, Germantown, MD. RNA 6000 Pico and High-Sensitivity DNA kits were from Agilent Technologies, Santa Clara, CA. KAPA Stranded mRNA-Seq Kit and Library Quantification (qPCR) Kit for Illumina platforms were from Kapa Biosystems, Wilmington, MA. Illumina TruSeq indexed adapters were from Integrated DNA Technologies (IDT), Coralville, IA. Pippin Prep 1.5% agarose gel cassettes were from Sage Science, Beverly, MA.
To aid gene annotation, results from two of the replicate hydrated libraries were mapped to the annotated A. vaga genome using the RNA-Seq Analysis tool in CLC Genomics Workbench 8.5.1. Mapping was done to genes plus intergenic regions with the following parameters: 80% length, 90% similarity, mismatch cost = 2, insertion cost = 3, deletion cost = 3, no global alignment, reverse strand specific, auto-detect paired distances, maximum of 10 hits per read, count paired reads as two, and expression value = total counts. Final transcript validation was performed by mapping results from all nine libraries to the genome using the Large Gap Mapping tool in CLC Genomics Workbench 10.0.1 with the following parameters: max number hits per segment = 6, max distance from seed = 2000, multi-match mode = random, mismatch cost = 2, insertion cost = 3, deletion cost = 3, similarity = 0.96, length fraction = 0.9, override default distances = no.
After complete curation of DDR genes, each library was mapped using RSEM 1.3.0  to the complete A. vaga transcriptome reference edited by replacing 58 DDR coding sequences with improved annotation as describe above. Differential expression was assessed using EBSeq 1.2.0  invoked from the rsem-run-ebseq script in RSEM using default values; and DESeq2 v1.10.1, EdgeR v3.12.1, and limma v3.26.9 invoked from the run_DE_analysis.pl script in Trinity v2.2.0 using default values. Significant differential expression (SDE) was PPDE > 0.95 for EBSeq, padj < 0.05 for DESeq, or fdr < 0.05 for edgeR and limma-voom. The statistics underlying padj, PPDE, and fdr are different and an arbitrary alpha = 0.05 is not equivalent across methods. A bash script detailing all program invocations is available as Additional file 3.
We compared the reproducibility of our replicate libraries by examining scatterplots of transcript mapping reported as transcripts per million (TPM). R2 values were > 0.99; 0.92–0.97; and 0.88–0.99 for the hydrated, entering, and recovering replicates, respectively. The greater variance in entering and recovering libraries were each due to single libraries. To assess whether these libraries affected the geometric mean size factor used to normalize read counts in EBSeq and DESeq we examined the distribution of the ratio of the counts per gene and the geometric mean for each library, and in all cases found a unimodal histogram with the size factor at or near the mode. Repeating the differential expression analyses without these libraries had no significant effect on significant differential expression of DDR genes.
Alternate Excision Repair
Base Excision Repair
DNA damage response
Nucleotide Excision Repair
Non-Homologous End Joining
single strand break
Hsu WS. Oogenesis in the Bdelloidea rotifer Philodina roseola Ehrenberg. Cellule. 1956;57:283–96.
Debortoli N, Li X, Eyres I, Fontaneto D, Hespeels B, et al. Genetic exchange among Bdelloid rotifers is more likely due to horizontal gene transfer than to meiotic sex. Curr Biol. 2016:1–10. https://doi.org/10.1016/j.cub.2016.01.031.
Eyres I, Boschetti C, Crisp A, Smith TP, Fontaneto D, et al. Horizontal gene transfer in bdelloid rotifers is ancient, ongoing and more frequent in species from desiccating habitats. BMC Biol. 2015;13:90. https://doi.org/10.1186/s12915-015-0202-9.
Mark Welch DB, Ricci C, Meselson M. Bdelloid rotifers: Progress in understanding the success of an evolutionary scandal. In: Schoen I, Martens K, van Dijk P, editors. Lost sex: the evolutionary biology of parthenogenesis. Springer; 2009. p. 259–79.
Caprioli M, Katholm AK, Melone G, Ramlov H, Ricci C, et al. Trehalose in desiccated rotifers: a comparison between a bdelloid and a monogonont species. Comp Biochem Physiol Part A Mol Integr Physiol. 2004;139:527–32.
Hespeels B, Knapen M, Hanot-Mambres D, Heuskin A-C, Pineux F, et al. Gateway to genetic exchange? DNA double-strand breaks in the bdelloid rotifer Adineta vaga submitted to desiccation. J Evol Biol. 2014;27:1334–45. https://doi.org/10.1111/jeb.12326.
Kathe SD, Barrantes-Reynolds R, Jaruga P, Newton MR, Burrows CJ, et al. Plant and fungal homologs are formamidopyrimidine DNA glycosylases but not 8-oxoguanine DNA glycosylases. DNA Repair (Amst). 2009;8:643–53. https://doi.org/10.1016/j.dnarep.2008.12.013.Plant.
Duclos S, Aller P, Jaruga P, Dizdaroglu M, Wallace SS, et al. Structural and biochemical studies of a plant formamidopyrimidine-DNA glycosylase reveal why eukaryotic Fpg glycosylases do not excise 8-oxoguanine. DNA Repair (Amst). 2012;11:714–25. https://doi.org/10.1016/j.dnarep.2012.06.004.
Bowman KK, Sidik K, Smith CA, Taylor J-S, Doetsch PW, et al. A new ATP-independent DNA endonuclease from Schizosaccharomyces pombe that recognizes cyclobutane pyrimidine dimers and 6–4 photoproducts. Nucleic Acids Res. 1994;22:3026–32. https://doi.org/10.1093/nar/22.15.3026.
Doetsch PW, Beljanski V, Song B (2005) The ultraviolet damage endonuclease (UVDE) protein and alternative excision repair: a highly diverse system for damage recognition and processing. In: Siede W (University of NTHC, Doetsch PW, editors. DNA Damage Recognition. New York: CRC Press. pp. 223–238.
Yonemasu R, McCready SJ, Murray JM, Osman F, Takao M, et al. Characterization of the alternative excision repair pathway of UV-damaged DNA in Schizosaccharomyces pombe. Nucleic Acids Res. 1997;25:1553–8.
Paspaleva K, Thomassen E, Pannu NS, Iwai S, Moolenaar GF, et al. Crystal structure of the DNA repair enzyme ultraviolet damage endonuclease. Structure. 2007;15:1316–24. https://doi.org/10.1016/j.str.2007.05.010.
Kanno SI, Shigenori I, Takao M, Yasui A. Repair of apurinic/apyrimidinic sites by UV damage endonuclease; a repair protein for UV and oxidative damage. Nucleic Acids Res. 1999;27:3096–103. https://doi.org/10.1093/nar/27.15.3096.
Mullins EA, Shi R, Parsons ZD, Yuen PK, David SS, et al. The DNA glycosylase AlkD uses a non-base-flipping mechanism to excise bulky lesions. Nature. 2015;527:254–8. https://doi.org/10.1038/nature15728.
Kanno S, Kuzuoka H, Sasao S, Hong Z, Lan L, et al. A novel human AP endonuclease with conserved zinc-finger-like motifs involved in DNA strand break responses. EMBO J. 2007;26:2094–103. https://doi.org/10.1038/sj.emboj.7601663.
Li S, Kanno S, Watanabe R, Ogiwara H, Kohno T, et al. Polynucleotide kinase and aprataxin-like forkhead-associated protein (PALF) acts as both a single-stranded DNA endonuclease and a single-stranded DNA 3′ exonuclease and can participate in DNA end joining in a biochemical system. J Biol Chem. 2011;286:36368–77. https://doi.org/10.1074/jbc.M111.287797.
Ertl HA, Russo DP, Srivastava N, Brooks JT, Dao TN, et al. The role of Blm helicase in homologous recombination, gene conversion tract length, and recombination between diverged sequences in Drosophilamelanogaster. Genetics. 2017;207:923–33. https://doi.org/10.1534/genetics.117.300285.
Woods DS, Sears CR, Turchi JJ, Hendrickson E, Ramsden D, et al. Recognition of DNA termini by the C-terminal region of the Ku80 and the DNA-dependent protein kinase catalytic subunit. PLoS One. 2015;10:e0127321. https://doi.org/10.1371/journal.pone.0127321.
Spagnolo L, Rivera-Calzada A, Pearl LH, Llorca O. Three-dimensional structure of the human DNA-PKcs/Ku70/Ku80 complex assembled on DNA and its implications for DNA DSB repair. Mol Cell. 2006;22:511–9. https://doi.org/10.1016/J.MOLCEL.2006.04.013.
Ma Y, Pannicke U, Schwarz K, Lieber MR. Hairpin opening and overhang processing by an Artemis/DNA-dependent protein kinase complex in nonhomologous end joining and V(D)J recombination. Cell. 2002;108:781–94. https://doi.org/10.1016/S0092-8674(02)00671-2.
Niewolik D, Peter I, Butscher C, Schwarz K. Autoinhibition of the nuclease ARTEMIS is mediated by a physical interaction between its catalytic and C-terminal domains. J Biol Chem. 2017;292:3351–65. https://doi.org/10.1074/jbc.M116.770461.
Hammel M, Yu Y, Radhakrishnan SK, Chokshi C, Tsai M-S, et al. An intrinsically disordered APLF links Ku, DNA-PKcs and XRCC4-DNA ligase IV in an extended flexible non-homologous end joining complex. Jbc. 2016;291:1–36. https://doi.org/10.1074/jbc.M116.751867.
Lee JW, Blanco L, Zhou T, Garcia-Diaz M, Bebenek K, et al. Implication of DNA polymerase lambda in alignment-based gap filling for nonhomologous DNA end joining in human nuclear extracts. J Biol Chem. 2004;279:805–11. https://doi.org/10.1074/jbc.M307913200.
Braithwaite EK, Kedar PS, Stumpo DJ, Bertocci B, Freedman JH, et al. DNA polymerases beta and lambda mediate overlapping and independent roles in base excision repair in mouse embryonic fibroblasts. PLoS One. 2010;5:e12229. https://doi.org/10.1371/journal.pone.0012229.
Taggart DJ, Dayeh DM, Fredrickson SW, Suo Z. N-terminal domains of human DNA polymerase lambda promote primer realignment during translesion DNA synthesis. DNA Repair (Amst). 2014;22:41–52. https://doi.org/10.1016/j.dnarep.2014.07.008.
Fiala KA, Duym WW, Zhang J, Suo Z. Up-regulation of the fidelity of human DNA polymerase lambda by its non-enzymatic proline-rich domain. J Biol Chem. 2006;281:19038–44. https://doi.org/10.1074/jbc.M601178200.
Grundy GJ, Rulten SL, Zeng Z, Arribas-Bosacoma R, Iles N, et al. APLF promotes the assembly and activity of non-homologous end joining protein complexes. EMBO J. 2013;32:112–25. https://doi.org/10.1038/emboj.2012.304.
Iles N, Rulten S, El-Khamisy SF, Caldecott KW. APLF (C2orf13) is a novel human protein involved in the cellular response to chromosomal DNA strand breaks. Mol Cell Biol. 2007;27:3793–803. https://doi.org/10.1128/MCB.02269-06.
de Ory A, Nagler K, Carrasco B, Raguse M, Zafra O, et al. Identification of a conserved 5’-dRP lyase activity in bacterial DNA repair ligase D and its potential role in base excision repair. Nucleic Acids Res. 2016;44:1833–44. https://doi.org/10.1093/nar/gkw054.
Li G-Y, McCulloch RD, Fenton AL, Cheung M, Meng L, et al. Structure and identification of ADP-ribose recognition motifs of APLF and role in the DNA damage response. Proc Natl Acad Sci U S A. 2010;107:9129–34. https://doi.org/10.1073/pnas.1000556107.
Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44:D279–85. https://doi.org/10.1093/nar/gkv1344.
Armougom F, Moretti S, Poirot O, Audic S, Dumas P, et al. Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-coffee. Nucleic Acids Res. 2006;34:W604–8. https://doi.org/10.1093/nar/gkl092.
Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, et al. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61:539–42. https://doi.org/10.1093/sysbio/sys029.
Zhang Z, Li J, Zhao X, Wang J, Wong GK, et al. KaKs Calculator : calculating Ka and Ks through model selection and model averaging. Genomics Proteomics Bioinforma. 2006;4:259–63. https://doi.org/10.1016/S1672-0229(07)60007-2.
Minoche AE, Dohm JC, Himmelbauer H. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol. 2011;12:R112. https://doi.org/10.1186/gb-2011-12-11-r112.
We thank Phyllis R. Strauss for her encouragement, advice, and guidance, and two anonymous reviewers whose comments greatly improved the manuscript.
This work was supported by NIH NIA R21AG046899 and by the G. Unger Vetlesen Foundation. Funding sources had no role in the study design or the collection, analysis, or interpretation of the date, or in the writing of the manuscript.
All Examined DNA Damage Response Genes. This Excel file is a table containing all gene names, A. vaga genome accession numbers, genome coordinates, expression levels, and results of differential expression significance tests. (XLSX 132 kb)
Details of Phylogenetic Analyses. This pdf contains scripts for RAxML and MrBayes analyses, complete gene trees and tables genes used for each tree (OTU designation, Accession, and species name). (PDF 692 kb)
Script for RNASeq Analysis. This Word file contains a bash script to run the rsem mapping and differential gene expression tests. (DOCX 113 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.