Sample collection
Adult seahorses (H. abdominalis) were purchased from a commercial breeding facility (Seahorse Australia, Beauty Point, Tasmania) and held at the University of Zurich under an animal care and experimentation permit from the Veterinäramt Zürich (Permit 103/2008). In addition, fin clips from 5 wild-caught seahorses from Sydney, Australia (2 individuals collected in 2003) and Tasmania (3 individuals collected from 3 populations in 2003 and 2004) [31] were included in analyses of MH class II genetic diversity.
Gene discovery via high-throughput transcriptome profiling
A full plate of 454 sequencing of transcriptome libraries prepared from pouch and reference (brain, gills, liver, heart, kidney and testes) tissues from a single pregnant and non-pregnant seahorse using the GS FLX Titanium Chemistry (Roche) recovered six partial transcripts of the MH IIα locus of the seahorse, which were assembled into a single 541 bp contig spanning the 5′ UTR, complete exons 1 and 2 and partial exon 3. Further details on this transcriptome screen are available elsewhere [11]. A TBLASTX search of the nucleotide collection of GenBank identified MH class IIα of Dicentrarchus labrax as the top hit (DQ821109.1: e-value = 3e-47), followed by class IIα genes from other teleost species (Morone saxatilis L35062.1: e-value = 8e-42, Larimichthys crocea EF681861.1: e-value = 5e-41, Miichthys miiuy GU936787.1: e-value = 2e-40, and Epinephelus coioides GU992883.1: e-value = 3e-40).
Full-length cDNA sequencing
Total RNA was extracted from brain, gill, kidney, liver, pouch and testes tissues of two pregnant seahorses with RNeasy extraction columns (Qiagen). RNA extractions were subsequently DNase treated, standardized to a common concentration of 85 ng, and pooled for library construction. 5′ and 3′ RACE libraries were prepared from 1 μg of total RNA using a SMARTer RACE cDNA amplification kit (Clontech).
The full-length cDNA sequence of the MH class IIα gene of the seahorse was obtained using separate 5′ and 3′ RACE reactions primed with gene-specific primers (MHIIαE2F and MHIIαE3R; Table 1), in 25 μL PCR reactions using 1.5 μL of 5′/3′ RACE-ready cDNA. Both reactions produced single products from the 5′ and 3′ ends of the class IIα gene, which were PCR-purified with Montage PCRμ96 Filter Plates (Millipore) and eluted in 20 μL ddH20 in preparation for sequencing.
Sequencing reactions were carried out in 10 μL volumes consisting of 2–4 μL purified PCR product, 0.2 μM primer, and 1 μL Big Dye v3.1 Terminator Cycle Sequencing mixture (Applied Biosystems). Cycling conditions consisted of 30 cycles of 10 sec at 96°C, 5 sec at 50°C and 4 min at 60°C. Ethanol-purified products were sequenced on an ABI 3730 automated sequencer (Applied Biosystems).
Full-length genomic sequencing of the Major Histocompatibility class II gene region
To elucidate the structure and distribution of variation across the MH region of the seahorse, genomic DNA of a single non-pregnant individual was extracted from muscle tissue (DNeasy, Qiagen). The quality of extracted DNA was assessed on a 1.5% agarose gel and spectrophotometrically quantified using a Nanodrop 2000 (Thermo Scientific). The gDNA sequence of the MH class IIβ for this individual has been previously published [12].
The full-length gDNA sequence of the seahorse MH class II gene region was determined using long-range PCR with gene-specific primers designed from the cDNA sequence of MH class IIα and the 5′ UTR region of the class IIβ locus (MHIIα-E1F3/MHIIβ-5UTRR4; Table 1).
Long range PCR was performed in a 25 μL volume containing 3U LongAmp Taq (New England Biolabs), 1× LongAmp reaction buffer, 0.4 mM dNTPs, 0.2 μM primers and 250 ng DNA. PCR amplification involved a 2 min denaturation step at 92°C, followed by 30 cycles of 92°C (20 s), 65°C (20 s) and 65°C (10 min), and a final extension step for 10 min at 65°C. The PCR reaction was filter-purified in preparation for cloning.
4 μL of purified PCR product was cloned into a TOPO TA cloning vector (Invitrogen) following the manufacturers′ recommendations. Following overnight culture of transformed chemically competent E. coli at 37°C, 5 positive colonies were picked and grown for 16 h in liquid culture on a 200 rpm horizontal shaker at 37°C. Liquid cultures were purified for downstream sequencing using a QIAprep Spin Miniprep kit (Qiagen).
Screening of plasmid DNA with primers for the hypervariable peptide binding region of the seahorse MH class IIα peptide binding region revealed the presence of two alleles, both of which were sequenced to completion using a nested sequencing strategy involving primers distributed across the full length of the amplified region (Table 1) using the protocols outlined above. DNA sequencing revealed a 786 bp deletion between MH class IIα and IIβ in one of the two full-length alleles (Figure 1).
The intervening non-coding sequence between the class IIα and IIβ loci was PCR-amplified and sequenced from genomic DNA using two sets of PCR reactions, one using MHIIα-E4F/MHIIβ-5UTRR2 (60°C anneal) and a second using MHIIβ-5UTRF2/MHIIβ-E2R2 (55°C anneal), both of which spanned the deletion region, allowing the determination of allelic phase of sequences from IIα and IIβ loci.
RT-PCR screening of tissue-specific expression
Samples were obtained from captive H. abdominalis individuals (Seahorse Australia, Beauty Point, Australia) preserved in RNAlater (Sigma-Aldrich) and then stored at −80°C. Four reproductively active adult males were screened for MH II gene activity, with the stage of pregnancy estimated using a recently published developmental key for syngnathid fishes [32]. Total RNA was extracted from a panel of tissues (brain, gill, heart, kidney, liver, pouch, testis from one mid-pregnant animal; and pouch from one non-pregnant, one early pregnant, and one late pregnant individual) using an RNeasy Mini Kit with QiaShredder (Qiagen) and DNase I (Invitrogen) digestion. First-strand cDNA was synthesized with SuperScript III First-Strand Synthesis System for RT-PCR (Invitrogen) using random hexamer priming and 200 ng of RNA.
MH class II PCRs were carried out using intron-spanning primers indicated in Table 1. Beta-actin (ACTB) was used as the positive control to ensure uniform amplification for each tissue, and was amplified using ACTB-E2F - GTCATGGTCGGCATGGGAC and ACTB-E3R - AGGTAGTCTGTGAGGTCTCG. PCR reactions for ACTB were performed in 20 μl volumes containing 0.5U Taq (New England Biolabs), 1× NEB reaction buffer, 0.75 μM MgCl2, 0.2 mM dNTPs, 0.2 μM primers, and 1.0 μl DNA, with the following PCR cycling conditions: 95°C for 1:30, then 30 cycles of 95°C for 0:30, 53°C for 0:30, 68°C for 1:00, followed by 68°C for 5:00. MH amplifications were performed in 25 μl volumes containing 1U Taq (NEB), 1× NEB reaction buffer, 1.0 μM MgCl2, 0.4 mM dNTPs, 0.2 μM primers and 2.5 μl DNA, with the following cycling parameters: 92°C for 0:10, then 40 cycles of 92°C for 0:10, 55°C for 0:30, 68°C for 2:00 (for MH IIα) or 92°C for 5:00, then 40 cycles of 92°C for 0:30, 62°C for 0:30, 68°C for 4:00, followed by 68°C for 15:00 (for MH IIβ).
PCR products were subjected to electrophoresis at 100 V for 20 min in 1.5% agarose gels stained with ethidium bromide, and visualized using an AlphaImager gel documentation system (Alpha Innotech).
MH IIα inheritance and MH II linkage analysis
MH class IIα Exon 2, containing the immunologically active peptide-binding region of the gene, was PCR-amplified and sequenced in a sample of 47 F1 individuals from 5 families (n = 8–13 per family) which had previously been characterized for patterns of genetic variation at the MHIIβ peptide-binding region [11]. A comparison of parent-offspring genotype profiles allowed the inference of the mode of MHIIα inheritance and a means to test for linkage of IIα and IIβ loci in this species.
Standard PCR was performed in 25 μL volumes containing 1U Taq (NEB), 1× NEB reaction buffer, 1.0 μM MgCl2, 0.4 mM dNTPs, 0.2 μM of either MHIIα-E2F/MHIIα-E3R or MHIIα-I1F3/MHIIα-E3R and 25–250 ng DNA. PCR amplifications involved a 10 s denaturation step at 92°C, followed by 40 cycles of 92°C (10 s), 55°C (30 s) and 68°C (2 min). All individuals were PCR-amplified and sequenced using both sets of primer pairs. PCR amplifications of both PCR purification and sequencing followed that outlined for the cDNA experiment above, producing the full length sequence of the 249 bp exon. After trimming 2 bp from the 5′ end of the sequence alignment and 1 bp from the 3′ terminus to exclude incomplete amino acids, the analyzed exon 2 dataset included 246 bp/82 amino acids.
Characterization of the MH IIα peptide-binding region (PBR)
Exon 2 of MH class IIα was also sequenced in a population of 101 seahorse individuals for which the peptide-binding region of the MH class IIβ locus had previously been characterized [11] to obtain an estimate of population-level variability of this region. PCR amplification and sequencing conditions were identical to those outlined above.
Sequence processing
All PCR reactions were sequenced in both directions, aligned using ClustalW [33] and visualized in BioEdit v.7.0.9 [34]. Heterozygous sites were coded using IUPAC nomenclature for degenerate positions, and allelic sequences were inferred using the default settings of PHASE V2.1.1 [35]. Individuals for which allelic phase could not be reliably inferred by statistical inference (Phase probabilities ≥ 0.95) were re-amplified and cloned (MH IIα: 1 individual, MH IIβ: 1 individual, MH IIα/MH IIβ: 4 individuals). Four to five colonies were sequenced from each cloned individual, allowing the direct determination of individual alleles. All private alleles were separately re-amplified and sequenced to verify their identity (MH IIα: 6 individuals, MH IIβ: 2 individuals, MH IIα/MH IIβ: 13 individuals).
Analyses of sequence polymorphism/linkage
Nucleotide diversity (π) at MH IIα was estimated under the maximum composite likelihood model implemented in Mega v6.0 [36], with standard error estimates derived from 500 bootstrap replicates. Exact tests of Hardy-Weinberg equilibrium (1,000,000 step Markov Chain, 100,000 dememorization steps) were performed in Arlequin v3.5.1.2 [37].
Analysis of gametic phase of MH IIα and IIβ genotypes was performed using the Bayesian ELB approach [38] implemented in Arlequin v.3.5.12. Pairwise linkage analysis of unphased MH IIα and IIβ data was also carried out using Arlequin (20,000 permutations, 5 EM replicates).
Site-specific tests of positive selection
Characterization of synonymous and non-synonymous substitutions across the peptide binding region of MH IIα was performed in Mega v6.0 [36] using the Nei-Gojobori method with Jukes-Cantor distances. A one-tailed Z-test of positive selection (500 bootstrap replicates) tested the null hypothesis of neutral evolution for putative peptide binding sites, non-binding sites, and the full peptide binding region.
A neighbor-joining tree was constructed from MH IIα alleles using the maximum composite likelihood method implemented in Mega v6.0 [36]. This tree served as a starting tree for a site-specific analysis of positive selection in Codeml v4.8 [39], which compared the fit of a neutral evolution model with recombination (M7) with one allowing for positive selection (M8), using a likelihood-ratio test (LRT). Sites experiencing positive selection were identified following a Bayes Empirical Bayes analysis (posterior probability ≥ 0.95) [40].
Network construction
An allelic network was constructed to visualize genetic relationships among alleles of the MH IIα PBR using TCS v.1.21 [41], and prepared for publication using yED v3.12.2 [42]. A second network was constructed using phased MH IIα and IIβ data, in order to visualize the frequency and distribution of MH class II composite genotypes.
Recombination
The presence of recombination in the seahorse MH class II region was investigated using RECCO v.0.93 [43] (10,000 permutations), using a minimum mutation savings criterion of 5 to identify recombinants. Recombination analyses were carried out independently for the MH IIα PBR dataset, and for a concatenated alignment of phased MH IIα/IIβ data, allowing the identification of intra- and interlocus recombination. Inferred recombination breakpoints for interlocus recombinants included an offset of 4,630 bp of unknown sequence separating the PBRs of the two genes (Figure 1), reflecting uncertainty in the location of breakpoints in the unsequenced region between the two PBRs.
Protein structure
The quaternary structure of the MH class II complex of the seahorse was reconstructed via homology modeling of the full-length MH class IIα and IIβ loci to the previously determined crystallographic structure of the mouse MH class II molecule, using Protinfo PPC [44]. Inferred protein surface models of target and database sequences were annotated and visualized in Chimera v1.6.2 [45].
Protinfo PPC returned five significant hits (structure confidence: 41-55%), all of which matched PDB models for the extracellular domain of the MH class II complex of Mus musculus. One of the three top hits (PDB ID: 1ES0: structure confidence 55%) was selected as a model for the seahorse MH class II complex. The expression vector, peptide and linker of the mouse structure were omitted from the modeled data, as well as 8 aa of MH IIβ not resolved in the original model, resulting in a total of 182 aa and 180 aa for the MH class IIα and IIβ loci, respectively. Known peptide-binding sites for the human MH class II molecule [3] were annotated on the mouse model, along with sites under positive selection in the seahorse.
Availability of supporting data section
Sequence data generated for this project have been deposited in GenBank (Accession #: KP259890-KP259909).