Skip to main content
Figure 1 | BMC Evolutionary Biology

Figure 1

From: Evolution of protein indels in plants, animals and fungi

Figure 1

Semi-automated pipeline for identifying universal eukaryote protein orthologs. The diagram shows the workflow for identifying universal single or inparalog-only orthologous protein clusters. Orthologous protein candidates were identified using InParanoid version 3 [59] with pairwise comparisons among three starting test proteomes: D. rerio, S. serevisiae, and A. thaliana. The 477 orthologous protein candidates identified were used as seeds to BLASTp search 35 additional proteomes. The resulting putative orthologous clusters were aligned using MUSCLE version 3.6 [40, 41], and screened by eye to eliminate incomplete sequences. Neighbor-Joining (NJ) trees were used to screen for redundant and unusually long branched sequences and to eliminate all but the shortest-branching sequence of each set of in-paralogs. Clusters found to include out-paralogs were partitioned into separate ortholog clusters. Clusters missing sequences from entire major taxa were also discarded. For the remaining protein alignments, indels were extracted using the program SeqFIRE. Genome combinations for the initial pairwise comparisons are indicated as follows: S-D (S. cerevisiae × D. rerio), S-A (S. cerevisiae × A. thaliana), D-A (D. rerio × A. thaliana).

Back to article page