Genomes analyzed
To analyze the representative species across all kingdoms, we first downloaded several publicly available completely sequenced genomes from each of the major kingdoms (Opisthokonta, Plantae, and Chromalveolates) from 2 sources (JGI and NCBI). We also downloaded all of the available bacterial and archaeal genomes from NCBI. When LRR-RK sequences were found in one genome, we downloaded all of the available complete genomes in that phylum. Thus, the complete proteomes of 77 eukaryotic, 50 archaeal and 884 bacterial species have been downloaded from their respective databases. See Additional file 1 for details regarding the genomes analyzed.
Sequence retrieval and domain predictions
We retrieved genes containing leucine-rich repeats (LRRs) and a kinase domain (KD) by running the hmmsearch program (HMMER 2.3.2) to search for the kinase Hidden Markov Model (HMM) profile (PF00069.16) within the proteomic sequences of completely sequenced genomes. Within this set of kinase proteins, we then searched for LRR-domain HMM profiles (PF00560.24) (E value cut-off < 1) [19, 20]. Signal peptides (SPs) and transmembrane domains (TMs) were predicted using the SignalP http://www.cbs.dtu.dk/services/SignalP/ and TMHMM http://www.cbs.dtu.dk/services/TMHMM/ websites, respectively, hosted at the Center for Biological Sequence Analysis, Technical University of Denmark [21]. Proteins containing LRRs, a TM and a KD were then considered to be putative LRR-RKs. We used the SMART web site http://smart.embl-heidelberg.de/ to check whether domains other than LRRs were predicted in the extracellular domain (ECD) of each protein [22]. If other domains were detected, the protein was rejected. Proteins containing only LRRs in their extracellular domain (ECD), a TM domain and a KD were classified as LRR-RKs. For the phylogenetic analysis presented in Figure 2, we first retrieved all of the peptide sequences of eukaryotic protein kinases used in [5] to establish the phylogenetic relationship between plant and animal protein kinases. We next retrieved one LRR-RLK protein per subgroup from the Arabidopsis genome. Finally, we included all of the newly identified LRR-RK proteins from the oomycete, Monosiga, Ectocarpus and Chlorella genomes. In oomycetes, Ectocarpus, Monosiga and Chlorella, we also retrieved proteins sequences of closely related kinases found using a Blastp search. This search was performed using the KD of the LRR-RKs, and we selected the non-LRR-RK best hit. Accessions numbers of all of these sequences are listed in the 'Accession number' section below.
Alignment and phylogenetic analysis
Peptide KD sequences of all of the kinases to be analyzed (plus four bacterial kinase genes [YP_003956736.1, ZP_04777056.1, ZP_06621294.1 and P0A5S4.1] used as the outgroup [23]) were aligned using the MAFFT program (v6.525 b, einsi parameters, 1000 iterations maximum) and manually curated [24]. Phylogenetic trees were generated under the maximum likelihood criterion using PhyML 3.0 (LG model, NNI topological moves, optimizing branch lengths and branch supports). For the approximate likelihood ratio test (aLRT), we used the minimum value between the parametric approximate likelihood ratio test (aLRT, Chi2-based) and the non-parametric aLRT (based on a Shimodaira-Hasegawa-like procedure) [25, 26]. All of the branches with support values less than 90 were collapsed. All of the manipulations of phylogenetic trees were performed using the TreeDyn [27] and MEGA4 [28] programs.
Expression analysis
We used the NCBI tBLASTn web interface to search for expressed ESTs that were similar (identity > 95%) to our set of oomycete LRR-RK proteins [29]. Only the Phytophthora infestans, Phytophthora sojae, Pythium ultimum and Saprolegnia parasitica EST databases have been searched, as the EST database of the Phytophthora ramorum genome was not available. Each EST sequence retrieved was validated by a BLASTn search using the library of nucleotide sequences from that species and from the Phytophthora infestans nucleotide sequences in GenBank. To search for ESTs from Phytophthora parasitica, the 24 LRR-RK peptide sequences of Phytophthora infestans have been used as query for a tBLASTn search on the VBI microbial database http://vmd.vbi.vt.edu/toolkit/index.php. We used the Phytophthora infestans LRR-RK proteins as queries because it is the most complete dataset thus far. This search revealed that at least 25 Phytophthora parasitica genes are homologous to the 24 Phytophthora infestans LRR-RK genes. We searched for ESTs for each of these 25 Phytophthora parasitica LRR-RKs on the NCBI Phytophthora parasitica EST database. The qRT-PCR expression analysis of 5 of the 6 Phytophthora parasitica ESTs retrieved was performed as described in Kebdani et al. (2010) using UBC, WS21 and Mago nashi protein encoding sequences used as reference genes [30–32]. Note that one of the 6 ESTs (DR440392.1) has not been analyzed by qRT-PCR because we did not succeed in designing oligonucleotides sets for this gene.
Accession numbers
The accession numbers for Arabidopsis thaliana are as follows: AtCKI1 [GenBank, CAA55395]; AtCDC2a [GenBank, AAB23643]; AtCPK7 [GenBank, AAB03247]; AtCKA1 [GenBank, BAA01090]; AtCTR1_Raf [GenBank, AAA32779]; AtAME2 [GenBank, BAA08215]; AtMKK3 [GenBank, BAA28829]; AtMEKK1 [GenBank, BAA09057]; AtNAK [GenBank, AAA18853]; AtNPH1 [GenBank, AAC01753]; AtPVPKlikePK5 [GenBank, BAA01715]; AtGSK3b [GenBank, CAA64408]; AtGSK3i [GenBank, CAA68027]; AtSnRK2 [GenBank, AAA32845]; AtMPK1 [TAIR, AT1G10210]; AtS6KlikePK1 [GenBank, AAA21142] and AtTousled [GenBank, AAA32874]. The accession numbers for Homo sapiens are as follows: hAXL [GenBank, NP_001690]; hRYK [GenBank, P34925]; hTRKalpha [GenBank, BAA34355]; hMuSK [GenBank, AAB63044]; hKLGlikePTK7 [GenBank, AAC50484]; hIR [GenBank, NP_000199]; hLTK [GenBank, P29376.3]; hRET [GenBank, AAH04257]; hTIE1 [GenBank, P35590]; hPDGFRbeta [GenBank, NM_002600.1]; hVGFR1 [GenBank, P17948.2]; hTousledLK1 [GenBank, NP_036422]; hMAPKKK1 [GenBank, Q13233]; hCLK1 [GenBank, P49759]; hMAPK1 [GenBank, NP_002736.3]; hCDK3 [GenBank, NP_001249]; hCKIalpha2 [GenBank, NP_001883]; hCaMK1 [GenBank, BAG70221]; hCK2a [GenBank, CAB65624]; hGRK6 [GenBank, P43250]; hEGFR [GenBank, P00533]; hFGFR2 [GenBank, P21802]; hHGFR [GenBank, P08581]; hEPH [GenBank, P21709]; hDDR [GenBank, Q08345]; hRaf1 [GenBank, AAA60247]; TGF beta receptors, hTGFbRI [GenBank, P36897] and hTGFbRII [GenBank, P37173]; hIRAK1 [GenBank, AAH54000] and hMAPKK1 [GenBank, Q02750]. Additional accession numbers are as follows: mCKIalpha [GenBank, NP_666199] for Mus musculus; xtPELLE [GenBank, NP_001006713] for Xenopus tropicalis; drIRAK1 [GenBank, CAP19555] for Danio rerio and dmPELLE [GenBank, NP_476971] for Drosophila melanogaster. The accessions of representative Arabidopsis thaliana LRR-RLK genes in each subfamily are as follows: [TAIR: AT4G29180] for LRRI, AtNIK1 for LRRII [TAIR: AT5G16000], AtIMK3 for LRRIII [TAIR: AT3G56100], [TAIR: AT2G45340] for LRRIV, AtSCM_SUB for LRRV [TAIR: AT1G11130], [TAIR: AT1G14390] for LRRVI-1, [TAIR: AT5G41180] for LRRVI-2, [TAIR: AT2G24230] for LRRVII, [TAIR: AT1G06840] for LRRVIII-1, [TAIR: AT3G14840] for LRRVIII-2, AtTMK1 for LRRIX [TAIR: AT1G66150], [TAIR: AT3G28450] for LRRXa, AtBRI1 for LRRXb [TAIR: AT4G39400), AtCLV1 for LRRXI [TAIR: AT1G75820], AtFLS2 for LRRXII [TAIR: AT5G46330], AtFEI1 for LRRXIIIa [TAIR: AT1G31420], AtER for LRRXIIIb [TAIR: AT2G26330], [TAIR: AT3G14840] for LRRXIV and AtRPK1 for LRRXV [TAIR: AT1G69270]. We followed the subfamily nomenclature of a previous report [33].