Evidence of positive selection at codon sites localized in extracellular domains of mammalian CC motif chemokine receptor proteins
BMC Evolutionary Biology volume 10, Article number: 139 (2010)
CC chemokine receptor proteins (CCR1 through CCR10) are seven-transmembrane G-protein coupled receptors whose signaling pathways are known for their important roles coordinating immune system responses through targeted trafficking of white blood cells. In addition, some of these receptors have been identified as fusion proteins for viral pathogens: for example, HIV-1 strains utilize CCR5, CCR2 and CCR3 proteins to obtain cellular entry in humans. The extracellular domains of these receptor proteins are involved in ligand-binding specificity as well as pathogen recognition interactions.
In mammals, the majority of chemokine receptor genes are clustered together; in humans, seven of the ten genes are clustered in the 3p21-24 chromosome region. Gene conversion events, or exchange of DNA sequence between genes, have been reported in chemokine receptor paralogs in various mammalian lineages, especially between the cytogenetically closely located pairs CCR2/5 and CCR1/3. Datasets of mammalian orthologs for each gene were analyzed separately to minimize the potential confounding impact of analyzing highly similar sequences resulting from gene conversion events.
Molecular evolution approaches and the software package Phylogenetic Analyses by Maximum Likelihood (PAML) were utilized to investigate the signature of selection that has acted on the mammalian CC chemokine receptor (CCR) gene family. The results of neutral vs. adaptive evolution (positive selection) hypothesis testing using Site Models are reported. In general, positive selection is defined by a ratio of nonsynonymous/synonymous nucleotide changes (dN/dS, or ω) >1.
Of the ten mammalian CC motif chemokine receptor sequence datasets analyzed, only CCR2 and CCR3 contain amino acid codon sites that exhibit evidence of positive selection using site based hypothesis testing in PAML. Nineteen of the twenty codon sites putatively indentified as likely to be under positive selection code for amino acid residues located in extracellular domains of the receptor protein products.
These results suggest that amino acid residues present in intracellular and membrane-bound domains are more selectively constrained for functional signal transduction and homo- or heterodimerization, whereas amino acid residues in extracellular domains of these receptor proteins evolve more quickly, perhaps due to heightened selective pressure resulting from ligand-binding and pathogen interactions of extracellular domains.
Chemotactic or chemoattractant cytokine (chemokine) proteins are a unique division of cytokines characterized by their roles in cell signaling through the use of heterotrimeric GTP-binding (G protein-coupled) 7-transmembrane receptors [1–3]. Chemokines are the largest family of cytokines . Currently, 42 ligand molecules and 19 receptors belong to the chemokine superfamily of cytokines . Within the chemokine superfamily, protein sub-families are distinguished by differences in amino acid sequence motif of four conserved cysteines residues . Chemokine ligands in the CC (β) sub-family have two adjacent cysteines that are both involved in intra-chain disulfide bridges . There are ten CC motif chemokine receptors (CCR1 through CCR10) that bind these ligands with differing ligand binding specificity [2, 5].
Chemokines, in conjunction with adhesion molecules, recruit specific subpopulations of leukocytes by activating various receptors and initiating signaling through G protein-coupled pathways [8, 9]. The signaling of chemokines through their receptors affect various cellular outcomes including leukocyte trafficking, gene transcription, degranulation of immune response cells, mitogenic processes, and apoptosis [2, 5, 6, 8]. Chemokines generally act as secondary pro-inflammatory mediators that are induced by primary pro-inflammatory mediators .
Nucleotide mutations in the open reading frame coding for chemokine receptors can have a dramatic effect on receptor activity or little effect, depending on the location of the substitution and the nature of the amino acid replacement: amino acid substitutions resulting in alterations at key ligand binding extracellular domains or intracellular G-protein coupled domains in particular are known to result in disrupted or abnormal receptor activity . Deficient signal transduction can result in increased susceptibility to infectious diseases as a result of the lack of a robust signaling response to pathogenic infection [10–16]. Mutations in the regulatory nucleotide sequence of chemokine receptors can also result in changes in gene expression and subsequent protein activity [17, 18].
In addition to their roles as pro-inflammatory agents in innate immune response through binding of endogenous ligands and subsequent receptor activation, some chemokine receptors have been co-opted to act as receptors or fusion proteins for a number of pathogens including HIV-1 strains [19, 20], protozoan parasites Plasmodium knowlesi and P. vivax , and Epstein-Barr virus ; herpesviruses mimic host chemokine receptors to elude host immune responses . Mutations in regulatory or coding sequences for chemokine receptors that are used as pathogen fusion proteins can alter host-pathogen interactions: amino acid substitutions in extracellular domains can prevent recognition by the pathogen or interfere with the pathogen's ability to utilize the receptor as a gateway into the cell. Genetic markers associated with disease resistance have been found in regulatory and coding sequences of chemokine receptors [24–29].
Because of their critical role in signaling immune responses, chemokine receptors are subjects of intense selection to accommodate signaling molecules, and are expected to experience purifying selection to maintain conformation and functionality of ligand binding and signaling . However, because of their role as targets of pathogen entry, chemokine receptors are also expected to experience positive selection pressure in response to viral/pathogen hijacking . Loci that are involved in responses to a variety of pathogens may experience balancing selection as a result of diverging selection pressures acting simultaneously on genes which may result in the maintenance of polymorphism . Given this apparent evolutionary tension, it is of interest to investigate the signature of selection on this subfamily of chemokine receptors.
Hypothesis Testing with Site Models
This paper presents the results of PAML hypothesis testing on chemokine receptor sequence datasets with "Site" models only [33, 34]. These models analyze sequence data at the level of the codon, and test whether a hypothesis (model) that allows for positive selection (dN/dS > 1 for some codons) is better fit to the data when compared to a null neutral hypothesis (model), determined through performing a likelihood ratio test between the likelihood scores of the null neutral and selection models. Each set of orthologous gene sequences was analyzed independently of one another such that only CCR1 gene sequences were included in the first data set, only CCR2 gene sequences were included in the second data set, and so on for each of the ten CCR genes (i.e., paralogous genes were not in the same data set).
When testing the hypothesis that some codon sites within chemokine receptor coding sequences have experienced positive selection pressure, significant results were obtained for some codons within the genes CCR2 and CCR3 (Table 1). For CCR2, the comparison between a null neutral site model which does not allow positive selection (M1a) and a selection site model (M2a) yielded a likelihood ratio test statistic of 5.45, which did not allow for rejection of the null hypothesis of neutral selection. However, the comparison between an additional pair of site models, M7 (null, neutral) and M8 (selection) for CCR2 yielded a likelihood test ratio statistic of 15.22, significant at p = 0.001, and a proportion of sites (0.02430, 2.4%) with ω = 2.83526. The analysis for CCR2 had a total of 380 amino acid sites, and nine amino acid sites were identified as sites of positive selection (Table 2) using Bayes empirical Bayes (BEB) analysis . One of the nine amino acid sites identified under positive selection had strong support with BEB posterior probability >95%. Eight of the nine amino acids identified as having experienced positive selection in the coding sequence of CCR2 are located in extracellular domains of the protein; one positively selected amino acid residue is located in the second transmembrane domain, near the transmembrane/extracellular boundary (Figure 1).
For hypothesis testing of site models with CCR3 gene sequences, the comparison of M1a (neutral) vs. M2a (selection) site models led to rejection of the null, neutral hypothesis in favor of selection with a likelihood test ratio statistic of 19.84, significant at p = 0.001, with a proportion of sites (0.01346 1%), and ω = 5.44115, indicating positive selection for those codon sites (Table 1). Five specific codon sites within the coding sequence were reported under positive selection with a BEB analysis following site testing: one site had strong support with BEB posterior probability greater than 95%, and one site had very strong support with BEB posterior probability greater than 99% (Table 2). A significant result was also obtained in the comparison of the second set of site models, M7 versus M8, with a test statistic of 22.3, significant at p = 0.001, with a proportion of sites (0.01637, ~1.6%) having a dN/dS value of 5.44 (Table 1). The analysis for CCR3 included 361 amino acid sites, and eleven specific amino acid sites were reported under positive selection with a BEB analysis following the site test. Three of these positively selected amino acid sites had posterior probabilities greater than 95%, one site had a posterior probability greater than 99% (Table 2). All of the positively selected sites identified for CCR3 in this analysis were located in extracellular domains of the receptor protein (Figure 2). A summary of results (significant and non significant) for all site model tests is presented in Additional File 1.
None of the ten CC motif chemokine receptors has a signature of positive selection as indicated by an ω value (ratio of nonsynonymous substitutions/synonymous substitutions, dN/dS) greater than one averaged over all codons as determined by hypothesis testing using branch models in PAML (unpublished results). Site tests, which analyze the sequence at the unit of the codon, revealed a proportion of codon sites that display evidence of positive selection (ω > 1) within the coding sequences of CCR2 and CCR3. The results obtained under the two sets of site models (M1a vs. M2a and M7 vs. M8) differ in some aspects; for example, the more conservative M1a vs. M2a comparison did not reveal statistically significant results for CCR2 while the M7 vs. M8 comparison did reveal significant differences, allowing for the identification of positively selected sites. For CCR3, while both M1a vs. M2a and M7 vs. M8 were both statistically significant comparisons, the comparison between M7 and M8 identified the same five codon sites that had been identified under M1a vs. M2a comparison as well as additional positively selected sites that were not identified in the M1a vs. M2a comparison. The differences in the results obtained using different models reflect that the M1a vs. M2a comparison is a more conservative test which may fail to detect positively selected sites identified by the less conservative M7 vs. M8 comparison.
It is interesting to note that in the results obtained for CCR2 and CCR3, nineteen out of the twenty amino acid sites that are identified as having experienced positive selection are located in the extracellular domains of the chemokine receptor proteins, suggesting that nonsynonymous substitutions are occurring, and more often being selected for, in the ligand binding and pathogen interaction regions of the receptors. Previous studies on CCR2, CCR3 and other CC chemokine receptors have identified the amine-terminus and extracellular domains as being important for both the endogenous ligand-binding functions [36–42] as well as for binding efficacy for pathogens in situations where these receptors have been co-opted as fusion proteins [24, 43–45]. These previous results coupled with the findings presented here point to the extracellular domains of CC chemokine receptor proteins being especially relevant to studies of the evolution of structure and function of receptors for endogenous ligand binding ability and as targets of pathogen interaction.
Recombination or gene conversion between paralogs in the chemokine receptor family has been investigated and described, particularly for CCR2/5 conversion in several orders of mammals [31, 46–50]. Conversion events between CCR1/3 in rodents have also been reported . These conversion events have primarily involved transmembrane regions, but some conversion events have occurred more extensively over CCR gene sequences and have impacted extracellular domains; for example the first extracellular loop of CCR5 converted by recombination with CCR2 in Mus , extracellular loop 2 of CCR5 converted by recombination with CCR2 in Homo and Oryctolagus [31, 46, 50], and extracellular loop 3 of CCR3 converted by recombination with CCR1 in Mus .
Analysis of sequences that have undergone gene conversion can lead to higher rate of false-positives when using maximum likelihood methods to detect positive selection, particularly in small data sets with only a few sequences, although the rate of false positives is only increased moderately . To minimize the impact of gene conversion events on the results of this study and as an alternative to eliminating sequences or parts of sequences that have undergone conversion events, each set of orthologous genes was analyzed independently of one another such that highly similar sequences resulting from gene conversion events within a species involving paralogous genes were not included in the same analysis, but rather were analyzed independently in separate data sets (e.g., only CCR1 gene sequences were analyzed together in one data set; CCR2 gene sequences were analyzed in a separate data set, and so on). In addition, a Bayes Empirical Bayes (BEB) analysis was used rather than Naïve Empirical Bayes (NEB) to identify putative codons under positive selection as NEB is less conservative and can be more prone to error in smaller data sets [34, 35] whereas BEB produces a low rate of false-positives with sequences that have experienced gene conversion .
Both of the genes containing positively selected codon sites (CCR2 and CCR3) have been reported to have undergone gene conversion events as discussed above; however, in the case of CCR2, gene conversion events have led to the conversion of CCR5 by CCR2, whereas in the case of CCR3, it is CCR3 that has been converted by CCR1. The results presented here, in which only one gene of a gene conversion pair displays evidence of positive selection through hypothesis testing, indicate that independent analyses of sequences that have undergone gene conversion may mitigate the detection of false-positives due to gene conversion.
Site tests, which analyze genetic sequences at the unit of the codon, revealed a proportion of codon sites that display evidence of positive selection (ω > 1) within the coding sequences of mammalian CC motif chemokine receptor genes CCR2 and CCR3. Nineteen of the twenty amino acid sites identified as having experienced positive selection are located in extracellular domains of the chemokine receptor proteins CCR2 and CCR3. These results suggest that amino acid residues present in intracellular and membrane-bound domains of mammalian CC motif chemokine receptor proteins are more selectively constrained, whereas amino acid residues in extracellular domains of these receptor proteins evolve more quickly, perhaps due to heightened selective pressure resulting from ligand-binding and pathogen interactions of extracellular domains.
Genomic coding sequences for CCRs from a number of placental mammals were obtained through searches of the online database NCBI Gene . Taxa included in the data set were chosen using the recently updated placental mammal phylogeny  and the online software application TimeTree [54, 55] to estimate divergence times between taxa (estimates given for nuclear genes were used). For PAML analyses, taxa that have diverged more than ~100 MYA may lead to decreased analytical power due to highly divergent sequences, difficulty in sequence alignment, and saturation of substitutions [56–58]; therefore, only taxa that have diverged less than 100 MYA were included in the data set.
For genes that displayed alternative splicing patterns, the presumed ancestral isoform sequence was identified through alignment methods and included in the dataset while the alternative isoforms were not. For CCR2, the human isoform that localizes to the plasma membrane was included in the CCR2 dataset, while the cytoplasmic variant was not. Species and GenBank accession numbers for sequences used in the analyses are listed in Additional File 2.
Nucleotide alignments of chemokine receptor sequences were generated using amino acid sequence alignments and the software program TranAlign . The output from TranAlign was converted to Nexus/PAUP format and submitted to the software program ModelTest  for selection of the most appropriate model of evolution for each dataset by testing the fit of 56 different evolutionary models with the data set. ModelTest uses both hierarchical likelihood ratio testing and Akaike Information Criterion (AIC). The best model was chosen based on AIC score and the number of estimated parameters. If there was a statistically insignificant difference between two models, the model with the fewest number of estimated parameters was chosen to introduce the least amount of uncertainty to the evolutionary analyses. Models used for analyses are summarized in Additional File 3.
Phylogenetic Analysis Using Parsimony* (PAUP*) and Phylogenetic Analysis by Maximum Likelihood (PAML) Methods
Maximum likelihood phylogenetic trees for each data set (Additional Files 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13) were constructed with the software package PAUP* . Data sets and maximum likelihood phylogenetic trees for each gene were submitted to PAML CODEML version 4.1 under different models and parameters to test for adaptive evolution either at codon sites ("Site Model"), along lineages ("Branch Model"), or at sites within lineages ("Branch-Site Model") . This paper presents the results of testing the data sets described with "Site" models only.
"Site" models allow the dN/dS ratio to vary across codons within a sequence for a lineage. Proportions of sites within each lineage were estimated to be in different categories: positive selection is indicated by some codons having a dN/dS > 1. The null, neutral model does not allow positive selection and is compared to the alternative hypothesis in which in which positive selection is allowed.
Two sets of site models are commonly used to test hypotheses of selection, and have been used here: M1a vs. M2a and M7 vs. M8. In the first set of models, the model M1a: Nearly Neutral allows 2 categories of codon sites in p0, and p1 proportions, with ω0 < 1, and ω1 = 1, whereas the model M2a: Selection allows an additional category of codons (p2) with ω2 > 1, indicating positive selection. The second set of site models compared is M7 and M8, in which M7 specifies a neutral model with dN/dS ratios across a continuous beta distribution with estimated parameters p and q of the beta distribution, and M8 specifies a similar model with an additional category for sites that have dN/dS > 1, indicating positive selection. M7 assumes a beta distribution of ω values between 0 and 1, and therefore does not allow any sites under positive selection (ω > 1). The M8 model is similar to M7 in that it also assumes a beta distribution for omega values, but allows another category of sites in which ω > 1. The comparison between M7: beta and M8: beta +ω is less conservative, and may indicate positive selection even when none is detected by the M1a: M2a comparison.
The PAML settings for the null (neutral) model M1a were model = 0, NSsites = 1, and for the alternative (selection) model M2a were model = 0, NSsites = 2. The PAML settings for the null model M7 were model = 0, NSsites = 7, and for the alternative (selection) model M8 were model = 0, NSsites = 8.
The likelihood estimates for each were compared using a hierarchical Likelihood Ratio Test (hLRT) of twice the difference in log likelihood values of the models being compared (2ΔlnL), with the result approximating chi-square distribution with degrees of freedom for the test statistic determined by the difference in estimated parameters between the models being compared. For both the M1a (neutral) vs. M2a (selection) and M7 (beta) vs. M8 (beta + selection) comparisons, the null model has two estimated parameters, while the alternative estimates four, resulting in two degrees of freedom and chi-square critical values of 5.99, 9.21, and 13.82 at 5%, 1%, and 0.1% significance, respectively .
Graves DT, Jiang Y: Chemokines, a family of chemotactic cytokines. Crit Rev Oral Biol Med. 1995, 6: 109-118. 10.1177/10454411950060020101.
Rosenkilde MM, Schwartz TW: The chemokine system - a major regulator of angiogenesis in health and disease. APMIS. 2004, 112: 481-495. 10.1111/j.1600-0463.2004.apm11207-0808.x.
Goncharova LB, Tarakanov AO: Why chemokines are cytokines while their receptors are not cytokine ones?. Curr Med Chem. 2008, 15: 1297-1304. 10.2174/092986708784535009.
Fernandez EJ, Lolis E: Structure, function, and inhibition of chemokines. Annu Rev Pharmacol Toxicol. 2002, 42: 469-499. 10.1146/annurev.pharmtox.42.091901.115838.
Bonecchi R, Galliera E, Borroni EM, Corsi MM, Locati M, Mantovani A: Chemokines and chemokine receptors: an overview. Front Biosci. 2009, 14: 540-551. 10.2741/3261.
Baggiolini M, Dewald B, Moser B: Human chemokines: An update. Annu Rev Immunol. 1997, 15: 675-705. 10.1146/annurev.immunol.15.1.675.
Lodi PJ, Garrett DS, Kuszewski J, Tsang ML, Weatherbee JA, Leonard WJ, Gronenborn AM, Clore GM: High-resolution solution structure of the beta-chemokine HMIP-1-beta by multidimensional NMR. Science. 1994, 263: 1762-1767. 10.1126/science.8134838.
Rossi D, Zlotnik A: The biology of chemokines and their receptors. Annu Rev Immunol. 2000, 18: 217-242. 10.1146/annurev.immunol.18.1.217.
de Paz JL, Moseman EA, Noti C, Polito L, von Andrian UH, Seeberger PH: Profiling heparin-chemokine interactions using synthetic tools. ACS Chem Biol. 2007, 2: 735-744. 10.1021/cb700159m.
Carrington M, Dean M, Martin MP, O'Brien SJ: Genetics of HIV-1 infection: chemokine receptor CCR5 polymorphism and its consequences. Hum Mol Genet. 1999, 8: 1939-1945. 10.1093/hmg/8.10.1939.
Lim JK, Glass WG, McDermott DH, Murphy PM: CCR5: No longer a 'good for nothing' gene - chemokine control of West Nile Virus infection. Trends Immunol. 2006, 27: 308-312. 10.1016/j.it.2006.05.007.
Glass WG, Lim JK, McDermott DH, Pletnev A, Cholera R, Gao J, Lekhong S, Yu SF, Frank WA, Pape J, Cheshier RC, Murphy PM: CCR5 saves lives: the protective role of CCR5 during West Nile virus infection. J Neurochem. 2006, 96: 94-94.
Glass WG, McDermott DH, Lim JK, Lekhong S, Yu SF, Frank WA, Pape J, Cheshier RC, Murphy PM: CCR5 deficiency increases risk of symptomatic West Nile virus infection. J Exp Med. 2006, 203: 35-40. 10.1084/jem.20051970.
Navratilova Z: Polymorphisms in CCL2 and CCL5 chemokines/chemokine receptors genes and their association with diseases. Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub. 2006, 150: 191-204.
Balestrieri ML, Balestrieri A, Mancini FP, Napoli C: Understanding the immunoangiostatic CXC chemokine network. Cardiovasc Res. 2008, 78: 250-256. 10.1093/cvr/cvn029.
Lim JK, Louie CY, Glaser C, Jean C, Johnson B, Johnson H, McDermott DH, Murphy PM: Genetic deficiency of chemokine receptor CCR5 is a strong risk factor for symptomatic West Nile Virus infection: A meta-analysis of 4 cohorts in the US epidemic. J Infect Dis. 2008, 197: 262-265. 10.1086/524691.
Mummidi S, Bamshad M, Ahuja SS, Gonzalez E, Feuillet PM, Begum K, Galvis MC, Kostecki V, Valente AJ, Murthy KK, Haro L, Dolan MJ, Allan JS, Ahuja SK: Evolution of human and non-human primate CC chemokine receptor 5 gene and mRNA - Potential roles for haplotype and mRNA diversity, differential haplotype-specific transcriptional activity, and altered transcription factor binding to polymorphic nucleotides in the pathogenesis of HIV-1 and Simian immunodeficiency virus. J Biol Chem. 2000, 275: 18946-18961. 10.1074/jbc.M000169200.
Nadif R, Mintz M, Rivas-Fuentes S, Jedlicka A, Lavergne E, Rodero M, Kauffmann F, Combadiere C, Kleeberger SR: Polymorphisms in chemokine and chemokine receptor genes and the development of coal workers' pneumoconiosis. Cytokine. 2006, 33: 171-178. 10.1016/j.cyto.2006.01.001.
Alkhatib G, Combadiere C, Broder CC, Feng Y, Kennedy PE, Murphy PM, Berger EA: CC CKRS: A RANTES, MIP-1 alpha, MIP-1 beta receptor as a fusion cofactor for macrophage-tropic HIV-1. Science. 1996, 272: 1955-1958. 10.1126/science.272.5270.1955.
Broder CC, Collman RG: Chemokine receptors and HIV. J Leukocyte Biol. 1997, 62: 20-29.
Horuk R, Chitnis CE, Darbonne WC, Colby TJ, Rybicki A, Hadley TJ, Miller LH: A receptor for the malarial parasite Plasmodium vivax - the erythrocyte chemokine receptor. Science. 1993, 261: 1182-1184. 10.1126/science.7689250.
Vischer HF, Nijmeijer S, Smit MJ, Leurs R: Viral hijacking of human receptors through heterodimerization. Biochem and Biophys Res Commun. 2008, 377: 93-97. 10.1016/j.bbrc.2008.09.082.
Murphy PM: Molecular piracy of chemokine receptors by herepesviruses. Infect Agents Dis Rev Issues Comment. 1994, 3: 137-154.
Howard OMZ, Shirakawa AK, Turpin JA, Maynard A, Tobin GJ, Carrington M, Oppenheim JJ, Dean M: Naturally occurring CCR5 extracellular and transmembrane domain variants affect HIV-1 co-receptor and ligand binding function. J Biol Chem. 1999, 274: 16228-16234. 10.1074/jbc.274.23.16228.
Singh KK, Barroga CF, Hughes MD, Chen J, Raskino C, McKinney RE, Spector SA: Prevalence of chemokine and chemokine receptor polymorphisms in seroprevalent children with symptomatic HIV-1 infection in the United States. J Acquir Immune Defic Syndr. 2004, 35: 309-313. 10.1097/00126334-200403010-00013.
Wang CB, Song W, Lobashevsky E, Wilson CM, Douglas SD, Mytilineos J, Schoenbaum EE, Tang JM, Kaslow RA: Cytokine and chemokine gene polymorphisms among ethnically diverse North Americans with HIV-1 infection. J Acquir Immune Defic Syndr. 2004, 35: 446-454. 10.1097/00126334-200404150-00002.
Kaur G, Singh P, Kumar N, Rapthap CC, Sharma G, Vajpayee M, Wig N, Sharma SK, Mehra NK: Distribution of CCR2 polymorphism in HIV-1-infected and healthy subjects in North India. Int J Immunogenet. 2007, 34: 153-156. 10.1111/j.1744-313X.2007.00667.x.
Suresh P, Wanchu A, Sachdeva RK, Bhatnagar A: Gene polymorphisms in CCR5, CCR2, CX3CR1, SDF-1 and RANTES in exposed but uninfected partners of HIV-1 infected individuals in North India. J Clin Immunol. 2007, 27: 131-131. 10.1007/s10875-006-9054-y.
Zhao WY, Lee SS, Wong KH, Chan KCW, Ng T, Chan CCS, Han D, Yam WC, Yuen KY, Ng MH, Zheng BJ: Functional analysis of naturally occurring mutations in the open reading frame of CCR5 in HIV-infected Chinese patients and healthy controls. J Acquir Immune Defic Syndr. 2005, 38: 509-517. 10.1097/01.qai.0000151004.19128.4a.
Kunstman KJ, Puffer B, Korber BT, Kuiken C, Smith UR, Kunstman J, Stanton J, Agy M, Shibata R, Yoder AD, Pillai S, Doms RW, Marx P, Wolinsky SM: Structure and function of CC-chemokine receptor 5 homologues derived from representative primate species and subspecies of the taxonomic suborders Prosimii and Anthropoidea. J Virol. 2003, 77: 12310-12318. 10.1128/JVI.77.22.12310-12318.2003.
Shields DC: Gene conversion among chemokine receptors. Gene. 2000, 246: 239-245. 10.1016/S0378-1119(00)00072-X.
Hedrick PW: Balancing selection. Curr Biol. 2007, 17: R230-R231. 10.1016/j.cub.2007.01.012.
Yang Z: PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl BioSci. 1997, 13: 555-556.
Yang Z: PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol Biol Evol. 2007, 24: 1586-1591. 10.1093/molbev/msm088.
Yang ZH, Wong WSW, Nielsen R: Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005, 22: 1107-1118. 10.1093/molbev/msi097.
Pease JE, Wang J, Ponath PD, Murphy PM: The N-terminal extracellular segments of the chemokine receptors CCR1 and CCR3 are determinants for MIP-1 alpha and eotaxin binding, respectively, but a second domain is essential for efficient receptor activation. J Biol Chem. 1998, 273: 19972-19976. 10.1074/jbc.273.32.19972.
Han KH, Green SR, Tangirala RK, Tanaka S, Quehenberger O: Role of the first extracellular loop in the functional activation of CCR2 - The first extracellular loop contains distinct domains necessary for both agonist binding and transmembrane signaling. J Biol Chem. 1999, 274: 32055-32062. 10.1074/jbc.274.45.32055.
Zoffmann S, Chollet A, Galzi JL: Identification of the extracellular loop 2 as the point of interaction between the n terminus of the chemokine MIP-1 alpha and its CCR1 receptor. Mol Pharmacol. 2002, 62: 729-736. 10.1124/mol.62.3.729.
Blanpain C, Doranz BJ, Bondue A, Govaerts C, De Leener A, Vassart G, Doms RW, Proudfoot A, Parmentier M: The core domain of chemokines binds CCR5 extracellular domains while their amino terminus interacts with the transmembrane helix bundle. J Biol Chem. 2003, 278: 5179-5187. 10.1074/jbc.M205684200.
Sabroe I, Jorritsma A, Stubbs VEL, Xanthou G, Jopling LA, Panath PD, Williams TJ, Murphy PM, Pease JE: The carboxyl terminus of the chemokine receptor CCR3 contains distinct domains which regulate chemotactic signaling and receptor down-regulation in a ligand-dependent manner. Eur J Immunol. 2005, 35: 1301-1310. 10.1002/eji.200425171.
Duchesnes UE, Murphy PM, Williams TJ, Pease JE: Alanine scanning mutagenesis of the chemokine receptor CCR3 reveals distinct extracellular residues involved in recognition of the eotaxin family of chemokines. Mol Immunol. 2006, 43: 1221-1231. 10.1016/j.molimm.2005.07.015.
Rajagopalan L, Rajarathnam K: Structural basis of chemokine receptor function - A model for binding affinity and ligand selectivity. Biosci Rep. 2006, 26: 325-339. 10.1007/s10540-006-9025-9.
Frade JMR, Llorente M, Mellado M, Alcami J, GutierrezRamos JC, Zaballos A, delReal G, Martinez AC: The amino-terminal domain of the CCR2 chemokine receptor acts as coreceptor for HIV-1 infection. J Clin Invest. 1997, 100: 497-502. 10.1172/JCI119558.
Liu SQ, Fan SX, Sun ZR: Structural and functional characterization of the human CCR5 receptor in complex with HIV gp120 envelope glycoprotein and CD4 receptor by molecular modeling studies. J Mol Modeling. 2003, 9: 329-336. 10.1007/s00894-003-0154-9.
Ho PT, Teal BE, Ross TM: Multiple residues in the extracellular domains of CCR3 are critical for coreceptor activity. Virology. 2004, 329: 109-118. 10.1016/j.virol.2004.07.028.
Carmo CR, Esteves PJ, Ferrand N, Loo van der W: Genetic variation at chemokine receptor CCR5 in leporids: alternation at the 2nd extracellular domain by gene conversion with CCR2 in Oryctolagus, but not in Sylvilagus and Lepus species. Immunogenetics. 2006, 58: 494-501. 10.1007/s00251-006-0095-4.
Esteves PJ, Abrantes J, Loo van der W: Extensive gene conversion between CCR2 and CCR5 in domestic cat (Felis catus). Int J Immunogenet. 2007, 34: 321-324. 10.1111/j.1744-313X.2007.00716.x.
Vazquez-Salat N, Yuhki N, Beck T, O'Brien SJ, Murphy WJ: Gene conversion between mammalian CCR2 and CCR5 chemokine receptor genes: A potential mechanism for receptor dimerization. Genomics. 2007, 90: 213-224. 10.1016/j.ygeno.2007.04.009.
Perelygin AA, Zharkikh AA, Astakhova NM, Lear TL, Brinton MA: Converted evolution of vertebrate CCR2 and CCR5 genes and the origin of a recombinant equine CCR5/2 gene. J Hered. 2008, 99: 500-511. 10.1093/jhered/esn029.
Abrantes J, Carmo CR, Matthee CA, Yamada F, Loo van der W, Esteves PJ: A shared unusual genetic change at the chemokine receptor type 5 between Oryctolagus, Bunolagus and Pentalagus. Conserv Genet. 2009, DOI 10.1007/s10592-009-9990-1.
Casola C, Hahn MW: Gene Conversion Among Paralogs Results in Moderate False Detection of Positive Selection Using Likelihood Methods. J Mol Evol. 2009, 68: 679-687. 10.1007/s00239-009-9241-6.
National Center for Biotechnology Information Entrez Gene. [http://www.ncbi.nlm.nih.gov/gene]
Springer MS, Burk-Herrick A, Meredith R, Eizirik E, Teeling E, O'Brien SJ, Murphy WJ: The adequacy of morphology for reconstructing the early history of placental mammals. Syst Biol. 2007, 56: 673-684. 10.1080/10635150701491149.
TimeTree:: The Timescale of Life. [http://www.timetree.org]
Hedges SB, Dudley J, Kumar S: TimeTree: a public knowledge-base of divergence times among organisms. Bioinformatics. 2006, 22: 2971-2972. 10.1093/bioinformatics/btl505.
Yang ZH: On the best evolutionary rate for phylogenetic analysis. Syst Biol. 1998, 47: 125-133. 10.1080/106351598261067.
Anisimova M, Bielawski JP, Yang ZH: Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol Evol. 2001, 18: 1585-1592.
Bielawski JP, Yang ZH: Maximum likelihood methods for detecting adaptive evolution after gene duplication. J Struct Funct Genomics. 2003, 3: 201-212. 10.1023/A:1022642807731.
Rice P, Longden I, Bleasby A: EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 2000, 16: 276-277. 10.1016/S0168-9525(00)02024-2.
Posada D, Crandall KA: Modeltest: testing the model of DNA substitution. Bioinformatics. 1998, 14: 817-818. 10.1093/bioinformatics/14.9.817.
Swofford DL: PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. 2002, Sinauer Associates, Sunderland, MA
Skrabanek L, Campagne F, Weinstein H: Building protein diagrams on the web with the residue-based diagram editor RdBe. Nucleic Acids Res. 2003, 31: 3856-3858. 10.1093/nar/gkg552.
KM carried out dataset construction, molecular evolution analyses, and drafting of the manuscript. MT contributed to the conception and design of the study, and participated in critical review and revision of the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: Supplementary Table 1: Model Parameter Estimates, dN/dS Ratios, Log Likelihood Values and Test Statistics for PAML Site Models. Summary of results for all PAML hypothesis testing using site models. Includes significant and non-significant results. (DOCX 74 KB)
Additional file 2: Supplementary Table 2: Taxa and NCBI GenBank accession numbers for loci included in CC chemokine receptor data sets. List of species and NCBI GenBank accession numbers for sequences used to construct the ten datasets (for each of the ten sets of orthologous genes) for hypothesis testing. Species and accession numbers for each dataset are grouped together on the table. (DOCX 94 KB)
Additional file 3: Supplementary Table 3: Parameters for ModelTest models of evolution used in PAML hypothesis testing of CC chemokine receptor sequences. Summary of evolutionary model parameters used in PAML hypothesis testing. (DOCX 81 KB)
Additional file 4: Supplementary Figure 1: Maximum likelihood phylogeny of mammalian CCR1 gene sequences. A phylogeny of mammalian CCR1 gene sequences. The tree was produced using PHYML with the GTR nucleotide model, a discrete gamma model with four categories, and a shape parameter of 0.6525. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 44 KB)
Additional file 5: Supplementary Figure 2: Maximum likelihood phylogeny of mammalian CCR2 gene sequences. A phylogeny of mammalian CCR2 gene sequences. The tree was produced using PHYML with the GTR nucleotide model, a discrete gamma model with four categories, and a shape parameter of 0.5867. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 42 KB)
Additional file 6: Supplementary Figure 3: Maximum likelihood phylogeny of mammalian CCR3 gene sequences. A phylogeny of mammalian CCR3 gene sequences. The tree was produced using PHYML with the GTR nucleotide model, a discrete gamma model with four categories, proportion of invariable sites 0.3247 and a shape parameter of 2.8971. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 52 KB)
Additional file 7: Supplementary Figure 4: Maximum likelihood phylogeny of mammalian CCR4 gene sequences. A phylogeny of mammalian CCR4 gene sequences. The tree was produced using PHYML with the GTR nucleotide model, a discrete gamma model with four categories, proportion of invariable sites 0.5901 and an estimated shape parameter of 33.284. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 32 KB)
Additional file 8: Supplementary Figure 5: Maximum likelihood phylogeny of mammalian CCR5 gene sequences. A phylogeny of mammalian CCR5 gene sequences. The tree was produced using PHYML with the HKY nucleotide model, a discrete gamma model with four categories, and a shape parameter of 0.451. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 52 KB)
Additional file 9: Supplementary Figure 6: Maximum likelihood phylogeny of mammalian CCR6 gene sequences. A phylogeny of mammalian CCR6 gene sequences. The tree was produced using PHYML with the JC69 nucleotide model, a discrete gamma model with four categories, proportion of invariable sites 0.4667, and an estimated shape parameter of 100. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 29 KB)
Additional file 10: Supplementary Figure 7: Maximum likelihood phylogeny of mammalian CCR7 gene sequences. A phylogeny of mammalian CCR7 gene sequences. The tree was produced using PHYML with the GTR nucleotide model, a discrete gamma model with four categories, and a shape parameter of 0.257. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 37 KB)
Additional file 11: Supplementary Figure 8: Maximum likelihood phylogeny of mammalian CCR8 gene sequences. A phylogeny of mammalian CCR8 gene sequences. The tree was produced using PHYML with the HKY nucleotide model, a discrete gamma model with four categories, and a shape parameter of 0.6482. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 32 KB)
Additional file 12: Supplementary Figure 9: Maximum likelihood phylogeny of mammalian CCR9 gene sequences. A phylogeny of mammalian CCR9 gene sequences. The tree was produced using PHYML with the HKY nucleotide model, a discrete gamma model with four categories, and a shape parameter of 0.3547. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 33 KB)
Additional file 13: Supplementary Figure 10: Maximum likelihood phylogeny of mammalian CCR10 gene sequences. A phylogeny of mammalian CCR10 gene sequences. The tree was produced using PHYML with the GTR nucleotide model, a discrete gamma model with four categories, and a shape parameter of 0.2439. Bootstrapping was performed with 100 replicates. Bootstrap support is indicated at nodes. (PDF 38 KB)
About this article
Cite this article
Metzger, K.J., Thomas, M.A. Evidence of positive selection at codon sites localized in extracellular domains of mammalian CC motif chemokine receptor proteins. BMC Evol Biol 10, 139 (2010). https://doi.org/10.1186/1471-2148-10-139