Molecular evolutionary and structural analysis of human UCHL1 gene demonstrates the relevant role of intragenic epistasis in Parkinson’s disease and other neurological disorders
BMC Evolutionary Biology volume 20, Article number: 130 (2020)
Parkinson’s disease (PD) is the second most common neurodegenerative disorder. PD associated human UCHL1 (Ubiquitin C-terminal hydrolase L1) gene belongs to the family of deubiquitinases and is known to be highly expressed in neurons (1–2% in soluble form). Several functions of UCHL1 have been proposed including ubiquitin hydrolyze activity, ubiquitin ligase activity and stabilization of the mono-ubiquitin. Mutations in human UCHL1 gene have been associated with PD and other neurodegenerative disorders. The present study aims to decipher the sequence evolutionary pattern and structural dynamics of UCHL1. Furthermore, structural and interactional analysis of UCHL1 was performed to help elucidate the pathogenesis of PD.
The phylogenetic tree topology suggests that the UCHL1 gene had originated in early gnathostome evolutionary history. Evolutionary rate analysis of orthologous sequences reveals strong purifying selection on UCHL1. Comparative structural analysis of UCHL1 pinpoints an important protein segment spanning amino acid residues 32 to 39 within secretion site with crucial implications in evolution and PD pathogenesis through a well known phenomenon called intragenic epistasis. Identified critical protein segment appears to play an indispensable role in protein stability, proper protein conformation as well as harboring critical interaction sites.
Conclusively, the critical protein segment of UCHL1 identified in the present study not only demonstrates the relevant role of intraprotein conformational epistasis in the pathophysiology of PD but also offers a novel therapeutic target for the disease.
Parkinson’s disease (PD) is the second most common neurodegenerative disorder after Alzheimer’s disease (AD) and is known to effect normal function of motor neurons . PD prevalence is 1–2% of population above age 65 years and 4–5% above the age of 85 years . There are two major pathological hallmarks of PD diagnosis, selective degeneration of dopaminergic neurons in substantia nigra (it’s a basal ganglia structure located in mid-brain that plays a key role in reward, addiction and movement) and the presence of an intracellular protein inclusion, lewy bodies (LBs) and lewy neuritis [2, 3]. PD-associated symptoms include motor features (rigidity, resting tremor, bradykinesia and postural instability) and non-motor features (olfactory dysfunction, autonomic dysfunction, cognitive impairment and psychiatric symptoms) . Since 1997, 27 PD-associated genes have been identified with autosomal dominant (UCHL1, GBA, GIGYF2, DNAJC13, LRRK2, TMEM230, GCH1, EIF4G1, SNCA, HTRA2, RIC3, ATXN2, VPS35, CHCHD2), autosomal recessive (PRKN, PTRHD1, DJ1, PLA2G6, SPG11, FBXO7, DNAJC6, SYNJ1, ATP13A2, VPS13C, PODXL, PINK1) or an X-linked mode of transmission (RAB39B) . UCHL1 (Ubiquitin C-Terminal Hydrolase 1) is identified as a major causal gene involved in the early-onset of familial and sporadic PD and other neurodegenerative disorders like AD [6, 7] and Huntington’s disease . As of now, five missense mutations in UCHL1 has been associated with PD and other neurological disorders, i.e. E7A , S18Y , I93M , R178Q and A216D .
Ubiquitin C-terminal hydrolase L1 (UCHL1) is a 24.8 kDa, acidic protein (pI 5.3) , consisting of 223 amino acids and encoded by 9 exons with a transcript of 1172 bps in length, and located on human chromosome 4(4p14) . UCHL1 (PGP 9.5 or PARK5) is the most abundant protein constituting 1–2% of the total brain soluble fraction, which is normally expressed exclusively in neurons and testis and is known to play a key role in ubiquitin turnover through its C-terminal hydrolase activity . However, abnormal expression of UCHL1 is found in many primary lung tumors, lung tumor cell lines and colorectal cancers [14, 15]. UCHL1 functions as a de-ubiquitinating or hydrolase enzyme in the ubiquitin-proteasome pathway . Furthermore, it also shows ubiquitin ligase activity for monoubiquitinated α-synuclein in a cell free system . In addition, UCHL1 stabilizes the monoubiquitin or free ubiquitin and thus provides the availability of ubiquitin in various other cellular events independently of its enzymatic activity . In the secondary structure comparison, circular dichroism analysis revealed that the mutant version of UCHL1 (I93M) has decreased level of alpha helix as compared to wild-type UCHL1, and therefore has a tendency to aggregate in neurons  and cause autosomal dominant form of PD . Transgenic mouse analysis of mutant version of UCHL1 (I93M) exhibits the physiological phenotypes related to PD and degeneration of dopaminergic neurons within the age of 20 weeks . Mutant version of UCHL1 (I93M) also exhibits increased insolubility, aberrant interactions with other proteins (HSP90 & HSC70) in mammalian cells and decreased interaction with monoubiquitin, suggesting that this mutant version (I93M) plays a causative role in familial PD . Furthermore, in vitro analysis of recombinant UCHL1 (I93M) indicated the decline of 50% hydrolase activity or deubiquitinatinating activity as compared to the wild type [18, 20] which also contributes to the oxidative modification of UCHL1 [16, 19]. Intriguingly, subset of mutant versions of UCHL1 (S18Y) show lower risk of PD due to its reducing ligase activity that leads to reduced level of ubiquitinated alpha-synuclein . However, some studies failed to identify these types of association between mutant version of UCHL1 (S18Y) and reduced risk of PD [22, 23]. At the molecular level, a missense mutant version of UCHL1 (E7A) exhibits extensive loss of ubiquitin binding ability thereby leading to completely abolished UCHL1 hydrolase activity [3, 7]. This loss of function can be rationalized at structural level because glutamic acid (at E7 position of wild type UCHL1) is located at ubiquitin binding interface to form electrostatic interactions with ubiquitin residues Arg42 and Arg72 . A recent study analyzed the biochemical impact of the two PD causing UCHL1 missense mutations suggesting that mutant version of UCHL1 (R178Q) exhibits a 4-fold increased hydrolase activity as compared to the wild type. This increased enzymatic activity provides a protective effect on cognitive function. Another mutant version of UCHL1 (A216D) showed insolubility and consequently the complete loss of function .
Given the vital role of UCHL1 in both familial and sporadic PD and other neurological disorders, the present study offers an insight into the evolutionary history of UCHL1 through evolutionary rate analysis and phylogenetic investigation. Furthermore, the present study provides the comparative analysis of UCHL1 protein at sequence, structural and interaction level. Comparative sequence and structural information inferred the strong epistasis influence of evolutionary and disease causing amino-acid substitutions on a small protein segment (amino acids 32 to 39) within N-terminal C-12 peptidase domain of UCHL1. Functional implications of identified critical protein segment were further evaluated through interaction analysis of UCHL1 with its interacting partners SNCA (PARK1) and PARKIN (PARK2).
In order to analyze the evolutionary history of PD-associated UCHL1 gene, Neighbor-joining (NJ) and Maximum likelihood (ML) trees were constructed. The phylogenetic tree topology suggests that UCHL1 is a gnathostomata (jawed vertebrate) specific gene and present in tetrapods, bony fishes and in cartilaginous fishes. Bidirectional BLAST based similarity searches did not identify orthologous counter part of this gene in any of the non-gnathostomata vertebrates or invertebrate animals analyzed in the present study. Furthermore, similarity searches failed to detect any paralogous copy of UCHL1 in any of the gnathostomata clades analyzed. (Fig. 1 and Additional file 1: Figure S1).
Evolutionary rate analysis of UCHL1 gene in sarcopterygians
In order to analyze evolutionary rate differences of the UCHL1 gene within different groups of sarcopterygians lineage, key animals were selected from four groups like hominoids (human, chimpanzee, gorilla and orangutan), non-hominoids (macaque, squirrel monkey, marmoset and Otolemur), non-primate placental mammals (cow, cat, elephant and mouse) and non-mammalian tetrapods (chicken, zebra finch, turtle and coelacanth). The rationale behind choosing only sarcopterygians for evolutionary rate analysis is that animals of this lineage are known to be more closely related or show more homology to humans as compare to teleost and cartilaginous fishes . To evaluate the selection constraints on selected subgroups of animals, the rates of non-synonymous (Ka/dN) and synonymous (Ks/dS) substitutions were estimated and their difference was calculated using z-test . In general, Ka value lower than Ks (Ka < Ks) suggests negative selection, i.e. non silent substitutions have been purged by natural selection, whereas the inverse scenario (Ka > Ks) implies positive selection, i.e. advantageous mutations have accumulated during the course of evolution. However, the evidence for positive or negative selection requires the values to be significantly different from each other [26, 27].
The Ka-Ks (dN-dS) difference was − 2.747 (P = 0.007) for hominoids, − 6.501 (P = 0) for non-hominoids, − 8.945 (P = 0) for non-primate placental mammals, − 8.199 (P = 0) for non-mammalian tetrapods (Additional file 2: Table S1). These data suggest that during the course of sarcopterygians evolution, UCHL1 has deviated significantly from neutrality and evolved under strong negative selection (Additional file 2: Table S1).
Domain organization of UCHL1 protein
In order to investigate the comparative domain organization of UCHL1, complete domain, motifs and sub-motifs were identified through extensive literature survey as well as using different tools/databases [28, 29]. UCHL1 contains a single domain that spans a large portion of protein (amino-acids 3 to 206), which is named as C-12 peptidase domain. It regulates substrate access to the catalytic site . The C-12 peptidase domain comprises of cysteine active-site (84–100), N-myristoylation sites (87–92, 94–99), Casein Kinase II Phosphorylation sites (119–122, 125–128, 188–191, 205–208), Protein kinase C phosphorylation sites (76–78, 121–123, 205–207)  and unconventional pathway secretion site of UCHL1 (32–39) . The farnesylation-site (220–223)  resides outside of the C-12 domain at the C-terminal of UCHL1 (Fig. 2a). These domains, motifs and sub-motifs are comparatively mapped on orthologs from major representative species of sarcopterygians, like primate (human), non-primate placental mammals (mouse and cat), non-mammalian tetrapods (chicken) and lobe-finned fish (coelacanth) (Fig. 2a).
Annotation and comparative analyses at sequence level revealed that the C-12 peptidase domain and its major motifs like the unconventional pathway secretion site of UCHL1 (32–39) , cysteine active-site (84–100) and N-myristoylation sites (87–92, 94–99) are highly conserved within selected subgroup of sarcopterygians animals (Fig. 2a). Three protein kinase C (PKC) phosphorylation sites (76–78, 121–123, 205–207) and four casein kinase II (CK2) phosphorylation sites (119–122, 125–128, 188–191, 205–208) are also found to be highly conserved in mammals (human, mouse and cat), whereas in chicken a PKC phosphorylation site appears to have been translocated from C-terminal to N-terminal of UCHL1 and an additional PKC phosphorylation site (198–200) is detected. Furthermore, both in chicken and coelacanth one N-myristoylation site appears to have been translocated from C-terminal to N-terminal of UCHL1 (at position 21–26). In both chicken and coelacanth, a CK2 phosphorylation site (119–122) is not detected and some translocations of PKC phosphorylation sites are also observed (Fig. 2a).
By employing ancestral reconstruction technique five mammalian specific evolutionary substitutions were identified and mapped on human UCHL1 protein. One of them occurred at the root of primate’s lineage, three at the root of simian’s lineage and one specifically in ape’s history (Fig. 2a and Table 1). Analysis of physicochemical properties of these five mammalian specific amino acid substitutions suggests that all of them are of radical type (Table 1). All these evolutionary substitutions are localized within the C-12 peptidase domain (Fig. 2a). In addition, previously reported 5 PD and other neurological disorders associated missense mutations (E7A, S18Y, I93M, R178Q and A216D) are also localized to the C-12 peptidase domain except A216D which resides at the C-terminal of human UCHL1. These data reveal that highly conserved C-12 peptidase domain has a significant role not only in the evolutionary/functional perspective but also in disease pathogenesis. The SLAC-window analysis identified 73 negatively constrained sites within the C-12 peptidase domain of UCHL1 (Fig. 2b and Additional file 2: Table S2).
Protein structural evolution of UCHL1
Comparative protein structure modeling was performed to inspect how negative selection is performing its role in defining the spatial constrains on ancestral UCHL1 proteins. Ancestral protein structures (mammals, primates, simians and apes) obtained from the MODELLER program were superimposed at an appropriate evolutionary scope, i.e. mammals-primates ancestors, primates-simians ancestors and simians-apes/human ancestors (Fig. 3). Protein structural deviations were examined with the help of Chimera and root mean square deviation (RMSD) values (Fig. 3 and Table 2). Comparative structural investigations revealed very notable aspects of UCHL1 evolution that were not anticipated by comparative analysis at sequence level alone. It appears that during the course of mammalian evolution the UCHL1 has undergone strong intragenic epistatic interactions to acquire its favorable protein conformation. Aforementioned superimposed protein models revealed a common deviated region composed of amino acids 32 to 39 within the secretion site at the N-terminal of C-12 peptidase domain (Table 2). These structural deviations were also measured with the help of backbone torsions quantification which also suggests that protein segment composing of amino acids 32 to 39 has evolved during mammalian history through a well-known phenomenon called intragenic epistasis . It appears that during the course of evolution, mammalian UCHL1 has incorporated destabilizing substitutions to obtain its intrinsic disordered conformation through radical structural shifts in this identified critical region. This critical region (amino acids 32–39) is recognised as crucial for proper conformation of not only unconventional pathway secretion site of UCHL1 protein but also for the entire C-12 peptidase domain.
Interestingly, all the previously reported human specific missense mutations associated with familial and sporadic PD and other neurological disorders are confined to the C-12 peptidase domain except A216D. To investigate the impact of these previously reported missense mutations at protein structure level, all disease causing mutant structures (previously reported missense mutations) of UCHL1 were predicted through MODELLER and superimposed on 2ETL (the wild type version of UCHL1) (Fig. 4). In addition, wild-type protein structure of UCHL1 and disease causing mutant versions were also predicted through I-TASSER server (Additional file 1: Figure S2 and Additional file 2: Table S3) and Robetta server (Additional file 1: Figure S3 and Additional file 2: Table S4). I-TASSER and Robetta predicted wild-type and mutant protein structures of UCHL1 were also superimposed (Additional file 1: Fig. S2, Fig. S3 and Additional file 2: Table S3, Table S4). Structural deviations between wild type and mutant versions were evaluated using RMSD values. Intriguingly, even though the disease causing mutations are spread across UCHL1, all of them appear to impact the structure of common protein region spanning amino acids 32–39 (secretion site of UCHL1). In addition another protein region (amino acids 222–223) within farnesylation site is deviated in all disease causing mutant models except for E7A (Fig. 4 and Table 3; Additional file 1: Figure S2 and Figure S3; Additional file 2: Table S3 and Table S4). Therefore, UCHL1 protein segment spanning amino acids 32–39 is considered as critical not only in evolutionary perspective but also in disease pathogenesis.
Protein-protein interaction analysis of UCHL1
In order to further investigate the significance of identified critical protein segment, we performed the interaction analysis of human UCHL1 with its biochemically and genetically verified interacting partner proteins. The identification of pathogenic mutations in the three human genes, i.e. SNCA (PARK1), PARKIN (PARK2), and UCHL1 (PARK5) has elucidated the ubiquitin proteasome system (UPS) and its potential role as a causal pathway in PD . Furthermore, the STRING database reveals information that SNCA and PARKIN are the interacting partners of UCHL1 . For protein-protein interaction analyses the domain architecture of SNCA and PARKIN is also annotated (Fig. 5). Human SNCA protein comprises of 140 amino acids and contains three major domains named as A2 lipid binding alpha helix domain, NAC domain and C-terminal acidic domain . Furthermore, PARKIN comprises of 465 amino acids and contains 5 domains named as Ubl domain, RING0, RING1, in-between RING (IBR) domain and RING2 domain .
Interaction analysis between UCHL1 and SNCA depicts only five interacting residues that involves the C-12 peptidase domain of UCHL1 and the NAC domain of SNCA (Fig. 5 and Additional file 2: Table S5). Interaction analysis of five human disease causing mutant versions of UCHL1 with SNCA revealed altered interaction patterns, although some of the wild type interactions are also retained (Additional file 1: Fig. S4 and Additional file 2: Table S6). These data signifies that UCHL1 and SNCA not only interact in normal individuals but also in PD patients with altered interaction pattern (Fig. 5; Additional file 1: Figure S4 and Additional file 2: Table S6).
Intriguingly, in UCHL1 (R178Q)-SNCA docked complex, two altered interactions were found within the identified critical region (amino acids 32–39 within secretion site) of UCHL1, which could potentially affect the normal secretion of UCHL1 to neurons and thus could contributes to disease pathogenesis (Additional file 1: Figure S4 and Additional file 2: Table S6).
Advent of high throughput annotation of genomes and broadened availability of genomic sequence data permitted to investigate the evolutionary history of genes of interest and link it to human disease associated phenotypic traits . Ubiquitin proteasome pathway (UPP) is the most important molecular mechanism that participates in neurodegenerative diseases such as PD and AD. The major pathological hallmarks of these two important neurodegenerative disorders are characterized by the accumulation of abnormal protein aggregates (AD: Extracellular Aβ amyloid plaque; PD: Intracellular α-synuclein in the Lewy body) within the neurons [37, 38]. UPP is the major mechanism which serves to recognize the damaged or misfolded proteins and transport them to the proteasome for degradation. By this way UPP maintains the normal concentration of proteins in the neurons and thus prevent neurodegeneration [39, 40]. UPP contains several components that include 26S proteasome, ubiquitin, ubiquitin activating enzyme E1, ubiquitin conjugating enzyme E2 and ubiquitin ligating enzyme (E3). The UCHL1 gene product is a key component of UPP and functions as deubiquitinating enzymes to remove ubiquitin from proteins and it also stabilizes the monomer ubiquitin in cell free system [11, 41]. Furthermore, UCHL1 is also known to facilitate E3 ligase in UPP [42,43,44]. The UCHL1 gene product has also been implicated in processes like apoptosis, protection against oxidative stress , long-term protonation  and chaperone-mediated autophagy . However, the role of UCHL1 in these processes have not yet been fully understood .
UCHL1 consists of 223 amino acids with a single domain. It belongs to a α/β fold protein family with six β-strands inside the hydrophobic core, which is surrounded by seven α-helices . Five missense mutations in human UCHL1 gene have so far been associated with autosomal dominant PD, recessive hereditary spastic paraplegia (SPG79) and early onset of progressive neurodegeneration. The present study is an attempt to investigate sequence and structural bases of UCHL1 evolution within the sarcopterygians lineage. Furthermore, this study elucidates protein structural and interactional basis of UCHL1-associated PD pathogenesis.
The ML and NJ based phylogenetic tree topologies of the UCHL1 gene supported by high bootstrap values revealed that it is gnathostomata (jawed-vertebrate) specific gene and present in tetrapods, bony fishes and in cartilaginous fishes. Bidirectional BLAST based similarity searches did not identify orthologous counter part of this gene in any of the non-gnathostomata vertebrates and invertebrate animals analyzed. Furthermore, similarity searches failed to detect any paralogous copy of UCHL1 in any of the gnathostomata (jawed-vertebrate) clades analyzed. Based on these phylogenetic data, it is speculated that human UCHL1 might have originated at the root of jawed vertebrates (Fig. 1 and Additional file 1: Figure S1). These observations prompted us to evaluate the selection constraints on this gene within different selected subgroups of animals in sarcopterygians lineage. For this purpose, the rates of non-synonymous (Ka/dN) and synonymous (Ks/dS) substitutions were estimated within each selected subgroups of animals and their difference was calculated using z-test. These statistical estimations corroborate well with the speculation that UCHL1 has evolved under strong purifying selection throughout the sarcopterygians history, which might have discouraged any functional modification to happen through gene duplications (Additional file 2: Table S1).
The C-12 peptidase domain is a major domain that spans the large portion of UCHL1 starting from residue 3 (at N-terminal) and ending at 206 residues (at C-terminal). Comparative domain organization data suggests that the C-12 domain and its motifs/sub-motifs are similarly present in all of the animals analyzed with no major variations in protein primary organization. In addition to C12 peptidase domain, extreme N-terminal and C-terminal sites also appear to be highly preserved among all the analyzed sarcopterygian animals (Fig. 2). Intriguingly, conserved domain/motif organization of UCHL1 among distantly related orthologs corroborates well with the signal of strong purifying selection identified through z-statistics and with previously reported functional data on human UCHL1. For instance, a recent study suggests that removal of only eleven residues from the N-terminal of UCHL1 is sufficient for the protein to loss affinity for ubiquitin or lack of deubiquitinating activity and ultimately leads to formation of insoluble aggregates [46, 47]. Minor truncation at N- or C-terminus of UCHL1 are reported to denature the protein, therefore renders it functionless . In vitro mutagenic and in-silico simulation studies revealed that removal of few amino acids either from the C-terminal or from the N-terminal of UCHL1 can destabilize its three dimensional structure, resulting in unfolding or loss of solubility consistent with protein aggregation [46, 48, 49].
Comparative structural analyses were performed to inspect how negative selection is playing its role in defining the spatial constrains on ancestral UCHL1 at structural level (Fig. 3 and Table 2). The comparative structural analysis of predicted ancestral proteins of UCHL1 (mammals ancestor, primates ancestor, simians ancestor and apes/human ancestor) showed multiple deviated regions in each of the predicted ancestral UCHL1. Comparative analysis of ancestral predicted structural data revealed a common deviated region comprises of amino acids 32 to 39 (Fig. 3). This particular protein segment appears to have experienced structural shifts repeatedly during the course of mammalian evolution through strong intragenic epistatic interactions within UCHL1. Furthermore, the protein structural impacts of five identified lineage specific substitutions suggest that UCHL1 has undergone structural destabilization during the course of mammalian history (Table 1). This structural destabilization can best be explained by speculating that during the course of mammal’s evolution UCHL1 has acquired the ability to attain favorable conformation upon binding to its targets (Table 1).
Furthermore, we also performed comparative structural analysis of previously reported PD and other neurological disorders causing missense mutations of UCHL1 (between wild type and all five PD and other neurological disorders causing mutant versions). This comparative structural analysis (predicted through Modeller, I-TASSER and Robetta) revealed multiple distinct deviated regions in each comparison as well as a commonly deviated segment comprising of amino acids 32 to 39 within the secretion site at N-terminal of the C-12 peptidase domain (Fig. 4 and Table 3; Additional file 1: Figure S2 and Figure S3; Additional file 2: Table S3 and Table S4). Interestingly, this commonly deviated region (32 to 39) in disease causing variants of UCHL1 also appears to have evolved structurally during mammalian evolution (described in preceding sections). Therefore, the protein segment 32–39 of UCHL1 is both important and indispensable in evolution and PD pathogenesis through intraprotein conformational epistasis . The functional significance of this critical region is supported by previously reported data which suggest that the substitution of the leucine within this region (Leu-32 to Leu-39) leads to reduced secretion of UCHL1 in cytoplasm of neuronal cells and consequently causing PD phenotype . In addition, the comparative analysis of human UCHL1 structure with all five PD-causing mutant versions (except E7A) revealed another commonly deviated region at C-terminal comprises of protein segment 220–223 within farnesylation site (Fig. 4 and Table 3; Additional file 1: Figure S2 and Figure S3; Additional file 2: Table S3 and Table S4). Functional significance of this second commonly deviated region corroborate well with previously reported data which suggest that the loss of just four amino acids from the C-terminal of UCHL1 is sufficient to induce protein aggregation and consequently neuronal cell death [32, 49]. Taken together, critical regions of UCHL1 identified in the present study (amino acids 32–39 & amino acids 220–223) are important for maintaining normal neuronal physiology and any conformational changes in either of these protein regions could lead to PD disease.
The interaction analysis of UCHL1 with its major interacting partner’s, i.e. SNCA and PARKIN further highlights the structural, functional and disease significance of identified critical regions (amino acids 32–39). For instance, the protein-protein interaction analysis revealed that the C-12 peptidase domain of UCHL1 interacts with the NAC domain of SNCA, and the RING1 and IBR domains of PARKIN (Fig. 5). Interestingly, both SNCA and PARKIN appears to interact physically with the identified critical region (amino acids 32–39) of UCHL1 (Fig. 5 and Additional file 2: Table S5). In addition, interaction analysis of five human disease causing mutant versions of UCHL1 with SNCA revealed altered interaction pattern, although some of the wild type interactions are also retained. For instance, docked complex of wild type UCHL1 with disease-causing mutant version of SNCA (A30P) revealed no altered interactions (same as the wild type interaction of UCHL1-SNCA complex) (Additional file 1: Figure S5 and Additional file 2: Table S6). In contrast, docked complex of wild type UCHL1 with another disease-causing mutant version of SNCA (A53T) revealed altered interaction pattern (Additional file 1: Figure S5 and Additional file 2: Table S6). These interaction data corroborate well with previously reported biochemical data which suggests that the A53T mutant version of SNCA impacts the normal secretion of UCHL1 in neuronal cytoplasm . Based on these interaction analyses, it is speculated that UCHL1 and SNCA interact physically, not only in normal individuals but also in PD patients with altered interaction pattern. (Fig. 5; Additional file 1: Figure S4 and Additional file 2: Table S6).
Human UCHL1 is known to play an important role in ubiquitin stability within neurons which is critical for ubiquitin–proteasome system and neuronal survival. Mutations in the human UCHL1 gene have been associated with various neurodegenerative disorders like PD, recessive hereditary spastic paraplegia (SPG79), AD and Huntington’s disease. Considering the indispensable role of the UCHL1 gene product in neuronal physiology and pathophysiology, the current study investigates the sequence evolutionary pattern and structural dynamics of UCHL1. Phylogenetic data suggest the ancient origin of UCHL1 at the root of gnathostomes (jawed vertebrate) history. Furthermore, molecular sequence evolutionary analysis reveals that UCHL1 has remained under strong functional constraints throughout the gnathostomes history which might have discouraged the duplication of this gene in any of the animal lineage analyzed in the present study. Comparative structural analysis of UCHL1 pinpointed a critical protein segment (amino acids 32 to 39 within the secretion site) with crucial implications in evolution and PD pathogenesis through a well known phenomenon of intraprotein conformational epistasis. This critical protein segment of UCHL1 can be targeted for drug designing and investigation for the treatment of PD in future.
The putative orthologous protein sequences of human UCHL1 were retrieved from protein databases accessible at Ensemble  and National Center for Biotechnology Information  by using BLAST p bidirectional best hit approach . Further confirmation of the common ancestry of the putative orthologs was obtained by clustering homologous proteins within phylogenetic trees [53, 54]. Sequences whose position within a tree is in sharp conflict with the uncontested animal phylogeny are excluded from the analysis. All protein sequences used in this study are provided in Additional file 3.
List of species selected for sequence analysis of UCHL1 is Homo sapiens (Human), Pan troglodytes (Chimpanzee), Gorilla gorilla (Gorilla), Macaca mulatta (Macaque), Cebus capucinus (white-faced sapajou), Otolemur garnettii (Northern greater galago), Mus musculus (Mouse), Rattus norvegicus (Rat), Equus caballus (Horse), Tursiops truncatus (Dolphin), Bos taurus (Cow), Felis catus (Cat), Canis lupus familiaris (Dog), Pteropus vampyrus (Megabat), Myotis lucifugus (Micro bat), Erinaceus europaeus (Hedgehog), Echinops telfairi (lesser hedgehog tenrec), Monodelphis domestica (Opossum), Gallus gallus (Chicken), Anolis carolinensis (Anole lizard), Latimeria chalumnae (Coelacanth), Oryzias latipes (Medaka), Gasterosteus aculeatus (Stickleback), Tetraodon nigroviridis (Tetraodon), Takifugu rubripes (Fugu), Danio rerio (Zebra fish), Lepisosteus oculatus (Spotted gar), Callorhinchus milii (Elephant shark), Rhincodon typus (Whale Shark).
Protein sequences were aligned by using CLUSTAL W through MEGA5 . The phylogenetic tree of UCHL1 was reconstructed by applying NJ method [55, 56]. Complete deletion option was used for removing gaps and missing data in the protein sequences. Poisson corrected (PC) amino acid distance and uncorrected p-distance of amino acids were used as amino acid substitution models . Due to the similar results obtained with both aforementioned models only NJ tree based on uncorrected p-distance is presented (Fig. 1). ML tree was also constructed by using the Whelan and Goldman (WAG) model of amino acid substitutions  (Additional file 1: Figure S1). To ensure the reliability and accuracy of the both NJ and ML trees, topologies bootstrap method was used (at 1000 pseudo replicates), which assigns the bootstrap values to each branch of the tree .
ML method and WAG model of amino acid substitution were used to predict ancestral sequences of UCHL1. Z-test is executed with MEGA  to examine selection constraint within hominoids (human, chimpanzee, gorilla and orangutan), non-hominoids (macaque, marmoset, squirrel monkey and bush baby), non-primate placental mammals (mouse, cat, cow and elephant) and non-mammalian tetrapods (chicken, turtle, frog and coelacanth). Goldman And Yang (GY-94) method (codon based model) implemented in Hyphy program was employed to calculate dN-dS for each of the aforementioned sarcopterygians groups .
Domains, motifs and sub-motifs were assigned to human UCHL1 by ratification from different databases like pfam  and MyHit tool . Clustal Omega  was employed for multiple sequence alignment to map the putative positions and locations of domains, motifs and sub-motifs on human UCHL1 and also in orthologous sequences from selected sarcopterygian animals (Fig. 2a). Identified evolutionary substitutions and previously reported human PD and SPG79 causing missense mutations (E7A, S18Y, I93M, R178Q, A216D) were mapped on the human UCHL1 (Fig. 2a). To estimate the negatively constrained residues of UCHL1 among sarcopterygians, we employed Single Likelihood Ancestor Counting (SLAC) method through Hyphy which uses global codon model and maximum likelihood to reconstruct the evolutionary history . The impact of all substitutions identified within mammalian history of UCHL1 were also classified into neutral or radical on the basis of their physicochemical properties, i.e. charge, volume, polarity [62, 63].
X-ray structure of human UCHL1 (2ETL) was retrieved from RCSB Protein Data Bank (PDB) . This X-ray structure was used as a reference in the comparative structural analysis to evaluate the structural deviations, both in evolutionary and disease perspective. Ancestral protein sequences (Mammalian ancestral, primate’s ancestral, simian ancestor and apes/human ancestor) were predicted by ancestral reconstruction technique with wild type UCHL1 structure as a reference to model the ancestral proteins (aforementioned) through homology modeling program MODELLER9 . Best structures were scrutinized on the basis of Discrete Optimized Protein Energy score. For improving the quality of the modeled protein structures, energy minimization protocols were employed through YASARA energy minimization server . For further quality validation of the modeled structures, RAMPAGE  and ERRAT  were employed (Additional file 1: Figure S6). MuPro was used to investigate the impact of lineage specific substitutions on the modeled ancestral UCHL1 . X-ray structure of human UCHL1 was also used as a reference to model the protein structures of human PD and other neurological disorders causing missense variants of UCHL1 (E7A, S18Y, I93M, R178Q, and A216D) via homology modeling program MODELLER9 . Furthermore, we also predicted the structures of wild-type and disease causing mutant versions of UCHL1 protein through I-TASSER server  (Additional file 1: Figure S3) and Robetta server (https://robetta.bakerlab.org/) (Additional file 1: Figure S4). Aforementioned protocol is used to minimize, validate and check the quality of PD and other neurological disorder causing mutant models of UCHL1 (Additional file 1: Figure S7). Superimposition of all modeled disease-causing mutant versions of UCHL1 with their wild type version was carried out by Chimera and root mean square deviation (RMSD) values were calculated . For protein-protein interaction analysis of human UCHL1, Cluspro protein-protein docking server  was utilized. X-ray structure of interacting partners of UCHL1, i.e. SNCA and PRKIN were obtained from PDB. Interaction between human UCHL1, human SNCA and human PRKIN were examined with the help of Ligplot  and PyMol .
Ubiquitin C-Terminal Hydrolase 1
Whelan and Goldman
Root Mean Square Deviation
Single Likelihood Ancestor Counting
Protein Data Bank
Protein Kinase C
Casein Kinase II
Ubiquitin Proteasome Pathway
Kumar R, Jangir DK, Verma G, Shekhar S, Hanpude P, Kumar S, et al. S-nitrosylation of UCHL1 induces its structural instability and promotes α-synuclein aggregation. Sci Rep. 2017;7:44558.
Siddiqui IJ, Pervaiz N, Abbasi AA. The Parkinson disease gene SNCA: evolutionary and structural insights with pathological implication. Sci Rep. 2016;6:24475.
Lee Y-T C, Hsu S-T D. Familial mutations and post-translational modifications of UCH-L1 in Parkinson's disease and neurodegenerative disorders. Curr Protein Pept Sci. 2017;18(7):733–45.
De Virgilio A, Greco A, Fabbrini G, Inghilleri M, Rizzo MI, Gallo A, et al. Parkinson's disease: autoimmunity and neuroinflammation. Autoimmun Rev. 2016;15(10):1005–11.
Lunati A, Lesage S, Brice A. The genetic landscape of Parkinson’s disease. Rev Neurol. 2018;174(9):628.
Zhang M, Cai F, Zhang S, Zhang S, Song W. Overexpression of ubiquitin carboxyl-terminal hydrolase L1 (UCHL1) delays Alzheimer's progression in vivo. Sci Rep. 2014;4:7298.
Bilguvar K, Tyagi NK, Ozkara C, Tuysuz B, Bakircioglu M, Choi M, et al. Recessive loss of function of the neuronal ubiquitin hydrolase UCHL1 leads to early-onset progressive neurodegeneration. Proc Natl Acad Sci U S A. 2013;110(9):3489–94.
Nazé P, Vuillaume I, Destée A, Pasquier F, Sablonnière B. Mutation analysis and association studies of the ubiquitin carboxy-terminal hydrolase L1 gene in Huntington's disease. Neurosci Lett. 2002;328(1):1–4.
Belin AC, Westerlund M, Bergman O, Nissbrandt H, Lind C, Sydow O, et al. S18Y in ubiquitin carboxy-terminal hydrolase L1 (UCH-L1) associated with decreased risk of Parkinson's disease in Sweden. Parkinsonism Relat Disord. 2007;13(5):295–8.
Leroy E, Boyer R, Auburger G, Leube B, Ulm G, Mezey E, et al. The ubiquitin pathway in Parkinson's disease. Nature. 1998;395(6701):451.
Rydning SL, Backe PH, Sousa MM, Iqbal Z, Øye A-M, Sheng Y, et al. Novel UCHL1 mutations reveal new insights into ubiquitin processing. Hum Mol Genet. 2016;26(6):1031–40.
Das C, Hoang QQ, Kreinbring CA, Luchansky SJ, Meray RK, Ray SS, et al. Structural basis for conformational plasticity of the Parkinson's disease-associated ubiquitin hydrolase UCH-L1. Proc Natl Acad Sci U S A. 2006;103(12):4675–80.
Ragland M, Hutter C, Zabetian C, Edwards K. Association between the ubiquitin carboxyl-terminal esterase L1 gene (UCHL1) S18Y variant and Parkinson's disease: a HuGE review and meta-analysis. Am J Epidemiol. 2009;170(11):1344–57.
Sasaki H, Yukiue H, Moiriyama S, Kobayashi Y, Nakashima Y, Kaji M, et al. Clinical significance of matrix metalloproteinase-7 and Ets-1 gene expression in patients with lung cancer. J Surg Res. 2001;101(2):242–7.
Yamazaki T, Hibi K, Takase T, Tezel E, Nakayama H, Kasai Y, et al. PGP9. 5 as a marker for invasive colorectal cancer. Clin Cancer Res. 2002;8(1):192–5.
Kabuta T, Furuta A, Aoki S, Furuta K, Wada K. Aberrant interaction between Parkinson disease-associated mutant UCH-L1 and the lysosomal receptor for chaperone-mediated autophagy. J Biol Chem. 2008;283(35):23731–8.
Li H, Kiyama H, Osaka H, Kimura I, Nishikawa K, Namikawa K, et al. Ubiquitin carboxy-terminal hydrolase L1 binds to and stabilizes monoubiquitin in neuron. Hum Mol Genet. 2003;12(16):1945–58.
Nishikawa K, Li H, Kawamura R, Osaka H, Wang Y-L, Hara Y, et al. Alterations of structure and hydrolase activity of parkinsonism-associated human ubiquitin carboxyl-terminal hydrolase L1 variants. Biochem Biophys Res Commun. 2003;304(1):176–83.
Setsuie R, Wang Y-L, Mochizuki H, Osaka H, Hayakawa H, Ichihara N, et al. Dopaminergic neuronal loss in transgenic mice expressing the Parkinson's disease-associated UCH-L1 I93M mutant. Neurochem Int. 2007;50(1):119–29.
Setsuie R, Wada K. The functions of UCH-L1 and its relation to neurodegenerative diseases. Neurochem Int. 2007;51(2–4):105–11.
Cartier AE, Ubhi K, Spencer B, Vazquez-Roque RA, Kosberg KA, Fourgeaud L, et al. Differential effects of UCHL1 modulation on alpha-synuclein in PD-like models of alpha-synucleinopathy. PLoS One. 2012;7(4):e34713.
Mellick G, Silburn P. The ubiquitin carboxy-terminal hydrolase-L1 gene S18Y polymorphism does not confer protection against idiopathic Parkinson's disease. Neurosci Lett. 2000;293(2):127–30.
Miyake Y, Tanaka K, Fukushima W, Kiyohara C, Sasaki S, Tsuboi Y, et al. UCHL1 S18Y variant is a risk factor for Parkinson’s disease in Japan. BMC Neurol. 2012;12(1):62.
Yousaf A, Sohail Raza M, Ali AA. The evolution of bony vertebrate enhancers at odds with their coding sequence landscape. Genome Biol Evol. 2015;7(8):2333–43.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28(10):2731–9.
Abbasi AA. Molecular evolution of HR, a gene that regulates the postnatal cycle of the hair follicle. Sci Rep. 2011;1:32.
Abbasi AA, Goode DK, Amir S, Grzeschik K-H. Evolution and functional diversification of the GLI family of transcription factors in vertebrates. Evol Bioinforma. 2009;5:S2322.
Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, et al. The Pfam protein families database. Nucleic Acids Res. 2011;40(D1):D290–301.
Pagni M, Ioannidis V, Cerutti L, Zahn-Zabal M, Jongeneel CV, Falquet L. MyHits: a new interactive resource for protein annotation and domain identification. Nucleic Acids Res. 2004;32(suppl_2):W332–5.
Bett JS, Ritorto MS, Ewan R, Jaffray EG, Virdee S, Chin JW, et al. Ubiquitin C-terminal hydrolases cleave isopeptide-and peptide-linked ubiquitin from structured proteins but do not edit ubiquitin homopolymers. Biochem J. 2015;466(3):489–98.
Konya C, Hatanaka Y, Fujiwara Y, Uchida K, Nagai Y, Wada K, et al. Parkinson’s disease-associated mutations in α-synuclein and UCH-L1 inhibit the unconventional secretion of UCH-L1. Neurochem Int. 2011;59(2):251–8.
Liu Z, Meray RK, Grammatopoulos TN, Fredenburg RA, Cookson MR, Liu Y, et al. Membrane-associated farnesylated UCH-L1 promotes α-synuclein neurotoxicity and is a therapeutic target for Parkinson's disease. Proc Natl Acad Sci. 2009;106(12):4635–40.
Ortlund EA, Bridgham JT, Redinbo MR, Thornton JW. Crystal structure of an ancient protein: evolution by conformational epistasis. Science. 2007;317(5844):1544–8.
Lim K-L, Tan JM. Role of the ubiquitin proteasome system in Parkinson's disease. BMC Biochem. 2007;8(1):1–10.
Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, et al. The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic Acids Res. 2016;1:gkw937.
Trempe J-F, Sauvé V, Grenier K, Seirafi M, Tang MY, Ménade M, et al. Structure of parkin reveals mechanisms for ubiquitin ligase activation. Science. 2013;340(6139):1451–5.
Ciechanover A, Kwon YT. Degradation of misfolded proteins in neurodegenerative diseases: therapeutic targets and strategies. Exp Mol Med. 2015;47(3):e147.
Sulistio YA, Heese K. The ubiquitin-proteasome system and molecular chaperone deregulation in Alzheimer’s disease. Mol Neurobiol. 2016;53(2):905–31.
Huang Q, Figueiredo-Pereira ME. Ubiquitin/proteasome pathway impairment in neurodegeneration: therapeutic implications. Apoptosis. 2010;15(11):1292–311.
Schwartz AL, Ciechanover A. Targeting proteins for destruction by the ubiquitin system: implications for human pathobiology. Annu Rev Pharmacol Toxicol. 2009;49:73–96.
Tramutola A, Di Domenico F, Barone E, Perluigi M, Butterfield DA. It is all about (U) biquitin: role of altered ubiquitin-proteasome system and UCHL1 in Alzheimer disease. Oxidative Med Cell Longev. 2016;2016:1.
Gadhave K, Bolshette N, Ahire A, Pardeshi R, Thakur K, Trandafir C, et al. The ubiquitin proteasomal system: a potential target for the management of Alzheimer's disease. J Cell Mol Med. 2016;20(7):1392–407.
McNaught KSP, Jenner P. Proteasomal function is impaired in substantia nigra in Parkinson's disease. Neurosci Lett. 2001;297(3):191–4.
Ma S, Attarwala IY, Xie X-Q. SQSTM1/p62: a potential target for neurodegenerative disease. ACS Chem Neurosci. 2019;10(5):2094–114.
Kyratzi E, Pavlaki M, Stefanis L. The S18Y polymorphic variant of UCH-L1 confers an antioxidant function to neuronal cells. Hum Mol Genet. 2008;17(14):2160–71.
Bishop P, Rocca D, Henley JM. Ubiquitin C-terminal hydrolase L1 (UCH-L1): structure, distribution and roles in brain function and dysfunction. Biochem J. 2016;473(16):2453–62.
Kim H-J, Kim HJ, Jeong J-E, Baek JY, Jeong J, Kim S, et al. N-terminal truncated UCH-L1 prevents Parkinson's disease associated damage. PLoS One. 2014;9(6):e99654.
Sułkowska JI, Rawdon EJ, Millett KC, Onuchic JN, Stasiak A. Conservation of complex knotting and slipknotting patterns in proteins. Proc Natl Acad Sci U S A. 2012;109(26):E1715–23.
Bishop P, Rubin P, Thomson AR, Rocca D, Henley JM. The ubiquitin C-terminal hydrolase L1 (UCH-L1) C terminus plays a key role in protein stability, but its farnesylation is not required for membrane association in primary neurons. J Biol Chem. 2014;289(52):36140–9.
Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, et al. Ensembl 2018. Nucleic Acids Res. 2017;46(D1):D754–61.
Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, et al. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2007;36(suppl_1):D13–21.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.
Pervaiz N, Shakeel N, Qasim A, Zehra R, Anwar S, Rana N, et al. Evolutionary history of the human multigene families reveals widespread gene duplications throughout the history of animals. BMC Evol Biol. 2019;19(1):128.
Seemab S, Pervaiz N, Zehra R, Anwar S, Bao Y, Abbasi AA. Molecular evolutionary and structural analysis of familial exudative vitreoretinopathy associated FZD4 gene. BMC Evol Biol. 2019;19(1):72.
Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4(4):406–25.
Russo C. Efficiencies of different statistical tests in supporting a known vertebrate phylogeny. Mol Biol Evol. 1997;14(10):1078–80.
Tamura K, Nei M, Kumar S. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc Natl Acad Sci U S A. 2004;101(30):11030–5.
Whelan S, Goldman N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol. 2001;18(5):691–9.
Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39(4):783–91.
Goldman N, Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994;11(5):725–36.
Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal omega. Mol Syst Biol. 2011;7(1):539.
Grantham R. Amino acid difference formula to help explain protein evolution. Science. 1974;185(4154):862–4.
Betts MJ, Russell RB. Amino acid properties and consequences of substitutions. Bioinformatics Genet. 2003;1:289–316.
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data Bank. Nucleic Acids Res. 2000;28(1):235–42.
Webb B, Sali A. Comparative protein structure modeling using MODELLER. Curr Protoc Bioinformatics. 2014;47(1):1–5.6.
Krieger E, Joo K, Lee J, Lee J, Raman S, Thompson J, et al. Improving physical realism, stereochemistry, and side-chain accuracy in homology modeling: four approaches that performed well in CASP8. Proteins. 2009;77(S9):114–22.
Lovell SC, Davis IW, Arendall WB III, De Bakker PI, Word JM, Prisant MG, et al. Structure validation by Cα geometry: ϕ, ψ and Cβ deviation. Proteins. 2003;50(3):437–50.
Colovos C, Yeates TO. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Sci. 1993;2(9):1511–9.
Cheng J, Randall A, Baldi P. Prediction of protein stability changes for single-site mutations using support vector machines. Proteins. 2006;62(4):1125–32.
Yang J, Zhang Y. I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Res. 2015;43(W1):W174–81.
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF chimera—a visualization system for exploratory research and analysis. J Comput Chem. 2004;25(13):1605–12.
Comeau SR, Gatchell DW, Vajda S, Camacho CJ. ClusPro: a fully automated algorithm for protein–protein docking. Nucleic Acids Res. 2004;32(suppl_2):W96–9.
Wallace AC, Laskowski RA, Thornton JM. LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng Des Sel. 1995;8(2):127–34.
DeLano W. The PyMOL molecular graphics system, version 1.3 r1. Schrödinger, LLC, New York; 2010. p. 1–10.
The authors thank Yasir Mahmood Abbasi (computer programmer) for technical support.
This work was supported by National Key Research and Development Program of China [2016Y FE0206600 to Y.B.]; The 13th Five-year Informatization Plan of Chinese Academy of Sciences [XXH13505-05 to Y.B.]; The Professional Association of the Alliance of International Science Organizations [ANSO-PA-2020-07 to Y.B.]; The Open Biodiversity and Health Big Data Program of IUBS [to Y.B.].
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Nawaz, M.S., Asghar, R., Pervaiz, N. et al. Molecular evolutionary and structural analysis of human UCHL1 gene demonstrates the relevant role of intragenic epistasis in Parkinson’s disease and other neurological disorders. BMC Evol Biol 20, 130 (2020). https://doi.org/10.1186/s12862-020-01684-7