Skip to main content

Molecular evolutionary analysis of human primary microcephaly genes



There has been a rapid increase in the brain size relative to body size during mammalian evolutionary history. In particular, the enlarged and globular brain is the most distinctive anatomical feature of modern humans that set us apart from other extinct and extant primate species. Genetic basis of large brain size in modern humans has largely remained enigmatic. Genes associated with the pathological reduction of brain size (primary microcephaly-MCPH) have the characteristics and functions to be considered ideal candidates to unravel the genetic basis of evolutionary enlargement of human brain size. For instance, the brain size of microcephaly patients is similar to the brain size of Pan troglodyte and the very early hominids like the Sahelanthropus tchadensis and Australopithecus afarensis.


The present study investigates the molecular evolutionary history of subset of autosomal recessive primary microcephaly (MCPH) genes; CEP135, ZNF335, PHC1, SASS6, CDK6, MFSD2A, CIT, and KIF14 across 48 mammalian species. Codon based substitutions site analysis indicated that ZNF335, SASS6, CIT, and KIF14 have experienced positive selection in eutherian evolutionary history. Estimation of divergent selection pressure revealed that almost all of the MCPH genes analyzed in the present study have maintained their functions throughout the history of placental mammals. Contrary to our expectations, human-specific adoptive evolution was not detected for any of the MCPH genes analyzed in the present study.


Based on these data it can be inferred that protein-coding sequence of MCPH genes might not be the sole determinant of increase in relative brain size during primate evolutionary history.


The enlarged and globular brain distinguish modern humans not only from extant non-human primates but also from their extinct Homo relatives [1]. The modern human brain is approximately three-fold larger in size than that of our closest extant relative, the chimpanzee, and extinct early hominids. From developmental perspective, the larger brain size in humans is attributed to human-specific fast and prolonged neonatal and postnatal brain growth and patterning [2,3,4]. The evolutionary expansion of human brain size is heterogeneous across brain regions. For instance, the most notable expansion occurred in the neocortex that has been directly related to the emergence of higher cognitive capabilities, such as language, intelligence, and social learning [5]. Most of the expansion in brain size occurred in the last 2–3 million years of human evolution [6, 7]. The genetic basis of divergence between the highly cognitive human brain and supposedly lesser cognitive non-human primate brain has largely remained enigmatic. It has been speculated that complexity of modern human brain arose through changes in protein-coding genes and non-coding regulatory elements [8]. In particular, in relation to the evolutionary expansion of cerebral cortex, genes associated with primary microcephaly (MCPH) have been the focus of immense attention in the past couple of decades [9]. Primary microcephaly is an autosomal congenital disorder, which is characterized by significant reduction in brain mass, particularly the cerebral cortex with no other neuroanatomical abnormalities [10]. The brain size of microcephaly patients is similar to that of very primitive extinct hominids Sahelanthropus tchadensis (brain size; 370 cm3) and Australopithecus afarensis (brain size; 450 cm3). Therefore, the small brain size in microcephaly patients is considered as an example of evolutionary retrogression [7, 11]. Furthermore, microcephaly patients have mild to severe cognitive impairment [10]. MCPH is a heterogeneous disorder and associated with mutations in at least sixteen gene loci (MCPH1-16) [12]. The underlying MCPH genes are named as MCPH1, WDR62, CDK5RAP2, CASC5, ASPM, CENPJ, STIL, CEP135, CEP152, ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT, and KIF14 [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]. Almost all MCPH-associated genes express predominantly in the fetal brain and regulate neurogenesis and brain size through participation in several important cellular mechanisms including DNA replication, repair, cytokinesis, intracellular transport and autophagy [27]. Therefore, investigation of MCPH genes can reveal molecular mechanisms that are critical in the regulation of brain size and complexity [28, 29]. Indeed, many previously reported molecular evolutionary investigations have implicated MCPH genes (like ASPM and CDK5RAP2) in the expansion and reduction of brain size during primate history [30, 31].

In this study, the roles of MCPH genes in the evolutionary enlargement of human brain size was explored through molecular evolutionary analysis of eight newly identified microcephaly genes; CEP135, ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT, and KIF14. Sequence data from 48 eutherian species was employed to investigate the signatures of episodic positive selection and diversifying selection during mammalian evolutionary history. Results obtained in the present study provide a broader perspective on the evolutionary link between primary microcephaly genes and human brain size.


Molecular evolutionary analysis of MCPH genes

In order to investigate the genetic basis of the evolutionary expansion of human brain, eight newly identified MCPH genes (CEP135, ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT, and KIF14) were considered as candidates for evolutionary analysis. We assembled the coding sequence for each of this selected subset of MCPH genes from a broad range of eutherian genomes which include 21 primatomorphans, 9 glirans, 6 carnivores, 2 perissodactylans, 5 cetartiodactylans, 3 chiropterans, and one animal each from Eulipotyphla and Proboscidea (Additional file 1: Figure S1). These 48 eutherian genomes provide enough genomic coverage and evolutionary breadth to perform a thorough molecular evolutionary genetic analysis. For each of these selected subsets of MCPH genes, orthologous sequence information was classified into three data sets. i.e., primates (20 genomes), nonprimate placental mammals (28 genomes), and placental mammals (48 genomes). Each of these selected subsets of datasets is subjected to maximum likelihood based codon substitution models.

Signature of pervasive positive selection in MCPH genes

In order to examine whether the pervasive positive selection has operated on selected subset of MCPH genes (CEP135, ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT, and KIF14), three pairs of site models (M1 & M2, M7 & M8, and M8a & M8) based on codon substitutions were used. These codon substitutions site models assume that selective pressure ω vary among the amino acid sites but not across the lineage [32]. The signature of positive selection is considered optimal only if two out of the three null models (M1, M7, and M8a) are rejected in the favor of more complex alternative models (M2 and M8). The results of codon substitutions site pair models (M1/M2, M7/M8, and M8a/M8) revealed the consistent signature of pervasive positive selection only for KIF14 gene in primates (Table 1 and Additional file 2: Table S1). Furthermore, significant signatures of positive selection were also detected in non-primate placental mammals and placental mammals for three (ZNF335, SASS6, and CIT) and two MCPH genes (ZNF335, and KIF14) respectively (Table 1 and Additional file 2: Table S1). Positively selected codon sites were identified by using the Bayes Empirical Bayes (BEB) method implemented in M8 codon substitution site model [33] (Table 2).

Table 1 Signature of pervasive positive selection in primary microcephaly genes
Table 2 Positive selected sites detected by using M8 codon substitution site model

Signatures of episodic positive selection

We next examined the imprints of transient or episodic positive selection in different lineages from ancestral primate branch to human terminal branch by employing branch site model (Additional file 2: Table S2). Branch site model allows the ratio of nonsynonymous (dN) to synonymous substitution (dS) rates ω to vary not only across the branches in the phylogeny but also across sites [34]. The significance of the transient imprint of adaptive evolution was determined by likelihood ratio tests (LRTs) of null model (similar to branch-site model except ω2 is restricted to one for the predefined lineage of interest) against branch-site model through log likelihood score for each model. False positive results obtained by branch-site model were eliminated by estimating the false discovery rate q value. Branch site calculations revealed no significant patterns of episodic positive selection in any lineage from primate ancestor to human terminal branch for all MCPH genes analyzed except for KIF14 in homininae ancestral branch (Additional file 2: Table S2).

Divergent selection of MCPH genes across mammals

Functional divergence among orthologous proteins during evolution may not necessarily be reflected as a signature of positive selection at the sequence level. Instead, sequence evolutionary rate variations among different clades of phylogeny can also be taken as a metric of adaptive functional diversity. For each of the eight MCPH coding regions, we estimated the divergent selective pressure between different partitions of mammalian phylogeny (primate vs. nonprimate, simians vs. nonsimians, catarrhini vs. noncatarrhini, hominidae vs. nonhominidae and hominini vs. nonhominini placental mammals) by employing clade model c (CmC) (Table 3). Similar to site and branch-site analysis, false positive data in CmC analysis was eradicated by q value [35]. The CmC analysis revealed that selective pressure on different parts of phylogeny is not significantly divergent for all analyzed microcephaly loci except for SASS6 in the comparison of simians and nonsimians placental mammals (Table 3). In this particular case of SASS6, approximately 34% of sites evolved under divergent selective pressure between simians (ωs = 0.499) and nonsimians placental mammals (ωns = 0.214) (Table 3).

Table 3 Divergent selection constraint parameters estimation and likelihood scores for eight MCPH genes on different partition of mammalian phylogeny


Compared to other primates, including great apes, humans have very large brains. Weighing approximately 1,400 g, our brains are roughly three times larger than those of other great apes such as chimpanzees (395 g) and gorillas (490 g) [36]. In particular, during 4–5 million years of human evolution, an enormous increase in brain size has occurred, from a brain mass of 450 g found in Australopithecines to about 1400 g in modern Homo sapiens [37]. Similar to overall increase in brain size/mass, the neocortex of brain (which is involved in higher-order brain functions such as cognition, reasoning and language) has significantly enlarged in the hominin lineage after the divergence of closely related chimpanzee lineage aproximately ~ 6–7 million years ago [7]. This expansion in neocortex size in the hominin lineage might have occurred prior to the split of anatomically modern humans from archaic hominins (Neanderthals and Denisovans) approximately 550,000–750,000 years ago, as both modern humans and Neanderthals exhibit comparable overall brain size [38, 39]. However, cranial lobe size (that demarcate different regions of neocortex) does differ between anatomically modern humans and Neanderthals, most prominently parieto-temporal lobe of the neocortex has increased and orbitofrontal cortex is wider in modern humans as compared to Neanderthals [38, 40]. This indicates that certain neocortical regions have evolved after the split of Neanderthals and anatomically modern humans. The size of the neocortex is predominantly determined by the magnitude of neurogenesis and cytokinesis during fetal development. During neurogenesis, cortical neurons originate from a progenitor cell in the ventricular zone of the developing brain [41]. The progenitor cells undergo successive cycles of proliferative division before entering to neurogenic division and formation of the subventricular zone [42,43,44,45]. Massive expansion in the neocortex size during human evolution has been attributed to extraordinary expansion in neuron number through increased rate of neural progenitor cell division [5]. This expansion can be explained in the context of the “radial unit hypothesis” of cortical development [46]. This hypothesis proposes a general mechanism for a rapid increase in neocortical surface area during evolution through prolonged proliferative symmetric division period. This could yield an increased number of radial columnar units that ultimately generate neurons and consequently expand the neocortical surface area [46]. An alternate hypothesis, “intermediate progenitor model” suggests that during mammalian evolution the expansion in neocortical surface area and folding have occurred due to the escalation in basal progenitor pool size (BP originates from apical radial glia; the main neural progenitor cells in the ventricular zone) and their subsequent expansion in the subventricular zone [47]. Recently, another hypothesis linked the evolutionary expansion of neocortex to increased proliferative capacity of basal radial glia (bRG) [48]. Though the timing of brain development is conserved across mammals, however species specific differences in the duration of cortical neurogenesis (6 days in mice, 60 days in macaque, and 100 days in humans) might have contributed to the differences in neocortex size and complexity during primate history [5, 49]. Species-specific difference in BP pool size abundance and temporal aspects of neocorticogenesis might have some lineage specific genetic underpinnings [48].

The majority of MCPH gene products are known to play an important role in the regulation of duration and mode of cell division and hence identified as prime suspect in evolutionary expansion of mammalian/primate cerebral cortex [43, 50, 51]. Initial evaluation of MCPH genes has revealed that MCPH1, CDK5RAP2, ASPM, and CENPJ evolved adaptively in the human lineage [9, 52, 53]. Further investigations have extended the signatures of positive selection in these MCPH loci (MCPH1, CDK5RAP2, ASPM, and CENPJ) from human to all anthropoid primates [31]. Recently, the inclusion of the non-primate eutherian species in evolutionary genetic studies of MCPH genes revealed the signatures of pervasive positive selection on ASPM, CDK5RAP2, MCPH1, CENPJ, CEP152 and WDR62 throughout the eutherians history [54].

The present study investigates the molecular evolutionary history of eight newly identified MCPH genes by employing sequence data from 48 eutherian genomes and rigorous maximum likelihood based codon substitution models. The codon substitutions site analysis indicated that positive selection occurred during different stages of eutherian evolution in four MCPH genes, ZNF335, SASS6, CIT, and KIF14 (Table 1 and Additional file 2: Table S1). Furthermore, the codon based site models revealed an inconsistent signature of pervasive positive selection in primates for all of the microcephaly genes analyzed in the present study except KIF14 (Table 1 and Additional file 2: Table S1). Positive selection often acts for a brief period of evolutionary time or transiently on protein-coding intervals [34]. During the course of primate history brain enlargement seems to have happened episodically, therefore, transient or episodic positive selection could be of particular relevance to genes involved in brain size expansion [7]. Intriguingly, for the majority of MCPH genes analyzed in the present study, the branch-site model was unable to identify any signatures of episodic positive selection from primate ancestor to human terminal branch (Additional file 2: Table S2). These data obtained through codon substitutions site and branch-site models were corroborated further by employing clade model C (CmC) and conclusively showed that majority of the MCPH genes have maintained their conserved functions throughout the history of placental mammals (Table 3).

Multiple sequence alignments of MCPH proteins analyzed in the present study have shown human specific substitutions for ZNF335, KIF14, PHC1, MFSD2A and CIT (Additional file 2: Table S3). Majority of these human specific substitutions appear to have fixed prior to the divergence of fully modern humans from archaic humans (Neandertals and Denisovans) some 450,000 years ago (Additional file 2: Table S3). Regardless the fact that human-specific adoptive evolution has not been detected in any of the MCPH genes analyzed here, we speculate that human specific replacements in subset of MCPH proteins could potentially be important in modifying their functions during the course of hominin evolution, The evolutionary relevance of these hominin-specific amino acid substitutions in evolution of brain size and complexity needs to be validated through further studies.


The present study demonstrates that the evolutionary enlargement of human brain cannot be attributed solely to the protein-coding sequences of MCPH genes [55]. Instead, complex conditional effects of human specific coding and non-coding regulatory changes in MCPH and other brain related loci might have been instrumental in the evolution of human brain size during the Pliocene–Pleistocene era.


Sequence acquisition and alignment

Full-length coding and amino acid sequences of eight MCPH genes (CEP135, ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT, and KIF14) for 48 eutherian species were retrieved from Ensemble and NCBI (National Center for Biotechnology Information) by using bidirectional BLAST hit strategy [56,57,58]. These sequence data include 20 primates and 28 nonprimate eutherian species (Additional file 1: Figure S1 and Additional file 3: Data S1).

The complete genomic sequences of archaic humans, the Neandertals and Denisovans were downloaded from Max Planck Institute for Evolutionary Anthropology website ( in binary SAM (BAM) file format with 50 × and 30 × sequence coverage, respectively [59, 60]. MCPH gene sequences were retrived by calculating the consensus sequences of respective chromosomes from BAM (binary alignment Map) files of archaic genomes and was compared with human MCPH genes by using UGENE software [61].

The orthologous coding sequences were aligned through PRANK with default parameters for the empirical codon model by using phylogenetic information as a guide [62]. PRANK concedes insertion and deletion as a distinct evolutionary event and introduces indels instead of aligning too divergent sequences and therefore reduces the number of false positive for evolutionary analysis [63, 64].

Analysis of substitution parameters and sites under positive selection

At the protein level, the ratio of nonsynonymous (dN) to synonymous (dS) substitution rates ω measures the selective pressure [32, 65]. The value of ω delineates the strength and direction of natural selection operating on protein-coding sequence; ω = 1, ω < 1, and ω > 1, indicate neutral evolution, negative selection and positive selection respectively. For each of eight MCPH genes within every three datasets (primates, nonprimate eutherians, all eutherians), selective pressure ω was estimated using five different codon substitution site models (M1, M2, M7, M8 and M8a) implemented in CodeML program from PAML4.7 software [32, 66, 67]. In this study, a well-accepted phylogeny of eutherians was used for each gene [68]. In order to estimate the sites under positive selection, we compared the LRT of three pairs of site models. The first pair compare the null model M1 (nearly neutral model that assume the existence of two classes of sites with ω = 1 and ω < 1) and alternative model M2 (positive selection model that assume an additional third class of site with ω > 1) [33, 67]. The other two pairs are null model M7 (beta) and alternative model M8 (beta, and ω2 > 1), and the last pair comparison between null model M8a (beta and ω2 = 1) and alternative model M8 [65, 69]. For this study, positive selection is inferred, if two out of three site pair models significantly reject the null model in the favor of alternative model. In addition, positively selected sites were identified by using Bayes empirical Bayes method implemented in M8 codon substitution site model [33].

Analysis of episodic positive selection

To evaluate whether or not the signature of positive selection was restricted to specific lineage, we used branch site approach implemented in CodeML [34]. This model allows for ω variation not only among prespecified branches but also among sites. The branch site model allows that phylogeny can be divided into prespecified foreground branch (ω2 >  = 1, proportion of sites may be under positive selection) and background branch (where proportion of sites experienced either purifying selection or neutral evolution 0 < ω2 <  = 1). The inference of positive selection was conducted by calculating LRT between this branch site model and null model (it is the same as branch site model but with ω2 = 1 for foreground branch) [34]. We used eutherian phylogenetic tree as an input for the detection of episodic selection at different evolutionary time point from primate ancestral branch to human terminal branch.

Analysis of divergent selection constraint

To determine the pattern of divergence in selective constraint across the phylogeny, we undertook the clade model C (CmC) approach implemented in CodeML [70]. Clade model C assumes that proportions of sites have evolved under divergent selective pressure but not necessarily under positive selection in two or more partition of phylogeny defined as priory. The LRT was conducted by comparing the CmC model with null model M2a-rel [71]. Both alternative CmC and null M2a-rel models have possessed three classes of sites with ω = 1, ω < 1. The third class of site in M2a-rel has single ω ratio (ω2 > 0) that is shared between all clades of phylogeny while CmC third class of site has ω ratios equal to the partitions of the phylogeny and varies among the partition of phylogeny [70, 71].

Statistical analysis

P values were calculated in Chi-square program of PAMLX 1.2 package [72]. The P values of the multiple testing hypothesis were corrected for false discovery rate by using the q value package in R3.5.0 [35, 73]. The q values were calculated for each analysis, ranked in ascending order by applying the bootstrap method for π0 estimation and specified fdr.level = 0.05 in q value package in R3.5.0 [74].

Availability of data and materials

The datasets analyzed during the current study are available in the Ensembl database (, NCBI database ( and Additional file 3: Data S1.



Autosomal recessive primary microcephaly


Bayes Empirical Bayes


Nonsynonymous substitutions rate


Synonymous substitutions rate


Likelihood ratio tests


Basal progenitor


Basal radial glia


Clade model C


National Center for Biotechnology Information


Binary alignment map


  1. 1.

    Neubauer S, Hublin J-J, Gunz P. The evolution of modern human brain shape. Sci Adv. 2018;4(1):5961.

    Google Scholar 

  2. 2.

    Sakai T, Hirata S, Fuwa K, Sugama K, Kusunoki K, Makishima H, Eguchi T, Yamada S, Ogihara N, Takeshita H. Fetal brain development in chimpanzees versus humans. Curr Biol. 2012;22(18):R791–2.

    Google Scholar 

  3. 3.

    Leigh SR. Brain growth, life history, and cognition in primate and human evolution. Am J Primatol. 2004;62(3):139–64.

    Google Scholar 

  4. 4.

    Ardesch DJ, Scholtens LH, Li L, Preuss TM, Rilling JK, van den Heuvel MP. Evolutionary expansion of connectivity between multimodal association areas in the human brain compared with chimpanzees. Proc Natl Acad Sci. 2019.

    Google Scholar 

  5. 5.

    Geschwind DH, Rakic P. Cortical evolution: judge the brain by its cover. Neuron. 2013;80(3):633–47.

    Google Scholar 

  6. 6.

    Shultz S, Maslin M. Early human speciation, brain expansion and dispersal influenced by African climate pulses. PLoS ONE. 2013;8(10):e76750.

    Google Scholar 

  7. 7.

    McHenry HM. Tempo and mode in human evolution. Proc Natl Acad Sci. 1994;91(15):6780–6.

    Google Scholar 

  8. 8.

    Vallender EJ, Mekel-Bobrov N, Lahn BT. Genetic basis of human brain evolution. Trends Neurosci. 2008;31(12):637–44.

    Google Scholar 

  9. 9.

    Evans PD, Anderson JR, Vallender EJ, Gilbert SL, Malcom CM, Dorus S, Lahn BT. Adaptive evolution of ASPM, a major determinant of cerebral cortical size in humans. Hum Mol Genet. 2004;13(5):489–94.

    Google Scholar 

  10. 10.

    Jamieson CR, Fryns J-P, Jacobs J, Matthijs G, Abramowicz MJ. Primary autosomal recessive microcephaly: MCPH5 maps to 1q25-q32. Am J Hum Genet. 2000;67(6):1575–7.

    Google Scholar 

  11. 11.

    Zollikofer CP, de León MSP, Lieberman DE, Guy F, Pilbeam D, Likius A, Mackaye HT, Vignaud P, Brunet M. Virtual cranial reconstruction of Sahelanthropus tchadensis. Nature. 2005;434(7034):755.

    Google Scholar 

  12. 12.

    Li H, Bielas SL, Zaki MS, Ismail S, Farfara D, Um K, Rosti RO, Scott EC, Tu S, Chi NC. Biallelic mutations in citron kinase link mitotic cytokinesis to human primary microcephaly. Am J Hum Genet. 2016;99(2):501–10.

    Google Scholar 

  13. 13.

    Jackson AP, Eastwood H, Bell SM, Adu J, Toomes C, Carr IM, Roberts E, Hampshire DJ, Crow YJ, Mighell AJ. Identification of microcephalin, a protein implicated in determining the size of the human brain. Am J Hum Genet. 2002;71(1):136–42.

    Google Scholar 

  14. 14.

    Nicholas AK, Khurshid M, Désir J, Carvalho OP, Cox JJ, Thornton G, Kausar R, Ansar M, Ahmad W, Verloes A. WDR62 is associated with the spindle pole and is mutated in human microcephaly. Nat Genet. 2010;42(11):1010.

    Google Scholar 

  15. 15.

    Bond J, Roberts E, Springell K, Lizarraga S, Scott S, Higgins J, Hampshire DJ, Morrison EE, Leal GF, Silva EO. A centrosomal mechanism involving CDK5RAP2 and CENPJ controls brain size. Nat Genet. 2005;37(4):353.

    Google Scholar 

  16. 16.

    Guernsey DL, Jiang H, Hussin J, Arnold M, Bouyakdan K, Perry S, Babineau-Sturk T, Beis J, Dumas N, Evans SC. Mutations in centrosomal protein CEP152 in primary microcephaly families linked to MCPH4. Am J Hum Genet. 2010;87(1):40–51.

    Google Scholar 

  17. 17.

    Bond J, Roberts E, Mochida GH, Hampshire DJ, Scott S, Askham JM, Springell K, Mahadevan M, Crow YJ, Markham AF. ASPM is a major determinant of cerebral cortical size. Nat Genet. 2002;32(2):316.

    Google Scholar 

  18. 18.

    Kumar A, Girimaji SC, Duvvari MR, Blanton SH. Mutations in STIL, encoding a pericentriolar and centrosomal protein, cause primary microcephaly. Am J Hum Genet. 2009;84(2):286–90.

    Google Scholar 

  19. 19.

    Hussain MS, Baig SM, Neumann S, Nürnberg G, Farooq M, Ahmad I, Alef T, Hennies HC, Technau M, Altmüller J. A truncating mutation of CEP135 causes primary microcephaly and disturbed centrosomal function. Am J Hum Genet. 2012;90(5):871–8.

    Google Scholar 

  20. 20.

    Genin A, Desir J, Lambert N, Biervliet M, Van Der Aa N, Pierquin G, Killian A, Tosi M, Urbina M, Lefort A. Kinetochore KMN network gene CASC5 mutated in primary microcephaly. Hum Mol Genet. 2012;21(24):5306–17.

    Google Scholar 

  21. 21.

    Awad S, Al-Dosari MS, Al-Yacoub N, Colak D, Salih MA, Alkuraya FS, Poizat C. Mutation in PHC1 implicates chromatin remodeling in primary microcephaly pathogenesis. Hum Mol Genet. 2013;22(11):2200–13.

    Google Scholar 

  22. 22.

    Khan MA, Rupp VM, Orpinell M, Hussain MS, Altmüller J, Steinmetz MO, Enzinger C, Thiele H, Höhne W, Nürnberg G. A missense mutation in the PISA domain of HsSAS-6 causes autosomal recessive primary microcephaly in a large consanguineous Pakistani family. Hum Mol Genet. 2014;23(22):5940–9.

    Google Scholar 

  23. 23.

    Hussain MS, Baig SM, Neumann S, Peche VS, Szczepanski S, Nürnberg G, Tariq M, Jameel M, Khan TN, Fatima A. CDK6 associates with the centrosome during mitosis and is mutated in a large Pakistani family with primary microcephaly. Hum Mol Genet. 2013;22(25):5199–214.

    Google Scholar 

  24. 24.

    Yang YJ, Baltus AE, Mathew RS, Murphy EA, Evrony GD, Gonzalez DM, Wang EP, Marshall-Walker CA, Barry BJ, Murn J. Microcephaly gene links trithorax and REST/NRSF to control neural stem cell proliferation and differentiation. Cell. 2012;151(5):1097–112.

    Google Scholar 

  25. 25.

    Gul A, Hassan MJ, Hussain S, Raza SI, Chishti MS, Ahmad W. A novel deletion mutation in CENPJ gene in a Pakistani family with autosomal recessive primary microcephaly. J Hum Genet. 2006;51(9):760–4.

    Google Scholar 

  26. 26.

    Moawia A, Shaheen R, Rasool S, Waseem SS, Ewida N, Budde B, Kawalia A, Motameny S, Khan K, Fatima A. Mutations of KIF14 cause primary microcephaly by impairing cytokinesis. Ann Neurol. 2017;82(4):562–77.

    Google Scholar 

  27. 27.

    Basit S, Al-Harbi KM, Alhijji SA, Albalawi AM, Alharby E, Eldardear A, Samman MI. CIT, a gene involved in neurogenic cytokinesis, is mutated in human primary microcephaly. Hum Genet. 2016;135(10):1199–207.

    Google Scholar 

  28. 28.

    Pervaiz N, Abbasi AA. Molecular evolution of WDR62, a gene that regulates neocorticogenesis. Meta gene. 2016;9:1–9.

    Google Scholar 

  29. 29.

    Mochida GH. Genetics and biology of microcephaly and lissencephaly. In: Seminars in pediatric neurology: 2009. Amsterdam: Elsevier; 2009. p. 120–6.

    Google Scholar 

  30. 30.

    Montgomery SH, Mundy NI. Evolution of ASPM is associated with both increases and decreases in brain size in primates. Evolution. 2012;66(3):927–32.

    Google Scholar 

  31. 31.

    Montgomery SH, Capellini I, Venditti C, Barton RA, Mundy NI. Adaptive evolution of four microcephaly genes and the evolution of brain size in anthropoid primates. Mol Biol Evol. 2010;28(1):625–38.

    Google Scholar 

  32. 32.

    Yang Z, Nielsen R, Goldman N, Pedersen A-MK. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000;155(1):431–49.

    Google Scholar 

  33. 33.

    Yang Z, Wong WS, Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22(4):1107–18.

    Google Scholar 

  34. 34.

    Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22(12):2472–9.

    Google Scholar 

  35. 35.

    Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci. 2003;100(16):9440–5.

    Google Scholar 

  36. 36.

    Semendeferi K, Damasio H. The brain and its main anatomical subdivisions in living hominoids using magnetic resonance imaging. J Hum Evol. 2000;38(2):317–32.

    Google Scholar 

  37. 37.

    Carroll SB. Genetics and the making of Homo sapiens. Nature. 2003;422(6934):849–57.

    Google Scholar 

  38. 38.

    Florio M, Borrell V, Huttner WB. Human-specific genomic signatures of neocortical expansion. Curr Opin Neurobiol. 2017;42:33–44.

    Google Scholar 

  39. 39.

    Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze A, Renaud G, Sudmant PH, De Filippo C. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature. 2014;505(7481):43.

    Google Scholar 

  40. 40.

    Bastir M, Rosas A, Gunz P, Peña-Melian A, Manzi G, Harvati K, Kruszynski R, Stringer C, Hublin J-J. Evolution of the base of the brain in highly encephalized human species. Nat Commun. 2011;2:588.

    Google Scholar 

  41. 41.

    Mo Z, Moore AR, Filipovic R, Ogawa Y, Kazuhiro I, Antic SD, Zecevic N. Human cortical neurons originate from radial glia and neuron-restricted progenitors. J Neurosci. 2007;27(15):4132–45.

    Google Scholar 

  42. 42.

    Rakic P. Specification of cerebral cortical areas. Science. 1988;241(4862):170–6.

    Google Scholar 

  43. 43.

    Rakic P. A small step for the cell, a giant leap for mankind: a hypothesis of neocortical expansion during evolution. Trends Neurosci. 1995;18(9):383–8.

    Google Scholar 

  44. 44.

    Bystron I, Blakemore C, Rakic P. Development of the human cerebral cortex: Boulder Committee revisited. Nat Rev Neurosci. 2008;9(2):110.

    Google Scholar 

  45. 45.

    Stancik EK, Navarro-Quiroga I, Sellke R, Haydar TF. Heterogeneity in ventricular zone neural precursors contributes to neuronal fate diversity in the postnatal neocortex. J Neurosci. 2010;30(20):7028–36.

    Google Scholar 

  46. 46.

    Rakic P. Radial unit hypothesis of neocortical expansion. In: Novartis foundation symposium: 2000. Chichester: Wiley; 1999. p. 30–52.

    Google Scholar 

  47. 47.

    Kriegstein A, Noctor S, Martínez-Cerdeño V. Patterns of neural stem and progenitor cell division may underlie evolutionary cortical expansion. Nat Rev Neurosci. 2006;7(11):883.

    Google Scholar 

  48. 48.

    Nonaka-Kinoshita M, Reillo I, Artegiani B, Martínez-Martínez MÁ, Nelson M, Borrell V, Calegari F. Regulation of cerebral cortex size and folding by expansion of basal progenitors. EMBO J. 2013;32(13):1817–28.

    Google Scholar 

  49. 49.

    Finlay BL, Darlington RB. Linked regularities in the development and evolution of mammalian brains. Science. 1995;268(5217):1578–84.

    Google Scholar 

  50. 50.

    Caviness V Jr, Goto T, Tarui T, Takahashi T, Bhide P, Nowakowski R. Cell output, cell cycle duration and neuronal specification: a model of integrated mechanisms of the neocortical proliferative process. Cereb Cortex. 2003;13(6):592–8.

    Google Scholar 

  51. 51.

    Kornack DR, Rakic P. Changes in cell-cycle kinetics during the development and evolution of primate neocortex. Proc Natl Acad Sci. 1998;95(3):1242–6.

    Google Scholar 

  52. 52.

    Evans PD, Gilbert SL, Mekel-Bobrov N, Vallender EJ, Anderson JR, Vaez-Azizi LM, Tishkoff SA, Hudson RR, Lahn BT. Microcephalin, a gene regulating brain size, continues to evolve adaptively in humans. Science. 2005;309(5741):1717–20.

    Google Scholar 

  53. 53.

    Evans PD, Vallender EJ, Lahn BT. Molecular evolution of the brain size regulator genes CDK5RAP2 and CENPJ. Gene. 2006;375:75–9.

    Google Scholar 

  54. 54.

    Montgomery SH, Mundy NI. Microcephaly genes evolved adaptively throughout the evolution of eutherian mammals. BMC Evol Biol. 2014;14(1):120.

    Google Scholar 

  55. 55.

    Boyd JL, Skove SL, Rouanet JP, Pilaz L-J, Bepler T, Gordân R, Wray GA, Silver DL. Human-chimpanzee differences in a FZD8 enhancer alter cell-cycle dynamics in the developing neocortex. Curr Biol. 2015;25(6):772–9.

    Google Scholar 

  56. 56.

    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.

    Google Scholar 

  57. 57.

    Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T. The Ensembl genome database project. Nucleic Acids Res. 2002;30(1):38–41.

    Google Scholar 

  58. 58.

    Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005;33(1):D501–4.

    Google Scholar 

  59. 59.

    Prufer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze A, Renaud G, Sudmant PH, de Filippo C, et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature. 2014;505(7481):43.

    Google Scholar 

  60. 60.

    Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, Mallick S, Schraiber JG, Jay F, Prufer K, de Filippo C, et al. A high-coverage genome sequence from an Archaic Denisovan individual. Science. 2012;338(6104):222–6.

    Google Scholar 

  61. 61.

    Okonechnikov K, Golosova O, Fursov M, Team U. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics. 2012;28(8):1166–7.

    Google Scholar 

  62. 62.

    Löytynoja A, Goldman N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science. 2008;320(5883):1632–5.

    Google Scholar 

  63. 63.

    Löytynoja A. Phylogeny-aware alignment with PRANK. In: Multiple sequence alignment methods. Berlin: Springer; 2014. p. 155–70.

    Google Scholar 

  64. 64.

    Fletcher W, Yang Z. The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection. Mol Biol Evol. 2010;27(10):2257–67.

    Google Scholar 

  65. 65.

    Yang Z, Swanson WJ. Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes. Mol Biol Evol. 2002;19(1):49–57.

    Google Scholar 

  66. 66.

    Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91.

    Google Scholar 

  67. 67.

    Wong WS, Yang Z, Goldman N, Nielsen R. Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics. 2004;168(2):1041–51.

    Google Scholar 

  68. 68.

    Nikolaev S, Montoya-Burgos JI, Margulies EH, Rougemont J, Nyffeler B, Antonarakis SE, Program NCS. Early history of mammals is elucidated with the ENCODE multiple species sequencing data. PLoS Genet. 2007;3(1):e2.

    Google Scholar 

  69. 69.

    Swanson WJ, Nielsen R, Yang Q. Pervasive adaptive evolution in mammalian fertilization proteins. Mol Biol Evol. 2003;20(1):18–20.

    Google Scholar 

  70. 70.

    Bielawski JP, Yang Z. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. J Mol Evol. 2004;59(1):121–32.

    Google Scholar 

  71. 71.

    Weadick CJ, Chang BS. An improved likelihood ratio test for detecting site-specific functional divergence among clades of protein-coding genes. Mol Biol Evol. 2011;29(5):1297–300.

    Google Scholar 

  72. 72.

    Xu B, Yang Z. PAMLX: a graphical user interface for PAML. Mol Biol Evol. 2013;30(12):2723–4.

    Google Scholar 

  73. 73.

    Team RC. R: a language and environment for statistical computing. 2018.

  74. 74.

    Storey JD, Taylor JE, Siegmund D. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc Ser B (Stat Methodol). 2004;66(1):187–205.

    Google Scholar 

Download references


The authors thank Yasir Mahmood Abbasi (computer programmer) for technical support.


This work was supported by National Key Research and Development Program of China [2016YFE0206600 to Y.B.]; The 13th Five-year Informatization Plan of Chinese Academy of Sciences [XXH13505-05 to Y.B.]; The 100-Talent Programme of Chinese Academy of Sciences [to Y.B.]; The Open Biodiversity and Health Big Data Programme of IUBS [to Y.B.]. The funding bodies had no role in the design of the study, collection, analysis, and neither interpretation of data nor the writing of the manuscript.

Author information




N.P designed and performed the experiments, analyzed the data and wrote the paper. H.K analyzed the data. Y.B designed the experiment, analyzed the data and wrote the paper. A.A.A conceived and designed the experiment, analyzed the data and wrote the paper. All authors have read and approved the final manuscript.

Corresponding authors

Correspondence to Yiming Bao or Amir Ali Abbasi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

: Supplemental Figures.

Additional file 2

: Supplemental Tables.

Additional file 3.

Data S1.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pervaiz, N., Kang, H., Bao, Y. et al. Molecular evolutionary analysis of human primary microcephaly genes. BMC Ecol Evo 21, 76 (2021).

Download citation


  • Molecular evolution
  • Brain
  • Primary microcephaly
  • MCPH
  • Primates
  • Positive selection