Skip to main content
  • Research article
  • Open access
  • Published:

High correlation between the turnover of nucleotides under mutational pressure and the DNA composition



Any DNA sequence is a result of compromise between the selection and mutation pressures exerted on it during evolution. It is difficult to estimate the relative influence of each of these pressures on the rate of accumulation of substitutions. However, it is important to discriminate between the effect of mutations, and the effect of selection, when studying the phylogenic relations between taxa.


We have tested in computer simulations, and analytically, the available substitution matrices for many genomes, and we have found that DNA strands in equilibrium under mutational pressure have unique feature: the fraction of each type of nucleotide is linearly dependent on the time needed for substitution of half of nucleotides of a given type, with a correlation coefficient close to 1. Substitution matrices found for sequences under selection pressure do not have this property. A substitution matrix for the leading strand of the Borrelia burgdorferi genome, having reached equilibrium in computer simulation, gives a DNA sequence with nucleotide composition and asymmetry corresponding precisely to the third positions in codons of protein coding genes located on the leading strand.


Parameters of mutational pressure allow us to count DNA composition in equilibrium with this mutational pressure. Comparing any real DNA sequence with the sequence in equilibrium it is possible to estimate the distance between these sequences, which could be used as a measure of the selection pressure. Furthermore, the parameters of the mutational pressure enable direct estimation of the relative mutation rates in any DNA sequence in the studied genome.


Nucleotide sequences of all contemporary genomes are results of compromise between mutational pressure and selection [1]. Many mutations which took place in the past have been eliminated by genetic death. Even so, there are differences in the nucleotide composition of protein coding sequences and intergenic sequences – it is very difficult to discriminate between the effects of selection and mutations on their composition. Furthermore, many prokaryotic genomes have very asymmetric nucleotide composition of chromosomes [e.g. [29]]. Strand composition of DNA depends on the role which the strand plays in the replication process – leading or lagging. Usually the leading strand is richer in Guanine (G) than in Cytosine (C) and richer in Thymine (T) than in Adenine (A). The replication-associated mutational pressure is thought to be the most probable cause of this asymmetry [1012].

Analyses of long range correlations in DNA sequences revealed that in the intergenic sequences a very strong triplet signal can be detected [13, 14]. This signal can be created by fragments of coding sequences transferred into intergenic space by recombination mechanisms. Since the nucleotide compositions of the first, the second and the third nucleotide positions in coding sequences are strongly correlated, these correlations are seen even in some noncoding intergenic sequences. We have assumed that some intergenic sequences have derived from coding sequences and could freely accumulate mutations with frequencies determined by the replication-associated mutational pressure. If the time of divergence has not been very long, the homology between the intergenic sequences and their original protein coding sequences can be found (these original coding sequences we have called the reference sequences).

We have made an assumption that mutations have been accumulated only in the intergenic sequences and not in the reference sequences, which is not exactly true, but which enabled accomplishing our studies. This assumption could give a good approximation of mutational pressure exerted on intergenic sequences. Many other authors, who have constructed matrices of substitutions using the mutations accumulated in pseudogene sequences have made the same assumptions [15, 16]. Such an assumption could give higher estimated mutational rate than the real one. Nevertheless, the substitution rates in the matrices are described as a relative values thus, it should not change the values in the matrix.

We have chosen for our analyses the B. burgdorferi genome because there are many premises indicating that this genome is in the steady state. The B. burgdorferi genome is very asymmetric, which suggests its structural conservation [17]. There are not many inversions of genes between the leading and lagging strands or the mutational pressure has had enough time to make the inverted genes resemble the genes of the new strand [18]. The nucleotide composition of the third positions in codons testifies for the very conserved structure of chromosome. These positions follow precisely the sign of the asymmetry of intergenic sequences and, the third positions of Open Reading Frames (ORFs) situated on the leading and lagging strands have precisely mirror asymmetry, which is even stronger than that of intergenic sequences [17]. This paradox could be explained assuming that the highly degenerated third positions have accumulated more neutral or near neutral mutations introduced by the replication-associated processes because they stay at their positions longer than intergenic sequences. There are constraints for inversions of coding sequences but no constraints for inversions of intergenic sequences. Thus, some newly inverted intergenic sequences could complement the asymmetry of the "new host" strand.

Results and Discussion

Testing the table of substitutions and verifying the assumptions

Once having experimentally found the rates of all types of substitutions (Table 1, Borrelia burgdorferi Table of Substitutions (BbTS)), we were able to test these data and to verify our previous assumptions. In equilibrium, the number of a given nucleotide substituted by other nucleotides should be balanced by the number of that nucleotide substituting the other nucleotides. The following four equations should be fulfilled:

Table 1 Tables of substitutions, DNA composition in the equilibrium with the mutational pressure and half times of nucleotide substitutions.

NA>G + NA>C + NA>T = NG>A + NC>A + NT>A     (1)

NG>A + NG>C + NG>T = NA>G + NC>G + NT>G     (2)

NC>A + NC>G + NC>T = NA>C + NG>C + NT>C     (3)

NT>A + NT>G + NT>C = NA>T + NG>T + NC>T     (4)

where NA>G = N A *p(NA>C) and where p(NA>C) is the probability of substitutions of A by G, taken from the BbTS, (other symbols – respectively).

Note that there are numbers, not frequencies in the equations. Fulfilling these equations means that the nucleotide composition of the sequences submitted to the mutational pressure determined by the parameters of BbTS is in equilibrium. We have assumed that in the case of the B. burgdorferi genome the best approximation of such sequences is the composition of the third positions of codons of ORFs, as has been argued in the Introduction section. Thus, the nucleotide composition of this set of nucleotides should not change significantly under such mutational pressure. To prove that, we simulated the mutational pressure on the sequence of the same composition as described previously [19] and after 10,000 Monte Carlo Steps (MCS), when the sequence was in equilibrium, we compared it to the sequence before the simulation. The ratios of nucleotides were 0.994, 1.008, 0.992 and 0.988 for A, T, G, and C, respectively (note that the ratios do not sum to 1 because they are not weighted). There are no significant changes in nucleotide composition of the third positions after the prolonged exposition to the mutational pressure described by BbTS (Chi square test, p = 0.99987). Thus, BbTS generates DNA sequence with nucleotide composition corresponding precisely to the nucleotide composition of the third codon positions. In Fig. 1 we have shown the evolution of two DNA sequences, of which one originally had equimolar nucleotide composition and the other one – the nucleotide composition of the third codon positions of ORFs from the leading strand. Both sequences reach the same final nucleotide composition. Furthermore, a sequence obtained after long evolution in computer has very similar asymmetry in terms of GC skew and AT skew as the sequence of the third codon positions before evolution. GC skew is [G-C]/ [G+C] and AT skew is [A-T]/ [A+T]. The AT skew is -0.23 and -0.22 for the sequences before and after simulation, respectively. The GC skew is 0.34 for the sequences before and after simulation. Note that the most frequent substitution is C->T transition, which is in agreement with the cytosine deamination theory (see ref. 10 and references therein), and the average transition frequency is twofold higher than transversion frequency.

Figure 1
figure 1

Evolution of DNA sequences under the mutational pressure described by the "real BbTS". Light lines indicate the fractions of nucleotides in the sequence which initially has been composed of equal numbers of each nucleotide. Bold lines show the fractions of nucleotides in a sequence of nucleotide composition of the third positions in codons of the B. burgdorferi coding sequences from the leading strand. x-axis – the number of Monte Carlo Steps (MCS), y-axis – fraction of nucleotides in the evolving strand.

Properties of the substitution matrices

Let us consider only nucleotides existing in the original sequence, which is already in the steady state. It is trivial that the substitution of each of the four nucleotides will follow exactly the same rules as a decay of radioactive isotopes with characteristic for each nucleotide "half time of substitutions" (τ A , τ G , τ T , τ C for A, G, T, and C, respectively) determined by the sum of probabilities of substitutions of a given nucleotide by the other three nucleotides. In a more formal language:

τ A = ln2/(p mut *(p(A>G) + p(A>T) + p(A>C)); (symbols for nucleotides other than A-respectively), where pmut is a parameter which denotes the overall rate of mutations and does not influence the ratios between τ for different nucleotides.

It is also trivial that in the equilibrium, the fraction of a nucleotide which has been substituted is exactly the same as the fraction of this very nucleotide substituting the other ones (left sides of equations 1–4). Thus, after the half time of substitutions the ratio between the "old" nucleotides and "new" nucleotides is 1:1 (see Fig. 2A and Fig. 2C). This is a general property of any table of substitutions in the equilibrium state. But BbTS has another property: the half time of substitutions is precisely correlated with the frequency of the given nucleotide in the sequence in equilibrium with the correlation coefficient equaling 0.999 (p = 0.0007, Fig. 2B). This is not just a feature of any matrix of substitutions. We have tried to find analytically a table of substitutions which would generate a DNA sequence of the nucleotide composition of the analysed sequence of B. burgdorferi [see also [19]]. One of such tables is presented in Table 1. This "artificial table" generates a DNA sequence of the same nucleotide composition as BbTS does, but the correlation coefficient between the half time of substitutions and the fraction of nucleotides in the sequence is close to zero (Fig. 2C and 2D).

Figure 2
figure 2

The rate of substitution of nucleotides in the DNA sequence in equilibrium under the mutational pressure described by: A – the "real BbTS" and, C – the "artificial BbTS". Bold lines show the fractions of nucleotides which have not been substituted yet, light lines indicate the fractions of nucleotides which appeared when substituting other nucleotides. Plots B and D represent the relations between half time of substitutions and the sizes of nucleotide fractions for sequences in equilibrium under the mutational pressure of "real BbTS" and "artificial BbTS" respectively. See also text and description for Fig. 1.

We claim that the mutational pressure leading to the nucleotide substitutions is extremely highly correlated with the DNA composition of the genome in such a way that the higher substitution turnover of a nucleotide determines the lower fraction of this nucleotide in the DNA sequence.

It seems very unbelievable that such correlation in the B. burgdorferi genome has happened accidentally. We have tested many other tables of substitutions which had been published for different genomes and different sequences (data collected in Table 1). As long as such matrices describe the substitutions for sequences which are not under the selection pressure (i.e. pseudogenes or the third positions in codons), they follow the same rule, with extremely high correlation between τ N and the fraction of the nucleotide N in the DNA sequence in equilibrium (all correlations were statistically significant). This rule is true for asymmetric DNA like in the B. burgdorferi genome and for much less biased eukaryotic DNA. Matrices found for the third positions in the four-fold degenerated codons in Drosophila mitochondrial DNA [20] fulfil this rule more precisely than for all third positions in codons in that organelle's genome (the same results were obtained for matrices of primates' mtDNA published by [21], data not shown). These differences could be expected if some mutations in the third positions, leading to amino acid substitutions are not neutral. Furthermore, in some instances, for example for the table describing substitution rates in sequences under strong selection [22], we have not found the correlation between τN and the fraction of nucleotide N (see the last column in Table 1). That supports the hypothesis that the rule is a specific property of the pure mutational pressure. One can also notice that matrices found by analysis of substitutions into different pseudogenes in the same organism or in very closely related organisms give a different DNA composition in equilibrium, which supports the thesis that the mutational pressure varies for different regions of the same eukaryotic genome [2325].

We have no clear answer for the question: what selection forces have tuned the mutational pressure in such a way that it follows the strict rules for sequences released from selection. It is logical that nucleotides with higher turnover destabilise the genetic information and selection would tend to eliminate them from the DNA molecule. On the other hand, a lower frequency of a nucleotide gives it a higher informative value while at the same time the deviation from the equimolar fractions of nucleotides in DNA diminishes the coding capacity of the whole molecule. Perhaps mathematical analysis of this phenomena, taking into considerations the properties of the universal genetic code, will show that the optimum for information transfer by the DNA molecule is just at such points. Further studies would show other properties of these strategic points where τ determines very specific balance between the DNA composition and mutational pressure.

The implications of such evolutionary established relations between the DNA composition and the turnover rates of nucleotides would have a great impact on the understanding of the genomes evolution itself. It gives the possibility of estimating the relations between the mutational pressure exerted on specific nucleotides of each genome analytically i.e. by simple computing the nucleotide composition of sequences which are not subject to selection pressure. Having the mutational pressure in terms of nucleotide turnover, one can estimate the selection pressure exerted on any sequence or position in codons. For example, see what would be the fate of the first positions of ORFs from the leading strand of the B. burgdorferi genome under the BbTS molecular pressure, without selection (Fig. 3). Note that the half time of substitutions of each nucleotide is the same as for other sequences under such mutational pressure, but the rate of appearing of new nucleotides is different and the composition of the sequence would change non-linearly during evolution. It is also simple to count, from the results of computer simulations, the corrections for multiple substitutions and reversions, which is important for estimating the real divergence time. It is clear that such corrections should be counted considering different contributions of each nucleotide turnover in the overall frequencies of multiple substitutions. Having precise mutational pressure one can predict not only the selection pressure but also find the history of the sequence.

Figure 3
figure 3

Changes in the nucleotide composition of the first positions in codons of B. burgdorferi coding sequences from the leading strand under BbTS mutational pressure. Descriptions as for Fig. 1 and 2. Note that the fractions of nucleotides which have not been substituted are exactly as in Fig. 2A but the fractions of nucleotides which substituted other nucleotides are far from being symmetrical to the first ones.

Keeping in mind the precise relations between the fraction of nucleotide and its turnover time, the symmetric DNA (with A=T and G=C) is a specific case were the turnover times of nucleotides in pairs equal each other. Posing a question of which type of substitution should be blamed for the DNA asymmetry makes sense for the mutational pressure exerted on the DNA released from the selection pressure. Now a simple test for such a mutational pressure is available – it should generate the DNA in equilibrium whose nucleotide composition fulfils the rule of linear interdependence between the sizes of the nucleotide fractions and their turnover times.


Substitution matrices enable counting the DNA composition in equilibrium with a given mutational pressure. It is possible to test if a given substitution matrix is the pure mutational matrix or if it is "contaminated" with the effects of selection. The difference between the DNA composition in equilibrium with mutational pressure and a DNA sequence under both mutational and selection pressures allows for estimation of the effect of selection pressure exerted on the particular sequence.

Materials and Methods

Construction of the substitution table

To estimate the frequency of substitutions, we have analysed the differences between coding sequences of the B. burgdorferi genome and sequences homologous to them found in the intergenic regions. For the data, see Additional file 1. The sequence of the B. burgdorferi genome [26] was downloaded from To accomplish our analysis, we extracted all intergenic sequences longer than 90 nucleotides. We translated them into amino acid sequences in all six reading frames. The amber and ochre stop codons were translated for tyrosine residues and opal for tryptophan. Then we searched data bases for homology with the B. burgdorferi protein sequences using FASTA program [27]. For detailed amino acid alignment data see Additional file 2. After selecting homologues (with E < 0.05) whose previously (presumably) coding strands were duplicated on the leading strand, we made alignments of nucleotide sequences of these intergenic sequences with the reference ORFs' sequences using CLUSTAL X programme [28] and we counted the nucleotide substitutions. The number of the analysed alignments sites was 3737 and the average number of substitutions per site – 0.46. For detailed nucleotide alignment data see Additional file 3. The observed numbers of nucleotide substitutions from nucleotide i to j (where i to j stand for A, T, G or C, and ij) were converted to relative substitution frequencies according to Gojobori, Li, and Gaur [29] and Francino and Ochman [30]. That allowed us to count the frequency of each of the twelve possible substitutions on the leading strand. Since the observed substitution rates were different for each of the four nucleotides, we introduced corrections for multiple substitutions and reversions for each type of the substituted nucleotide instead of one general correction. It means that we have counted for each type of nucleotide the fraction of substituted (observed number) and used it for estimating the corrected substitution number according to Kimura's formula [31]. The frequencies of substitutions, normalised in such a way that the sum of all 12 frequencies equals 1, are shown in Table 1 (Table of Substitutions – BbTS).

Computer simulations

Computer simulations were performed on the DNA sequences corresponding to the real protein coding sequences of B. burgdorferi leading strand or DNA random sequences generated by computer. In the first case all ORFs longer than 100 codons situated on leading strand of Borrelia genome were spliced together. In the second case the DNA sequence were constructed by computer by drawing the consecutive nucleotides with a probability described by the assumed composition of this artificial sequence.

During the simulation of mutational pressure, in one Monte Carlo Step (MCS) each nucleotide in the sequence was drawn with the probability equalling pmut, then it is substituted with the probability described by the substitution matrix. Note that the nucleotide drawn for substitution not necessarily has to be substituted. After each MCS each substitution was counted in the specific type of substitution and additionally the evolving sequence was compared with the original sequence and the accumulated substitution were counted. This allowed us to measure not only the divergence rate but also the mutational rate subdivided on 12 different classes. The dynamic of substitution was also counted analytically using the equations describing the first order reaction rate. For more details on computing methods see Kowalczuk et al. [19, 32].



table of probabilities of nucleotide substitutions in the Borrelia burgdorferi leading DNA strand.


  1. Sueoka N: On the genetic basis of variation and heterogeneity of DNA base composition. Proc Natl Acad Sci USA. 1962, 48: 582-591.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  2. Lobry JR: Asymmetric substitution patterns in the two DNA strands of bacteria. Mol Biol Evol. 1996, 13: 660-665.

    Article  CAS  PubMed  Google Scholar 

  3. Freeman JM, Plasterer TN, Smith TF, Mohr SC: Patterns of genome organization in bacteria. Science. 1998, 279: 1827-1830. 10.1126/science.279.5358.1827a.

    Article  Google Scholar 

  4. Grigoriev A: Analyzing genomes with cumulative skew diagrams. Nucleic Acids Res. 1998, 26: 2286-2290. 10.1093/nar/26.10.2286.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  5. McLean MJ, Wolfe KH, Devine K: Base composition skews, replication orientation, and gene orientation in 12 prokaryote genomes. J Mol Evol. 1998, 47: 691-696.

    Article  CAS  PubMed  Google Scholar 

  6. Mrazek J, Karlin S: Strand compositional asymmetry in bacterial and large viral genomes. Proc Natl Acad Sci USA. 1998, 95: 3720-3725. 10.1073/pnas.95.7.3720.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  7. Francino MP, Ochman H: A comparative genomics approach to DNA asymmetry. Ann N Y Acad Sci. 1999, 870: 428-431.

    Article  CAS  PubMed  Google Scholar 

  8. Mackiewicz P, Gierlik A, Kowalczuk M, Dudek MR, Cebrat S: Asymmetry of nucleotide composition of prokaryotic chromosomes. J AppI Genet. 1999, 40: 1-14.

    Google Scholar 

  9. Rocha EP, Danchin A, Viari A: Universal replication biases in bacteria. Mol Microbiol. 1999, 32: 11-16. 10.1046/j.1365-2958.1999.01334.x.

    Article  CAS  PubMed  Google Scholar 

  10. Frank AC, Lobry JR: Asymmetric substitution patterns: a review of possible underlying mutational or selective mechanisms. Gene. 1999, 238: 65-77. 10.1016/S0378-1119(99)00297-8.

    Article  CAS  PubMed  Google Scholar 

  11. Mackiewicz P, Gierlik A, Kowalczuk M, Dudek MR, Cebrat S: How does replication-associated mutational pressure influence amino acid composition of proteins?. Genome Res. 1999, 9: 409-416.

    PubMed Central  CAS  Google Scholar 

  12. Tillier ER, Collins RA: The contributions of replication orientation, gene direction, and signal sequences to base-composition asymmetries in bacterial genomes. J Mol Evol. 2000, 50: 249-257.

    CAS  PubMed  Google Scholar 

  13. Voss R: Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys Rev Lett. 1992, 68: 3805-3808. 10.1103/PhysRevLett.68.3805.

    Article  CAS  PubMed  Google Scholar 

  14. Gierlik A, Mackiewicz P, Kowalczuk M, Dudek MR, Cebrat S: Some hints on Open Reading Frame statistics – how ORF length depends on selection. Int J Modern Phys C. 1999, 10: 635-643. 10.1142/S0129183199000474.

    Article  Google Scholar 

  15. Li WH, Wu CI, Luo CC: Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J Mol Evol. 1984, 21: 58-71.

    Article  CAS  PubMed  Google Scholar 

  16. Yang Z: Estimating the pattern of nucleotide substitution. J Mol Evol. 1994, 39: 105-111.

    PubMed  Google Scholar 

  17. Mackiewicz P, Gierlik A, Kowalczuk M, Szczepanik D, Dudek MR, Cebrat S: Mechanisms generating long-range correlation in nucleocide composition of the Borrelia burgdorferi. Physica A. 1999, 273: 103-115. 10.1016/S0378-4371(99)00345-3.

    Article  CAS  Google Scholar 

  18. Mackiewicz P, Mackiewicz D, Gierlik A, Kowalczuk M, Nowicka A, Dudkiewicz M, Dudek MR, Cebrat S: The Differential Killing of Genes by Inversions in Prokaryotic Genomes. J Mol Evol. 2001, 53: 615-621. 10.1007/s002390010248.

    Article  CAS  PubMed  Google Scholar 

  19. Kowalczuk M, Gierlik A, Mackiewicz P, Cebrat S, Dudek MR: Optimization of gene sequences under constant mutational pressure and slection. Physica A. 1999, 273: 116-131. 10.1016/S0378-4371(99)00346-5.

    Article  CAS  Google Scholar 

  20. Tamura K: The rate and pattern of nucleotide substitution in Drosophila mitochondrial DNA. Mol Biol Evol. 1992, 9: 814-825.

    CAS  PubMed  Google Scholar 

  21. Adachi J, Hasegawa M: Tempo and mode of synonymous substitutions in mitochondrial DNA of primates. Mol Biol Evol. 1996, 13: 200-208.

    Article  CAS  PubMed  Google Scholar 

  22. Zhang J: Performance of likelihood ratio tests of evolutionary hypotheses under inadequate substitution models. Mol Biol Evol. 1999, 16: 868-875.

    Article  CAS  PubMed  Google Scholar 

  23. Filipski J: Why the rate of silent codon substitutions is variable within a vertebrate's genome. J Theor Biol. 1988, 134: 159-164.

    Article  CAS  PubMed  Google Scholar 

  24. Wolfe KH, Sharp PM, Li WH: Mutation rates differ among regions of the mammalian genome. Nature. 1989, 337: 283-285. 10.1038/337283a0.

    Article  CAS  PubMed  Google Scholar 

  25. Matassi G, Sharp PM, Gautier C: Chromosomal location effects on gene sequence evolution in mammals. Curr Biol. 1999, 9: 786-791. 10.1016/S0960-9822(99)80361-3.

    Article  CAS  PubMed  Google Scholar 

  26. Fraser CM, Casjens S, Huang WM, Sutton GG, Clayton R, Lathigra R, White O, Ketchum KA, Dodson R, Hickey EK: Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature. 1997, 390: 580-586. 10.1038/37551.

    Article  CAS  PubMed  Google Scholar 

  27. Pearson WR, Lipman DJ: Improved tools for biological sequence comparison. Proc Natl Acad Sci USA. 1988, 85: 2444-2448.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  28. Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ: Multiple sequence alignment with Clustal X. Trends Biochem Sci. 1988, 23: 403-405. 10.1016/S0968-0004(98)01285-7.

    Article  Google Scholar 

  29. Gojobori T, Li WH, Graur D: Patterns of nucleotide substitution in pseudogenes and functional genes. J Mol Evol. 1982, 18: 360-369.

    Article  CAS  PubMed  Google Scholar 

  30. Francino MP, Ochman H: Strand symmetry around the beta-globin origin of replication in primates. Mol Biol Evol. 2000, 17: 416-422.

    Article  CAS  PubMed  Google Scholar 

  31. Kimura M: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol. 1980, 16: 111-120.

    Article  CAS  PubMed  Google Scholar 

  32. Kowalczuk M, Mackiewicz P, Mackiewicz D, Nowicka A, Dudkiewicz M, Dudek MR, Cebrat S: Multiple base substitution corrections in DNA sequence evolution. Int J Modern Phys C. 2001, 12: 1043-1053. 10.1142/S0129183101002279.

    Article  CAS  Google Scholar 

Download references


Funding sources: This work was supported by The State Committee for Scientific Research, grant numbers 6 P04A 025-18 and 6 P04A 016 20. P.M. was supported by Foundation for Polish Science.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Stanislaw Cebrat.

Electronic supplementary material


Additional file 1: Contains information on intergenic sequences and homologous ORFs used to obtain table of substitutions. This file is to be opened in excel with tab as separator. (XLS )

Additional file 2: Contains the relevant fragments of letters from FASTA with amino acid alignments. (TXT 34 KB)

Additional file 3: Contains nucleotide alignments by CLUSTALX program. (TXT 39 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kowalczuk, M., Mackiewicz, P., Mackiewicz, D. et al. High correlation between the turnover of nucleotides under mutational pressure and the DNA composition. BMC Evol Biol 1, 13 (2001).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: