Skip to main content

Table 3 False positive rates (null scenarios) and true positive rates (alternative scenarios) for three LRTs when the evolutionary process includes both ω variability among sites and MNRs

From: Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates

    

SNR (\( {e}^{\beta_{HI}} \) = 1)

Low MNR (\( {e}^{\beta_{HI}} \) = 0.4)

High MNR (\( {e}^{\beta_{HI}} \) = 0.05)

ω 0

ω 1

ω 2

LRT-1

LRT-2

LRT-3

LRT-1

LRT-2

LRT-3

LRT-1

LRT-2

LRT-3

Null scenarios

False positives

2a

0.05

0.5

1.0

0.00

0.01

0.01*

0.00

0.01

0.00*

0.00

0.00

0.00*

2b

 

1.0

1.0

0.01

0.04

0.03*

0.00

0.03

0.00*

0.00

0.03

0.00*

Alternative scenarios

True positives

2c

0.05

0.5

1.5

0.03

0.36

0.44

0.01

0.24

0.20*

0.00

0.09

0.00*

2d

  

2.0

0.52

0.82

0.85

0.05

0.65

0.61*

0.00

0.45

0.14*

2e

  

5.0

1.00

1.00

1.00

1.00

0.99

1.00

0.14

0.99

1.00*

2f

0.05

1.0

1.5

0.06

0.10

0.08

0.00

0.14

0.05

0.00

0.14

0.01*

2 g

  

2.0

0.33

0.46

0.37

0.00

0.46

0.24

0.00

0.31

0.09*

2 h

  

5.0

1.00

0.99

1.00

0.98

1.00

1.00

0.09

1.00

0.99*

  1. LRT-1 compares M1a to M2a (under-fit models). LRT-2 compares G1ax to G2ax (perfect-fit models). LRT-3 compares G1a13 to G2a13 (over-fit models). The asterisk symbol (*) indicates scenarios where either convergence problems or suboptimal peaks were encountered for the models of LRT-3. To overcome these problems, models were re-fit to the same dataset multiple times, each using a different set of initial parameter values. The number of problematic datasets for SNR was 2a = 21 and 2b = 1; for low MNR was 2a = 27, 2b = 16, 2c = 16 and 2f = 10; and for high MNR was 2a = 29, 2b = 20, 2c = 35, 2e = 15, 2f = 15 and 2 g = 1. Because using multiple initials for the problematic datasets was successful, the results above are for all 100 replicates