Presentation on theme: "Mullers ratchet and fixation of beneficial alleles: the soliton approach to many-site problem Igor Rouzine Department of Molecular Biology and Microbiology."— Presentation transcript:
Mullers ratchet and fixation of beneficial alleles: the soliton approach to many-site problem Igor Rouzine Department of Molecular Biology and Microbiology Tufts University Boston, USA
Basic terminology and notation N: haploid population size (number of genomes) Allele: variant of a site in a genome. Can be better-fit or less-fit (2-allele model). Fitness w: relative average number of progeny of a genome. Mutation event: DNA transcription error. Can be deleterious or beneficial. U: mutation rate per genome per generation U b : beneficial mutation rate Mutation load, mutation number k: the number of less-fit alleles in a genome as compared to the best possible genome Selection coefficient s: small relative fitness gain/loss per mutation. Special notation: V = dk av /dt: average substitution rate of beneficial mutations v = (1/U) dk av /dt: normalized rachet rate (substitution rate of deleterious mutations) = s/U f(k,t): frequency of a class of genomes with mutation number k (f,t): probability of having frequency f of a class at time t = ln f = U b /U: ratio of the beneficial mutation rate to the total mutation rate x = k k av : relative mutation number k 0, x 0 : the minimum values of k and x in a population (for the best-fit class) u = exp( (x 0 )]
Experiment on steady state fitness of a virus (VSV) versus population size Experiment on measuring fitness of steady state fitness of vesicular stomatitis virus passaged at fixed number of infectious units N. One-site theory based on selection- mutation balance does not work. k 1 : mutation load of the reference strain of virus L: total number of sites M gen : generations per passage r: expansion ratio per generation
Fixation of deleterious mutations at very small population sizes Mutation rate per genome is usually small for all organisms, U = At very small population sizes N, mutation events are rare and separated in time. Fixation of separate deleterious mutations is effectively opposed by selection. One-site, 2-allele model, diffusion equation (Lande 1994; 1998 based on Crow and Kimura 1970): A single genome containing a deleterious mutation with selection coefficient s will be fixed (spread to all population) with probability The average substitution rate is exponentially small at Ns << 1 as Where k av is the average number of deleterious alleles in genome.
Many-site effect: Mullers ratchet at N >> e U/s N not too small: time overlap between fixation of mutations at different sites, accumulation of deleterious mutations is rapid even at N >> 1/s, provided U >> s. Case N >> exp(U/s) (Stephan 1993; Charlesworth and Charlesworth 1997; Gordo and Charlesworth 2000): Selection-mutation equilibrium: Poisson distribution of genomes f(k) with k av = U/s. Zero-mutation class contains, on the average, n 0 =Nf(0) = Nexp(-U/s) genomes. Random fluctuations cause its eventual loss. Distribution shifts by one notch in k: one click of Mullers ratchet (Muller 1964; Felsenstein 1974). Stopping ratchet: recombination (absent in Y chromosome or asexual organisms); beneficial mutation (not efficient at small k/L); epistasis (biointeracation between sites)
Diffusion approach at N >> e U/s Clicks are infrequent due to large Nf(0). Calculating the average time between ratchet clicks. Assumptions: All classes but zero class are at deterministic equilibrium with current k av. In a transitional time interval between clicks, zero-class is out of equilibrium. Diffusion equation for f = f(0), the random frequency of zero-class, f eq =exp(-U/s) where a(f) is the average change of f per generation, a(f eq )=0. f decrease in the average fitness of population -> decrease in relative fitness of the zero class -> a(f) > 0 -> increase in f If f falls to 0, it never comes back. Estimate of a(f) for f far from f eq is far from trivial (Gordo and Charlesworth 2000). The average time between clicks is a complex function of N, s, U, not only of the zero-class size Nf(0).
Far from equilibrium: ratchet at N < exp(U/s) and fixation of beneficial mutations All classes are way out of equilibrium (e.g., ratchet clicks overlap in time). Soliton approach (Tsimring et al, Phys. Rev. Lett. 1996; Rouzine et al 2003) Basic model including beneficial mutations: Deterministic detailed balance equation for the class frequencies:
Early approach Tsimring, Levine and Kessler, Phys. Rev. Lett., 1996: very similar model Approximation: f k (t) is smooth in k Continuous set of soliton-like solutions f k (t)= F V (k-k av (t)) labeled by the velocity, V = dk av /dt, related to the soliton width, std k. Choice of the solution (physics: lifting degeneracy): Cutoff of distribution at the high-fitness edge at f(k) < 1/N.
Smooth logarithm of the distribution and the diffusion equation for the best-fit class Distribution f k (t) is not smooth in k in the tail (which is very important): f k /f k-1 ~ 1 or >> 1 (Rouzine et al 2003). Better: as long as the scale in k is large, k av k 0 >> 1, where k 0 =min(k) is the mutation number for the best-fit class. All groups are deterministic but the best-fit class. Diffusion equation for the best-fit class frequency f = f k0 : which yields Stochastic threshold: Best-fit class is lost or established: Note: S is not zero, effect of change in f on S can be neglected
Solitary wave solution Seeking solution in the soliton form the balance equation becomes I cannot solve it for (x), but can find all I need without solving it. A continuous set of solutions at any v < 1 2 with different widths std k : Variance std k 2 = (1 2 )/ 1/2 = (1 2 )(U/s) 1/2 : equilibrium, v=0 Broader distribution: fixation of beneficial mutations dominates, v < 0 Narrower distribution ratchet dominates, 0 < v < 1 2
General expression for the substitution rate Each solution exists in a finite interval x > x 0, where [dx/d ] x=x0 = 0. At the boundary, (x 0 )=ln u, where Thus, the deterministic distribution has a high- fitness edge at the relative mutation number x 0. We can integrate the balance equation in How (x 0 ) depends on N, is determined by the stochastic best-fit class.
Mullers ratchet at s << U Mullers ratchet with rate Uv: 0 < v < 1. Beneficial mutations are not important at low k av : The general result for v simplifies: For the continuous approach to work, we need |x 0 | >> 1, hence, = s/U << 1. The wave is also broad: std k ~ 1/s 1/2. At s << U, the distribution of genomes in k is not formed. Single fixation events rule?
Best-fit class. Stochastic threshold condition (Rouzine et al 2003): Solving the equations for the variance and the average without beneficial mutations: (well-known stochastic threshold from 1-site theory) (Note: effective selection coefficient S=0 at v=0)
Best-fit class: another method Finding the average time to the loss of the class A best-fit class with k 0 1 mutations is lost at t = 0. Ratchet click time: Answer: Cf. previous method: extra factor v 1/2 in the logarithm.
Edge effect on continuity Continuity approximation requires At x ~ x 0, but not |x-x 0 | > 1 and is met. At x = x 0, dx/d =0, so the condition is violated close to the edge. Edge creates perturbation that spreads inward. The effect is deterministic. Balance equations near the edge: Periodic initial condition: Numeric solution: at k k 0 = x 0, edge correction to (x 0 ) Continuous result
Final result = s/U, v=V/U
Simulation vs analytic theory
Simulation vs analytic theory: 2
Simulation vs analytic theory: 3 Equilibrium best-fit class size = 1
Simulation vs analytic theory: 4
Simulation vs analytic theory: 5
Conclusions: Mullers ratchet 1.At U >> s and high average fitness, an approach based on continuous deterministic treatment of the logarithm of the mutation number distribution combined with stochastic treatment of the best-fit class has been developed. 2.In a broad interval of population sizes from s << N << exp(U/s), we predict enhanced, versus one-site theory, accumulation of deleterious mutations (Mullers ratchet) in the form of travelling wave for the mutation number distribution, 3.At moderately small s/U, the edge correction to the continuity approximation is important for the numeric accuracy. 4.Two methods of edge treatment based on the diffusion equation yield different factors multiplying N, with a small numeric difference for relevant parameter values. 5.In the entire range of N, a very good agreement with Monte-Carlo simulation results is obtained. 6.At larger N, the distribution is close to equilibrium, and the earlier separate-click approach applies. At small N ~ s, the results match that of single-mutation model.
Fixation of beneficial alleles: v < 0 Deleterious mutations are not important in the general formula for v, if The result simplifies to (Rouzine et al 2003): Compare to the ratchet result: Because U is no longer important, we return to notation s, U b,and V=-vU: The high-fitness tail length, the edge derivative, and the distribution maximum: and 1 otherwise.
Beneficial mutations: the high-fitness edge Again, from diffusion equation for f k0 = f : Unlike in the ratchet case, S > 0, and M is not zero: Stochastic threshold approach (Rouzine et al 2003):
High-fitness edge: method 2 Note: for an established class, beneficial mutations are not critically important: We have 1/S = (1/V)/ln(V/U b ) (from the continuous part) >> 1/V. Beneficial mutation creates a new class k 0 within time interval ~ 1/S: (Rouzine & Coffin 2005, recombination model; Desai & Fisher 2006, this model, preprint online) New: Example: V/U b =10 4, N=10 3 : change in lnN is 0.18.
Final answer in the limit when U is not important Intermediate V (Desai & Fisher 2006, preprint online) : The same as for large V, except N is replaced with N(V/s) 1/2 ~ N ln 1/2 (NU b )]/ln(s/U b ), i.e., relatively small difference in V. Large V: Transition to 1-site theory starts at |x 0 | ~ k av, and ends at std k ~ k av 1/2 : V ~ sk av /ln(sk av /U b ) to V ~ sk av (1-site result)
Contrasting to two-clone approach If only two competing beneficial clones within a pre-existing virus variant are considered at a time, saturation of the fixation speed is predicted: V = at large N. (Maynard Smith, What use is sex? 1971; Gherish and Lenski 1998; Orr 2000) Variation in s is essential: A clone with larger s pushes out the previous one. Mutations with larger and larger s win, as N increases. Hence, the effective increase in for fixed mutations, and increase in the adaptation rate, sV. Solitary wave approach: additional mutations at other sites resolve clonal interference. Variation in s is not vitally important.
Comparison with simulation Difference between green and blue line, mostly, due to neglecting U. Dependence in simulation on k av at U=0.05 and small k av is due to |x 0 | = 53 at N = We assumed |x 0 | << k av. No transition to 1-site theory yet (expected at V/s = k av = 50). Possible reasons for the difference with simulation at k av =500: 1) S = s|x 0 | = 0.5, we assumed S<<1. 2) Edge effects on the continuity of lnf k.
Conclusions: accumulation of beneficial alelles 1.Using the same approach as in the ratchet case, at large population sizes or low fitness, we predict accumulation of beneficial mutations under Fisher-Muller- Hill-Robertson effect, in the form of traveling wave iof mutation number distribution. 2.In a very broad range of N, the substitution rate V is proportional to the logarithm of the population size (in contrast to the two-clone interference model result). 3.In the limit of large N, transition to the one-site deterministic theory is predicted (in contrast to the two-clone interference model result). 4.A more accurate treatment of the best-fit class based on the diffusion equation affects the factor multiplying N, which difference may be numerically detectable at moderately large N and large s/U b. 5.Good agreement with Monte-Carlo simulation is obtained for some parameters relevant to viral populations.
Current work and future directions 1) Asexual populations: - variation of s among sites -linkage disequilibrium 2) Partly sexual haploid populations and sexual diploid populations: - accumulation rate of pre-existing beneficial alleles - correlations between genomes in fitness and site-site correlations - coalescent time - linkage disequilibrium - the fitness distribution of a far ancestor of a site - synergy between beneficial mutations and recombination
Acknowledgements John Coffin, Tufts University, Boston, USA Alex Kondrashov, National Institutes of Health, MD, USA John Wakeley, Harvard University, Boston, USA Isabel Novella, Medical College of Ohio, Toledo, OH, USA