How to interpret the forward diffusion equation

** How to interpret the forward diffusion equation**
: general forward equation If <Δx> >0, this term causes P to slide to the right, with speed <Δx> (we previously called this “translation”. We’ll see that this term corresponds to natural selection. “diffusion term”: represents a random walk. causes P(x) to become broader and shorter through time.. We’ll see that this term represents “genetic drift.” pure diffusion (<Δx>=0) pure translation <(Δx)2>=0) diffusion and translation cartoon/example, x is on horizontal axis and xo=0. Dan: ask “can <(Δx)2> ever be < 0”? Dan: tell them they will have a HW problem on the classic random walk. So, <(Δx)2> tells us how strong diffusion (drift) is and <Δx> says how strong translation (selection) is.

Challenge: can anyone provide an interpretation to the backward equation?
general backward equation Like the forward equation, there’s both a “diffusion term” and a “translation term”, but… The translation term has the opposite sign as it did in the forward equation. Crucial point: now the dependent variable is the starting frequency (xo) and we’re thinking of the final frequency as a fixed value. This speaks to a student’s question from the other day: “Can you think of the Markov chain as moving backward in time?” This is probably the clearest reason why this is called the backward equation.

--(totally optional slide): Faster alternative to deriving forward and backward diffusion equations -- As we said, the name of the game is to trade recursive terms for derivatives. A far faster, though probably more mysterious approach is to use Taylor expansions. We can stop at first order in t, but we need to keep up to second order in x. Why? (Try it out and see!) For any “well-behaved” function f(x), the Taylor expansion, up to second order, is: We need to expand functions like P(x|xo,t+1), P(x± 1/N|x_o, t), etc. Like so: Using these expansions, you can get from the backward recursion eq. to the backward diffusion eq. in just a few lines. Note: for the forward equation, expand (PT), not just P….

sidebar on the generality of forward/backward diffusion equations
For a while now, we’ve been operating on the assumption that we’re studying a 1-step process: large jumps (e.g. x → 2/x) were forbidden T was tri-diagonal Some stochastic processes that aren’t 1-step. For example, a very popular alternative to Moran’s model is the “Wright-Fisher” model (homework problem). Do our diffusion equations apply to non-one-step processes? YES! … as long as large jumps (e.g. x → 2/x) are much less likely than small jumps. But we won’t bother re-deriving everything for those models. The spirit of those derivations is the same as ours anyway…

sidebar: A “heads up” on an error (?) in most textbooks
The factor <(Δx)2> appears in both the forward and backward equations. This is related to the variance in x (during 1 time unit): prove this for “fun”? So another way to write the forward equation (for example) is: But what you see in all the textbooks is So it looks to me like the <x>2 term got erroneously discarded. To be fair, we’ll see that this term is usually much smaller the <(Δx)2> term, so dropping it is usually an OK approximation. But it is an approximation… just a heads up.

Orientation: where are we and where are we going?
Plan of attack: develop matrix formalism use matrix formalism to derive forward / backward diffusion equations use forward/backward equations to solve key problems We started by writing P(x|xo, t) as a matrix. We showed that the matrix T turns P(x|xo, t) into P(x|xo, t+1). When we multiplied by T on the left: P(x|xo, t+1)=TP(x|xo, t), got the forward recursion equation. When we multipled by T on the right: P(x|xo, t+1)=P(x|xo, t)T, we got the backward recursion equation. Next, we turned the forward/backward recursion equations into forward/backward diffusion equations. This was good because: diffusion equations can be easily interpreted (second derivative term represents diffusion, first derivative term represents translation). diffusion equations can often be solved analytically (i.e. w/out a computer). And this is where we’re going: next we’ll solve some seminal problems in population genetics.

Forward diffusion equation for Moran model (with no mutation and no selection). Part 1
How can we tailor the (general) diffusion equations to a specific example (e.g. neutral/no-mutation Moran model)? All we need to do is compute <Δx> and <(Δx)2>: First <Δx>: Δx can only take two values, +1/N and -1/N. Both options have equal probability: x(1-x) (b/c this is the neutral case). Thus, <Δx>=x(1-x)(1/N) + x(1-x)(-1/N) = 0. How about <(Δx)2>?: Well, if Δx either increases or decreases, (Δx)2 will equal (1/N)2. So, <(Δx)2> = x(1-x) (1/N)2 + x(1-x) (1/N)2 = 2x(1-x) (1/N)2 We’ve done all the work we need to: now just plug in for <Δx> and <(Δx)2>:

Forward diffusion equation for Moran model (with no mutation and no selection). Part 2
We need just 1 more small tweak before our forward equation is in its usual form. The left-hand-side (LHS) gives the rate of change of P with respect to time. But in which time units (think: timesteps or generations)? On the RHS, we computed <Δx> and <(Δx)2> during 1 timestep, so the LHS is also in units of timesteps. But most people work in units of generations (an experimentalist doesn’t need or care to know what a “timestep” is). To convert to generations, note that N timesteps = 1 generation. So N-times more action happens during 1 generation than during 1 timestep: ∂P/∂t is N-times larger, when measured in generations. (you can formalize this argument by defining a new variable τ=t/N, then taking the derivative with respect to τ). So, drumroll… Moran model, no selection or drift

** Forward diffusion equation for Moran model (with no mutation and no selection). Part 3 **
Last slide we got this equation: So what? There is a second derivative term (representing diffusion). But there’s no first derivative term (representing translation). So: allele frequency takes a (unbiased) random walk. This random walk is called GENETIC DRIFT. The strength of the diffusion term (random walk) is proportional to 1/N. So when N is small, diffusion is strong. In other words, the strength of genetic drift = 1/N: when x→0 or x→1 , the strength of diffusion → 0: the random walk “freezes” at x=0,1 Motoo Kimura solved this equation analytically (i.e. w/out a computer). But the formulas are very complicated and don’t add much to our intuitive understanding. Nevertheless, we can learn a lot from numeric solutions (see next slide).

** Solutions to forward diffusion equation: a mathematical portrait of genetic drift **
Drift causes alleles to “segregate”: here, xo = 0.5 P(x) gets broader through time: we lose certainty in x. probability “piles up” on edges: x eventually “chooses a side”, i.e. segregates. , xo=0.5 note that P(x) can exceed 1. That’s ok: Prob. that x is b/w x and x+dx = P(x)dx ≈ 0. Prob x is b/w xlow and xhigh is This solution is like the columns in the P(x|xo, t) matrix: integrates to 1: answer to 3. is “average heterozygosity” = integral of 2x(1-x) “Clicker” questions: As time progresses, describe how the following change: The average allele frequency (<x>) (not <Δx>!) The variance in allele frequency (Var[x]) (not Var[Δx] !) The genetic diversity within the population (can you remember how we measure this?) What does this look like at t = ∞?

Backward diffusion equation for Moran model (with no mutation and no selection).
A few slide ago we derived the forward diffusion equation. How about the backward one? Easy! Since we already figured out <Δx>=0 and <(Δx)2> = 2xo(1-xo)/N2, this is a breeze. The only difference compared to the forward equation is the subscript on the x’s. x(1-x) is outside the derivative Once again, generations are better units than timesteps. As before, to change to units of generations, just multiply the RHS by N. Pro-Tip: if you see a N2 in your equation, you probably forgot to change units to generations. I don’t mind if you just memorize the rule “to convert form timesteps to generations, multiply every term on the RHS by N”

Backward diffusion equation for Moran model (with no mutation and no selection).
, x = 1 As with forward equation, analytic solutions possible but ugly. We won’t discuss them. Left: numerical solutions. NOT probability densities (don’t integrate to 1: rows of P matrix). Rather, two interpretations: #1: P(x=1|xo, t): fixation probability at time t, given that started at frequency xo. #2: retrospective: Likehihood that x=0 at t=0, given that x=1 at time t. Some observations: 1. Pfix increases with xo, and with t. 2. for small t (blue), fixation ≈ impossible. 3. Pfix(xo, t→ ∞)) approaches a linear function. Intuitive?

Eventual fixation probability: t→ ∞ solution to backward diffusion equation (w/ no selection or mutation) In the last slide, it looked like P(x=1|xo,,t)≡Pfix(xo,t) was tending toward a linear function as t→∞. We prove this below… The key is to note that ∂P/∂t → 0 as t→∞. This is exactly the same argument we made when analyzing the continuous-time branching process (back then, we said Pextinct(t=∞) = Pextinct(t=∞+dt) ). If xo≠ 0 and xo≠ 1, we can divide both sides by N/(xo(1-xo)), resulting in This is a super-easy differential equation to solve. It’s best done “by inspection.” The eq. says that the second derivative = 0. And second derivative is like curvature. So Pfix(xo,t→∞) must be a line: “Pfix= mxo + b”. (I got tired of writing t→∞) But we also know the “boundary conditions”: Pfix(xo=0)=b=0 and Pfix(xo=1)=m+b=m=1. So, Pfix(xo, t→∞) = xo

Interpretation and consequence of Pfix=xo
30 second clicker question: what is the most interesting value of xo? answer: xo=1/N (b/c new alleles originate as a single copy). In this case, Pfix=1/N. Interpretation: After infinite time, one of the N lineages will have taken over the population, and the other N-1 lineages gone extinct. In the neutral case, all N lineages are equivalent (in terms of fitness). So, by symmetry, the probability that a new allele is the lucky one (that takes over the population)= 1/N. Consequence: few minute clicker question: Suppose that every time a genome is duplicated, Uneut neutral mutations are introduced, on average. Each generation, how many mutations are introduced in the whole population? (assume we’re dealing with unicellulars, like bacteria) What fraction of the new, neutral mutations will eventually achieve fixation? Multiply your answer to obtain the substitution rate of neutral mutations. The exercise you just did has played a huge role in population genetics. It says that the tempo of evolutionary change is independent of everything except Uneutral. So, Uneutral sets the pace of a “molecular clock” that can date evolutionary divergence.

How to interpret the forward diffusion equation

Similar presentations

Presentation on theme: " How to interpret the forward diffusion equation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

** How to interpret the forward diffusion equation**

Similar presentations

Presentation on theme: "** How to interpret the forward diffusion equation**"— Presentation transcript:

Similar presentations

About project

Feedback

How to interpret the forward diffusion equation

Presentation on theme: " How to interpret the forward diffusion equation"— Presentation transcript: