Presentation on theme: "Monte Carlo Methods and Statistical Physics Mathematical Biology Lecture 4 James A. Glazier (Partially Based on Koonin and Meredith, Computational Physics,"— Presentation transcript:
Monte Carlo Methods and Statistical Physics Mathematical Biology Lecture 4 James A. Glazier (Partially Based on Koonin and Meredith, Computational Physics, Chapter 8)
Monte Carlo Methods Use Statistical Physics Techniques to Solve Problems that are Difficult or Inconvenient to Solve Deterministically. Two Basic Applications: –Evaluation of Complex Multidimensional Integrals (e.g. in Statistical Mechanics) [1950s] –Optimization of Problems where the Deterministic Problem is Algorithmically Hard (NP Complete—e.g. the Traveling Salesman Problem) [1970s]. Both Applications Important in Biology.
Example: Thermodynamic Partition Function For a Gas of N Atoms at Temperature 1/ , Interacting Pairwise through a Potential V(r), the Partition Function: Suppose We need to Evaluate Z Numerically with 10 Steps/Integration. Then Have 10 3N Exponentials to Evaluate. The Current Fastest Computer is About Operations/Second. One Year ~ 3 x 10 7 Seconds. So One Year ~ 3 x Operations. In One Year Could Evaluate Z for about 7 atoms! This Result is Pretty Hopeless. There Must Be a Better Way.
Normal Deterministic Integration Consider the Integral: Subdivide [0,1] into N Evenly Spaced Intervals of width x=1/N. Then:
2) Convergence is Slow: while for Normal Deterministic Integration: Error Estimate—Continued However, Comparison Isn’t Fair. Suppose You Fix the Number of Subdomains in the Integral to be N. In d Dimensions Each Deterministic Sub-Integral has N 1/d Intervals. So the Net Error is So, if d >4 the Monte Carlo Method is Better!
For a constant Function, the Error is 0 for Both Deterministic and Monte Carlo Integration. Two Rather Strange Consequences: 1)In Normal Integration, Error is 0 for Straight Lines. In Monte Carlo Integration, Errors Differ for Straight Lines Depending on Slope (Worse for Steeper Lines). If Error Estimate How Good is the Estimate?
Monte Carlo Integration Use the Idea of the Integral as an Average: Before We Solved by Subdividing [0,1] into Evenly Spaced Intervals, but could Equally Well Pick Positions Where We Evaluate f(x) Randomly: Chosen to be Uniform Random. So: Approximates I Note: Need a Good Random Number Generator for this Method to Work. See (Vetterling, Press…, Numerical Recipies)
Pathology Like Normal Integration, Monte Carlo Integration Can Have Problems. Suppose You have N Delta Functions Scattered over the Unit Interval. However, the Probability of Hitting a Delta Function is 0, so I N =0. For Sharply-Peaked Functions, the Random Sample is a Bad Estimate (Standard Numerical Integration doesn’t Work Well Either)
Can Improve Estimates by Picking the ‘Random’ Points Intelligently, to Have More Points Where f(x) is Large and Few Where f(x) is Small. Let w(x) be a Weight Function Such That: For Deterministic Integration, the Weight Function has No Effect: Let: Then: Alternatively, Pick: So All We have to do is Pick x According to the Distribution w(x) and Divide f(x) by that Distribution: Weight Functions
If Why Not Just Let w(x)= f(x) ? Then Need to Solve the Integral to Invert y(x) to Obtain x(y) or to Pick x According to w(x). But Stripping Linear Drift is Easy and Always Helps. In d dimensions have: So: Which is Hard to Invert, so Need to Pick Directly (Though, Again Can Strip Drift). Weight Functions—Continued
Let And Then When You Can’t Invert y(x) Refer to Large Literature on How to Generate With the Needed Distribution. Example
Metropolis Algorithm Originally a Way to Derive Statistics for Canonical Ensemble in Statistical Mechanics. A Way to Pick the According to the Weight Function in a very high dimensional space. Idea: Pick any x 0 and do a Random Walk: Subject to Constraints Such that the Probability of a Walker at has: Problems: 1) Convergence can be Very Slow. 2) Result can be Wrong. 3) Variance Not Known.
For any State Generate a New Trial State. Usually (Not Necessary) Assume that is Not Too Far From. I.e. that it lies within a ball of Radius >0 of : Let: If r ≥1 then Accept the Trial: If r <1 then Accept the Trial with Probability r. I.e. Pick a Random Number [0,1]. –If < r then Accept: –Otherwise Reject: Repeat. Algorithm
Problems May Not Sample Entire Space. –If Too Small Explore only Small Region Around. –If Too Big Probability of Acceptance Near 0. Inefficient. –If Regions of High w Linked by Regions of Very Low w Never See Other Regions. –If w Sharply Peaked Tend to Get Stuck Near Maximum. Sequence of Not Statistically Independent, so Cannot Estimate Error. Fixes: –Use Multiple Replicas. Many Different, Which Together Sample the Whole Space. –Pick So that the Acceptance Probability is ~ ½ (Optimal). –Run Many Steps Before Starting to Sample.
Convergence Theorem Theorem: Proof: Consider Many Independent Walkers Starting from Every Possible Point and Let Them Run For a Long Time. Let be the Density of Walkers at Point. Consider Two Points and. Let be the Probability for a Single Walker to Jump from to. Rate of Jumps from to is: So the Net Change in is:
At Equilibrium: So if And if So Always Tends to its Equilibrium Value Monotonically and the Rate of Convergence is Linear in the Deviation: Implies that System is Perfectly Damped. This Result is the Fundamental Justification for Using the Metropolis Algorithm to Calculate Nonequilibrium and Kinetic Phenomena. Convergence Theorem—Continued
Need to Evaluate If is Allowed, So is. (Detailed Balance). I.e. So While if So Always. Normalizing by the Total Number of Walkers □ Note that This Result is Independent of How You Choose Given, As Long as Your Algorithm Has Nonzero Transition Probabilities for All Initial Conditions and Obeys Detailed Balance (Second Line Above). Convergence Theorem—Conclusion