Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C)

Slides:



Advertisements
Similar presentations
Chapter 4. Elements of Statistics # brief introduction to some concepts of statistics # descriptive statistics inductive statistics(statistical inference)
Advertisements

CHAPTER 14: Confidence Intervals: The Basics
Monte Carlo Methods and Statistical Physics
Computational Methods in Physics PHYS 3437
Computational Methods in Physics PHYS 3437
Asymptotic error expansion Example 1: Numerical differentiation –Truncation error via Taylor expansion.
Computational Methods in Physics PHYS 3437
Bayesian Reasoning: Markov Chain Monte Carlo
1 CE 530 Molecular Simulation Lecture 8 Markov Processes David A. Kofke Department of Chemical Engineering SUNY Buffalo
BSc/HND IETM Week 9/10 - Some Probability Distributions.
Advanced Computer Graphics (Spring 2005) COMS 4162, Lectures 18, 19: Monte Carlo Integration Ravi Ramamoorthi Acknowledgements.
Lecture 2: Numerical Differentiation. Derivative as a gradient
CF-3 Bank Hapoalim Jun-2001 Zvi Wiener Computational Finance.
Pseudorandom Number Generators
Evaluating Hypotheses
Chapter Sampling Distributions and Hypothesis Testing.
Sampling Distributions
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Chapter 10: Estimating with Confidence
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
R. Kass/S07 P416 Lec 3 1 Lecture 3 The Gaussian Probability Distribution Function Plot of Gaussian pdf x p(x)p(x) Introduction l The Gaussian probability.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C)
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
Simulation of Random Walk How do we investigate this numerically? Choose the step length to be a=1 Use a computer to generate random numbers r i uniformly.
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Random Number Generators CISC/QCSE 810. What is random? Flip 10 coins: how many do you expect will be heads? Measure 100 people: how are their heights.
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 9 Section 1 – Slide 1 of 39 Chapter 9 Section 1 The Logic in Constructing Confidence Intervals.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
1 Lesson 3: Choosing from distributions Theory: LLN and Central Limit Theorem Theory: LLN and Central Limit Theorem Choosing from distributions Choosing.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
5.3 Random Variables  Random Variable  Discrete Random Variables  Continuous Random Variables  Normal Distributions as Probability Distributions 1.
Sampling W&W, Chapter 6. Rules for Expectation Examples Mean: E(X) =  xp(x) Variance: E(X-  ) 2 =  (x-  ) 2 p(x) Covariance: E(X-  x )(Y-  y ) =
Experimental Method and Data Process: “Monte Carlo Method” Presentation # 1 Nafisa Tasneem CHEP,KNU
Monte Carlo Methods So far we have discussed Monte Carlo methods based on a uniform distribution of random numbers on the interval [0,1] p(x) = 1 0  x.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C)
Distributions of the Sample Mean
1 2 nd Pre-Lab Quiz 3 rd Pre-Lab Quiz 4 th Pre-Lab Quiz.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.
Confidence Interval Estimation For statistical inference in decision making:
Sampling and estimation Petter Mostad
Discrete Random Variables. Introduction In previous lectures we established a foundation of the probability theory; we applied the probability theory.
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Copyright © 2009 Pearson Education, Inc. 8.1 Sampling Distributions LEARNING GOAL Understand the fundamental ideas of sampling distributions and how the.
Lecture 5 Introduction to Sampling Distributions.
From the population to the sample The sampling distribution FETP India.
CHAPTER 2.3 PROBABILITY DISTRIBUTIONS. 2.3 GAUSSIAN OR NORMAL ERROR DISTRIBUTION  The Gaussian distribution is an approximation to the binomial distribution.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © 2005 Dr. John Lipp.
01/26/05© 2005 University of Wisconsin Last Time Raytracing and PBRT Structure Radiometric quantities.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
The expected value The value of a variable one would “expect” to get. It is also called the (mathematical) expectation, or the mean.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Introduction to Computer Simulation of Physical Systems (Lecture 10) Numerical and Monte Carlo Methods (CONTINUED) PHYS 3061.
Lesson 8: Basic Monte Carlo integration
Lecture 4 - Monte Carlo improvements via variance reduction techniques: antithetic sampling Antithetic variates: for any one path obtained by a gaussian.
Presentation transcript:

Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C)

Todays Lecture Introduction to Monte Carlo methods Introduction to Monte Carlo methods Background Background Integration techniques Integration techniques

Introduction Monte Carlo refers to the use of random numbers to model random events that may model a mathematical of physical problem Monte Carlo refers to the use of random numbers to model random events that may model a mathematical of physical problem Typically, MC methods require many millions of random numbers Typically, MC methods require many millions of random numbers Of course, computers cannot actually generate truly random numbers Of course, computers cannot actually generate truly random numbers However, we can make the period of repetition absolutely enormous However, we can make the period of repetition absolutely enormous Such pseudo-random number generators are based on truncation of numbers of their significant digits Such pseudo-random number generators are based on truncation of numbers of their significant digits See Numerical Recipes, p (2 nd edition FORTRAN) See Numerical Recipes, p (2 nd edition FORTRAN)

Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin. John von Neumann

History of numerical Monte Carlo methods Another contribution to numerical methods related to research at Los Alamos Another contribution to numerical methods related to research at Los Alamos Late 1940s: scientists want to follow paths of neutrons following various sub-atomic collision events Late 1940s: scientists want to follow paths of neutrons following various sub-atomic collision events Ulam & von Neumann suggest using random sampling to estimate this process Ulam & von Neumann suggest using random sampling to estimate this process 100 events can be calculated in 5 hours on ENIAC 100 events can be calculated in 5 hours on ENIAC The method is given the name Monte Carlo by Nicholas Metropolis The method is given the name Monte Carlo by Nicholas Metropolis Explosion of inappropriate use in the 1950s gave the technique a bad name Explosion of inappropriate use in the 1950s gave the technique a bad name Subsequent research illuminated when the method was appropriate Subsequent research illuminated when the method was appropriate

Terminology Random deviate – a distribution of numbers choosen uniformly between [0,1] Random deviate – a distribution of numbers choosen uniformly between [0,1] Normal deviate – numbers chosen randomly between (-,) weighted by a Gaussian Normal deviate – numbers chosen randomly between (-,) weighted by a Gaussian

Background to MC integration Suppose we have a definite integral Suppose we have a definite integral Given a good set of N sample points {x i } we can estimate the integral as Given a good set of N sample points {x i } we can estimate the integral as a b Sample points e.g. x 3 x 9 Each sample point yields an element of the integral of width (b-a)/N and height f(x i ) f(x)

What MC integration really does While the previous explanation is a reasonable interpretation of the way MC integration works, the most popular explanation is below While the previous explanation is a reasonable interpretation of the way MC integration works, the most popular explanation is below ab Height given by random sample of f(x) Average

Mathematical Applications Lets formalize this just a little bit… Lets formalize this just a little bit… Since by the mean value theorem Since by the mean value theorem We can approximate the integral by calculating (b-a), and we can calculate by averaging many values of f(x) We can approximate the integral by calculating (b-a), and we can calculate by averaging many values of f(x) Where x i є[a,b] and the values are chosen randomly Where x i є[a,b] and the values are chosen randomly

Example Consider evaluating Consider evaluating Lets take N=1000, then evaluate f(x)=e x with xє[0,1] at 1000 random points Lets take N=1000, then evaluate f(x)=e x with xє[0,1] at 1000 random points For this set of points define For this set of points define I 1 =(b-a) N,1 = N,1 since b-a=1 I 1 =(b-a) N,1 = N,1 since b-a=1 Next choose 1000 different xє[0,1] and create a new estimate I 2 = N,2 Next choose 1000 different xє[0,1] and create a new estimate I 2 = N,2 Next choose another 1000 different xє[0,1] and create a new estimate I 3 = N,3 Next choose another 1000 different xє[0,1] and create a new estimate I 3 = N,3

Distribution of the estimates We can carry on doing this, say 10,000 times at which point well have 10,000 values estimating the integral, and the distribution of these values will be a normal distribution We can carry on doing this, say 10,000 times at which point well have 10,000 values estimating the integral, and the distribution of these values will be a normal distribution The distribution of the all of the I N integrals constrains the errors we would expect on a single I N estimate The distribution of the all of the I N integrals constrains the errors we would expect on a single I N estimate This is the Central Limit Theorem, for any given I N estimate the sum of the random variables within it will converge toward a normal distribution This is the Central Limit Theorem, for any given I N estimate the sum of the random variables within it will converge toward a normal distribution Specifically, the standard deviation will be the estimate of the error in a single I N estimate Specifically, the standard deviation will be the estimate of the error in a single I N estimate The mean, x 0, will approach e-1 The mean, x 0, will approach e-1 x0x0 x 0 + N x 0 - N 1 1/e

Calculating N The formula for the standard deviation of N samples is The formula for the standard deviation of N samples is If there is no deviation in the data then RHS is zero If there is no deviation in the data then RHS is zero Given some deviation as N, the RHS will settle to some constant value > 0 (in this case ~ …) Given some deviation as N, the RHS will settle to some constant value > 0 (in this case ~ …) Thus we can write Thus we can write A rough measure of how good a random number generator is how well does a histogram of the 10,000 estimates fit to a Gaussian.

Add mc.ps plot 1000 samples per I integration Standard deviation is 0.491/1000 Increasing the number of integral estimates makes the distribution closer and closer to the infinite limit.

Resulting statistics For data that fits a Gaussian, the theory of probability distribution functions asserts that For data that fits a Gaussian, the theory of probability distribution functions asserts that 68.3% of the data ( N ) will fall within ± N of the mean 68.3% of the data ( N ) will fall within ± N of the mean 95.4% of the data (19/20) will fall within ±2 N of the mean 95.4% of the data (19/20) will fall within ±2 N of the mean 99.7% of the data will fall within ±3 N etc… 99.7% of the data will fall within ±3 N etc… Interpretation of poll data: Interpretation of poll data: These results will be accurate to ±4%, (19 times out of 20) These results will be accurate to ±4%, (19 times out of 20) The ±4% corresponds to ±2 The ±4% corresponds to ±2 Since 1/sqrt(N) Since 1/sqrt(N) This highlights one of the difficulties with random sampling, to improve the result by a factor of 2 we must increase N by a factor of 4! This highlights one of the difficulties with random sampling, to improve the result by a factor of 2 we must increase N by a factor of 4!

Why would we use this method to evaluate integrals? For 1D it doesnt make a lot of sense For 1D it doesnt make a lot of sense Taking h~1/N then composite trapezoid rule error~h 2 ~1/N 2 =N -2 Taking h~1/N then composite trapezoid rule error~h 2 ~1/N 2 =N -2 Double N, get result 4 times better Double N, get result 4 times better In 2D, we can use an extension of the trapezoid rule to use squares In 2D, we can use an extension of the trapezoid rule to use squares Taking h~1/N 1/2 then error h 2 N -1 Taking h~1/N 1/2 then error h 2 N -1 In 3D we get h~1/N 1/3 then error h 2 N -2/3 In 3D we get h~1/N 1/3 then error h 2 N -2/3 In 4D we get h~1/N 1/4 then error h 2 N -1/2 In 4D we get h~1/N 1/4 then error h 2 N -1/2

MC beneficial for N>4 Monte Carlo methods always have N ~N -1/2 regardless of the dimension Monte Carlo methods always have N ~N -1/2 regardless of the dimension Comparing to the 4D convergence behaviour we see that MC integration becomes practical at this point Comparing to the 4D convergence behaviour we see that MC integration becomes practical at this point It wouldnt make any sense for 3D though It wouldnt make any sense for 3D though For anything higher than 4D (e.g. 6D,9D which are possible!) MC methods tend to be the only way of doing these calculations For anything higher than 4D (e.g. 6D,9D which are possible!) MC methods tend to be the only way of doing these calculations MC methods also have the useful property of being comparatively immune to singularities, provided that MC methods also have the useful property of being comparatively immune to singularities, provided that The random generator doesnt hit the singularity The random generator doesnt hit the singularity The integral does indeed exist! The integral does indeed exist!

Importance sampling In reality many integrals have functions that vary rapidly in one part of the number line and more slowly in others In reality many integrals have functions that vary rapidly in one part of the number line and more slowly in others To capture this behaviour with MC methods requires that we introduce some way of putting our points where we need them the most To capture this behaviour with MC methods requires that we introduce some way of putting our points where we need them the most We really want to introduce a new function into the problem, one that allows us to put the samples in the right places We really want to introduce a new function into the problem, one that allows us to put the samples in the right places

General outline Suppose we have two similar functions g(x) & f(x), and g(x) is easy to integrate, then Suppose we have two similar functions g(x) & f(x), and g(x) is easy to integrate, then

General outline cont The integral we have derived has some nice properties: The integral we have derived has some nice properties: Because g(x)~f(x) (i.e. g(x) is a reasonable approximation of f(x) that is easy to integrate) then the integrand should be approximately 1 Because g(x)~f(x) (i.e. g(x) is a reasonable approximation of f(x) that is easy to integrate) then the integrand should be approximately 1 and the integrand shouldnt vary much! and the integrand shouldnt vary much! It should be possible to calculate a good approximation with a fairly small number of samples It should be possible to calculate a good approximation with a fairly small number of samples Thus by applying the change of variables and mapping our sample points we get a better answer with fewer samples Thus by applying the change of variables and mapping our sample points we get a better answer with fewer samples

Example Lets look at integrating f(x)=e x again on [0,1] Lets look at integrating f(x)=e x again on [0,1] MC random samples are 0.23,0.69,0.51,0.93 MC random samples are 0.23,0.69,0.51,0.93 Our integral estimate is then Our integral estimate is then

Apply importance sampling We first need to decide on our g(x) function, as a guess let us take g(x)=1+x We first need to decide on our g(x) function, as a guess let us take g(x)=1+x Well it isnt really a guess – we know this is the first two terms of the Taylor expansion of e x ! Well it isnt really a guess – we know this is the first two terms of the Taylor expansion of e x ! y(x) is thus given by y(x) is thus given by For end points we get y(0)=0, y(1)=3/2 For end points we get y(0)=0, y(1)=3/2 Rearrange y(x) to give x(y): Rearrange y(x) to give x(y):

Set up integral & evaluate samples The integral to evaluate is now The integral to evaluate is now We must now choose ys on the interval [0,3/2] We must now choose ys on the interval [0,3/2]y Close to 1 because g(x)~f(x)

Evaluate For the new integral we have For the new integral we have Clearly this technique of ensuring the integrand doesnt vary too much is extremely powerful Clearly this technique of ensuring the integrand doesnt vary too much is extremely powerful Importance sampling is particularly important in multidimensional integrals and can add 1 or 2 significant figures of accuracy for a minimal amount of effort Importance sampling is particularly important in multidimensional integrals and can add 1 or 2 significant figures of accuracy for a minimal amount of effort

Rejection technique Thus far weve look in detail at the effect of changing sample points on the overall estimate of the integral Thus far weve look in detail at the effect of changing sample points on the overall estimate of the integral An alternative approach may be necessary when you cannot easily sample the desired region which well call W An alternative approach may be necessary when you cannot easily sample the desired region which well call W Particularly important in multi-dimensional integrals when you can calculate the integral for a simple boundary but not a complex one Particularly important in multi-dimensional integrals when you can calculate the integral for a simple boundary but not a complex one We define a larger region V that includes W We define a larger region V that includes W Note you must also be able to calculate the size of V easily Note you must also be able to calculate the size of V easily The sample function is then redefined to be zero outside the volume, but have its normal value within it The sample function is then redefined to be zero outside the volume, but have its normal value within it

Rejection technique diagram Region we want to calculate V W f(x) Area of W=integral of region V multiplied by fraction of points falling below f(x) within V Algorithm: just count the total number of points calculated & the number in W!

Better selection of points: sub- random sequences Choosing N points using a uniform deviate produces an error that converges as N -0.5 Choosing N points using a uniform deviate produces an error that converges as N -0.5 If we could choose points better we could make convergence faster If we could choose points better we could make convergence faster For example, using a Cartesian grid of points leads to a method that converges as N -1 For example, using a Cartesian grid of points leads to a method that converges as N -1 Think of Cartesian points as avoiding one another and thus sampling a given region more completely Think of Cartesian points as avoiding one another and thus sampling a given region more completely However, we dont know a priori how fine the grid should be However, we dont know a priori how fine the grid should be We want to avoid short range correlations – particles shouldnt be too close to one another We want to avoid short range correlations – particles shouldnt be too close to one another A better solution is to choose points that attempt to maximally avoid one another A better solution is to choose points that attempt to maximally avoid one another

A list of sub-random sequences Tore-SQRT sequences Tore-SQRT sequences Van der Corput & Halton sequences Van der Corput & Halton sequences Faure sequence Faure sequence Generalized Faure sequence Generalized Faure sequence Nets & (t,s)-sequences Nets & (t,s)-sequences Sobol sequence Sobol sequence Niederreiter sequence Niederreiter sequence Well look very briefly at Halton & Sobol sequences, both of which are covered in detail in Numerical Recipes Well look very briefly at Halton & Sobol sequences, both of which are covered in detail in Numerical Recipes Many to choose from!

Haltons sequence Suppose in 1d we obtain the jth number in sequence, denoted H j, via Suppose in 1d we obtain the jth number in sequence, denoted H j, via (1) write j as a number in base b, where b is prime (1) write j as a number in base b, where b is prime e.g. 17 in base 3 is 122 e.g. 17 in base 3 is 122 (2) Reverse the digits and place a radix point in front (2) Reverse the digits and place a radix point in front e.g base 3 e.g base 3 It should be clear why this works, adding an additional digit makes the mesh of numbers progressively finer It should be clear why this works, adding an additional digit makes the mesh of numbers progressively finer For a sequence of points in n dimensions (x i 1,…,x i n ) we would typically use the first n primes to generate separate sequences for each of the x i j components For a sequence of points in n dimensions (x i 1,…,x i n ) we would typically use the first n primes to generate separate sequences for each of the x i j components

2d Haltons sequence example Pairs of points constructed from base 3 & 5 Halton sequences

Sobol (1967) sequence Useful method described in Numerical Recipes as providing close to N -1 convergence rate Algorithms are also available at From Num. Recipes

Summary MC methods are a useful way of numerically integrating systems that are not tractable by other methods MC methods are a useful way of numerically integrating systems that are not tractable by other methods The key part of MC methods is the N -0.5 convergence rate The key part of MC methods is the N -0.5 convergence rate Numerical integration techniques can be greatly improved using importance sampling Numerical integration techniques can be greatly improved using importance sampling If you cannot write down a function easily then the rejection technique can often be employed If you cannot write down a function easily then the rejection technique can often be employed

Next Lecture More on MC methods – simulating random walks More on MC methods – simulating random walks