Random Numbers and Simulation  Generating truly random numbers is not possible Programs have been developed to generate pseudo-random numbers Programs.

Slides:



Advertisements
Similar presentations
Hypothesis testing and confidence intervals by resampling by J. Kárász.
Advertisements

Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Sampling Distributions (§ )
BAYESIAN INFERENCE Sampling techniques
Statistics for Financial Engineering Part1: Probability Instructor: Youngju Lee MFE, Haas Business School University of California, Berkeley.
Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.
Bayesian estimation Bayes’s theorem: prior, likelihood, posterior
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
Chapter 6 Introduction to Sampling Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Statistics Lecture 20. Last Day…completed 5.1 Today Parts of Section 5.3 and 5.4.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Evaluating Hypotheses
Chapter 7: Variation in repeated samples – Sampling distributions
Chapter 14 Simulation. Monte Carlo Process Statistical Analysis of Simulation Results Verification of the Simulation Model Computer Simulation with Excel.
The Monte Carlo Method: an Introduction Detlev Reiter Research Centre Jülich (FZJ) D Jülich
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Standard error of estimate & Confidence interval.
Essential Statistics Chapter 101 Sampling Distributions.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Random Sampling, Point Estimation and Maximum Likelihood.
1 Theoretical Physics Experimental Physics Equipment, Observation Gambling: Cards, Dice Fast PCs Random- number generators Monte- Carlo methods Experimental.
General Principle of Monte Carlo Fall 2013 By Yaohang Li, Ph.D.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
Sampling Distribution ● Tells what values a sample statistic (such as sample proportion) takes and how often it takes those values in repeated sampling.
1 Lesson 3: Choosing from distributions Theory: LLN and Central Limit Theorem Theory: LLN and Central Limit Theorem Choosing from distributions Choosing.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Today’s lesson Probability calculations with the standard normal distribution. Making predictions based on the specification of a normal distribution.
AP Statistics 9.3 Sample Means.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 26.
Chapter 10 – Sampling Distributions Math 22 Introductory Statistics.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Chap 7-1 Developing a Sampling Distribution Assume there is a population … Population size N=4.
Population and Sample The entire group of individuals that we want information about is called population. A sample is a part of the population that we.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Lecture 2 Basics of probability in statistical simulation and stochastic programming Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius,
1 Since everything is a reflection of our minds, everything can be changed by our minds.
Stat 1510: Sampling Distributions
Monte Carlo Process Risk Analysis for Water Resources Planning and Management Institute for Water Resources 2008.
M ONTE C ARLO SIMULATION Modeling and Simulation CS
§ 5.3 Normal Distributions: Finding Values. Probability and Normal Distributions If a random variable, x, is normally distributed, you can find the probability.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Machine Learning Chapter 5. Evaluating Hypotheses
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Topic 5 - Joint distributions and the CLT
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
SESSION 37 & 38 Last Update 5 th May 2011 Continuous Probability Distributions.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
1 Sampling distributions The probability distribution of a statistic is called a sampling distribution. : the sampling distribution of the mean.
Lecture 5 Introduction to Sampling Distributions.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Central Limit Theorem Let X 1, X 2, …, X n be n independent, identically distributed random variables with mean  and standard deviation . For large n:
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.
G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Statistical Data Analysis: Lecture 5 1Probability, Bayes’ theorem 2Random variables and.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Random Numbers and Simulation
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
STAT 311 REVIEW (Quick & Dirty)
Sample Mean Distributions
Lecture 2 – Monte Carlo method in finance
CHAPTER 15 SUMMARY Chapter Specifics
Sampling Distributions (§ )
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

Random Numbers and Simulation  Generating truly random numbers is not possible Programs have been developed to generate pseudo-random numbers Programs have been developed to generate pseudo-random numbers Values are generated from deterministic algorithms Values are generated from deterministic algorithms 1 © Fall 2011 John Grego and the University of South Carolina

Random Numbers Pseudo-random deviates can pass any statistical test for randomness Pseudo-random deviates can pass any statistical test for randomness They appear to be independent and identically distributed They appear to be independent and identically distributed Random number generators for common distributions are available in R Random number generators for common distributions are available in R Special techniques (STAT 740) may be needed as well Special techniques (STAT 740) may be needed as well 2

Monte Carlo Simulation  Some common uses of simulation Modeling stochastic behavior Modeling stochastic behavior Calculating definite integrals Calculating definite integrals Approximating the sampling distribution of a statistics (e.g., maximum of a random sample) Approximating the sampling distribution of a statistics (e.g., maximum of a random sample) 3

Modeling Stochastic Behavior  Buffon’s needle  Random Walk  Observe X 1, X 2, …, where p=P(X i =1)=P(X i =-1)=.5 and study S 1,S 2,…, where 4

Modeling Stochastic Behavior  This is also called Gambler’s ruin; each X i represents a $1 bet with a return of $2 for a win and $0 for a loss. 5

Gambler’s Ruin  The properties of a fair game (p=.5) are a lot more interesting than the properties of an unfair game (p≠.5)  Some properties of this process are easy to anticipate (E(S)) 6

Gambler’s Ruin  Some properties are difficult to anticipate, and can be aided by simulation. Expected number of returns to 0 Expected number of returns to 0 Expected length of a winning streak Expected length of a winning streak Probability of going broke given an initial bank Probability of going broke given an initial bank 7

Calculating Definite Integrals In statistics, we often have to calculate difficult definite integrals (posterior distributions, expected values) In statistics, we often have to calculate difficult definite integrals (posterior distributions, expected values) (here, x could be multidimensional) 8

Calculating Definite Integrals Example 1 Example 1 Example 2 Example 2 9

Hit-or-Miss Monte Carlo Example 1 Example 1 Determine c such that c≥h(x) across entire region of interest (here, c=4) Determine c such that c≥h(x) across entire region of interest (here, c=4) 10

Hit-or-Miss Monte Carlo Generate n random uniform (X i,Y i ) pairs, X i ’s from U[a,b] (here, U[0,1]) and Y i ’s from U[0,c] (here, U[0,4]) Generate n random uniform (X i,Y i ) pairs, X i ’s from U[a,b] (here, U[0,1]) and Y i ’s from U[0,c] (here, U[0,4]) Count the number of times (call this m) that Y i is less than h(X i ) Count the number of times (call this m) that Y i is less than h(X i ) Then I 1 ≈c(b-a)m/n Then I 1 ≈c(b-a)m/n I.e., (height)(width)(proportion under curve) I.e., (height)(width)(proportion under curve) 11

Classical Monte Carlo Integration Take n random uniform values, U 1,…,U n over [a,b] and estimate I using Take n random uniform values, U 1,…,U n over [a,b] and estimate I using This method seems straightforward, but is actually more efficient than Hit-or-Miss Monte Carlo This method seems straightforward, but is actually more efficient than Hit-or-Miss Monte Carlo 12

Expected Value of a Function of a Random Variable Suppose X is a random variable with density f. Find E[h(x)] for some function h, e.g., Suppose X is a random variable with density f. Find E[h(x)] for some function h, e.g., 13

Expected Value of a Function of a Random Variable For n random values X 1, X 2, …, X n from the distribution of X (i.e., with density f), For n random values X 1, X 2, …, X n from the distribution of X (i.e., with density f), 14

Examples Example 3: If X is a random variable with a N(10,1) distribution, find E(X 2 ) Example 3: If X is a random variable with a N(10,1) distribution, find E(X 2 )  Example 4: If Y is a random variable with a Beta(5,1) distribution, E(-lnY) There are more advanced methods of integration using simulation (Importance Sampling) There are more advanced methods of integration using simulation (Importance Sampling) 15

Integration integrate() performs numerical integration for functions of a single variable (not using simulation techniques) integrate() performs numerical integration for functions of a single variable (not using simulation techniques) adapt() in the adapt package performs multivariate numerical integration adapt() in the adapt package performs multivariate numerical integration 16

Approximating the Sampling Distribution of a Statistic To perform inference (CI’s, hypothesis tests) based on sampling statistics, we need to know the sampling distribution of the statistics, at least up to an approximation To perform inference (CI’s, hypothesis tests) based on sampling statistics, we need to know the sampling distribution of the statistics, at least up to an approximation Example: X 1, X 2, …, X n ~ iid N( ,  2 ). Example: X 1, X 2, …, X n ~ iid N( ,  2 ). 17

Approximating the Sampling Distribution of a Statistic What if the data’s distribution is not known? What if the data’s distribution is not known? Large sample: Central Limit Theorem Large sample: Central Limit Theorem Small sample: Normal theory or nonparametric procedures based on permutation distributions Small sample: Normal theory or nonparametric procedures based on permutation distributions 18

Approximating the Sampling Distribution of a Statistic If the population distribution is known, we can approximate the sampling distribution with simulation. If the population distribution is known, we can approximate the sampling distribution with simulation. Repeatedly (m times) generate random samples of size n from the population distribution Repeatedly (m times) generate random samples of size n from the population distribution Calculate a statistic (say, S) each time Calculate a statistic (say, S) each time The empirical (observed) distribution of S- values approximates the true distribution of S The empirical (observed) distribution of S- values approximates the true distribution of S 19

Example X 1, X 2, X 3, X 4 ~Expon(1) X 1, X 2, X 3, X 4 ~Expon(1) What is the sampling distribution of: What is the sampling distribution of: 20