# Materials for Lecture 11 Chapters 3 and 6 Chapter 16 Section 4.0 and 5.0 Lecture 11 Pseudo Random LHC.xls Lecture 11 Validation Tests.xls Next 4 slides.

## Presentation on theme: "Materials for Lecture 11 Chapters 3 and 6 Chapter 16 Section 4.0 and 5.0 Lecture 11 Pseudo Random LHC.xls Lecture 11 Validation Tests.xls Next 4 slides."— Presentation transcript:

Materials for Lecture 11 Chapters 3 and 6 Chapter 16 Section 4.0 and 5.0 Lecture 11 Pseudo Random LHC.xls Lecture 11 Validation Tests.xls Next 4 slides were added because right about now most students are confused about PDF parameters and what functions to use

Parameter Estimation Parameters for a distribution define the shape and position on the number scale –Uniform( Min, Max) –Norm( Mean, Std Dev) –Mean (Ỹ or Ῡ) and risk as Empirical( S i, P(S i )) Shape can be skewed right or left, can be tall or squatty (kurtosis) Parameters reflect amount of variability in the stochastic variable Must validate random variables against their parameters We use the parameters to simulate the distributions

Same Mean Different Std Dev

Review Steps for Parameter Estimation Step 1: Check for presence of a trend, cycle or structural pattern –If trend or structural model, work with the residuals (ẽ t ) –If no trend use actual data (X’s) Step 2: Estimate parameters for several assumed distributions using the X’s or the residuals (ẽ t ) Step 3: Simulate the different distributions Step 4: Pick the best match based on –Mean, Variability -- use validation tests –Minimum and Maximum –Shape of the CDF vs. historical series –Penalty function CDFDEV() to quantify differences

Univariate Parameter Estimation When do you use UPES? When there is no trend in the data When you want to use the historical mean as your forecasted y-hat Test an unknown random variable for its shape Or use residuals

Univariate Parameter Estimation Empirical distribution fits your data best because it lets the data define the shape Prefer to use the EMP with deviations as a percent or fraction from Y-hat If there is a trend, then account for it with deviations from trend Else use deviations from mean EMP allows us to model low probability events Test with = CDFDEV(original data, sim data)

Model Validation Do the simulated values for the random variables reproduce their parameters? Does the model accurately forecast the system? Do the results conform to theoretical expectations? Do the results conform to expectations of experts? –Touring Test of simulation model results –Show the results to experts, using alternative assumptions about the input values

Four P’s for Validation Planning – in the initial model preparation mode, developer should plan how to validate the model Personal – it’s the developer’s responsibility to verify every equation, coefficient, and random variable; check if results are theoretically correct? Peers – utilize experts in the field to review model results using Touring Test; use sensitivity testing of model Prospective Clients – do the results conform to their expectations? Are the results useful to the client?

Model Verification Check all equations for arithmetic accuracy –Use Excel’s “Trace Dependence” functions Check linkage of variables coming into each equation –Check model in “Expected Value” and “Stochastic” mode Insure that the variables in each equation are theoretically correct Make sure the model contains all of the necessary equations to calculate the KOVs

Model Validation Use statistical tests of each random variable to insure that it: –reproduces the historical distribution –reproduces the historical correlation matrix among random variables Statistical Tests –Student t test –F test –Chi Square test

Statistical Tests for Validation Test the means of the random variables against their historical values –Statistically equal at 95% level based on a t-test? Test the variance against historical values –Statistically equal at 95% level based on an F-test? Check the historical vs. simulated coefficient of variation –Needs to be constant over time Check the minimum and maximum –For a Normal distribution are they reasonable? Should be: Min ≈ Mean + Std Dev * (-3) Max ≈ Mean + Std Dev * (3) –For an Empirical distribution compare simulated min and max to values the model “should” simulate or Xmin should get = Y-hat * (1+Minimum Fractional Deviate) Xmax should get = Y-hat * (1+Maximum Fractional Deviate) Check the correlation matrix for the simulated variables vs. the historical correlation matrix using t-tests

Validation Tests in Simetar Verification/Validation tests in Simetar –Hypothesis tests icon –Compare Two Series Historical Data vs. Simulated Values –1 st Data Series is history –2 nd Data Series is simulated –Test means and variances for two series, i.e., are they statistically equal –Test works for a pair of variables and for comparing two multivariate distributions (matrices)

Statistical Tests for Validation Compare Two Series Historical Data vs. Simulated Values –1 st Data Series is history –2 nd Data Series is simulated

Validation Tests in Simetar Compare mean and standard deviation of simulated data to the user’s specified values –“Data Series” is the simulated values –Type in the mean or cell –Specify the Std Dev as a value or a cell location The test is used when –Only mean and std dev are known, i.e., there is no history for the variable –Mean is a projected value which is different from the history

Validation Tests in Simetar Compare mean and standard deviation of simulated data to the user’s specified values The test is used when only mean and std dev are known, i.e., there is no history for the variable Or the mean is a projected value different from history Note the Given Values are Mean = 10 and Std Dev = 3

Validation Tests in Simetar Test simulated values for Multivariate Distributions (MVE and MVN) to test if the historical correlation matrix is reproduced in the simulation –Data Series is the simulated values for all random variables in the MV distribution, a matrix of variables in SimData –The original correlation matrix used to simulate the MVE or MVN distribution OK, if the majority of correlation coefficients are statistically the same as the historical correlation matrix

Charts for Validation Test simulated values for Multivariate Distributions (MVE and MVN) to test if the historical correlation matrix is reproduced in the simulation

Using Charts for Visual Validation Use a CDF to compare historical series to simulated series, tests the min and max Use a PDF to compare historical series to simulated series, tests the shape Use a Box Plot to compare historical series to simulated series, checks the variability Use a Probability graph to compare historical series to simulated series, P(x) vs. F(x) Use a Fan graph to show the range of the risk and level of the mean over time, visual test of CV constant over time

How Simetar Simulates Random Numbers A pseudo random number generator is used so we can reproduce the simulation results from day to day with the same inputs Pseudo random number generator uses a seed to start the sampling sequence –The default seed in Simetar is 31517 –Change the seed if you like If you do not use a pseudo random number generator then every time you simulate the model you get different answers, even if the input has not changed

Latin Hyper Cube vs. Monte Carlo Simulated Numbers Monte Carlo simulation procedure samples randomly from the full range of the possible values for a random variable –Requires large number of iterations for adequate coverage over possible range of a variable –For small number of iterations does not sample adequately

Latin Hyper Cube vs. Monte Carlo Simulated Numbers Latin Hyper Cube systematically samples all segments of the distribution for a random variable –If 100 iterations are to be simulated, LHC samples one value randomly from each of 100 intervals of equal length on 0 to 1 USD scale –Insures all segments of distribution are sampled, even at small numbers of iterations With LHC get “adequate” sampling coverage of a distribution with fewer iterations

Latin Hyper Cube vs. Monte Carlo A Uniform distribution defined as U(0,1) is a straight line with a 45 0 angle out of the origin A perfect sample would lie on the straight line Use the following USDs –Excel’s =RAND() –Simetar’s =UNIFORM() Simulate these two USDs Draw a CDF with the two random variables, Which one lies on the straight line between 0 and 1? X F(x) 0.0 1.0

Example of Latin Hyper Cube vs. Monte Carlo Simulation of USD

Download ppt "Materials for Lecture 11 Chapters 3 and 6 Chapter 16 Section 4.0 and 5.0 Lecture 11 Pseudo Random LHC.xls Lecture 11 Validation Tests.xls Next 4 slides."

Similar presentations