Carey Williamson Department of Computer Science University of Calgary

Slides:



Advertisements
Similar presentations
Statistical Inference Statistical Inference is the process of making judgments about a population based on properties of the sample Statistical Inference.
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling Distributions (§ )
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Experimental Evaluation
Inferences About Process Quality
Hypothesis Testing. Distribution of Estimator To see the impact of the sample on estimates, try different samples Plot histogram of answers –Is it “normal”
Statistical inference: confidence intervals and hypothesis testing.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Simulation Output Analysis
Topic 5 Statistical inference: point and interval estimate
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
1 OUTPUT ANALYSIS FOR SIMULATIONS. 2 Introduction Analysis of One System Terminating vs. Steady-State Simulations Analysis of Terminating Simulations.
© Copyright McGraw-Hill 2004
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
1 Estimation of Population Mean Dr. T. T. Kachwala.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Chapter 13 Understanding research results: statistical inference.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Methods of Presenting and Interpreting Information Class 9.
Outline Sampling Measurement Descriptive Statistics:
Chapter 3 INTERVAL ESTIMATES
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Confidence Intervals and Sample Size
Chapter 9 Hypothesis Testing.
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
Making inferences from collected data involve two possible tasks:
Confidence Interval Estimation
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Chapter 4. Inference about Process Quality
Unit 5: Hypothesis Testing
CHAPTER 6 Statistical Inference & Hypothesis Testing
Chapter 6: Sampling Distributions
PCB 3043L - General Ecology Data Analysis.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Chapter 2 Simple Comparative Experiments
CPSC 531: System Modeling and Simulation
Parameter, Statistic and Random Samples
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
Introduction to Inferential Statistics
Sampling and Sampling Distributions
Statistical Methods Carey Williamson Department of Computer Science
Hypothesis Tests for a Population Mean in Practice
Chapter 9 Hypothesis Testing.
CONCEPTS OF ESTIMATION
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Chapter 9 Hypothesis Testing.
Discrete Event Simulation - 4
Measures of Dispersion (Spread)
CHAPTER 6 Statistical Inference & Hypothesis Testing
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Hypothesis Testing and Confidence Intervals
Chapter 8: Estimating with Confidence
Sampling Distributions (§ )
Chapter 8: Estimating with Confidence
STA 291 Spring 2008 Lecture 13 Dustin Lueker.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Statistical Inference for the Mean: t-test
Presentation transcript:

Carey Williamson Department of Computer Science University of Calgary Statistical Methods Carey Williamson Department of Computer Science University of Calgary

Plan: Discuss statistical methods in simulations Outline Plan: Discuss statistical methods in simulations Define concepts and terminology Traditional approaches: Hypothesis testing Confidence intervals Batch means Analysis of Variance (ANOVA)

Motivation Simulations rely on pRNG to produce one or more “sample paths” in the stochastic evaluation of a system Results represent probabilistic answers to the initial perf eval questions of interest Simulation results must be interpreted accordingly, using the appropriate statistical approaches and methodology

Hypothesis Testing A technique used to determine whether or not to believe a certain statement (to what degree) Statement is usually regarding a statistic, and some postulated property of the statistic Formulate the “null hypothesis” H0 Alternative hypothesis H1 Decide on statistic to use, and significance level Collect sample data and calculate test statistic Decide whether to accept null hypothesis or not

Used for discrete distributions Chi-Squared Test A technique used to determine if sample data follows a certain known distribution Used for discrete distributions Requires large number of samples (at least 30) Compute D = Σ ------------------ Check value against Chi-Square quantiles k (observedi – expectedi)2 i=1 expectedi

Kolmogorov-Smirnov Test A technique used to determine if sample data follows a certain known distribution Used for continuous distributions Any number of samples is okay (small/large) Uses CDF (known distribution vs empirical distn) Compute max vertical deviation from CDF Check value(s) against K-S quantiles K+ = √n max ( Fobs(x) – Fexp(x) ) K- = √n max ( Fexp(x) – Fobs(x) )

A bit like Goldilocks + the “three bears” Simulation Run Length Choosing the right duration for a simulation is a bit of an art (inexact step) A bit like Goldilocks + the “three bears” Too short: results may not be “typical” Too long: excessive CPU time required Just right: good results, reasonable time Usual approach: guessing; bigger is better

Simulation Warmup One reason why simulation run-length matters is that simulation results might exhibit some temporal bias Example: the first few customers arrive to an empty system, and are never lost Need to determine “steady-state”, and discard (biased) transient results from either warmup or cooldown period

Simulation Replications One way to establish statistical confidence in simulation results is to repeat an experiment multiple times Multiple replications, with exact same config parameters, but different seeds Assumes independent results + normality Can compute the “mean of means” and the “variance of the global mean”

Statistical Inference Methods to estimate the characteristics of an entire population based on data collected from a (random) sample (subset) Many different statistics are possible Desirable properties: Consistent: convergence toward true value as the sample size is increased Unbiased: sample is representative of population Usually works best if samples are independent

True values: μ (mean), σ (std deviation) Random Sampling Different samples typically produce different estimates, since they themselves represent a random variable with some inherent sampling distribution (known/not) Statistics can be used to get point estimates (e.g., mean, variance) or interval estimates (e.g., confidence interval) True values: μ (mean), σ (std deviation)

Sample Mean and Variance Sample variance: Sample standard deviation: s = √s2 n x = 1/n Σ xi i=1 n s2 = 1/(n-1) Σ (xi – x)2 i=1

Chebyshev’s Inequality Expresses a general result about the “goodness” of a sample mean x as an estimate of the true mean μ (for any distn) Want to be within error ε of true mean μ Pr[ x - ε < μ < x + ε] ≥ 1 – Var(x) / ε2 The lower the variance, the better The tighter ε is, the harder it is to be sure!

Recall that Normal distribution is symmetric about the mean Central Limit Theorem The Central Limit Theorem states that the distribution of Z approaches the standard normal distribution as n approaches ∞ N(0,1) has mean 0, variance 1 Recall that Normal distribution is symmetric about the mean About 67% of obs within 1 standard dev About 95% of obs within 2 standard dev

Pr[|x – μ| < ε] ≥ k (confidence level) Confidence Intervals There is inherent error when estimating the true mean μ with the sample mean x How many samples n are needed so that the error is tolerable? (i.e., within some specified threshold value ε) Pr[|x – μ| < ε] ≥ k (confidence level) Depends on variance of sampled process Depends on size of interval ε

Computes a “p value” for a result F-tests and t-tests A statistical technique to assess the level of significance associated with a result Computes a “p value” for a result Loosely stated, this reflects the likelihood (or not) of the observed result occurring, relative to the initial hypothesis made F-tests: relies on the F distribution t-tests: relies on the student-t distribution

Can compute mean for each batch i Can compute mean of means Batch Means Analysis A lengthy simulation run can be split into N batches, each of which is (assumed to be) independent of the other batches Can compute mean for each batch i Can compute mean of means Can compute variance of means Can provide confidence intervals

Analysis of Variance (ANOVA) Often the results from a simulation or an experiment will depend on more than one factor (e.g., job size, service class, load) ANOVA is a technique to determine which factor has the most impact Focuses on variability (variance) of results Attributes a portion of variability to each of the factors involved, or their interaction

Summary Simulations use pRNG to produce probabilistic answers to the performance evaluation questions of interest It is important to interpret simulation results appropriately, using the correct statistical approaches and methodology Basic techniques include confidence intervals, significance tests, and ANOVA