Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inferential Statistics A Closer Look. Analyze Phase2 Nature of Inference in·fer·ence (n.) “The act or process of deriving logical conclusions from premises.

Similar presentations


Presentation on theme: "Inferential Statistics A Closer Look. Analyze Phase2 Nature of Inference in·fer·ence (n.) “The act or process of deriving logical conclusions from premises."— Presentation transcript:

1 Inferential Statistics A Closer Look

2 Analyze Phase2 Nature of Inference in·fer·ence (n.) “The act or process of deriving logical conclusions from premises known or assumed to be true. The act of reasoning from factual knowledge or evidence.” 1 1. Dictionary.com Inferential Statistics – To draw inferences about the process or population being studied by modeling patterns of data in a way that account for randomness and uncertainty in the observations. 2 2. Wikipedia.com

3 Analyze Phase3 5 Step Approach to Inferential Statistics So many questions….? 1. What do you want to know? 2. What tool will give you that information? 5. How confident are you with your data summaries? 4. How will you collect the data? 3. What kind of data does that tool require?

4 Analyze Phase4 Types of Error 1.Error in sampling Error due to differences among samples drawn at random from the population (luck of the draw). This is the only source of error that statistics can accommodate. 2.Bias in sampling Error due to lack of independence among random samples or due to systematic sampling procedures (height of horse jockeys only). 3.Error in measurement Error in the measurement of the samples (MSA/GR&R) 4.Lack of measurement validity Error in the measurement does not actually measure what it intends to measure (placing a probe in the wrong slot measuring temperature with a thermometer that is just next to a furnace).

5 Analyze Phase5 Population, Sample, Observation Population –EVERY data point that has ever been or ever will be generated from a given characteristic. Sample –A portion (or subset) of the population, either at one time or over time. Observation –An individual measurement. X X X X X X

6 Analyze Phase6 Significance Significance is all about differences. In general, larger differences (or deltas) are considered to be “more significant.” Practical difference and significance is: The amount of difference, change, or improvement that will be of practical, economic, or technical value to you. The amount of improvement required to pay for the cost of making the improvement. Statistical difference and significance is: The magnitude of difference or change required to distinguish between a true difference, change, or improvement and one that could have occurred by chance. Six Sigma decisions will ultimately have a return on resource investment (RORI)* element associated with them. The key question of interest for our decisions “is the benefit of making a change worth the cost and risk of making it?” * RORI includes not only dollars and assets but the time and participation of your teams.

7 Analyze Phase7 The Mission Your mission, which you have chosen to accept, is to reduce cycle time, reduce the error rate, reduce costs, reduce investment, improve service level, improve throughput, reduce lead time, increase productivity… change the output metric of some process, etc… In statistical terms, this translates to the need to move the process mean and/or reduce the process standard deviation You’ll be making decisions about how to adjust key process input variables based on sample data, not population data - that means you are taking some risks. How will you know your key process output variable really changed, and is not just an unlikely sample? The Central Limit Theorem helps us understand the risk we are taking and is the basis for using sampling to estimate population parameters. Mean Shift Variation Reduction Both

8 Analyze Phase8 A Distribution of Sample Means Imagine you have some population, the individual values of this population form some distribution. Take a sample of some of the individual values and calculate the sample mean. Keep taking samples and calculating sample means. Plot a new distribution of these sample means. The central limit theorem says that as the sample size becomes large, this new distribution (the sample mean distribution) will form a normal distribution, no matter what the shape of the population distribution of individuals.

9 Analyze Phase9 Sampling Distributions—The Foundation of Statistics 3 5 2 12 10 1 6 12 5 6 12 14 3 6 11 9 10 12 Population 1 92 1283 956 71411 81010 Sample 1Sample 2Sample 3 7.4 9.2 6.4 Samples from the population, each with five observations: In this example, we have taken three samples out of the population, each with five observations in it. We computed a mean for each sample. Note that the means are not the same! Why not? What would happen if we kept taking more samples?

10 Analyze Phase10 Central Limit Theorem If all possible random samples, each of size n, are taken from any population with a mean μ and standard deviation σ, the distribution of sample means will: have a mean have a stddev and be normally distributed when the parent population is normally distributed, or will be approximately normal for samples of size 30 or more when the parent population is not normally distributed. This improves with samples of larger size. Bigger is Better!

11 Analyze Phase11 So What? So how does this theorem help me understand the risk I am taking when I use sample data, instead of population data? Recall that 95% of normally distributed data is within ± 2 standard deviations from the mean. Therefore, the probability is 95% that my sample mean is within 2 standard errors of the true population mean.

12 Analyze Phase12 A Practical Example Let’s say your project is to reduce the setup time for a large casting Based on a sample of 20 setups, you learn that your baseline average is 45 minutes, with a standard deviation of 10 minutes. Because this is just a sample, the 45 minute average is just an estimate of the true average. Using the central limit theorem, there is 95% probability that the true average is somewhere between 40.5 and 49.5 minutes. Therefore, don’t get too excited if you made a process change that resulted in a reduction of only 2 minutes.

13 Analyze Phase13 Sample Size and the Mean When taking a sample we have only estimated the true mean All we know is that the true mean lies somewhere within the theoretical distribution of sample means or the t-distribution which are analyzed using t-tests. T-tests measure the significance of differences between means. Distribution of individuals in the population Theoretical distribution of sample means for n = 10 Theoretical distribution of sample means for n = 2

14 Analyze Phase14 Standard Error of the Mean The standard deviation for the distribution of means is called the standard error of the mean and is defined as: This formula shows that the mean is more stable than a single observation by a factor of the square root of the sample size.

15 Analyze Phase15 Standard Error The rate of change in the standard error approaches zero at about 30 samples. This is why 30 samples is often recommended when generating summary statistics such as the mean and standard deviation. This is also the point at which the t and Z distributions become nearly equivalent. 30201 00 Sample Size Standard Error 5

16 End of Presentation SSD Global University


Download ppt "Inferential Statistics A Closer Look. Analyze Phase2 Nature of Inference in·fer·ence (n.) “The act or process of deriving logical conclusions from premises."

Similar presentations


Ads by Google