Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analyze Phase Hypothesis Testing Normal Data Part 2

Similar presentations


Presentation on theme: "Analyze Phase Hypothesis Testing Normal Data Part 2"— Presentation transcript:

1 Analyze Phase Hypothesis Testing Normal Data Part 2
Now we will continue in the Analyze Phase with “Hypothesis Testing Normal Data Part 2”.

2 Hypothesis Testing Normal Data Part 2
Variance Testing Analyze Results Calculate Sample Size Hypothesis Testing NND P1 Hypothesis Testing ND P1 Intro to Hypothesis Testing Inferential Statistics “X” Sifting Welcome to Analyze Hypothesis Testing ND P2 Wrap Up & Action Items Hypothesis Testing NND P2 We are now moving into Hypothesis Testing Normal Data Part 2 where we will address Calculating Sample Size, Variance Testing and Analyzing Results.

3 Tests of Variance are used for both Normal and Non-normal Data.
1 Sample to a target 2 Samples: F-Test 3 or More Samples: Bartlett’s Test Non-Normal Data 2 or more samples: Levene’s Test The null hypothesis states there is no difference between the Standard Deviations or variances. Ho: σ1 = σ2 = σ3 … Ha: at least one is different Please read the slide.

4 Use the sample size calculations for a 1 sample t-test.
1-Sample Variance A 1-sample variance test is used to compare an expected population variance to a target. If the target variance lies inside the confidence interval then we fail to reject the null hypothesis. Ho: σ2Sample = σ2Target Ha: σ2Sample ≠ σ2Target Use the sample size calculations for a 1 sample t-test. Stat > Basic Statistics > Graphical Summary Please read the slide.

5 1-Sample Variance Practical Problem: We are considering changing supplies for a part we currently purchase from a supplier that charges a premium for the hardening process and has a large variance in their process. The proposed new supplier has provided us with a sample of their product. They have stated they can maintain a variance of 0.10. 2. Statistical Problem: Ho: σ2 = 0.10 or Ho: σ = 0.31 Ha: σ2 ≠ Ha: σ ≠ 0.31 3. 1-sample variance: α = β = 0.10 The Statistical Problem can be stated two ways: The null hypothesis: The variance is equal to 0.10 and the alternative hypothesis: The variance is not equal to 0.10 OR The null hypothesis: The Standard Deviation is equal to 0.31 and the alternative hypothesis: The Standard Deviation is not equal to 0.31

6 Open the MINITABTM worksheet: “Exh_Stat.MTW”
1-Sample Variance Stat > Basic Statistics > Graphical Summary 4. Sample Size: Open the MINITABTM worksheet: “Exh_Stat.MTW” This is the same file used for the 1 Sample t example. We will assume the sample size is adequate. 5. State Statistical Solution Please read the slide.

7 Recall the target Standard Deviation is 0.31.
1-Sample Variance Recall the target Standard Deviation is 0.31. Take time to notice the Standard Deviation of 0.31 falls within 95% confidence interval. Based off this data the Statistical Solution is “fail to reject the null”. What does this mean from a practical standpoint? They can maintain a variance of 0.10 that is valid. Typically shifting a Mean is easier to accomplish in a process than reducing variance. The new supplier would be worth continuing the relationship to see if they can increase the Mean slightly while maintaining the reduced variance.

8 Test of Variance Example
Practical Problem: We want to determine the effect of two different storage methods on the rotting of potatoes. You study conditions conducive to potato rot by injecting potatoes with bacteria that cause rotting and subjecting them to different temperature and oxygen regimes. We can test the data to determine if there is a difference in the Standard Deviation of the rot time between the two different methods. 2. Statistical Problem: Ho: σ1 = σ2 Ha: σ1 ≠ σ2 3. Equal Variance test (F-test since there are only 2 factors.) The Statistical Problem is: The null hypothesis: The Standard Deviation of the first method is equal to the Standard Deviation of the second method. The alternative hypothesis: The Standard Deviation of the first method is not equal to the Standard Deviation of the second method. These hypotheses can also be stated in terms of variance.

9 Test of Variance Example
4. Sample Size: α = β = 0.10 MINITABTM Session Window Open Minitab Worksheet “EXH_AOV.MTW” Stat > Power and Sample Size > s2 2 Variances… Power and Sample Size Test for Two Standard Deviations Testing (StDev 1 / StDev 2) = 1 (versus not =) Calculating power for (StDev 1 / StDev 2) = ratio Alpha = 0.05 Method: Levene's Test Sample Target Ratio Size Power Actual Power The sample size is for each group. Minimum sample size of 89 required Now open the data set “EXH_AOV.MTW”. Then follow along in MINITABTM. From the “Stat” menu click “Power and Sample Size” then click “2 Variances”. This will open the window shown above. Insert the given values for “Ratios” and for “Power values:”. Hit “OK” to yield the output shown as the Session Window. Here we learn, for the parameters we have selected, we need a minimum sample set of 89. Since the data set provided has over one hundred data points we have the information we need.

10 Normality Test – Follow the Roadmap
5. Statistical Solution: Stat>Basic Statistics>Normality Test Check Normality.

11 Normality Test – Follow the Roadmap
Ho: Data is Normal Ha: Data is NOT Normal Stat>Basic Stats> Normality Test (Use Anderson Darling) According to the graph we have Normal data.

12 Test of Equal Variance – Normal Data
Stat>ANOVA>Test for Equal Variance Now conduct the test for Equal Variance.

13 Test of Equal Variance – Normal Data
6. Practical Solution: The difference between the Standard Deviations from the two samples is not significant. Use F-Test for 2 samples Normally distributed data. P-value < 0.05 (.002) What is the Statistical Solution? Fail to reject.

14 We can assume our data is Normally Distributed.
Normality Test Perform another test using the column Rot. First run the Normality Test… The p-value is > 0.05 We can assume our data is Normally Distributed. Please read the slide.

15 Test for Equal Variance (Normal Data)
Then run a Test for Equal Variance using using Rot as a “Response:” and Temp as “Factors:”. Please read the slide.

16 P-value > 0.05; there is no statistically significant difference.
Test of Equal Variance Ho: σ1 = σ2 Ha: σ1≠ σ2 P-value > 0.05; there is no statistically significant difference. You can see there is no statistical difference for variance in Rot based on temperature as a factor. Since the data is Normally Distributed and we have 2 samples use F-Test statistic.

17 Use F- Test for 2 samples of Normally Distributed Data.
Test of Equal Variance Use F- Test for 2 samples of Normally Distributed Data. As you can see this Session Window shows the same P-value.

18 Continuous Data - Normal
Another method for testing for Equal Variance will allow more than one factor. The Labels are the factors. The data is the Output.

19 Test For Equal Variances
Stat>ANOVA>Test for Equal Variance This time we have Rot as the response and Temp and Oxygen as the factors.

20 Test For Equal Variances Graphical Analysis
P-value > 0.05 shows insignificant difference between variance This graph shows a test of Equal Variance displaying Bonferroni 95% confidence for the response Standard Deviation at each level. As you will see the Bartlett’s and Levene’s test are displayed in the same Session Window. The asymmetry of the intervals is due to the Skewness of the chi-square distribution. For the potato rot example you fail to reject the null hypothesis of the variances being equal.

21 Test For Equal Variances Statistical Analysis
Test for Equal Variances: Rot versus Temp, Oxygen 95% Bonferroni confidence intervals for Standard Deviations - Temp Oxygen N Lower StDev `Upper Bartlett's Test (Normal distribution) Test statistic = P-value = 0.744 Levene's Test (any continuous distribution) Test statistic = P-value = 0.858 Use this if data is Normal and for Factors > or = 2 data is Non-normal and Does the Session Window have the same P-values as the Graphical Analysis?

22 Tests for Variance Exercise
Exercise objective: Utilize what you have learned to conduct and analyze a test for Equal Variance using MINITABTM. 1. The quality manager was challenged by the plant director as to why the VOC levels in the product varied so much. After using a Process Map some potential sources of variation were identified. These sources included operating shifts and the raw material supplier. Of course the quality manager has already clarified the Gage R&R results were less than 17% study variation so the gage was acceptable. The quality manager decided to investigate the effect of the raw material supplier. He wants to see if the variation of the product quality is different when using supplier A or supplier B. He wants to be at least 95% confident the variances are similar when using the two suppliers. Use data ppm VOC and RM Supplier to determine if there is a difference between suppliers. Exercise.

23 Tests for Variance Exercise: Solution
First we want to do a graphical summary of the two samples from the two suppliers. Please read the slide.

24 Tests for Variance Exercise: Solution
In “Variables:” enter ‘ppm VOC’ In “By variables:” enter ‘RM Supplier’ We want to see if the two samples are from Normal populations. Please read the slide.

25 Tests for Variance Exercise: Solution
The P-value is greater than 0.05 for both Anderson-Darling Normality Tests so we conclude the samples are from Normally Distributed populations because we “failed to reject” the null hypothesis that the data sets are from Normal Distributions. Are both Data Sets Normal? Please read the slide.

26 Tests for Variance Exercise: Solution
Continue to determine if they are of Equal Variance.

27 Tests for Variance Exercise: Solution
For “Response:” enter ‘ppm VOC’ For “Factors:” enter ‘RM Supplier’ Note MINITABTM defaults to 95% confidence level which is exactly the level we want to test for this problem. Please read the slide.

28 Tests for Variance Exercise: Solution
Because both populations were considered to be Normally Distributed the F-test is used to evaluate whether the variances (Standard Deviation squared) are equal. The P-value of the F-test was greater than 0.05 so we “fail to reject” the null hypothesis. So once again in English: The variances are equal between the results from the two suppliers on our product’s ppm VOC level. Please read the slide.

29 Normal Continuous Data
Hypothesis Testing Roadmap Normal Test of Equal Variance 1 Sample t-test 1 Sample Variance Variance Not Equal Variance Equal 2 Sample T One Way ANOVA Continuous Data Two Samples Now we will explore ANOVA.

30 Purpose of ANOVA Analysis of Variance (ANOVA) is used to investigate and model the relationship between a response variable and one or more independent variables. Analysis of Variance extends the two sample t-test for testing the equality of two population Means to a more general null hypothesis of comparing the equality of more than two Means versus them not all being equal. The classification variable, or factor, usually has three or more levels (If there are only two levels, a t-test can be used). Allows you to examine differences among Means using multiple comparisons. The ANOVA test statistic is: Please read the slide.

31 What do we want to know? Is the “between group” variation large enough to be distinguished from the “within group” variation? μ1 μ2 delta (δ) (Between Group Variation) Within Group Variation (level of supplier 1) Total (Overall) Variation Please read the slide.

32 Calculating ANOVA delta (δ) Where:
G = the number of groups (levels in the study) xij = the individual in the jth group nj = the number of individuals in the jth group or level = the grand Mean Xj = the Mean of the jth group or level delta (δ) (Between Group Variation) Within Group Variation Total (Overall) Variation Take a moment to review the formulas for an ANOVA.

33 Alpha Risk and Pair-Wise t-tests
The alpha risk increases as the number of Means increases with a pair-wise t-test scheme. The formula for testing more than one pair of Means using a t-test is: The reason we do not use a t-test to evaluate series of Means is because the alpha risk increases as the number of Means increases. If we had 7 pairs of Means and an alpha of 0.05 our actual alpha risk could be as high as 30%. Notice we did not say it was 30% only that it could be as high as 30% which is quite unacceptable.

34 Three Samples We have three potential suppliers claiming to have equal levels of quality. Supplier B provides a considerably lower purchase price than either of the other two vendors. We would like to choose the lowest cost supplier but we must ensure we do not effect the quality of our raw material. We would like test the data to determine if there is a difference between the three suppliers. File>Open Worksheet > ANOVA.MTW Please read the slide.

35 Follow the Roadmap…Test for Normality
The samples of all three suppliers are Normally Distributed. Supplier A P-value = 0.568 Supplier B P-value = 0.385 Supplier C P-value = 0.910 Compare P-values.

36 Test for Equal Variance…
Test for Equal Variance (must stack data to create “Response” & “ Factors”): Before testing for Equal Variance you must first stack the worksheet. According to the data there is no significant difference in the variance of the three suppliers.

37 Enter Stacked Supplier data in “Responses:” Check “Boxplots of data”
ANOVA MINITABTM Enter Stacked Supplier data in “Responses:” Click on “Graphs…”, Check “Boxplots of data” Stat>ANOVA>One-Way Unstacked Follow along in MINITABTM.

38 What does this graph tell us?
ANOVA What does this graph tell us? There does not seem to be a huge difference here.

39 No Difference between suppliers
ANOVA Session Window P-value > .05 No Difference between suppliers Stat>ANOVA>One Way (unstacked) One-way ANOVA: Supplier A, Supplier B, Supplier C Source DF SS MS F P Factor Error Total S = R-Sq = 18.95% R-Sq(adj) = 5.44% Level N Mean StDev Supplier A Supplier B Supplier C Individual 95% CIs For Mean Based on Pooled StDev Level Supplier A ( * ) Supplier B ( * ) Supplier C ( * ) Pooled StDev = Looking at the P-value the conclusion is we fail to reject the null hypothesis. According to the data there is no significant difference between the Means of the three suppliers.

40 ANOVA F-Calc F-Critical
One-way ANOVA: Supplier A, Supplier B, Supplier C Source DF SS MS F P Factor Error Total S = R-Sq = 18.95% R-Sq(adj) = 5.44% Level N Mean StDev Supplier A Supplier B Supplier C Individual 95% CIs For Mean Based on Pooled StDev Level Supplier A ( * ) Supplier B ( * ) Supplier C ( * ) Pooled StDev = Before looking up the F-Critical value you must first know what the degrees of freedom is. The purpose of the ANOVA’s test statistic uses variance between the Means divided by variance within the groups. Therefore the degrees of freedom would be three suppliers minus 1 for 2 degrees of freedom. The denominator would be 5 samples minus 1 (for each supplier) multiplied by 3 suppliers, or 12 degrees of freedom. As you can see the F-Critical value is 3.89 and since the F-Calc is 1.40 and not close to the critical value, we fail to reject the null hypothesis.

41 Let’s check how much difference we can see with a sample of 5.
Sample Size Let’s check how much difference we can see with a sample of 5. Power and Sample Size One-way ANOVA Alpha = Assumed Standard Deviation = 1 Number of Levels = 3 Sample Maximum Size Power SS Means Difference The sample size is for each level. Will having a sample of 5 show a difference? After crunching the numbers we conclude a sample of 5 can only detect a difference of 2.56 Standard Deviations. Which means the Mean would have to be at least 2.56 Standard Deviations before we could see a difference. To help elevate this problem a larger sample should be used. If there is a larger sample you would be able to have a more sensitive result for the Means and the variance.

42 If the model is adequate the residual plots will be structureless.
ANOVA Assumptions Observations are adequately described by the model. Errors are Normally and independently distributed. Homogeneity of variance among factor levels. In one-way ANOVA model adequacy can be checked by either of the following: Check the data for Normality at each level and for homogeneity of variance across all levels. Examine the residuals (a residual is the difference in what the model predicts and the true observation). Normal plot of the residuals Residuals versus fits Residuals versus order Please read the slide. If the model is adequate the residual plots will be structureless.

43 Residual Plots Stat>ANOVA>One-Way Unstacked>Graphs
To generate the residual plots in MINITABTM select “Stat>ANOVA>One-way Unstacked>Graphs”, then select “Individual plots” and check all three types of plots.

44 Histogram of Residuals
The Histogram of Residuals should show a bell-shaped curve. The Histogram of residuals should show a bell shaped curve.

45 Normal Probability Plot of Residuals
Normality plot of the Residuals should follow a straight line. Results of our example look good. The Normality assumption is satisfied. The Normality plot of the residuals should follow a straight line on the probability plot. (Does a pencil cover all the dots?)

46 Residuals versus Fitted Values
The plot of Residuals versus fits examines constant variance. The plot should be structureless with no outliers present. Our example does not indicate a problem. The plot of residuals versus fits examines constant variance. The plot should be structureless with no outliers present.

47 ANOVA Exercise Exercise objective: Utilize what you have learned to conduct an analysis of a one way ANOVA using MINITABTM. The quality manager was challenged by the plant director as to why the VOC levels in the product varied so much. The quality manager now wants to find if the product quality is different because of how the shifts work with the product. The quality manager wants to know if the average is different for the ppm VOC of the product among the production shifts. Use Data in columns “ppm VOC” and “Shift” in “hypotest stud.mtw” to determine the answer for the quality manager at a 95% confidence level. Exercise.

48 ANOVA Exercise: Solution
First we need to do a graphical summary of the samples from the 3 shifts. Stat>Basic Stat>Graphical Summary Please read the slide.

49 ANOVA Exercise: Solution
We want to see if the 3 samples are from Normal populations. In “Variables:” enter ‘ppm VOC’ In “By variables:” enter ‘Shift’ Please read the slide.

50 ANOVA Exercise: Solution
The P-value is greater than 0.05 for both Anderson-Darling Normality Tests so we conclude the samples are from Normally Distributed populations because we “failed to reject” the null hypothesis that the data sets are from Normal Distributions. P-Value P-Value P-Value Please read the slide.

51 ANOVA Exercise: Solution
Now we need to test the variances. For “Response:” enter ‘ppm VOC’ For “Factors:” enter ‘Shift’ First we need to determine if our data has Equal Variances. Stat > ANOVA > Test for Equal Variances… Please read the slide.

52 ANOVA Exercise: Solution
The P-value of the F-test was greater than 0.05 so we “fail to reject” the null hypothesis. Are the variances are equal…Yes! Please read the slide.

53 ANOVA Exercise: Solution
For “Response:” enter ‘ppm VOC’ For “Factor:” enter ‘Shift’ Also be sure to click “Graphs…” to select “Four in one” under residual plots. Also, remember to click “Assume equal variances” because we determined the variances were equal between the 2 samples. We need to use the One-Way ANOVA to determine if the Means are equal of product quality when being produced by the 3 shifts. Again we want to put 95.0 for the confidence level. Stat > ANOVA > One-Way… Please read the slide.

54 ANOVA Exercise: Solution
We must look at the Residual Plots to be sure our ANOVA analysis is valid. Since our residuals look Normally Distributed and randomly patterned we will assume our analysis is correct. Please read the slide.

55 ANOVA Exercise: Solution
Since the P-value of the ANOVA test is less than 0.05 we “reject” the null hypothesis that the Mean product quality as measured in ppm VOC is the same from all shifts. We “accept” the alternate hypothesis that the Mean product quality is different from at least one shift. Since the confidence intervals of the Means do not overlap between Shift 1 and Shift 3 we see one of the shifts is delivering a product quality with a higher level of ppm VOC. Don’t miss that shift! Please read the slide.

56 At this point you should be able to:
Summary At this point you should be able to: Be able to conduct Hypothesis Testing of Variances Understand how to Analyze Hypothesis Testing Results Please read the slide.

57 Learn about IASSC Certifications and Exam options at…
IASSC Certified Lean Six Sigma Green Belt (ICGB) The International Association for Six Sigma Certification (IASSC) is a Professional Association dedicated to growing and enhancing the standards within the Lean Six Sigma Community. IASSC is the only independent third-party certification body within the Lean Six Sigma Industry that does not provide training, mentoring and coaching or consulting services. IASSC exclusively facilitates and delivers centralized universal Lean Six Sigma Certification Standards testing and organizational Accreditations. The IASSC Certified Lean Six Sigma Green Belt (ICGB) is an internationally recognized professional who is well versed in the Lean Six Sigma Methodology who both leads or supports improvement projects. The Certified Green Belt Exam, is a 3 hour 100 question proctored exam. 57


Download ppt "Analyze Phase Hypothesis Testing Normal Data Part 2"

Similar presentations


Ads by Google