McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Experimental Design and Analysis of Variance Chapter 11.

Slides:



Advertisements
Similar presentations
Experimental Design and Analysis of Variance
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Chapter Nine Comparing Population Means McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 11 Analysis of Variance
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
Design of Experiments and Analysis of Variance
1 Chapter 10 Comparisons Involving Means  1 =  2 ? ANOVA Estimation of the Difference between the Means of Two Populations: Independent Samples Hypothesis.
Chapter 10 Comparisons Involving Means
ANOVA: Analysis of Variation
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS
Chapter 11 Analysis of Variance
Chapter 17 Analysis of Variance
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 15 Analysis of Variance.
Copyright ©2011 Pearson Education 11-1 Chapter 11 Analysis of Variance Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 11-1 Chapter 11 Analysis of Variance Statistics for Managers using Microsoft Excel.
1 1 Slide © 2005 Thomson/South-Western AK/ECON 3480 M & N WINTER 2006 n Power Point Presentation n Professor Ying Kong School of Analytic Studies and Information.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
CHAPTER 3 Analysis of Variance (ANOVA) PART 1
Statistics Design of Experiment.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 30 日 第八週:變異數分析與實驗設計.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Adapted by Peter Au, George Brown College McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited.
QNT 531 Advanced Problems in Statistics and Research Methods
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
INFERENTIAL STATISTICS: Analysis Of Variance ANOVA
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Analysis of Variance Chapter 13 BA 303.
© 2003 Prentice-Hall, Inc.Chap 11-1 Analysis of Variance IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION Dr. Xueping Li University of Tennessee.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Experimental Design and Analysis of Variance Chapter 10.
Chapter 10 Analysis of Variance.
Basic concept Measures of central tendency Measures of central tendency Measures of dispersion & variability.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Lecture 9-1 Analysis of Variance
Chapter 10: Analysis of Variance: Comparing More Than Two Means.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Chapter 4 Analysis of Variance
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
10-1 of 29 ANOVA Experimental Design and Analysis of Variance McGraw-Hill/Irwin Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
CHAPTER 3 Analysis of Variance (ANOVA) PART 1
CHAPTER 4 Analysis of Variance (ANOVA)
Experimental Design and Analysis of Variance
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
CHAPTER 4 Analysis of Variance (ANOVA)
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Chapter 11 Analysis of Variance
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Experimental Design and Analysis of Variance Chapter 11

11-2 Experimental Design and Analysis of Variance 11.1Basic Concepts of Experimental Design 11.2One-Way Analysis of Variance 11.3The Randomized Block Design 11.4Two-Way Analysis of Variance

11-3 Basic Concepts of Experimental Design Up until now, we have considered only two ways of collecting and comparing data: –Using independent random samples –Using paired (or matched) samples Often data is collected as the result of an experiment –To systematically study how one or more factors (variables) influence the variable that is being studied

11-4 Experimental Design #2 In an experiment, there is strict control over the factors contributing to the experiment –The values or levels of the factors are called treatments For example, in testing a medical drug, the experimenters decide which participants in the test get the drug and which ones get the placebo, instead of leaving the choice to the subjects The object is to compare and estimate the effects of different treatments on the response variable

11-5 Experimental Design #3 The different treatments are assigned to objects (the test subjects) called experimental units –When a treatment is applied to more than one experimental unit, the treatment is being “replicated” A designed experiment is an experiment where the analyst controls which treatments are used and how they are applied to the experimental units

11-6 Experimental Design #4 In a completely randomized experimental design, independent random samples are assigned to each of the treatments –For example, suppose three experimental units are to be assigned to five treatments –For completely randomized experimental design, randomly pick three experimental units for one treatment, randomly pick three different experimental units from those remaining for the next treatment, and so on

11-7 Experimental Design #5 Once the experimental units are assigned and the experiment is performed, a value of the response variable is observed for each experimental unit –Obtain a sample of values for the response variable for each treatment

11-8 Experimental Design #6 In a completely randomized experimental design, it is presumed that each sample is a random sample from the population of all possible values of the response variable –That could possibly be observed when using the specific treatment –The samples are independent of each other Reasonable because the completely randomized design ensures that each sample results from different measurements being taken on different experimental units Can also say that an independent samples experiment is being performed

11-9 Example 11.1: Gasoline Mileage Case Compare the effects of three types of gasoline (Types A, B, and C) on the gasoline mileage of a particular make and model midsized automobile –The response variable is gasoline mileage, in miles per gallon (mpg) –The gasoline types (A, B, or C) are the treatments

11-10 Example 11.1: Gasoline Mileage Case #2 Use a completely randomized experimental design –Have available 1,000 cars for testing –Need samples of size five for each gasoline type –Randomly select five cars from the 1,000 cars; assign these five to get gasoline type A –Randomly select five cars from the 995 remaining cars; these five are assigned to get gasoline type B –Randomly select five cars from the 990 remaining cars; these five are assigned to get gasoline type C Each randomly selected car is test driven using the appropriate gasoline type and driving conditions

11-11 Example 11.1: Gasoline Mileage Case #3 The mileage data is listed on the next slide (Table 11.1) –Let x ij denote the mileage x of the jth car (j = 1,2, …, 5) using gasoline type i (i = A, B, or C) –Assume that the mileage data for a particular gasoline type is a random sample of all possible mileages using that type

11-12 Example 11.1: Gasoline Mileage Case #4 Type AType BType C x A1 =34.0x B1 =35.3x C1 =33.3 x A2 =35.0x B2 =36.5x C2 =34.0 x A3 =34.3x B3 =36.4x C3 =34.7 x A4 =35.5x B4 =37.0x C4 =33.0 x A5 =35.8x B5 =37.6x C5 =34.9

11-13 Example 11.1: Gasoline Mileage Case #5 Looking at the box plots below, we could get the idea that type B gives the highest gasoline mileage

11-14 One-Way Analysis of Variance Want to study the effects of all p treatments on a response variable –For each treatment, find the mean and standard deviation of all possible values of the response variable when using that treatment –For treatment i, find treatment mean µ i One-way analysis of variance estimates and compares the effects of the different treatments on the response variable –By estimating and comparing the treatment means µ 1, µ 2, …, µ p –One-way analysis of variance, or one-way ANOVA

11-15 Example 11.4: Gasoline Mileage Case The mean of a sample is the point estimate for the corresponding treatment mean  A = mpg estimates  A  B = mpg estimates  B  C = mpg estimates  C

11-16 Example 11.4: Gasoline Mileage Case Continued The standard deviation of a sample is the point estimate for the corresponding treatment standard estimates s A = mpg estimates σ A s B = mpg estimates σ B s C = mpg estimates σ C

11-17 ANOVA Notation n i denotes the size of the sample randomly selected for treatment i x ij is the j th value of the response variable using treatment i  i is average of the sample of n i values for treatment i –  i is the point estimate of the treatment mean µ i s i is the standard deviation of the sample of n i values for treatment i –s i is the point estimate for the treatment (population) standard deviation σ i

11-18 One-Way ANOVA Assumptions Completely randomized experimental design –Assume that a sample has been selected randomly for each of the p treatments on the response variable using a completely randomized experimental design Constant variance –The p populations of values of the response variable (associated with the p treatments) all have the same variance

11-19 One-Way ANOVA Assumptions Continued Normality –The p populations of values of the response variable all have normal distributions Independence –The samples of experimental units are randomly selected, independent samples

11-20 Notes on Assumptions One-way ANOVA is not very sensitive to violations of the equal variances assumption –Especially when all the samples are about the same size –All of the sample standard deviations should be reasonably equal to each other

11-21 Notes on Assumptions Continued Normality is not crucial –ANOVA results are approximately valid for mound-shaped distributions If the sample distributions are reasonably symmetric and if there are no outliers, then ANOVA results are valid for even small samples For gasoline mileages, the assumptions are roughly satisfied

11-22 Testing for Significant Differences Between Treatment Means Are there any statistically significant differences between the sample (treatment) means? The null hypothesis is that the mean of all p treatments are the same –H 0 : µ 1 = µ 2 = … = µ p The alternative is that some (or all, but at least two) of the p treatments have different effects on the mean response –H a : at least two of µ 1, µ 2, …, µ p differ

11-23 Testing for Significant Differences Between Treatment Means Continued Compare the between-treatment variability to the within-treatment variability –Between-treatment variability is the variability of the sample means from sample to sample –Within-treatment variability is the variability of the treatments (that is, the values) within each sample

11-24 Example 11.5: The Gasoline Mileage Case In Figure 11.1(a), (next slide) the between- treatment variability is not large compared to the within-treatment variability –The between-treatment variability could be the result of sampling variability –Do not have enough evidence to reject H 0 : μ A = μ B = μ C In figure 11.1(b), between-treatment variability is large compared to the within- treatment variability –May have enough evidence to reject H 0 in favor of H a : at least two of μ A, μ B, μ C differ

11-25 Example 11.5: The Gasoline Mileage Case #2

11-26 MINITAB and Excel Output of an ANOVA of Gasoline Mileage Data in Table 11.1

11-27 Partitioning the Total Variability in the Response Total Variability =Between Treatment Variability +Within Treatment Variability Total Sum of Squares =Treatment Sum of Squares +Error Sum of Squares SSTO=SST+SSE

11-28 Note The overall mean  is where n = n 1 + n 2 + … + n i + …. n p Also

11-29 Mean Squares The treatment mean-squares is The error mean-squares is

11-30 F Test for Difference Between Treatment Means Suppose that we want to compare p treatment means The null hypothesis is that all treatment means are the same: –H 0 : µ 1 = µ 2 = … = µ p The alternative hypothesis is that they are not all the same: –H a : at least two of µ 1, µ 2, …, µ p differ

11-31 F Test for Difference Between Treatment Means #2 Define the F statistic: The p-value is the area under the F curve to the right of F, where the F curve has p – 1 numerator and n – p denominator degrees of freedom

11-32 F Test for Difference Between Treatment Means #3 Reject H 0 in favor of H a at the  level of significance if F > F   or if p-value <  F  is based on p – 1 numerator and n – p denominator degrees of freedom

11-33 Gasoline Mileages Data For the p = 3 gasoline types and n = 15 observations (5 observations per type): The overall mean  is The treatment sum of squares is

11-34 Gasoline Mileages Data Continued The error sum of squares is The total sum of squares is SSTO = SST + SSE = =

11-35 Example 11.5: Gasoline Mileages Case The treatment mean squares is The error mean squares is The F statistic is

11-36 Example 11.5: Gasoline Mileages Case #2 At  = 0.05 significance level, use F 0.05 with p - 1 = = 2 numerator and n – p = 15 – 3 = 12 denominator degrees of freedom From Table A.6, F 0.05 = 3.89 F = > F 0.05 = 3.89 Therefore reject H 0 at 0.05 significance level –There is strong evidence at least two of the treatment means differ –So at least two of the three different gasoline types have an effect on gasoline mileage But which ones? Do pairwise comparisons (next topic)

11-37 Example 11.5: Gasoline Mileages Case #3 DegreesSum of MeanF Sourceof FreedomSquaresSquaresStatistic Treatmentsp-1SSTMST = SSTF = MST p-1 MSE Errorn-pSSEMSE = SSE n-p Totaln-1SSTO Example 11.5 The Gasoline Mileage Case (Excel Output)

11-38 Pairwise Comparisons, Individual Intervals Individual 100(1 -  )% confidence interval for µ i – µ h : t  /2 is based on n – p degrees of freedom

11-39 Example 11.6: The Gasoline Mileage Case Comparing three treatments Each sample size is five MSE is q 0.05 = 3.77 for p = 3 and n-p = 12 A Tukey simultaneous 95 percent confidence interval for μ A - μ B

11-40 Pairwise Comparisons, Simultaneous Intervals Tukey simultaneous 100(1 -  )% confidence interval for µ i – µ h : q  is the upper  percentage point of the studentized range for p and (n – p) from Table A.9 m denotes common sample size

11-41 Example 11.6: The Gasoline Mileage Case

11-42 The Randomized Block Design A randomized block design compares p treatments (for example, production methods) on each of b blocks (or experimental units or sets of units; for example, machine operators) –Each block is used exactly once to measure the effect of each and every treatment –The order in which each treatment is assigned to a block should be random

11-43 The Randomized Block Design Continued A generalization of the paired difference design; this design controls for variability in experimental units by comparing each treatment on the same (not independent) experimental units –Differences in the treatments are not hidden by differences in the experimental units (the blocks)

11-44 Randomized Block Design x ij T he value of the response variable when block j uses treatment i  i T he mean of the b response variable observed when using treatment i ( the treatment i mean)  j The mean of the p values of the response variable when using block j (the block j mean)  The mean of all the bp values of the response variable observed in the experiment (the overall mean)

11-45 Randomized Block Design Continued

11-46 Example 11.7: Defective Cardboard Box Case p = 4 treatments (production methods) b = 3 blocks (machine operators) n = 12 observations

11-47 The ANOVA Table, Randomized Blocks DegreesSum of MeanF Sourceof FreedomSquaresSquaresStatistic Treatmentsp-1SSTMST = SSTF(trt) = MST p-1 MSE Blocksb-1SSBMSB = SSBF(blk) = MSB b-1 MSE Error(p-1)  (b-1)SSEMSE = SSE (p-1)(b-1) Total(p  b)-1SSTO where SSTO = SST + SSB + SSE

11-48 Sum of Squares SST measures the amount of between-treatment variability SSB measures the amount of variability due to the blocks SSTO measures the total amount of variability SSE measures the amount of the variability due to error (SSE = SSTO – SST – SSB)

11-49 F Test for Treatment Effects H 0 : No difference between treatment effects H a : At least two treatment effects differ Test statistic: Reject H 0 if –F > F  or –p-value <  F  is based on p-1 numerator and (p-1)  (b-1) denominator degrees of freedom

11-50 F Test for Block Effects H 0 : No difference between block effects H a : At least two block effects differ Test statistic: Reject H 0 if –F > F  or –p-value <  F  is based on p-1 numerator and (p-1)  (b-1) denominator degrees of freedom

11-51 Example 11.7: Sum of Squares For p = 4 treatments (production methods), b = 3 blocks (machine operators), and n = 12 observations SST = SSB = SSTO = SSE = –See textbook (pages ) for details of calculations MST = SST/(p-1) = /2 = MSB = SSB/(b-1) = /2 =

11-52 Example 11.7: Treatment Effects H 0 : no differences between the treatment effects vs H a : at least two treatment effects differ Test at the  = 0.05 level of significance –Reject H 0 if F(treatments) > F 0.05 (based on p-1 numerator and (p-1)(b-1) denominator degrees of freedom F(treatments) = MST/MSE = /0.639 = F 0.05 based on p-1 = 3 numerator and (p-1)(b- 1) = 6 denominator degrees of freedom is 4.76 (Table A.6)

11-53 Example 11.7: Treatment Effects Continued F(treatments) = > F0.05 = 4.76 So reject H 0 at 5% significance level Therefore, we have strong evidence that at least two production methods (the treatments) have different effects on the mean hourly production of defective boxes

11-54 Example 11.7: Block Effects H 0 : no differences between block effects vs H a : at least two block effects differ Test at the  = 0.05 level of significance –Reject H 0 if F(blocks) > F 0.05 (based on p-1 numerator and (p-1)(b-1) denominator degrees of freedom F(blocks) = MSB/MSE = 9.083/0.639 = F 0.05 based on b-1 = 2 numerator and (p-1)(b- 1) = 6 denominator degrees of freedom is 5.14 (Table A.6)

11-55 Example 11.7: Block Effects Continued F(blocks) = > F 0.05 = 5.14 So reject H 0 at 5% significance level Therefore, we have strong evidence that at least two machine operators (the blocks) have different effects on the mean hourly production of defective boxes

11-56 Example 11.7: MINITAB Output of a Randomized Block ANOVA

11-57 Estimation of Treatment Differences Under Randomized Blocks, Individual Intervals Individual 100(1 -  )% confidence interval for µ i - µ h t  /2 is based on (p-1)(b-1) degrees of freedom

11-58 Example 11.8 The Defective Cardboard Box Case t  with (3-1)(4-1) = 6 degrees of freedom

11-59 Estimation of Treatment Differences Under Randomized Blocks, Simultaneous Intervals Tukey simultaneous 100(1 -  )% confidence interval for µ i - µ h q  is the upper  percentage point of the studentized range for p and (p-1)(b-1) from Table A.9

11-60 q  for 4 and 6 Example 11.8 The Defective Cardboard Box Case

11-61 Two-Way Analysis of Variance A two factor factorial design compares the mean response for a levels of factor 1 (for example, display height) and each of b levels of factor 2 (for example, display width) A treatment is a combination of a level of factor 1 and a level of factor 2

11-62 Example 11.9 The Shelf Display Case Tastee Bakery wishes to study the effect of two factors 1.Shelf display height 2.Shelf display width Three setting are used for height and two for width A sample size of three used for each combination

11-63 Example 11.9 The Shelf Display Case Continued

11-64 Example 11.9: Plotting the Treatment Means

11-65 Example 11.9: A MINITAB Output of the Graphical Analysis

11-66 Possible Treatment Effects in Two-Way ANOVA

11-67 Two-Way ANOVA Table DegreesSum of MeanF Sourceof FreedomSquaresSquaresStatistic Factor 1a-1SS(1)MS(1) = SS(1)F(1) = MS(1) a-1 MSE Factor 1b-1SS(2)MS(2) = SS(2)F(2) = MS(2) b-1 MSE Interaction(a-1)(b-1)SS(int)MS(int) = SS(int) F(int) = MS(int) (a-1)(b-1) MSE Errorab(m-1)SSEMSE = SSE ab(m-1) Totalabm-1SSTO

11-68 Example 11.9 The Shelf Display Case

11-69 F Tests for Treatment Effects H 0 : No difference between treatment effects H a : At least two treatment effects differ Test Statistics: Reject H 0 if F > F   or p-value <  F  is based on a-1 and ab(m-1) degrees of freedom F  is based on b-1 and ab(m-1) degrees of freedom F  is based on (a-1)(b-1) and ab(m-1) degrees of freedom Main Effects Interaction

11-70 Estimation of Treatment Differences Under Two-Way ANOVA, Factor 1 Individual 100(1 -  )% confidence interval for µ i - µ i’ –t  /2 is based on ab(m-1) degrees of freedom Tukey simultaneous 100(1 -  )% confidence interval for µ i - µ i’ –q  is the upper  percentage point of the studentized range for a and ab(m-1) from Table A.9

11-71 Estimation of Treatment Differences Under Two-Way ANOVA, Factor 2 Individual 100(1 -  )% confidence interval for µj - µj’ –t  /2 is based on ab(m-1) degrees of freedom Tukey simultaneous 100(1 -  )% confidence interval for µj - µj’ –q  is the upper  percentage point of the studentized range for b and ab(m-1) from Table A.9