Topic 22: Inference. Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts.

Slides:



Advertisements
Similar presentations
Lesson #24 Multiple Comparisons. When doing ANOVA, suppose we reject H 0 :  1 =  2 =  3 = … =  k Next, we want to know which means differ. This does.
Advertisements

Analysis of Variance (ANOVA) ANOVA methods are widely used for comparing 2 or more population means from populations that are approximately normal in distribution.
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
CHAPTER 25: One-Way Analysis of Variance Comparing Several Means
Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg
Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test.
Analysis of Variance (ANOVA) Statistics for the Social Sciences Psychology 340 Spring 2010.
One-Way ANOVA Multiple Comparisons.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
ANOVA notes NR 245 Austin Troy
Part I – MULTIVARIATE ANALYSIS
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Analysis of Variance: Inferences about 2 or More Means
Comparing Means.
Chapter 3 Experiments with a Single Factor: The Analysis of Variance
Analysis of Variance (ANOVA) MARE 250 Dr. Jason Turner.
One-way Between Groups Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Comparing Several Means: One-way ANOVA Lesson 14.
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
Linear Contrasts and Multiple Comparisons (Chapter 9)
Chapter 12: Analysis of Variance
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.
Introduction to SAS Essentials Mastering SAS for Data Analytics
Topic 28: Unequal Replication in Two-Way ANOVA. Outline Two-way ANOVA with unequal numbers of observations in the cells –Data and model –Regression approach.
Intermediate Applied Statistics STAT 460
When we think only of sincerely helping all others, not ourselves,
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
1 Experimental Statistics - week 6 Chapter 15: Randomized Complete Block Design (15.3) Factorial Models (15.5)
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
Lecture 4 SIMPLE LINEAR REGRESSION.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN,
STA305 week21 The One-Factor Model Statistical model is used to describe data. It is an equation that shows the dependence of the response variable upon.
5-5 Inference on the Ratio of Variances of Two Normal Populations The F Distribution We wish to test the hypotheses: The development of a test procedure.
Everyday is a new beginning in life. Every moment is a time for self vigilance.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, October 15, 2013 Analysis of Variance (ANOVA)
Topic 6: Estimation and Prediction of Y h. Outline Estimation and inference of E(Y h ) Prediction of a new observation Construction of a confidence band.
Topic 30: Random Effects. Outline One-way random effects model –Data –Model –Inference.
Topic 25: Inference for Two-Way ANOVA. Outline Two-way ANOVA –Data, models, parameter estimates ANOVA table, EMS Analytical strategies Regression approach.
Topic 26: Analysis of Covariance. Outline One-way analysis of covariance –Data –Model –Inference –Diagnostics and rememdies Multifactor analysis of covariance.
Topic 4: Statistical Inference. Outline Statistical inference –confidence intervals –significance tests Statistical inference for β 1 Statistical inference.
ANOVA: Graphical. Cereal Example: nknw677.sas Y = number of cases of cereal sold (CASES) X = design of the cereal package (PKGDES) r = 4 (there were 4.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model.
Experimental Statistics - week 3
Topic 20: Single Factor Analysis of Variance. Outline Analysis of Variance –One set of treatments (i.e., single factor) Cell means model Factor effects.
1 Topic 9 – Multiple Comparisons Multiple Comparisons of Treatment Means Reading:
One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.
Other Types of t-tests Recapitulation Recapitulation 1. Still dealing with random samples. 2. However, they are partitioned into two subsamples. 3. Interest.
Statistics for the Social Sciences Psychology 340 Spring 2009 Analysis of Variance (ANOVA)
Experimental Statistics - week 9
Topic 29: Three-Way ANOVA. Outline Three-way ANOVA –Data –Model –Inference.
Topic 27: Strategies of Analysis. Outline Strategy for analysis of two-way studies –Interaction is not significant –Interaction is significant What if.
Chapters Way Analysis of Variance - Completely Randomized Design.
Topic 21: ANOVA and Linear Regression. Outline Review cell means and factor effects models Relationship between factor effects constraint and explanatory.
Stats/Methods II JEOPARDY. Jeopardy Estimation ANOVA shorthand ANOVA concepts Post hoc testsSurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Posthoc Comparisons finding the differences. Statistical Significance What does a statistically significant F statistic, in a Oneway ANOVA, tell us? What.
Chapter 12 Introduction to Analysis of Variance
1 Experimental Statistics - week 5 Chapter 9: Multiple Comparisons Chapter 15: Randomized Complete Block Design (15.3)
Comparing Three or More Means
Analysis of Treatment Means
After ANOVA If your F < F critical: Null not rejected, stop right now!! If your F > F critical: Null rejected, now figure out which of the multiple means.
Experimental Statistics - Week 4 (Lab)
Analysis of Treatment Means
Presentation transcript:

Topic 22: Inference

Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts

Cell Means Model Y ij = μ i + ε ij where μ i is the theoretical mean or expected value of all observations at level i and the ε ij are iid N(0, σ 2 ) Y ij ~N(μ i, σ 2 ) independent

Parameters The parameters of the model are – μ 1, μ 2, …, μ r –σ 2 Estimate μ i by the mean of the observations at level i, –û i = = (ΣY ij )/(n i ) (level i sample mean) –s i 2 = Σ(Y ij - ) 2 /(n i -1) (level i sample variance) –s 2 = Σ(n i -1)s i 2 / (n T -r) (pooled variance)

F test F = MSM/MSE H 0 : μ 1 = μ 2 = … = μ r = μ H 1 : not all of the μ i are equal Under H 0, F ~ F(r-1, n T -r) Reject H 0 when F is large Report the P-value

KNNL Example KNNL p 676, p 685 Y is the number of cases of cereal sold X is the design of the cereal package –there are 4 levels for X because there are 4 different package designs i =1 to 4 levels j =1 to n i stores with design i

Plot the means proc means data=a1; var cases; by design; output out=a2 mean=avcases; symbol1 v=circle i=join; proc gplot data=a2; plot avcases*design; run;

The means

Confidence intervals ~ N(μ i, σ 2 /n i ) CI for μ i is ± t c t c is computed from the t(α/2,n T -r) Degrees of freedom larger than n i -1 because we’re pooling variances together into one single estimate This is advantage of ANOVA if appropriate

Using Proc Means proc means data=a1 mean std stderr clm maxdec=2; class design; var cases; run; This does not use the pooled estimate of variance.

Output N des Obs Mean Std Dev Std Error We can use this information to calculate SSE SSE = 4(2.3)2 + 4(3.65)2 + 3(2.65)2 + 4(3.96)2 =

Confidence Intervals Lower 95% Upper 95% des CL for Mean CL for Mean There is no pooling of error in computing these CIs Each interval assumes different variance estimate

CI’s using PROC GLM proc glm data=a1; class design; model cases=design; means design/t clm; run;

Output The GLM Procedure t Confidence Intervals for cases Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square Critical Value of t

CI Output 95% Confidence des N Mean Limits These CI’s are often narrower because more degrees of freedom (common variance)

Multiplicity Problem We have constructed 4 (in general, r) 95% confidence intervals The overall confidence level (all intervals contain its mean) is less that 95% Many different kinds of adjustments have been proposed We have previously discussed the Bonferroni (i.e., use α/r)

BON option proc glm data=a1; class design; model cases=design; means design/bon clm; run;

Output Bonferroni t Confidence Intervals for cases Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square Critical Value of t Note the bigger t*

Bonferroni CIs Simultaneous 95% des N Mean Confidence Limits

Using LSMEANS LSMEANS statement provides proper SEs for each individual treatment estimate AND confidence intervals as well lsmeans design / cl; However, statement does not provide ability to adjust these single mean CIs for multiplicity

Hypothesis tests on individual means Not usually done Use PROC MEANS options T and PROBT for a test of the null hypothesis H 0 : μ i = 0 To test H 0 : μ i = c, where c is an arbitrary constant, first use a DATA STEP to subtract c from all observations and then run PROC MEANS options T and PROBT Can also use GLM’s MEAN statement with CLM option (more df)

Differences in means ~ N(μ i -μ k, σ 2 /n i + σ 2 /n k ) CI for μ i -μ k is ± t c s( ) where s( ) =s

Determining t c We deal with the multiplicity problem by adjusting t c Many different choices are available –Change α level (e.g., bonferonni) –Use different distribution

LSD Least Significant Difference (LSD) Simply ignores multiplicity issue Uses t(n T -r) to determine critical value Called T or LSD in SAS

Bonferroni Use the error budget idea There are r(r-1)/2 comparisons among r means So, replace α by α/(r(r-1)/2) and use t(n T -r) to determine critical value Called BON in SAS

Tukey Based on the studentized range distribution (maximum minus minimum divided by the standard deviation) t c = q c / where q c is detemined from SRD Details are in KNNL Section 17.5 (p 746) Called TUKEY in SAS

Scheffe Based on the F distribution t c = Takes care of multiplicity for all linear combinations of means Protects against data snooping Called SCHEFFE in SAS See KNNL Section 17.6 (page 753)

Multiple Comparisons LSD is too liberal (get too many Type I errors) Scheffe is too conservative (very low power) Bonferroni is ok for small r Tukey (HSD) is recommended

Example proc glm data=a1; class design; model cases=design; means design/ lsd tukey bon scheffe; run;

LSD t Tests (LSD) for cases NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square Critical Value of t

LSD Intervals Difference design Between Simultaneous 95% Comparison Means Confidence Limits *** *** *** *** *** *** *** *** *** ***

Tukey Tukey's Studentized Range (HSD) Test for cases NOTE: This test controls the Type I experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square Critical Value of Studentized Range /sqrt(2)= 2.88

Tukey Intervals Difference design Between Simultaneous 95% Comparison Means Confidence Limits *** *** *** *** *** ***

Scheffe Scheffe’s Test for cases NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than Tukey's for all pairwise comparisons. Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square Critical Value of F Sqrt(3* )= 3.14

Scheffe Intervals Difference design Between Simultaneous 95% Comparison Means Confidence Limits *** *** *** *** *** ***

Output (LINES option) Mean N design A B B B B B Scheffe

Using LSMEANS LSMEANS statement provides proper SEs AND confidence intervals based on multiplicity adjustment lsmeans design / cl adjust=bon; However, statement does not have lines option Can use PROC GLIMMIX to do this….approach shown in sasfile

Linear Combinations of Means These combinations should come from research questions, not from an examination of the data L = Σ i c i μ i = Σ i c i ~ N(L, Var( ) ) Var( ) = Σ i c i 2 Var( ) Estimated by s 2 Σ i (c i 2 /n i )

Contrasts Special case of a linear combination Requires Σ i c i = 0 Example 1: μ 1 – μ 2 Example 2: μ 1 – (μ 2 + μ 3 )/2 Example 3: (μ 1 + μ 2 )/2 - (μ 3 + μ 4 )/2

Contrast and Estimate proc glm data=a1; class design; model cases=design; contrast '1&2 v 3&4' design ; estimate '1&2 v 3&4' design ; run;

Output Contr DF SS MS F P 1&2 v 3& <.0001 St Par Est Err t P 1&2 v 3& <.0001 Contrast performs F test. Estimate performs t-test and gives estimate.

Multiple contrasts We can simultaneously test a collection of contrasts (1 df each contrast) Example 1, H 0 : μ 2 = μ 3 = μ 4 The F statistic for this test will have an F(2, n T -r) distribution Example 2, H 0 : μ 1 = (μ 2 + μ 3 + μ 4 )/3 The F statistic for this test will have an F(1, n T -r) distribution

Example proc glm data=a1; class design; model cases=design; contrast '1 v 2&3&4' design ; estimate '1 v 2&3&4' design /divisor=3; contrast '2 v 3 v 4' design , design ;

Output Con DF F P 1 v 2&3& v 3 v <.0001 St Par Est Err t P 1 v 2&3&

Last slide We did large part of Chapter 17 We used programs topic22.sas to generate the output for today