Topic 26: Analysis of Covariance. Outline One-way analysis of covariance –Data –Model –Inference –Diagnostics and rememdies Multifactor analysis of covariance.

Slides:



Advertisements
Similar presentations
Simple linear models Straight line is simplest case, but key is that parameters appear linearly in the model Needs estimates of the model parameters (slope.
Advertisements

Topic 12: Multiple Linear Regression
Topic 9: Remedies.
Topic 32: Two-Way Mixed Effects Model. Outline Two-way mixed models Three-way mixed models.
STA305 week 31 Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
EPI 809/Spring Probability Distribution of Random Error.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Topic 3: Simple Linear Regression. Outline Simple linear regression model –Model parameters –Distribution of error terms Estimation of regression parameters.
Comparison of Repeated Measures and Covariance Analysis for Pretest-Posttest Data -By Chunmei Zhou.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Multiple regression analysis
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Every achievement originates from the seed of determination. 1Random Effect.
Incomplete Block Designs
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Topic 16: Multicollinearity and Polynomial Regression.
Topic 28: Unequal Replication in Two-Way ANOVA. Outline Two-way ANOVA with unequal numbers of observations in the cells –Data and model –Regression approach.
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
More complicated ANOVA models: two-way and repeated measures Chapter 12 Zar Chapter 11 Sokal & Rohlf First, remember your ANOVA basics……….
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Topic 7: Analysis of Variance. Outline Partitioning sums of squares Breakdown degrees of freedom Expected mean squares (EMS) F test ANOVA table General.
Lecture 4 SIMPLE LINEAR REGRESSION.
Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Discussion 3 1/20/2014. Outline How to fill out the table in the appendix in HW3 What does the Model statement do in SAS Proc GLM ( please download lab.
Topic 17: Interaction Models. Interaction Models With several explanatory variables, we need to consider the possibility that the effect of one variable.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
Anaregweek11 Regression diagnostics. Regression Diagnostics Partial regression plots Studentized deleted residuals Hat matrix diagonals Dffits, Cook’s.
The Completely Randomized Design (§8.3)
Topic 6: Estimation and Prediction of Y h. Outline Estimation and inference of E(Y h ) Prediction of a new observation Construction of a confidence band.
March 28, 30 Return exam Analyses of covariance 2-way ANOVA Analyses of binary outcomes.
Analisa Regresi Week 7 The Multiple Linear Regression Model
Topic 23: Diagnostics and Remedies. Outline Diagnostics –residual checks ANOVA remedial measures.
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
Topic 30: Random Effects. Outline One-way random effects model –Data –Model –Inference.
Topic 25: Inference for Two-Way ANOVA. Outline Two-way ANOVA –Data, models, parameter estimates ANOVA table, EMS Analytical strategies Regression approach.
PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.
1 1 Slide Simple Linear Regression Estimation and Residuals Chapter 14 BA 303 – Spring 2011.
Environmental Modeling Basic Testing Methods - Statistics III.
1 Experimental Statistics - week 9 Chapter 17: Models with Random Effects Chapter 18: Repeated Measures.
Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model.
Linear Models Alan Lee Sample presentation for STATS 760.
Topic 20: Single Factor Analysis of Variance. Outline Analysis of Variance –One set of treatments (i.e., single factor) Cell means model Factor effects.
Example 1 (knnl917.sas) Y = months survival Treatment (3 levels)
Experimental Statistics - week 9
Topic 29: Three-Way ANOVA. Outline Three-way ANOVA –Data –Model –Inference.
Topic 27: Strategies of Analysis. Outline Strategy for analysis of two-way studies –Interaction is not significant –Interaction is significant What if.
Topic 22: Inference. Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts.
Topic 21: ANOVA and Linear Regression. Outline Review cell means and factor effects models Relationship between factor effects constraint and explanatory.
Analysis of Covariance KNNL – Chapter 22. Analysis of Covariance Goal: To Compare treatments (1-Factor or Multiple Factors) after Controlling for Numeric.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
Education 793 Class Notes ANCOVA Presentation 11.
Chapter 20 Linear and Multiple Regression
Today: Feb 28 Reading Data from existing SAS dataset One-way ANOVA
Inference for Least Squares Lines
Relationship with one independent variable
Topic 31: Two-way Random Effects Models
CHAPTER 29: Multiple Regression*
Joanna Romaniuk Quanticate, Warsaw, Poland
Relationship with one independent variable
Checking the data and assumptions before the final analysis.
Essentials of Statistics for Business and Economics (8e)
Presentation transcript:

Topic 26: Analysis of Covariance

Outline One-way analysis of covariance –Data –Model –Inference –Diagnostics and rememdies Multifactor analysis of covariance

Data for One-Way ANCOVA Y ij is the j th observation on the response variable in the i th group X ij is the j th observation on the covariate in the i th group i = 1,..., r levels (groups) of factor j = 1,..., n i observations for level i

KNNL Example (pg 927) Y is cases of crackers sold during promotion period Factor is the type of promotion (r=3) –Customers sample crackers in store –Additional shelf space –Special display shelves n i =5 different stores per type The covariate X is the number of cases of crackers sold at the store in the preceding period

Data data a1; infile 'c:\...\CH22TA01.DAT'; input cases last trt store; proc print data=a1; run;

Output Obs cases last trt store

Output Obs cases last trt store

Plot the data title1 'Plot of the data'; symbol1 v='1' i=none c=black; symbol2 v='2' i=none c=black; symbol3 v='3' i=none c=black; proc gplot data=a1; plot cases*last=trt/frame; run;

Background Covariates are sometimes called concomitant variables Covariates should be related to the response variable Covariates should not be affected by the treatment variable (factor) Often they are some kind of baseline or pretest value

Basic ideas A covariate can reduce the MSE, thereby increasing power A covariate can adjust for differences in characteristics of subjects in the treatment groups We assume that the covariate will be linearly related to the response and that the relationship will be the same for all levels of the factor. Similar to comparing regression lines.

Cell Means Model for one-way ancova –the  ij are iid N(0, σ 2 ) –Y ij ~N(, σ 2 ) and indep For each i, we have a simple linear regression The slopes are the same The intercepts can be different

Plot of the data with lines title1 'Plot of the data with lines'; symbol1 v='1' i=rl c=black; symbol2 v='2' i=rl c=black; symbol3 v='3' i=rl c=black; proc gplot data=a1; plot cases*last=trt/frame; run;

Parameters The parameters of the model are –μ i for i = 1 to r –β –σ 2

Estimates We use multiple regression methods to estimate the μ i and β We use the residuals from the model to estimate σ 2 The estimate is s 2 (equal to the MSE)

Factor Effects Model for one way anova –the  ij are iid N(0, σ 2 ) The usual constraints are  = 0 Proc glm sets α r = 0

Interpretation of model Expected value of a Y with level i and X ij =x is Expected value of a Y with level i´ and X ij =x is The difference is  i -  i´ Note that this difference does not depend on the value of x (due to assumption of constant slopes)

Proc glm proc glm data=a1; class trt; model cases=last trt; run;

Model output Source DF MS F P Model Error Total 14

Anova table output Type I Source DF SS MS F P last <.0001 trt <.0001 Type III Source DF SS MS F P last <.0001 trt <.0001

Type I vs Type III Because X for covariate is not orthogonal (like indep argument before) to X’s for factor –Order that the X are fit makes a difference. Want to compare means after adjusting for covariate –General rule to use Type III SS when Type I and III SS differ

Parameter Estimates proc glm data=a1; class trt; model cases=last trt /solution; run;

Output Par Est SE t P Int 4.37 B last <.0001 trt B <.0001 trt2 7.9 B <.0001 trt3 0.0 B Common slope is 0.89

Interpretation Expected value of Y with level i of factor A and X=x is So is the expected value of Y when X is equal to the average covariate value This is usually the level of X where the trt means are calculated and compared Need to make sure this level of X is reasonable for each level of the factor

LSMEANS The L(least)S(square) means can be used to obtain these estimates –All other categorical values are set at an equal mix for all levels (I.e., average over the other factors) –All continuous values are set at the overall mean These are similar to subpopulation mean estimates

Intepretation for KNNL example Y is cases of crackers sold under a particular promotion scenario X is the cases of crackers sold during the last period The LSMEANS are the estimated cases of crackers that would be sold for a store with the ave number of crackers sold during the last period

LSMEANS Statement proc glm data=a1; class trt; model cases=last trt; lsmeans trt/ stderr tdiff pdiff cl; run;

Output treat LSMEAN SE P < < <.0001

Output L east Squares Means for Effect treat t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: cases i/j < < <.0001 <.0001

Output treat LSMEAN 95% CL

Output Difference Between 95% CL for i j Means LSM(i)-LSM(j)

Prep data for plot title1 'Plot of the data with the model'; proc glm data=a1; class trt; model cases=last trt; output out=a2 p=pred;

Prep data for plot data a3; set a2; drop cases pred; if trt eq 1 then do cases1=cases; pred1=pred; output; end;

Prep data for plot if treat eq 2 then do cases2=cases; pred2=pred; output; end; if treat eq 3 then do cases3=cases; pred3=pred; output; end; proc print data=a3; run;

Code for plot symbol1 v='1' i=none c=black; symbol2 v='2' i=none c=black; symbol3 v='3' i=none c=black; symbol4 v=none i=rl c=black; symbol5 v=none i=rl c=black; symbol6 v=none i=rl c=black; proc gplot data=a3; plot (cases1 cases2 cases3 pred1 pred2 pred3) *last/frame overlay; run;

Check for equality of slopes title1 'Check for equal slopes'; proc glm data=a1; class trt; model cases=last trt last*trt; run;

Output Type I Source DF F P last <.0001 treat <.0001 last*treat Type III Source DF F P last <.0001 treat last*treat

Diagnostics and remedies Examine the data and residuals Look for outliers that are influential Transform if needed, consider Box-Cox Examine variances (standard deviations) –Use a BY statement and look at the MSE

Last slide Read KNNL Ch 22 We used topic26.sas to generate the output for today