Presentation is loading. Please wait.

Presentation is loading. Please wait.

Group 4 AMS 572. Table of Contents 1. Introduction and History 1.1 Part 1: Ahram Woo 1.2 Part 2: Jingwen Zhu 2. Theoretical Background 2.1 Part 1: Xin.

Similar presentations


Presentation on theme: "Group 4 AMS 572. Table of Contents 1. Introduction and History 1.1 Part 1: Ahram Woo 1.2 Part 2: Jingwen Zhu 2. Theoretical Background 2.1 Part 1: Xin."— Presentation transcript:

1 Group 4 AMS 572

2 Table of Contents 1. Introduction and History 1.1 Part 1: Ahram Woo 1.2 Part 2: Jingwen Zhu 2. Theoretical Background 2.1 Part 1: Xin Yu 2.2 Part 2: Unjung Lee 3. Application of ANCOVA and Summary 3.1 Part 1: Xiaojuan Shang 3.2 Part 2: Younga Choi 3.3 Part 3: Qiao Zhang

3 1. Introduction and History Group 4 by Ahram Woo

4 1. Introduction and History Individual by Ahram Woo Ahram Woo Jingwen Zhu Xiaojuan Shang Younga Choi Qiao Zhang Unjung Lee Xin Yu

5 Analysis of covariance : An extension of ANOVA in which main effects and interactions are assessed on Dependent Variable(DV) scores after the DV has been adjusted for by the DVs relationship with one or more Covariates (CVs) 1. Introduction and History 1.1 Introduction to ANCOVA by Ahram Woo ANCOVA = ANOVA + Linear Regression

6 R.A. Fisher who is credited with the introduction of ANCOVA "Studies in crop variation. IV. The experimental determination of the value of top dressings with cereals" published in Journal of Agricultural Science, vol. 17, The paper was published in Introduction and History 1.1 Introduction to ANCOVA by Ahram Woo

7 1. Introduction and History 1.1 Introduction to ANCOVA by Ahram Woo ANOVA is described by R. A. Fisher to assist in the analysis of data from agricultural experiments. ANOVA compare the means of any number of experimental conditions without any increase in Type 1 error. ANOVA is a way of determining whether the average scores of groups differed significantly.

8 Model the relationship between explanatory and dependent variables by fitting a linear equation to observed data. (i.e. Y = a + bX) 1.2 Introduction to Linear Regression by Jingwen Zhu 1. Introduction and History There is a relationship or not ? One variable causes the other? Scatter Plot & Correlation Coefficient

9 1.2 Introduction to Linear Regression by Jingwen Zhu 1. Introduction and History

10 Galton studied data on relative heights of fathers and their sons Conclusions: A taller-than-average father tends to produce a taller-than-average son The son is likely to be less tall than the father in terms of his relative position within his own population 1.2 Introduction to Linear Regression by Jingwen Zhu 1. Introduction and History

11 ANCOVA is a merger of ANOVA and regression. ANCOVA allows to compare one variable in 2 or more groups taking into account (or to correct for) variability of other variables, called covariates. The inclusion of covariates can increase statistical power because it accounts for some of the variability 1.2 Introduction to Linear Regression by Jingwen Zhu 1. Introduction and History

12 Example: whether MCAT scores are significantly different among medical students who had different types of undergraduate majors, when adjusted for year of matriculation? Dependent variable (continuous) MCAT total (most recent) Fixed factor (categorical variables) Undergraduate major 1 = Biology/Chemistry 2 = Other science/health 3 = Other Covariate Year of matriculation 1.2 Introduction to Linear Regression by Jingwen Zhu 1. Introduction and History

13 One factor of k levels or groups. E.g., 3 treatment groups in a drug study. The main objective is to examine the equality of means of different groups. Total variation of observations (SST) can be split in two components: variation between groups (SSA) and variation within groups (SSE). 1.2 Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

14 Consider a layout of a study with 16 subjects that intended to compare 4 treatment groups (G1-G4). Each group contains four subjects. S1S2S3S4 G1 Y11Y12Y13Y14 G2 Y21Y22Y23Y24 G3 Y31Y32Y33Y34 G4 Y41Y42Y43Y Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

15 Model: Assumptions: –Observations y ij are independent. – are normally distributed with mean zero and constant standard deviation. 1.2 Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

16 Hypothesis H o : Means of all groups are equal. H a : At least one of them is not equal to other. ANOVA Table Source of Variance Sum of Squares Degree of Freedom Mean Square F TreatmentSSAa-1SSA/(a-1)MSA/MSE ErrorSSEN-aSSE/(N-a) TotalSSTN Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

17 SSA (Variation between groups) is due to the difference in different groups. E.g. different treatment groups or different doses of the same treatment. 1.2 Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

18 Treatment 12….a SAMPLE MEAN …. 1.2 Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

19 SSE (Variation within groups) is the inherent variation among the observations within each group. 1.2 Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

20 Treatment 12….a.... Sample Mean …. 1.2 Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

21 SST (SUM SQUARE OF TOTAL) is the combination of SSE and SSA 1.2 Introduction to One-way Analysis of Variance by Jingwen Zhu 1. Introduction and History

22 by Xin Yu 2. Theoretical Background 2.1 Model of ANOVA Data, the jth observatio n of the ith group Grand mean of Y Error N(0, σ ^2) Effects of the jth group(we mainly focus on when ai=0,i=1, …,a )

23 by Xin Yu 2. Theoretical Background 2.1 Model of Linear Regression Data, the (ij)th observation Predictor Error Slope and Intersect (we mainly focus on the estimate)

24 2. Theoretical Background 2.1 ANCOVA: ANOVA Merged With Linear Regression by Xin Yu Effects of the ith group (We still focus on if ai=0, i=1, …,a) Known covariance

25 2. Theoretical Background 2.1 How to Perform ANCOVA by Xin Yu ANOVA Model!

26 2. Theoretical Background 2.1 How do we get by Xin Yu Within each group, consider ai as a constant, and notice that we actually only desire the estimate of slope β instead of intersect.

27 2. Theoretical Background 2.1 How do we get (continue) by Xin Yu (*)Within each group, do Least Square: (*)Assume that β1=…=βi=…=βa (*)Which means that αi and β are independent; Or, Covariate has nothing to do with group effect

28 2. Theoretical Background 2.1 How do we get (continue) by Xin Yu We use POOLED ESTIMATE of β

29 2. Theoretical Background by Xin Yu 2.1 Model of ANOVA

30 Y = β 0 + β 1 X+ ε Y : dependent (response) variable X : independent (predictor) variable β 0 : the intercept β 1 : the slope ε : error term ~ N(0,σ 2 ) E(Y) = β 0 + β 1 X 2.2.A The Simple Linear Regression Model by Unjung Lee 2. Theoretical Background

31 X Y Y = β 0 + β 1 x } } β 1 = Slope 1 y { Error: β 0 = Intercept 2.2.A The Simple Linear Regression Model by Unjung Lee 2. Theoretical Background

32 Y Identical normal distrib utions of errors, all cent ered on the regression line. Y = β 0 + β 1 x y N( y|x, y|x 2 ) 2.2.A The Simple Linear Regression Model by Unjung Lee 2. Theoretical Background

33 The relationship between X and Y is the strai ght-line relationship. X and Y has a common variance σ 2. Error is normally distributed. Error is independent. 2.2.A Assumptions of simple linear regression model by Unjung Lee 2. Theoretical Background

34 2.2.A The least squares(LS) method by Unjung Lee 2. Theoretical Background

35 The fitted values and residuals We can get these ones with the nor mal equations 2.2.A The least squares(LS) method by Unjung Lee 2. Theoretical Background

36 X Y Data X Y Three errors fro m a fitted line X Y Three errors from the le ast squares regression li ne e X Errors from the least s quares regression line are minimized 2.2.A Fitting a Regression Line by Unjung Lee 2. Theoretical Background

37 . { Y X xixi yiyi 2.2.A Errors in Regression by Unjung Lee 2. Theoretical Background

38 A statistical model that utilizes two or more q uantitative and qualitative explanatory varia bles (x1,..., xp) to predict a quantitative depe ndent variable Y. Caution: have at least two or more quantitati ve explanatory variables (rule of thumb) 2.2.A Multiple linear regression by Unjung Lee 2. Theoretical Background

39 Involves categorical X variable with two levels –e.g., female-male, employed-not emp loyed, etc. Variable levels coded 0 & 1 Assumes only intercept is different –Slopes are constant across categories 2.2.A Dummy-Variable Regression Model by Unjung Lee 2. Theoretical Background

40 Y X1X1X1X1 0 0 Same slopes b 1 b0b0 b 0 + b 2 Females Males 2.2.A Dummy-Variable Model Relationships by Unjung Lee 2. Theoretical Background

41 Permits use of qualitative data (e.g.: seasonal, class standing, location, gender). 0, 1 coding (nominative data) As part of Diagnostic Checking; incorporate outliers (i.e.: large residuals) and influence m easures. 2.2.A Dummy Variables by Unjung Lee 2. Theoretical Background

42 Hypothesizes interaction between pairs of X vari ables –Response to one X variable varies at different levels of another X variable Contains two-way cross product terms Y = x x x 1 x 2 + Can be combined with other models e.g. dummy variable models 2.2.A Interaction Regression Model by Unjung Lee 2. Theoretical Background

43 Given: Without interaction term, effect of X 1 on Y is me asured by 1 With interaction term, effect of X 1 on Y is measured by X 2 –Effect increases as X 2i increases 2.2.A Effect of Interaction by Unjung Lee 2. Theoretical Background β β β

44 Effect (slope) of X 1 on Y does depend on X 2 value X1X1X1X Y Y = X X X 1 X 2 Y = X 1 + 3(1) + 4 X 1 (1) = X 1 Y = X 1 + 3(0) + 4 X 1 (0) = X A Interaction Example by Unjung Lee 2. Theoretical Background

45 2.2.A The two-way ANOVA by Unjung Lee 2. Theoretical Background

46 soursedfssMs Factor Aa-1SS(A)MS(A) = SS(A)/(a- 1) Factor Bb-1SS(B) MS(B) = SS(B)/(b-1) Intersectio n AB (a-1)(b-1)SS(AB)MS(AB)= SS(AB)/(a- 1)(b-1) Errorab(r-1)SSESSE/ab(r- 1) Totalabr-1SS(Total) 2.2.A The two-way ANOVA table by Unjung Lee 2. Theoretical Background

47 2.2.A Test homogeneity of variance by Unjung Lee 2. Theoretical Background

48 2.2.B Test Whether Ho: by Xin Yu

49 2. Theoretical Background 2.2.B Test Whether Ho: by Xin Yu (1) Define Sum of Square of Errors within Groups Is calculated based on AND, is generated by the random error ε.

50 2. Theoretical Background 2.2.B Test Whether Ho: by Xin Yu

51 2. Theoretical Background 2.2.B Test Whether Ho: by Xin Yu MSA Mean Square between Groups Mean Square within Groups Do F test on MSA and to see whether we can reject our Ho F= MSA/

52 2. Theoretical Background 2.2.C Test Linear Relationship by Xin Yu Assumption 3: Test a linear relationship between the dependent variable and covariate. Ho: β=0 How to do it next? Use F test on SSR and SSE

53 2. Theoretical Background How to calculate SSR and MSR? From each SST is the difference obtained from the sum mation of the square of the differences between and. 2.2.C Test Linear Relationship by Xin Yu

54 2. Theoretical Background How to calculate SSE and MSE? From each SSE is the error obtained from the summation of th e square of the differences between and 2.2.C Test Linear Relationship by Xin Yu

55 2. Theoretical Background Based on the T.S. we determine whether to accept Ho(β=0) or not. Assume Assumption 1 and 2 are already passed. (*)If H0 is true (β=0), we do ANOVA. (*)Otherwise, we do ANCOVA So, anytime we want to use ANCOVA, we need to test the three assumptions first! 2.2.C Test Linear Relationship by Xin Yu

56 3.1 Case Introduction by Xiaojuan Shang 3. Application of ANCOVA Analysis of covariance (ANCOVA) is a statistical procedure that allows you to include both cate gorical and continuous variables in a single mod el. ANCOVA assumes that the regression coeffici ents are homogeneous (the same) across the c ategorical variable. Violation of this assumption can lead to incorrect conclusions

57 3.1 Case Introduction by Xiaojuan Shang 3. Application of ANCOVA Here is an example data file we will use. It conta ins 30 subjects who used one of three diets, diet 1 (diet=1), diet 2 (diet=2) and a control group (d iet=3). Before the start of the study, the height of the subject was measured, and after the study t he weight of the subject was measured.

58 3.1 Data Structure by Xiaojuan Shang 3. Application of ANCOVA

59 3.1 Case Concerns by Xiaojuan Shang 3. Application of ANCOVA Difference between three diet groups Correlation between height and weight Difference between control group and the oth er two groups

60 3.1 Case Data: Compare with ANOVA by Xiaojuan Shang 3. Application of ANCOVA PROC GLM DATA=htwt; CLASS diet ; MODEL weight = diet ; MEANS diet / deponly ; CONTRAST 'compare 1&2 with control' diet ; CONTRAST 'compare diet 1 with 2 ' diet ; RUN; QUIT;

61 3.1 Case Data: Compare with ANOVA by Xiaojuan Shang 3. Application of ANCOVA

62 3.1 Case Data: Compare with ANOVA by Xiaojuan Shang 3. Application of ANCOVA

63 1.Description of data 2. Investigation of equality of slope for the grou ps through traditional ANOVA model (homog eneity of regression assumption) 3.When homogeneity of assumption is violated examination on the effect of the group va riable (diet group) at different levels of the covariat e (height levels). 3.2 SAS Codes for ANCOVA model: Outline by Younga Choi 3. Application of ANCOVA

64 N= 30 IV: (1)Diet (three levels) -diet 1 (diet=1, n=10) -diet 2 (diet=2, n=10) - diet 3, control group, (diet=3, n=10) (2) Height DV: weight of the subject was measured after the study 3.2 Data Description by Younga Choi 3. Application of ANCOVA

65 Comparing means of diet groups Comparing means of diet groups 3.2 Reading the Data & Traditional ANCOVA model by Younga Choi 3. Application of ANCOVA

66 3.2 Homogeneity of Regression Assumption by Younga Choi 3. Application of ANCOVA Checking on the Homogeneity of Regression Assumption:

67 3.2 Homogeneity of Regression Assumption by Younga Choi 3. Application of ANCOVA Checking on the Homogeneity of Regression Assumption: Pairwise Comparisons

68 3.2 Homogeneity of Regression Assumption by Younga Choi 3. Application of ANCOVA When the Homogeneity of Regression Assumption is Violated

69 Comparing Slope of Diet1 and Diet2 and Diet3 Combined 3.2 Homogeneity of Regression Assumption by Younga Choi 3. Application of ANCOVA

70 3.2 Homogeneity of Regression Assumption by Younga Choi 3. Application of ANCOVA

71 Overall mean value of height Overall mean value of height 3.2 Homogeneity of Regression Assumption by Younga Choi 3. Application of ANCOVA

72 3.3 SAS Output- One Way ANOVA Model by Qiao Zhang 3. Application of ANCOVA

73 The results are consistent with those of the ANOVA 3.3 Standard ANCOVA Model by Qiao Zhang 3. Application of ANCOVA

74 3.3 Assumptions (Homogenity of Regresion) by Qiao Zhang 3. Application of ANCOVA

75 Diet=1 Dependent Variable: weight Diet=2 Dependent Variable: weight Diet=3 Dependent Variable: weight There is significant linear relationship between wei ght and height in both diet 2 and diet 3 group, but not in diet 1 group. 3.3 Assumptions (Homogenity of Regresion) by Qiao Zhang 3. Application of ANCOVA

76 The diet*height effect is indeed significant, indicat ing that the slopes do differ across the three diet g roups. 3.3 Assumptions (Homogenity of Regresion) by Qiao Zhang 3. Application of ANCOVA

77 These results indicate a significant difference betwee n diet 1 and diet 2 for those 59 inches tall, and a signi ficant difference for those 64 inches tall. For those w ho are tall (i.e., 68 inches), diet 1 and diet 2 are abo ut equally effective. 3.3 Tests : Comparing diet 1 with diet 2 by Qiao Zhang 3. Application of ANCOVA

78 The difference in weight between diet groups 1 and 2 combined and the contro l group is significant at different he ights. 3.3 Comparing diets 1 and 2 to the control group by Qiao Zhang 3. Application of ANCOVA

79 The test comparing the slopes of diet g roup 1 versus 2 and 3 was significant, and the test comparing the slopes for d iet groups 2 versus 3 was not significant. We can combine slopes for diet group 2 and Testing to pool slopes by Qiao Zhang 3. Application of ANCOVA

80 Pooled slopes model Unpooled slopes model 3.3 Overall analysis: diet groups 2 and 3 by Qiao Zhang 3. Application of ANCOVA

81 3.3 Overall analysis by Qiao Zhang 3. Application of ANCOVA

82 The homogeneity of regression assumption is viol ated in this data set. We then estimated models that have separate s lopes across groups. When comparing the control group to diets 1 a nd 2, we found the control group weighed mor e at 3 different levels of height (59 inches, 64 inc hes and 68 inches). When we comparing diets 1 and 2, we found di et 2 to be more effective at 59 and 64 inches, b ut there was no difference at 68 inches. 3.3 Summary of Outputs by Qiao Zhang 3. Application of ANCOVA


Download ppt "Group 4 AMS 572. Table of Contents 1. Introduction and History 1.1 Part 1: Ahram Woo 1.2 Part 2: Jingwen Zhu 2. Theoretical Background 2.1 Part 1: Xin."

Similar presentations


Ads by Google