Presentation is loading. Please wait.

Presentation is loading. Please wait.

A (second-order) multiple regression model with interaction terms.

Similar presentations


Presentation on theme: "A (second-order) multiple regression model with interaction terms."— Presentation transcript:

1 A (second-order) multiple regression model with interaction terms

2 A example in which the predictors do not interact

3 Is baby’s birth weight related to smoking during pregnancy? Sample of n = 32 births Response (y): birth weight in grams of baby Potential predictor (x 1 ): smoking status of mother (yes or no) Potential predictor (x 2 ): length of gestation in weeks

4 A first order model with one binary predictor where … y i is birth weight of baby i x i1 is length of gestation of baby i x i2 = 1, if mother smokes and x i2 = 0, if not and … the independent error terms  i follow a normal distribution with mean 0 and equal variance  2.

5 Estimated first order model with one binary predictor The regression equation is Weight = - 2390 + 143 Gest - 245 Smoking

6 In what way do the predictors have an “additive effect” on the response? The effect of smoking on the mean birth weight is the same for all gestation lengths. (Exhibited by parallel lines.) The effect of gestation length on the mean birth weight is the same for smokers and non-smokers. (Exhibited by parallel lines.)

7 What are “additive effects”? A regression model contains additive effects if the response function can be written as a sum of functions of the predictor variables: For example:

8 An example where including “interaction terms” is appropriate

9 Compare three treatments (A, B, C) for severe depression Random sample of n = 36 severely depressed individuals. y = measure of treatment effectiveness x 1 = age (in years) x 2 = 1 if patient received A and 0, if not x 3 = 1 if patient received B and 0, if not

10 Compare three treatments (A, B, C) for severe depression

11 A model with interaction terms where … y i is treatment effectiveness for patient i x i1 is age of patient i x i2 = 1, if treatment A and x i2 = 0, if not x i3 = 1, if treatment B and x i3 = 0, if not

12 If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0): In what way do the predictors have an “interaction effect” on the response?

13 The effect of treatment on the treatment’s effectiveness depends on the individual’s age. (Exhibited by non-parallel lines.) The effect of the individual’s age on the treatment’s effectiveness depends on the treatment. (Exhibited by non-parallel lines.)

14 What does it mean for two predictors “to interact”? In general, two predictors interact if the effect on the response variable of one predictor depends on the value of the other. A slope parameter can no longer be interpreted as the change in the mean response for each unit increase in the predictor, while the other predictors are held constant.

15 What are “interaction effects”? A regression model contains interaction effects if the response function cannot be written as a sum of functions of the predictor variables: For example:

16 The estimated regression function If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0): The regression equation is y = 6.21 + 1.03age + 41.3x2 + 22.7x3 - 0.703agex2 - 0.510agex3

17 The estimated regression function

18 Recall the appropriate regression analysis steps Model building –Model formulation –Model estimation –Model evaluation Model use

19 Residuals versus fits plot

20 Normal probability plot

21 Is there a difference in the mean effectiveness for the three treatments? If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0):

22 Test for identical regression functions Analysis of Variance Source DF SS MS F P Regression 5 4932.85 986.57 64.04 0.000 Residual Error 30 462.15 15.40 Total 35 5395.00 Source DF Seq SS age 1 3424.43 x2 1 803.80 x3 1 1.19 agex2 1 375.00 agex3 1 328.42 F distribution with 4 DF in numerator and 30 DF in denominator x P( X <= x ) 24.4900 1.0000

23 Does the effect of age on the treatment’s effectiveness depend on treatment? If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0):

24 Test for significant interaction Analysis of Variance Source DF SS MS F P Regression 5 4932.85 986.57 64.04 0.000 Residual Error 30 462.15 15.40 Total 35 5395.00 Source DF Seq SS age 1 3424.43 x2 1 803.80 x3 1 1.19 agex2 1 375.00 agex3 1 328.42 F distribution with 2 DF in numerator and 30 DF in denominator x P( X <= x ) 22.8400 1.0000

25 Another example A model with one qualitative predictor and two quantitative predictors

26 Bird breathing habits in burrows? Experiment with n = 120 nestling bank swallows and n = 120 adult bank swallows Response (y): % increase in “minute ventilation”, Vent, i.e., total volume of air breathed per minute Potential predictor (x 1 ): percentage of oxygen, O2, in the air the baby birds breathe Potential predictor (x 2 ): percentage of carbon dioxide, CO2, in the air the baby birds breathe Potential predictor (x 3 ): 1 if adult, 0 if baby

27 Primary research question Is there any evidence that the adult birds differ from the baby birds in terms of their minute ventilation as a function of oxygen and carbon dioxide?

28 A formulated model where … y i is percentage of minute ventilation x i1 is percentage of oxygen x i2 is percentage of carbon dioxide x i3 is type of bird (0, if baby and 1, if adult) and … the independent error terms  i follow a normal distribution with mean 0 and equal variance  2.

29 An aside An example that illustrates the impact of leaving a necessary interaction term out of the model

30 Suggests x is related to y? Suggests there is a treatment effect?

31 A formulated model where … y i is the response x i1 is the variable you want to “adjust for” x i2 is treatment (0 or 1) and … the independent error terms  i follow a normal distribution with mean 0 and equal variance  2.

32 Is x related to y? Is there a treatment effect? The regression equation is y = 4.55 - 0.028 x + 1.10 group Predictor Coef SE Coef T P Constant 4.5492 0.8665 5.25 0.000 x -0.0276 0.1288 -0.21 0.831 group 1.0959 0.7056 1.55 0.125... Analysis of Variance Source DF SS MS F P Regression 2 23.255 11.628 1.23 0.298 Residual Error 73 690.453 9.458 Total 75 713.709 Source DF Seq SS x* 1 0.435 group 1 22.820

33 The estimated regression functions

34 The residuals versus fits plot

35 A more appropriately formulated model where … y i is the response x i1 is the variable you want to “adjust for” x i2 is treatment (0 or 1) x i1 x i2 is the “missing” interaction term and … the independent error terms  i follow a normal distribution with mean 0 and equal variance  2.

36 The estimated regression function The regression equation is y = 10.1 - 1.04 x - 10.1 group + 2.03 groupx

37 The residuals versus fits plot

38 Is x related to y? Is there a treatment effect? The regression equation is y = 10.1 - 1.04 x - 10.1 group + 2.03 groupx Predictor Coef SE Coef T P Constant 10.1401 0.4320 23.47 0.000 x -1.04416 0.07031 -14.85 0.000 group -10.0859 0.6110 -16.51 0.000 groupx 2.03307 0.09944 20.45 0.000 S = 1.187 R-Sq = 85.8% R-Sq(adj) = 85.2% Analysis of Variance Source DF SS MS F P Regression 3 612.26 204.09 144.84 0.000 Residual Error 72 101.45 1.41 Total 75 713.71

39 Back to the bird example

40 A more appropriately formulated model where … y i is percentage of minute ventilation x i1 is percentage of oxygen x i2 is percentage of carbon dioxide x i3 is type of bird (0, if baby and 1, if adult) and … the independent error terms  i follow a normal distribution with mean 0 and equal variance  2.

41 The model yields two response functions For baby birds (x i3 = 0): For adult birds (x i3 = 1):

42 Is there a significant interaction between type and O 2 ? between type and CO 2 ? between O 2 and CO 2 ? The regression equation is Vent = - 18 + 1.19 O2 + 54.3 CO2 + 112 Type - 7.01 TypeO2 + 2.31 TypeCO2 - 1.45 CO2O2 Predictor Coef SE Coef T P Constant -18.4 160.0 -0.11 0.909 O2 1.189 9.854 0.12 0.904 CO2 54.28 25.99 2.09 0.038 Type 111.7 157.7 0.71 0.480 TypeO2 -7.008 9.560 -0.73 0.464 TypeCO2 2.311 7.126 0.32 0.746 CO2O2 -1.449 1.593 -0.91 0.364 S = 165.6 R-Sq = 27.2% R-Sq(adj) = 25.3%

43 Is there a significant interaction between type and O 2 ? between type and CO 2 ? between O 2 and CO 2 ? Analysis of Variance Source DF SS MS F P Regression 6 2387540 397923 14.51 0.000 Residual Error 233 6388603 27419 Total 239 8776143 Source DF Seq SS O2 1 93651 CO2 1 2247696 Type 1 5910 TypeO2 1 14735 TypeCO2 1 2884 CO2O2 1 22664

44 The residual versus fits plot

45 Plot for adult birds

46 Plot for baby birds

47 Is there any evidence that the adult birds differ from the baby birds? The regression equation is Vent = 137 - 8.83 O2 + 32.3 CO2 + 9.9 Type Predictor Coef SE Coef T P Constant 136.77 79.33 1.72 0.086 O2 -8.834 4.765 -1.85 0.065 CO2 32.258 3.551 9.08 0.000 Type 9.93 21.31 0.47 0.642

48 The residuals versus fits plot

49 The normal probability plot

50 Cost of including unnecessary terms in the model For model with interaction terms: Analysis of Variance Source DF SS MS F P Regression 6 2387540 397923 14.51 0.000 Residual Error 233 6388603 27419 Total 239 8776143 For model with no interaction terms: Analysis of Variance Source DF SS MS F P Regression 3 2347257 782419 28.72 0.000 Residual Error 236 6428886 27241 Total 239 8776143

51 Another comment about multiple testing Be aware of impact of multiple testing, but don’t be so extreme about it that your hands are tied. The serious danger is when you have many predictors -- not a small number associated with specific research questions. You can test multiple parameters simultaneously to reduce the number of tests you have to perform. You can reduce individual test’s α level.


Download ppt "A (second-order) multiple regression model with interaction terms."

Similar presentations


Ads by Google