Download presentation
Presentation is loading. Please wait.
Published byArlene Cook Modified over 9 years ago
1
A (second-order) multiple regression model with interaction terms
2
A example in which the predictors do not interact
3
Is baby’s birth weight related to smoking during pregnancy? Sample of n = 32 births Response (y): birth weight in grams of baby Potential predictor (x 1 ): smoking status of mother (yes or no) Potential predictor (x 2 ): length of gestation in weeks
4
A first order model with one binary predictor where … y i is birth weight of baby i x i1 is length of gestation of baby i x i2 = 1, if mother smokes and x i2 = 0, if not and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2.
5
Estimated first order model with one binary predictor The regression equation is Weight = - 2390 + 143 Gest - 245 Smoking
6
In what way do the predictors have an “additive effect” on the response? The effect of smoking on the mean birth weight is the same for all gestation lengths. (Exhibited by parallel lines.) The effect of gestation length on the mean birth weight is the same for smokers and non-smokers. (Exhibited by parallel lines.)
7
What are “additive effects”? A regression model contains additive effects if the response function can be written as a sum of functions of the predictor variables: For example:
8
An example where including “interaction terms” is appropriate
9
Compare three treatments (A, B, C) for severe depression Random sample of n = 36 severely depressed individuals. y = measure of treatment effectiveness x 1 = age (in years) x 2 = 1 if patient received A and 0, if not x 3 = 1 if patient received B and 0, if not
10
Compare three treatments (A, B, C) for severe depression
11
A model with interaction terms where … y i is treatment effectiveness for patient i x i1 is age of patient i x i2 = 1, if treatment A and x i2 = 0, if not x i3 = 1, if treatment B and x i3 = 0, if not
12
If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0): In what way do the predictors have an “interaction effect” on the response?
13
The effect of treatment on the treatment’s effectiveness depends on the individual’s age. (Exhibited by non-parallel lines.) The effect of the individual’s age on the treatment’s effectiveness depends on the treatment. (Exhibited by non-parallel lines.)
14
What does it mean for two predictors “to interact”? In general, two predictors interact if the effect on the response variable of one predictor depends on the value of the other. A slope parameter can no longer be interpreted as the change in the mean response for each unit increase in the predictor, while the other predictors are held constant.
15
What are “interaction effects”? A regression model contains interaction effects if the response function cannot be written as a sum of functions of the predictor variables: For example:
16
The estimated regression function If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0): The regression equation is y = 6.21 + 1.03age + 41.3x2 + 22.7x3 - 0.703agex2 - 0.510agex3
17
The estimated regression function
18
Recall the appropriate regression analysis steps Model building –Model formulation –Model estimation –Model evaluation Model use
19
Residuals versus fits plot
20
Normal probability plot
21
Is there a difference in the mean effectiveness for the three treatments? If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0):
22
Test for identical regression functions Analysis of Variance Source DF SS MS F P Regression 5 4932.85 986.57 64.04 0.000 Residual Error 30 462.15 15.40 Total 35 5395.00 Source DF Seq SS age 1 3424.43 x2 1 803.80 x3 1 1.19 agex2 1 375.00 agex3 1 328.42 F distribution with 4 DF in numerator and 30 DF in denominator x P( X <= x ) 24.4900 1.0000
23
Does the effect of age on the treatment’s effectiveness depend on treatment? If patient received B (x i2 = 0, x i3 = 1): If patient received A (x i2 = 1, x i3 = 0): If patient received C (x i2 = 0, x i3 = 0):
24
Test for significant interaction Analysis of Variance Source DF SS MS F P Regression 5 4932.85 986.57 64.04 0.000 Residual Error 30 462.15 15.40 Total 35 5395.00 Source DF Seq SS age 1 3424.43 x2 1 803.80 x3 1 1.19 agex2 1 375.00 agex3 1 328.42 F distribution with 2 DF in numerator and 30 DF in denominator x P( X <= x ) 22.8400 1.0000
25
Another example A model with one qualitative predictor and two quantitative predictors
26
Bird breathing habits in burrows? Experiment with n = 120 nestling bank swallows and n = 120 adult bank swallows Response (y): % increase in “minute ventilation”, Vent, i.e., total volume of air breathed per minute Potential predictor (x 1 ): percentage of oxygen, O2, in the air the baby birds breathe Potential predictor (x 2 ): percentage of carbon dioxide, CO2, in the air the baby birds breathe Potential predictor (x 3 ): 1 if adult, 0 if baby
27
Primary research question Is there any evidence that the adult birds differ from the baby birds in terms of their minute ventilation as a function of oxygen and carbon dioxide?
28
A formulated model where … y i is percentage of minute ventilation x i1 is percentage of oxygen x i2 is percentage of carbon dioxide x i3 is type of bird (0, if baby and 1, if adult) and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2.
29
An aside An example that illustrates the impact of leaving a necessary interaction term out of the model
30
Suggests x is related to y? Suggests there is a treatment effect?
31
A formulated model where … y i is the response x i1 is the variable you want to “adjust for” x i2 is treatment (0 or 1) and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2.
32
Is x related to y? Is there a treatment effect? The regression equation is y = 4.55 - 0.028 x + 1.10 group Predictor Coef SE Coef T P Constant 4.5492 0.8665 5.25 0.000 x -0.0276 0.1288 -0.21 0.831 group 1.0959 0.7056 1.55 0.125... Analysis of Variance Source DF SS MS F P Regression 2 23.255 11.628 1.23 0.298 Residual Error 73 690.453 9.458 Total 75 713.709 Source DF Seq SS x* 1 0.435 group 1 22.820
33
The estimated regression functions
34
The residuals versus fits plot
35
A more appropriately formulated model where … y i is the response x i1 is the variable you want to “adjust for” x i2 is treatment (0 or 1) x i1 x i2 is the “missing” interaction term and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2.
36
The estimated regression function The regression equation is y = 10.1 - 1.04 x - 10.1 group + 2.03 groupx
37
The residuals versus fits plot
38
Is x related to y? Is there a treatment effect? The regression equation is y = 10.1 - 1.04 x - 10.1 group + 2.03 groupx Predictor Coef SE Coef T P Constant 10.1401 0.4320 23.47 0.000 x -1.04416 0.07031 -14.85 0.000 group -10.0859 0.6110 -16.51 0.000 groupx 2.03307 0.09944 20.45 0.000 S = 1.187 R-Sq = 85.8% R-Sq(adj) = 85.2% Analysis of Variance Source DF SS MS F P Regression 3 612.26 204.09 144.84 0.000 Residual Error 72 101.45 1.41 Total 75 713.71
39
Back to the bird example
40
A more appropriately formulated model where … y i is percentage of minute ventilation x i1 is percentage of oxygen x i2 is percentage of carbon dioxide x i3 is type of bird (0, if baby and 1, if adult) and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2.
41
The model yields two response functions For baby birds (x i3 = 0): For adult birds (x i3 = 1):
42
Is there a significant interaction between type and O 2 ? between type and CO 2 ? between O 2 and CO 2 ? The regression equation is Vent = - 18 + 1.19 O2 + 54.3 CO2 + 112 Type - 7.01 TypeO2 + 2.31 TypeCO2 - 1.45 CO2O2 Predictor Coef SE Coef T P Constant -18.4 160.0 -0.11 0.909 O2 1.189 9.854 0.12 0.904 CO2 54.28 25.99 2.09 0.038 Type 111.7 157.7 0.71 0.480 TypeO2 -7.008 9.560 -0.73 0.464 TypeCO2 2.311 7.126 0.32 0.746 CO2O2 -1.449 1.593 -0.91 0.364 S = 165.6 R-Sq = 27.2% R-Sq(adj) = 25.3%
43
Is there a significant interaction between type and O 2 ? between type and CO 2 ? between O 2 and CO 2 ? Analysis of Variance Source DF SS MS F P Regression 6 2387540 397923 14.51 0.000 Residual Error 233 6388603 27419 Total 239 8776143 Source DF Seq SS O2 1 93651 CO2 1 2247696 Type 1 5910 TypeO2 1 14735 TypeCO2 1 2884 CO2O2 1 22664
44
The residual versus fits plot
45
Plot for adult birds
46
Plot for baby birds
47
Is there any evidence that the adult birds differ from the baby birds? The regression equation is Vent = 137 - 8.83 O2 + 32.3 CO2 + 9.9 Type Predictor Coef SE Coef T P Constant 136.77 79.33 1.72 0.086 O2 -8.834 4.765 -1.85 0.065 CO2 32.258 3.551 9.08 0.000 Type 9.93 21.31 0.47 0.642
48
The residuals versus fits plot
49
The normal probability plot
50
Cost of including unnecessary terms in the model For model with interaction terms: Analysis of Variance Source DF SS MS F P Regression 6 2387540 397923 14.51 0.000 Residual Error 233 6388603 27419 Total 239 8776143 For model with no interaction terms: Analysis of Variance Source DF SS MS F P Regression 3 2347257 782419 28.72 0.000 Residual Error 236 6428886 27241 Total 239 8776143
51
Another comment about multiple testing Be aware of impact of multiple testing, but don’t be so extreme about it that your hands are tied. The serious danger is when you have many predictors -- not a small number associated with specific research questions. You can test multiple parameters simultaneously to reduce the number of tests you have to perform. You can reduce individual test’s α level.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.