Confounding, Effect Modification, and Stratification
Adding a Third Dimension to the RxC picture ExposureDisease ? Mediator Confounder Effect modifier
1. Confounding A confounding variable is associated with the exposure and it affects the outcome, but it is not an intermediate link in the chain of causation between exposure and outcome.
Confounding: Example
Confounding: example Drinker Non-drinker Lung cancerNo lung cancer % of cases are drinkers, but only 25% of controls are drinkers. Therefore, it appears that drinking is strongly associated with lung cancer.
Confounding: example Drinker Non-drinker Lung cancerNo lung cancer Drinker Non-drinker Lung cancerNo lung cancer Smoker Non-smoker Among smokers, 45/75=60% of lung cancer cases drink and 15/25=60% of controls drink. Among non-smokers 5/25=20% of lung cancer cases drink and 35/175=20% of controls drink
Confusion over postmenopausal hormones: Postmenopausal HRT Heart attacks (MI) ? High SES, high education, other confounders?
Mixture May Rival Estrogen in Preventing Heart Disease August 15, 1996, Thursday “Widely prescribed hormone pills that combine estrogen and progestin appear to be just as effective as estrogen alone in preventing heart disease in women after menopause, a study has concluded.” “Many women take hormones … to reduce the risk of heart disease and broken bones.” “More than 30 studies have found that estrogen after menopause is good for the heart.”
Example: Nurse’s Health Study protective relative risks
Nurse’s Health Study
No apparent Confounding… e.g., the effect is the same among smokers and non-smokers, so the association couldn’t be due to confounding by smokers (who may take less hormones and certainly get more heart disease).
RCT: Women’s Health Initiative (2002) On hormones On placebo
Controlling for confounders in medical studies 1. Confounders can be controlled for in the design phase of a study (randomization or restriction or matching). 2. Confounders can be controlled for in the analysis phase of a study (stratification or multivariate regression).
Analytical identification of confounders through stratification
Mantel-Haenszel Procedure: Non-regression technique used to identify confounders and to control for confounding in the statistical analysis phase rather than the design phase of a study.
Stratification: “Series of 2x2 tables” Idea: Take a 2x2 table and break it into a series of smaller 2x2 tables (one table at each of J levels of the confounder yields J tables). Example: in testing for an association between lung cancer and alcohol drinking (yes/no), separate smokers and non-smokers.
Stratification:“Series of 2xK tables” Idea: Take a 2xK table and break it into a series of smaller 2xK tables (one table at each of J levels of the confounder yields J tables). Example: In evaluating the association between lung cancer and being either a teetotaler, light drinker, moderate drinker, or heavy drinker (2x4 table), separate into smokers and non-smokers (two 2x4 tables).
Road Map 1.Test for Conditional Independence (Mantel-Haenszel, or “Cochran-Mantel- Haenszel”, Test). Null hypothesis: when conditioned on the confounder, exposure and disease are independent. Mathematically, (for dichotomous confounder): P(E&D/~C) = P(E/~C)*P(D/~C) and P(E&D/C)=P(E/C)*P(D/C) Example: once you condition on smoking, alcohol and lung cancer are independent; M-H test comes out NS. 2. Test for homogeneity. Breslow-Day. Null hypothesis: the relationship (or lack of relationship) between exposure and disease is the same in each stratum (homogeneity). Example: B-D test would come out significant if alcohol aggravated the risk of cigarettes on lung cancer but did not increase lung cancer risk in non-smokers. Homogeneity does NOT require independence!! 3. If homogenous, for series of 2x2 tables, you can take a weighted average of OR’s or RR’s (which should be similar in each stratum !) from the strata to get an overall OR or RR that has been controlled for confounding by C.
Controlling for confounding by stratification Example: Gender Bias at Berkeley? (From: Sex Bias in Graduate Admissions: Data from Berkeley, Science 187: ; 1975.) Crude RR = (1276/1835)/(1486/2681) =1.25 (1.20 – 1.32) Denied Admitted FemaleMale
Program A Stratum 1 = only those who applied to program A Stratum-specific RR =.46 ( ) Denied Admitted FemaleMale
Program B Stratum 2 = only those who applied to program B Stratum-specific RR = 0.86 ( ) Denied Admitted FemaleMale
Program C Stratum 3 = only those who applied to program C Stratum-specific RR = 1.05 ( ) Denied Admitted FemaleMale
Program D Stratum 4 = only those who applied to program D Stratum-specific RR = 1.02 ( ) Denied Admitted FemaleMale
Program E Stratum 5 = only those who applied to program E Stratum-specific RR = 0.96 ( ) Denied Admitted FemaleMale
Program F Stratum 6 = only those who applied to program F Stratum-specific RR = 1.01 ( ) Denied Admitted FemaleMale
Summary Crude RR = 1.25 (1.20 – 1.32) Stratum specific RR’s:.46 ( ) 0.86 ( ) 1.05 ( ) 1.02 ( ) 0.96 ( ) 1.01 ( ) Maentel-Haenszel Summary RR:.97 Cochran-Mantel-Haenszel Test is NS. Gender and denial of admissions are conditionally independent given program. The apparent association (RR=1.25) was due to confounding.
Cochran-Mantel-Haenszel Test of Conditional Independence The (Cochran)-Mantel-Haenszel statistic tests the null hypothesis that exposure and disease are independent when conditioned on the confounder.
CMH test of conditional independence Exposed Unexposed DiseaseNo Disease ab cd Strata k NkNk
CMH test of conditional independence Exposed Unexposed DiseaseNo Disease ab cd Strata k NkNk
E.g., for Berkeley… Result is NS
Summary Crude RR = 1.25 (1.20 – 1.32) Stratum specific RR’s:.46 ( ) 0.86 ( ) 1.05 ( ) 1.02 ( ) 0.96 ( ) 1.01 ( ) Breslow-Day test rejects (p=.0023) because of the protective effect for women in program A. We will still combine them—but that may obscure a potentially interesting pro-female bias in program A!
The Mantel-Haenszel Summary Risk Ratio Disease Not Disease ExposureNot Exposed ac b d k strata
The Mantel-Haenszel Summary Risk Ratio Disease Not Disease ExposureNot Exposed ac b d total unexposed total exposed Exposed &disease Unexposed&disease
The Mantel-Haenszel Summary Risk Ratio Disease Not Disease ExposureNot Exposed ac b d
E.g., for Berkeley… Use computer to get confidence limits…
The Mantel-Haenszel Summary Odds Ratio Exposed Not Exposed CaseControl ab c d
Country OR = 1.32 Spouse smokes Spouse does not smoke US Spouse smokes Spouse does not smoke Great Britain OR = 1.6 Spouse smokes Spouse does not smoke Lung CancerControl Japan OR = 1.52 Source: Agresti. Introduction to Categorical Data Analysis Chapter 3. Example…
In SAS… proc freq data=secondhand; weight number; * specifies the size of each 2x2 cell; tables country*NoSpouse*NotCase/ cmh ; run;
CMH test of conditional independence: p=.0196 Summary Statistics for Spouse by Case Controlling for Country Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 Nonzero Correlation Significant CMH test means that there does appear to be an association between spousal smoking and cancer, after controlling for country.
Breslow-Day test of homogeneity: NS Controlling for Country Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square DF Pr > ChiSq Total Sample Size = 1262 NS means there’s no evidence that OR’s differ across strata (OK to combine them into summary OR)
Summary Statistics for Spouse by Case Controlling for Country Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel (Odds Ratio) Logit Cohort Mantel-Haenszel (Col1 Risk) Logit Cohort Mantel-Haenszel (Col2 Risk) Logit Type of Study Method 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel (Odds Ratio) Logit MH OR and confidence limits…
Country OR = 1.32 Spouse smokes Spouse does not smoke US Spouse smokes Spouse does not smoke Great Britain OR = 1.6 Spouse smokes Spouse does not smoke Lung CancerControl Japan OR = 1.52 Example… Source: Agresti. Introduction to Categorical Data Analysis Chapter 3.
The Mantel-Haenszel Summary Odds Ratio Exposed Not Exposed CaseControl ab c d
Summary OR Not Surprising!
MH OR assumptions OR or RR doesn’t vary across strata. (Homogeneity!) If exposure/disease association does vary for different subgroups, then the summary OR or RR is not appropriate…
advantages and limitations advantages… Mantel-Haenszel summary statistic is easy to interpret and calculate Gives you a hands-on feel for the data disadvantages… Requires categorical confounders or continuous confounders that have been divided into intervals Cumbersome if more than a single confounder To control for 1 and/or continuous confounders, a multivariate technique (such as logistic regression) is preferable.
2. Effect Modification Effect modification occurs when the effect of an exposure is different among different subgroups.
Years of Life Lost Due to Obesity (JAMA. Jan ;289: ) Data from US Life Tables and the National Health and Nutrition Examination Surveys (I, II, III).
Conclusion Race and gender modify the effect of obesity on years-of-life-lost.
Among white women, stage of breast cancer at detection is associated with education. However, no clear pattern among black women.
Colon cancer and obesity in pre- and post-menopausal women Obesity appears to be a risk factor in pre- menopausal women But appears to be protective or unrelated in post-menopausal women