Presentation is loading. Please wait.

Presentation is loading. Please wait.

Winter Electives Molecular and Genetic Epidemiology

Similar presentations


Presentation on theme: "Winter Electives Molecular and Genetic Epidemiology"— Presentation transcript:

1 Winter Electives Molecular and Genetic Epidemiology
Decision and Cost-effectiveness Analysis Grantwriting (Workshop – not for credit hours) Medical Informatics Elective courses available in our program in the winter

2 Today: Lecture: Confounding & Interaction III Section: 3:30 to 5 (S-18, S-22, S-22) Next Tuesday (12/6/05) – All at China Basin 8:15 to 9:45: Journal Club 10:00 to 1:00 pm: Mitch Katz Note chapters in his text book box lunches provided 1:15 to 2:45: Last Small Group Section Web-based course evaluation Bring laptop Distribute Final Exam Exam due 12/13 (in hands of Olivia by 4 pm) by or China Basin 5700

3 Confounding and Interaction: Part III
When Evaluating Association Between an Exposure and an Outcome, the Possible Roles of a 3rd Variable are: Intermediary Variable Effect Modifier Confounder No Effect Forming “Adjusted” Summary Estimates to Evaluate Presence of Confounding Concept of weighted average Woolf’s Method Mantel-Haenszel Method Clinical/biological decision rather than statistical Handling more than one potential confounder Limitations of Stratification to Adjust for Confounding the motivation for multivariable regression Here is our roadmap for today. Two weeks ago, we defined and discussed confounding and, in particular, we distinguished confounding variables from intermediary variables. Last week, on our way to discussing confounding we came across interaction, also called effect modification or effect-measure modification, and distinguished it from confounding. Today, we will integrate all of this material by reviewing the possible effects of a third variable when evaluating the association between an exposure and a disease - in other words, that a third variable can either be an intermediary variable (depending upon how you are conceptualizing your research question), serve as an effect modifier, be a confounder, or have no effect. We will spend the majority of today talking about how stratification is the initial tool that we use to form “adjusted” summary effect estimates in order to assess for and deal with the presence of confounding. Remember, we began to do this last week but we ran into effect modification in the particular example we had looking at smoking, caffeine use, and delayed ability to conceive. To obtain summary adjusted estimates via stratification, we will review the concept of weighted averages and look at specific techniques known as Woolf’s method and the Mantel-Haenszel method. In particular, we will describe, unlike what we did when we were evaluating for interaction, that there is no role for statistical testing when evaluating for confounding. We will also talk about the possible approaches for when there is more than one potential confounder to contend with. Finally, although stratification is a very useful technique for evaluating for interaction and confounding, it does have its limitations that we will discuss. These limitations served as the driving impetus for the development of multivariable regression - a topic we will discuss in the next two sessions with Mitch Katz, next week.

4 When Assessing the Association Between an Exposure and a Disease, What are the Possible Effects of a Third Variable? Assumption: The third variable a priori is felt to be relevant No Effect Intermediary Variable: ON CAUSAL PATHWAY Remember from last, when we were assessing the association between an exposure and a disease, this is how we summarized the possible roles of a third variable. And, again, the underlying assumption is that you are only looking at other variables which a priori are felt to have some relevance – you are not just picking out any old random variable that happened to be measured in your dataset. Depending upon your research question, a third variable could be conceived as an intermediary variable on the causal pathway and if so, we would not want to adjust for it or control for it any way. Hence, this is the first decision to make. In other words, the question is are you interested in an effect of the exposure on the disease that operates through a pathway other than through the third variable. If you are not conceiving the third variable as an intermediary variable, then you need to assess whether it is an effect modifier and then if it is not an effect modifier, then you need to assess if it is a confounder. If it is none of these, neither an intermediary variable nor an effect modifier nor confounder, then we simply say that is has no effect and we report the crude estimate. I C + EM _ Confounding: ANOTHER PATHWAY TO GET TO THE DISEASE Effect Modifier (Interaction): MODIFIES THE EFFECT OF THE EXPOSURE D

5 What are the Possible Effects of a 3rd Variable?
Intermediary Variable Effect Modifier (interaction) Confounder No Effect Intermediary Variable? (conceptual decision) Report Crude Estimate no yes Effect Modifier? (numerically assess both magnitude and statistical differences) So, there are four possibilities: the third variable can be an intermediary variable, an effect modifier, a confounder, or have no effect. How are you going to determine which of these possibilities is occurring in your sample? This algorithm is a useful place to start. The first decision you have to make is a conceptual one based upon your knowledge of the biological or behavioral system and the research question at hand. Remember the example we used regarding exercise and CAD with the 3rd factor being HDL. Depending upon your research question, sometime you will want to consider HDL as an intermediary variable, but sometimes you don’t. Remember that if you are conceptualizing the third variable as an intermediary variable, then you would not control for it. In other words, you would stop right here and just report the crude or unadjusted association. Assuming that the third variable is not being conceptualized as an intermediary variable, the more the typical situation is that you have suspicions that the third variable could be a confounder because, for example, it is known to be associated with the disease in question. You are also interested in whether the third variable is an effect modifier because this will giver a richer understanding of the system. So, functionally, what you do is to look for effect modification and remember you do this by stratifying upon levels or categories of the third level. The decision as to whether you will declare or ignore interaction is a decision based on clinical and statistical issues. If you believe interaction may be present, then you should report the association between the exposure and disease in terms of stratum-specific estimates based on the effect modifier. If there is no evidence of interaction, you then look for confounding. To do this, you will form an “adjusted” summary estimate of the two or more stratum-specific estimates and you will compare the adjusted measure of association to the crude measure of association. If you deem that confounding is present, you will want to report the adjusted measure of association. If no confounding is present, you are left with the conclusion that the third variable had no effect. In that case, report the crude estimate as the final measure of association between exposure and disease. [Let me give you some context here. If you are working in a system where the effects of a third variable are already well known, say you evaluating the relationship between diabetes on CAD via a mechanism apart from obesity, then there is no question of what the third variable is. The issue is that you just need to crank through the numbers to see the particular magnitude of effect that the third variable has in your sample.] Report stratum-specific estimates no yes Confounder? (numerically assess difference between adjusted and crude; not a statistical decision) yes Report “adjusted” summary estimate Report Crude Estimate (3rd variable has no effect) no

6 Effect of a Third Variable: Statistical Interaction
Crude RR crude = 1.7 Stratified Heavy Caffeine Use No Caffeine Use Remember this example from a study of the effects of smoking and caffeine use in the occurrence of delayed pregnancies among women hoping to conceive. The principal exposure in question is smoking. We were interested in how the effects of a third variable, caffeine use, may be influencing this relationship. First of all, in the way were conceptualizing the system, caffeine is not an intermediary variable. I guess we could have considered caffeine use an intermediary variable if we thought that the only way that smoking caused delayed conception was thru caffeine use, but this is not our intent. Instead, we are interested in knowing whether smoking has any direct biological pathways in preventing conception. So, we stratified by caffeine use and remember that we saw qualitative interaction, actually both on the multiplicative (ie the risk ratio) and additive (ie the risk difference) scales. When we performed a test of homogeneity in Stata we saw the following. Assuming the dataset is loaded, the command is “cs” followed by the name of the outcome variable, in this case “delayed” followed by a comma, and then by and in parentheses the name of the 3rd variable, in this case “caffeine”. Stata shows you the stratum-specific risk ratios and then at the bottom shows you the test of homogeneity. The test statistic is 7.8 and it follows a chi square distribution with 1 degree of freedom. This corresponds to a p value of The null hypothesis is that the stratum specific estimates are the same. The interpretation of the p value is that differences this big or bigger between stratum would occur 5 times out of 1000 by chance alone if there was indeed no true difference. We decided that this is too rich of a story to sweep under the carpet and hence it would not be appropriate to try to summarize these two effects, 2.4 and 0.7, into one overall number. Instead we would report the two stratum-specific estimates separately. In the presence of effect modification of this magnitude, effect modification trumps any further consideration of confounding. Stop here. End of story. RRcaffeine use = 0.7 RRno caffeine use = 2.4 . cs delayed smoking, by(caffeine) caffeine | RR [95% Conf. Interval] M-H Weight no caffeine | heavy caffeine | Crude | M-H combined | Test of homogeneity (M-H) chi2(1) = Pr>chi2 = Declare interaction; confounding is not relevant

7 Statistical Tests of Interaction: Test of Homogeneity (heterogeneity)
Null hypothesis: The individual stratum-specific estimates of the measure of association differ only by random variation i.e., the strength of association is homogeneous across all strata i.e., there is no interaction The test statistic will have a chi-square distribution with degrees of freedom of one less than the number of strata Remember also that we discussed the statistical tests available to assess the role of chance or random variation in causing apparent interaction. The null hypothesis of these tests is that there is no differences between the strata - any differences that you do see are only because of random sampling error. In other words, the null hypothesis is that the strength of association is homogenous across strata; there is no interaction. We won’t dwell into the mechanics of these tests only to say that they follow a chi-square distribution with the degrees of freedom equal to the number of strata minus one.

8 Report vs Ignore Interaction? Some Guidelines
Is an art form: requires consideration of both clinical and statistical significance And, finally, remember last week we finished by going through several examples to get a feeling for when we should report, rather than ignore interaction. We won’t go through these again other than to say that the decision to report requires looking both at the magnitude of the difference in the stratum specific estimates, putting these in context with the biological question, as well as the statistical test of interaction. If you always wait for 0.05, you may miss some interesting findings. On the other hand, with a p = 0.2, while it may be ok to report the presence of interaction, you certainly cannot be sure it is not simply from chance. Finally, the downside of reporting interaction is that it is just more complicated. In this simple example, there need to be two answers instead of one if you believe that interaction is present. What happens when you get into multiple category third variables or the presence of many other “third” variables. You’d have to be reporting separate measures of association for dozens of strata.

9 If Interaction is not Present, What Next?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were identical. This is seldom the case. Case-control study of post-exposure AZT use in preventing HIV seroconversion after needlestick (NEJM 1997) A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s Crude What if you decide interaction is not present, what should you do next? To illustrate this, let’s work through another neat example. This is a case-control study looking at the effectiveness of AZT in preventing HIV seroconversion after a needlestick in health care workers. The context is that needlesticks among health care workers from patients with HIV disease are unfortunately all too common. The question is whether taking AZT right after a needlestick can prevent the healthcare worker from becoming infected. Attempts to address this question with a RCT did not work because no doctor or nurse wanted to be randomized to the placebo group. However, there was no evidence for many years whether taking antiretroviral therapy, replete with its toxicity, would do any good in preventing HIV acquisition. So, this had to be sorted out observationally and luckily there were at least some doctors and nurses who did not elect to use AZT such that we have some variability to work with. So, the exposure in question is the use of AZT and the outcome is the occurrence of HIV. Cases were health care workers who had acquired HIV after a needlestick and controls were health care workers who did not acquire HIV after a needlestick. In the crude analysis, the OR was 0.61, which by the way was not statistically significant, ie no strong evidence of a benefit from AZT. Should we conclude that healthcare workers not use AZT after a needlestick because it does not work? Are there any potential confounders we should be concerned about (without looking ahead in your handout?) ORcrude =0.61 (95% CI: ) The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?

10 Post-exposure prophylaxis with AZT after a needlestick
AZT Use Right, severity of exposure. It is likely that the health care workers who took AZT were also the ones who had the most severe exposures (i.e. deep wound, big inoculum, end-stage AIDS patient source), the same exposures that were associated with a greater probability of HIV transmission. Severity of Exposure HIV

11 Evaluating for Interaction
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were identical. This is seldom the case. Potential confounder: severity of exposure Crude A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s ORcrude =0.61 Stratified Minor Severity So, when the authors stratified by the severity of the needlestick, look at the stratum specific estimates; they are much lower. The first thing to decide is whether interaction is present. What do you think? Well, this is a big difference in magnitude but note how few cases there are in the minor needlestick severity group. What would you want to do at this point? Major Severity OR = 0.0 OR = 0.35 The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?

12 Is there interaction? Is there confounding?
To stratify the subjects into those women with maternal age less than 35 and those with maternal age >= 35, you add a “by(matage) option. If you add a “, pool” option as I have here, the program will give you not only the default MH summary but also the Woolf estimate. Is there interaction? Is there confounding? Finally, you are already familiar with this command but for sake of comparison let’s look at the summary estimate as obtained by logistic regression which as you know uses the MLE approach. As you can see, the MH estimate is essentially identical to the MLE in this problem. When we look at a statistical test of heterogeneity, we see a p value of 0.44 showing that chance could have easily caused this difference between strata. So, the authors did not decide to declare that there was any important interaction. It would have also been too complicated to report interaction. That said, is there confounding here? How do you know? To address this, we have to form a summary estimate of the two strata.

13 How do we decide on a weight?
Assuming Interaction is not Present, Form a Summary of the Unconfounded Stratum-Specific Estimates Right. We need to assign a weight to each stratum and then perform a weighted average. Construct a weighted average Assign weights to the individual strata Summary Estimate = Weighted Average of the stratum-specific estimates a simple mean is a weighted average where the weights are equal to 1 which weights to use depends on type of effect estimate desired (OR, RR, RD) and characteristics of the data e.g., Woolf’s method Mantel-Haenszel method Hopefully the concept of a weighted average is understood by everyone. A simple mean is in fact a weighted average where the weights equal one. To get the average height of everyone in class, we add up everyone’s height and divide by the number of persons contributing. The weight is one. To do this, we will form a weighted average. In other words, we will assign weights to the various strata and then take an average of the strata using these weights. Mathematically, it looks like this. We take the measure of association in each stratum and multiply by the weight and then divide by the total weight. A simple mean is an example of a weighted average where all the weights are 1. For example, the average of 2, 4, 6, and 8 equal 1(2) + 1(4) + 1(6) + 1(8) / (4*1) which equals 5.. Which weight we should use depends upon a lot of factors, like the measure of association you are calculating and the nature of the data. Some methods that we will discuss now are called Woolf’s method and the Method of M-H. How do we decide on a weight? The second approach to getting a summary estimate is actually the one used by multivariable modeling approaches and we will touch on this briefly today. It is called the maximum likelihood approach

14 Forming a Summary Estimate for Stratified Data
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were identical. This is seldom the case. Goal: Create a summary “adjusted” estimate for the relationship in question while adjusting for the potential confounder A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s Crude We know that the weighted average is going to be somewhere between 0.0 and 0.35, but where exactly is it going to be? So, how are we going to summarize these two strata. How would you weight these strata? Would you give them equal weight? Weight according to sample size? No. of cases? Variance? ORcrude =0.61 Stratified Minor Severity Major Severity OR = 0.0 OR = 0.35 How would you weight these strata? According to sample size? No. of cases? The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?

15 Summary Estimators: Woolf’s Method
One of the first approaches developed for forming summaryl adjusted estimates was Woolf’s method: aka Directly pooled or precision estimator Woolf’s estimate for adjusted odds ratio where wi wi is the inverse of the variance of the stratum-specific log(odds ratio) Well, remember in all of our studies, we are just taking a sample of the population and we know there is always the threat of sampling error. The potential for sampling error is best seen in the variance of the measure of association for each stratum. Hence, one intuitive approach is weight each stratum according to its sampling error - give the most weight to those strata that have the smallest variance. In fact, this is the basis of the one of the techniques developed called Woolf’s method. In this method, we actually initially work with log of the odds ratio and the formula is like the general formula we showed before. The weighted average of the different strata is the weighted average of log odds ratio for each of the strata. In other words, this is the sum of the products of each stratum-specific log OR times its weight, all divided by the sum of the weights. After you have taken the weighted average on the log scale, at the end you exponentiate this to get back to the native scale. The weight is not the variance of the log odds ratio per se but it is the inverse of the variance. So, what you see in the denominator here is the variance of the log odds ratios. This makes sense, right, because the bigger the variance, the inverse of the variance is small and thus a smaller weight. The smaller the variance, ie the more confident that you have nailed down the estimate, the inverse is larger - i.e. more weight. This is the inverse of the variance of the log odds ratio. This makes sense the more precise strata have the smallest variances and the inverse of a small number is a large number

16 Calculating a Summary Effect Using the Woolf Estimator
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were identical. This is seldom the case. e.g. AZT use, severity of needlestick, and HIV Crude ORcrude =0.61 A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s Stratified Minor Severity Major Severity When we actually crank through the numbers, assign each stratum a weight based on the inverse of its variance and take a weighted average, we uncover a problem with Woolf’s methods. It cannot handle cells that have zeros in them because a) you cannot take a log of 0; and b) you cannot divide by zero. OR = 0.0 OR = 0.35 The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?

17 Summary Estimators: Woolf’s Method
I discuss this approach first not only because it was one of the first proposed but also because it is the most conceptually straightforward. Conceptually straightforward Best when: number of strata is small sample size within each strata is large Cannot be calculated when any cell in any stratum is zero because log(0) is undefined 1/2 cell corrections have been suggested but are subject to bias Formulae for Woolf’s summary estimates for other measures (e.g., RR, RD) available in texts and software documentation sensitive to small strata, cells with “0” computationally messy It seems the most reasonable to assign each stratum according to how sure you are of the inference and the variance of the estimate is the best measure we have for this. In the days before computers, this was considered computationally messy such that other easier methods were sought So, Woolf’s methods is conceptually very straightforward and has its uses, especially when the number of strata are small and the sample size within each strata is large but does not work when there are zeros in cells. You can use substitute 0.5 in these cells to get answers but this is an imperfect solution. You can find formulae for the Woolf’s approach for other measures of association, like a risk ratio, risk difference in textbooks and software documentation.

18 Summary Estimators: Mantel-Haenszel
A more robust approach is the Mantel-Haenszel method Summary Estimators: Mantel-Haenszel Again, using the same cell definitions, the M-H estimate for the summary OR is the sum of a times d divided by T divided by the sum of . . . Mantel-Haenszel estimate for odds ratios ORMH = wi = wi is inverse of the variance of the stratum-specific odds ratio under the null hypothesis (OR =1) Given this problem with cells of zero, a second method, the Mantel-Haenszel method is more widely used. Its formula is shown here. Here, we are not working in the log odds scale, instead we are working with the plain old native odds ratio scale. So, here the weight is bc/N which happens to be the inverse of the variance of each stratum under the conditions when the OR=1. This is one of those things you are going to have to accept without proof. So,the weight is again related to the variance, but a special form of the variance, that is, when there is no association. Now, you might wonder why we using a variance associated with an odds ratio of no effect if indeed there is an effect in the stratum. It turns out that despite this apparent illogic, this weight works quite well compared to what is considered the real gold standard approach of using a technique called maximum likelihood estimation to find the adjusted measure. Maximum likelihood estimation is used in some of the techniques in multivariable regression you will learn about in the Winter and Spring and does require a computer and is not entirely transparent to the observer. Because Mantel Haenszel works about as well in many situations, and is much simpler, it is often favored. Note: another intuitive approach to understanding the MH formula is to note that ad/n is a pretty good approximation of the numerator ad in an odds ratio and bc/N is a pretty good approximation of the denominator of the odds ratio. If we decompose this slightly, we can see that the weight is for each stratum is actually b times c divided by T. This is actually the inverse of the . . . And the same logic as before, strata with the smallest variance get the most weight

19 Summary Estimators: Mantel-Haenszel
The MH is the most commonly used estimator. Mantel-Haenszel estimate for odds ratios relatively resistant to the effects of large numbers of strata with few observations resistant to cells with a value of “0” computationally easy most commonly used It is fairly resistant (ie it doesn’t blow up) . . . Although really not a factor in the computer era, the computation of the MH estimator is a breeze. More importantly is that the M-H closely approximates the MLE estimate which is generally regarded as the most accurate. So to review, the Mantel-Haenszel technique, although not as straightforward conceptually as the Woolf technique -which directly uses stratum-specific variances - has many advantages. It is relatively resistant to large numbers of strata and can handle cells with zeros. It also is computationally very easy, although this is not an issue these days of computers. For all these reasons, it is the most common approach to forming adjusted summary estimates. Note: need to point out that the M-H technique is only relatively resistant to strata with few observations. If you start to get a lot of strata with just 1 observation, it can be seen how this will completely waste that observation since there will be nothing to compare this with.

20 Calculating a Summary Effect Using the Mantel-Haenszel Estimator
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were identical. This is seldom the case. e.g. AZT use, severity of needlestick, and HIV ORMH = Crude ORcrude =0.61 A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s Stratified Minor Severity Major Severity Let’s work through our AZT and healthcare work example and obtain an summary estimate for the association between AZT use and HIV seroconversion after adjusting for severity of needlestick. What do we end up with? The summary adjusted estimate is This is an example of negative confounding. The crude estimate was 0.6 and not even statistically significant. After adjustment for severity of needlestick the OR is now an impressive There is a 70% reduction in odds of HIV seroconversion associated with using AZT. I really like this example. First, it illustrates how an elegant observational study design determined an extremely important biologic inference, one that could not be determined with a randomized experiment, because as mentioned health care workers refused to be randomized when such a study was started. Second, it illustrates how the authors would have blown making this inference if they had not paid attention to measuring and adjusting for important confounders. OR = 0.0 OR = 0.35 The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?

21 Calculating a Summary Effect in Stata
How can we make our lives a lot easier and implement all of this on the computer? The epitab command - Tables for Epidemiologists is quite a little handy command. Has anyone used it ? epitab command - Tables for epidemiologists see “Survival Analysis and Epidemiological Tables Reference Manual” To produce crude estimates and 2 x 2 tables: For cross-sectional or cohort studies: cs variablecase variable exposed For case-control studies: cc variablecase variableexposed To stratify by a third variable: cs varcase varexposed, by(varthird variable) cc varcase varexposed, by(varthird variable) Default summary estimator is Mantel-Haenszel , pool will also produce Woolf’s method How do we do this in Stata? You know the commands already. Remember, for a cross-sectional or cohort study, to get the crude measure of association it is “cs followed by the outcome variable, followed by the exposure variable”. For a case-control study, it is “cc followed by outcome variable followed by exposure variable” To stratify by a third variable, you add a comma and a “by(name of third variable)”. The default summary estimator is Mantel-Haenszel, but if you add a “, pool” you will also get the summary estimate using the Woolf method.

22 Calculating a Summary Effect Using the Mantel-Haenszel Estimator
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were identical. This is seldom the case. e.g. AZT use, severity of needlestick, and HIV . cc HIV AZTuse,by(severity) pool severity | OR [95% Conf. Interval] M-H Weight minor | major | Crude | Pooled (direct) | M-H combined | Test of homogeneity (B-D) chi2(1) = Pr>chi2 = Test that combined OR = 1: Mantel-Haenszel chi2(1) = Pr>chi2 = Crude ORcrude =0.61 A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s Stratified Minor Severity Major Severity So, on the bottom is the Stata command. Cc outcome predictor, by(third variable) and then we added the pool option. Here is the crude estimate again of 0.6. Here are the stratum-specific estimates: 0 and 0.35 Here is the Woolf estimate, that Stata calls Pooled or Direct: Note it is undefined because Stata had the same problem we had when dividing by zero. Here is the Mantel-Haenszel adjusted measure: 0.30 just like we got by hand on the prior slide. OR = 0.0 OR = 0.35 The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?

23 Calculating a Summary Effect Using the Mantel-Haenszel Estimator
In addition to the odds ratio, Mantel-Haenszel estimators are also available in Stata for: risk ratio “cs varcase varexposed, by(varthird variable)” rate ratio “ir varcase varexposed vartime, by(varthird variable)” In addition to the adjusted summary estimate for the odds ratio, Mantel and Haenszel also have developed techniques for forming adjusted risk ratios and rate ratios. Again, the epitab command in Stata, using cs for risk ratio and ir for rate ratios can be used. Reading about the epitab command in the Stata reference manual can teach you a lot about epidemiology.

24 Assessment of Confounding: Interpretation of Summary Estimate
If the summary estimate, here a M-H OR estimator of 3.8 Compare “adjusted” estimate to crude estimate e.g. compare ORMH (= 0.30 in the example) to ORcrude (= 0.61 in the example) If “adjusted” measure “differs meaningfully” from crude estimate, then confounding is present e.g., does ORMH = 0.30 “differ meaningfully” from ORcrude = 0.61? What does “differs meaningfully” mean? a matter of judgement based on biologic/clinical sense rather than on a statistical test no one correct answer the objective is to remove bias 10% change from the crude often used your threshold needs to be stated a priori and included in your methods section So, its in the hands of the researcher So, now that we have determined an adjusted estimate for this AZT and needlestick research question, be it with the Woolf method or the Mantel-Haenszel method, what are we going to do with it? We now need to compare it to the crude estimate to determine if confounding is indeed present. If the adjusted estimate is meaningfully different than the crude, we then conclude that confounding is present. So, of course, your next question is what does “differ meaningfully” mean? Is this the same arbitrary process we used when looking at interaction. Not really, because here we are guarding against bias whereas in the case of interaction we weren’t trying to prevent a bias but instead we’re giving a more detailed explanation of the system. So, already you know we have to be more conservative. Most importantly, this is this decision is based on your biological sense of the system under study - and this is not a statistical issue, as I will discuss in greater length in the next slide. There is no one correct answer but if forced me to give you an answer I would say differences of 10% or greater in the measure of association would typically be considered meaningful. This is what most of the leading thinkers in the field also suggest. But, this is the kind of thing you want to be intellectually honest about and you should always state upfront before you embark upon an analysis what difference you are going to consider big enough.

25 Statistical Testing for Confounding is Inappropriate
Testing for statistically significant differences between crude and adjusted measures is inappropriate e.g., when examining an association for which a factor is a known confounder (say age in the association between HTN and CAD) if the study has a small sample size, even large differences between crude and adjusted measures will not be statistically different yet, we know confounding is present therefore, the difference between crude and adjusted measures cannot be ignored as merely chance. The difference must be reported as confounding the issue of confounding is one of internal validity, not of sampling error. we must live with whatever effects we see after adjustment for a factor for which there is an a priori belief about confounding we’re not concerned that sampling error is causing confounding and therefore we don’t have to worry about testing for role of chance Many of you are perhaps saying why don’t we just apply a statistical test to see if the difference between the crude and adjusted is significant. If there is anything I want you to leave today with is that this is a bad idea. Even if you wanted to perform a statistical test, they are actually not readily available in software packages. Why is statistical testing inappropriate? Say you are working with a variable that you know is a confounder in every prior instance it was examined, like age in the evaluation of the relationship between hypertension and coronary artery disease. What if you had a small sample size? Then even large differences between the crude and adjusted measures of association might not be statistically significant. What are you going to do? Throw out age and just go with crude estimate of the relationship between hypertension and CAD? Of course, you won’t do this because you know that confounding is present! You cannot ignore differences between crude and adjusted measures just because they are not statistically significant. In other words, when protecting against bias, you have to do whatever you can regardless of statistical significance. We have to live with what we see as differences between crude and adjusted regardless of the statistics. Looked at another way, the issue of confounding is one of internal validity, not of sampling error. That said, we must be prepared to need to live with whatever effects we see after adjustment for a factor for which there is an a priori belief about confounding. Because we are not concerned that sampling error is causing confounding, therefore we don’t have to worry about testing for the role of chance.

26 Confidence Interval Estimation and Hypothesis Testing for the Mantel-Haenszel Estimator
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were identical. This is seldom the case. e.g. AZT use, severity of needlestick, and HIV . cc HIV AZTuse,by(severity) pool severity | OR [95% Conf. Interval] M-H Weight minor | major | Crude | Pooled (direct) | M-H combined | Test of homogeneity (B-D) chi2(1) = Pr>chi2 = Test that combined OR = 1: Mantel-Haenszel chi2(1) = Pr>chi2 = What does the p value = mean? A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s Now, that we have decided that confounding is present, what’s left? Now we know that the adjusted summary estimate is the final answer but what about the precision of this estimate? In addition to calculating the M-H summary estimate (shown here), Stata also calculates the 95% confidence interval (shown here). The MH-adjusted odds ratio is 0.12 to 0.79, thus not crossing 1. There is also a null hypothesis being tested as well for the MH - adjusted OR, and it is that the adjusted OR is equal to 1. This follows a chi-square distribution of 1, regardless of the number of strata. This is known as the Mantel-Haenszel chi square statistic, and here it is 6.06 with the corresponding p value of Who can state in words what the p value means? The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?

27 Mantel-Haenszel Confidence Interval and Hypothesis Testing
It is worth pointing that Stata is working hard for you behind the scenes in the Mantel-Haenszel-adjusted confidence interval formation and hypothesis testing. Whereas the point estimate for the adjusted measure of association is easy to calculate, we just did it, the standard error, shown here,is not. But luckily Stata calculates all of this for you. And remember, the standard formula for a 95% confidence interval is the point estimate plus or minus 1.96 times the standard error. For the hypothesis test of adjusted measure of association equal to one, this is the formula for the chi-square statistic which has 1 degree of freedom. There’s no need for remember these formulae but you do need to know how to interpret them.

28 Mantel-Haenszel Techniques
Mantel-Haenszel estimators Mantel-Haenszel chi-square statistic Mantel’s test for trend (dose-response) It is also worth pointing that in addition to the Mantel-Haenszel estimators we are discussing, you’ll see other techniques named after Mantel and/or Haenszel and it is worth keeping it straight which is which. We actually described the M-H chi-square statistic in the last slide and there is also something known as the Mantel test for trend, for which there’s a nice discussion in the Appendix of our green text book. . .

29 Summary Effect in Stata -example
e.g. Spermicide use, maternal age and Down’s With this in mind, let’s consider an example using Crude OR = 3.5 Age < 35 Age > 35 Stratified Let’s look at another example. Remember our question looking at the effect of spermicide use and the development of Down’s Syndrome. Remember after we stratified by maternal age, we saw these two stratum specific estimates: 3.4 and 5.7, but the p value for the test of homogeneity was 0.71 and so we were willing to pass on interaction and now we need to assess for confounding. Here the crude estimate is 3.5 The adjusted OR by Woolf method is 3.82 and adjusted OR by M-H is Confounding or not? Well, this is a close call, right below the 10% cut-off, I guess. The point estimate has changed after adjustment from 3.50 to 3.82, at the most, but this is actually fairly trivial. You would probably be fine if concluded there was no meaningful confounding by age and you went with the OR of 3.5 as your final answer. Part of the answer is in your feeling or knowledge about age and your certainty about its role as a confounder. On the one hand, if you are absolutely certain that age is associated with both spermicide use and Down’s, then it should probably be adjusted for and you should accept the adjusted OR as the right answer. But here, whereas we might be pretty sure that age is associated with Down’s, how certain are we that age is associated with Spermicide use? Maybe it is in general, but maybe it actually isn’t in your population. If you’re saying, why quibble - just adjust and accept the adjusted measure of association, the problem with this is when you adjust you often add a certain amount of statistical imprecision. Note the CI for the crude vs adjusted; the adjusteds are slightly wider. In this particular example, the CI’s are still well above 1 whether you go with the crude or the adjusted so it does not matter much. However, if you start adding in adjustment of other potential confounders, then you might start to nudge the 95% CI down to 1. So, the point is there is no reason to adjust and take the statistical penalty unless you really need to. OR = 3.4 OR = 5.7 Should we pool these? Is there confounding present? Which answer should you report as “final”?

30 No Effect of Third Variable
Crude OR crude = 21.0 (95% CI: ) Stratified Matches Present Matches Absent Here’s another example of what happens when you adjust for something you don’t need to adjust for. Remember, the effect of matches on the association between smoking and lung cancer. There was no interaction and furthermore, when we looked at the measure of association in the two matching-using strata, we saw the same effect as the unadjusted association. In other words, matches had no effect on the association. In this case, we would report the crude estimate, only right? Why not report the average of the stratum-specific estimates - after all the average is 21, the same as the crude? Well, one answer is that it is too much work. The second answer is that most of the time when you stratify, you pay a little price in terms of statistical precision. In other words, the confidence interval of the crude estimate will be narrower than the CI of the adjusted measure - they both will have 21 as their pt estimate but the crude association will be more precise. As you can see, the 95% confidence interval for the crude estimate is 16.4 to 26.9 compared to 14.2 to 31.1, which is determined by techniques we are going to describe later today. So, you don’t want to adjust for things that are not acting as confounders. ORmatches = 21.0 OR no matches = 21.0 OR adj = 21.0 (95% CI: )

31 Whether or not to accept the “adjusted” summary estimate in favor of the crude?
Methodologic literature is inconsistent on this Scientifically most rigorous approach would appear to be to create two lists of potential confounders prior to the analysis: A. Those factors for which you will accept the adjusted result no matter how small the difference from the crude B. Those factors for which you will accept the adjusted result only if it meaningfully differs from the crude (with some pre-specified difference, e.g., 10%) For some analyses, may have no factors on A list. For other analyses, no factors on B list. Always putting all factors on A list may seem conservative, but not necessary the right thing to do to take the penalty in statistical imprecision So, all of this talk about what is a meaningful difference between the crude and adjusted measure of association and whether or not to accept the adjusted summary estimate as your final answer in favor of the crude must be causing some of you to have questions. The tension is between trying to get the valid answer, the right answer, but without having to take unnecessary penalties in precision. I’ve found that the methodologic literature is vague on this point, leaving the individual on his own. In thinking about it, it would seem that the only scientifically rigorous approach to this would be to create two lists of potential confounders prior to conducting the analysis. On one list would be those factors for which you will accept the adjusted result no matter how small the difference from the crude. These are factors you just simply believe or know are confounders. This might be like age in the example a few slides back for the spermicide use and Down’s syndrome example. On another list might be those factors for which you will accept the adjusted result only if it meaningfully differs from the crude (with some pre-specified difference, e.g., 10%). These might be factors which you think may be confounders but you are not sure about. For some analyses, you may have no factors on A list. For other analyses, no factors on B list. I typically don’t have many variables on my A list, if any, and instead put just about everything on the B list. The point is that always putting all factors on A list may seem conservative, but it is not necessary the right thing to do to take the penalty in statistical imprecision for these variables.

32 Presence or Absence of Confounding by a Third Variable?
Let’s go through a few more numerical examples and assume that what we are adjusting for is on the B list. Assume that this column is the crude estimate, that these are the stratum-specific estimates and this is the adjusted estimate, the weighted average of the stratum specific estimates. If the crude is 4.0 and the adjusted is 2, you would always report the adjusted estimate. Or, if the crude is 4.0 and the adjusted is 1.1, you would definitely report the adjusted. If the crude is 0.2 and the adjusted is 0.8, I would definitely adjust. However, if the crude is 4.0 and the adjusted is 4.1, I would not adjust. I would ignore this in favor of the crude estimate. Likewise, if the crude is 1.9 and the adjusted is 1.8, I would also probably ignore. You can run through the other examples on your own.

33 Stratifying by Multiple Potential Confounders
Crude Stratified <40 smokers 40-60 smokers >60 smokers All of our examples so far today have dealt with situations when there is only one “third variable” - one potential effect modfier, one potential confounder. What if, as shown here in the example of chlamydia and CAD if there were more than one additional variable present? <40 non-smokers 40-60 non-smokers >60 non-smokers

34 The Need for Evaluation of Joint Confounding
Variables that evaluated alone show no confounding may show confounding when evaluated jointly Crude Stratified by Factor 1 alone by Factor 2 alone by Factor 1 & 2 The examples I have shown thus far have just one potential confounder to worry about. What should we do when more than . . . In this example, the crude estimate is identical to the stratum specific measures when the 2 other variables are looked at separately. There are a couple of approaches here. You could look at each of the other variables one variable at a time to see if it is a confounder but this is a dangerous practice because it ignores the joint effects of the third and fourth variables. Consider this case-control study where the crude OR is 2.2 and there are two additional variables, typically we call these covariates: factor 1 and factor 2. If we just stratified by factor 1, we would see no evidence of interaction or confounding, right? If we just stratified by factor 2, we would also see no evidence of interaction or confounding. If you stopped here, you might be tempted to say that there is no effect of factors 1 and 2 in the relationship between the primary exposure and the outcome. But if we formed four strata based on the combinations of the two potential confounders, we would see the following. Each of the four strata shows no effect. In other words, when we looked at the joint effects of factors 1 and 2, positive confounding is present - such that, after adjustment for factors 1 and 2, there is no longer any association between the exposure and the disease. This illustrates the point that variables which when evaluated alone show no effect (ie no confounding or no interaction) may show confounding when evaluated jointly

35 Approaches for When More than One Potential Confounder is Present
Backward versus forward confounder evaluation strategies relevant both for stratification and especially multivariable modeling (the heart of model selection) Backwards Strategy initially evaluate all potential confounders together (i.e., look for joint confounding) conceptually preferred because in nature variables are all present and act together Procedure: with all potential confounders considered, form adjusted estimate. This is the “gold standard” one variable can then be dropped and the adjusted estimate is re-calculated (adjusted for remaining variables) if the dropping of the first variable results in a non-meaningful (eg <10%) change compared to the gold standard, it can be eliminated procedure continues until no more variables can be dropped (i.e. are remaining variables are relevant) Problem: with many potential confounders, cells become very sparse and stratum-specific estimates very imprecise This introduces the whole topic of I know you are learning a bit about this in biostatistics. Which is preferable -backward or forwards? So, this gets at the question of what general approach you should use when there are multiple potential confounders. This is relevant not only for stratification, our topic of discussion today, but also for when you do multivariable regression. In fact, this gets at the heart of the most difficult aspect of multivariable modeling, that being model selection. We won’t have time to cover this in detail, but this to get you started with the some of the vocabulary. In a backwards strategy, you start by initially looking at all the potential confounders jointly; ie look for joint confounding. This really is conceptually preferred because in nature these variables are all occurring together not in isolation. Procedurally, we would form mutually exclusive and exhaustive strata based on all the potential confounders. Assuming interaction was not present, we would, just like we did when there was only one potential confounder, form an adjusted summary measure by averaging the different strata. This adjusted estimate becomes the gold standard adjusted measurement, the one to which all others are compared. Then, one variable is dropped at a time and the adjusted estimate is re-calculated without the presence of that variable. If the dropping of that variable results in an inconsequential change, say less than 10%, compared to the gold standard estimate and no worsening of the precision, that variable can be eliminated. Often the elimination of factors whose absence results in a non-meaningful change in the adjusted measure of association will be associated with a gain in imprecision. In other words, without that factor the confidence interval for the newly calculated adjusted measure of association will be narrower. If the dropping of the variable results in no consequential change in the measure of association then it can be eliminated from further consideration. If dropping the variable does result in a meaningful change, then it cannot be dropped. Instead, another variable is dropped and you recalculate. The procedure continues until no more variables can be dropped. For example, using the last slide, you would adjust for the two factors jointly and get an OR adjusted of If you then dropped one of the two variables and got an adjusted estimate you would see a very large change and there conclude you could not drop that variable. You would then try to drop the other variable and see the same result. You would conclude that you have to keep both variables for adjustment. The problem, however, with this is depending upon how many potential confounders there are and how many levels there are for each confounder the number of strata you’d need could be very large, and the cells in the strata very small rendering the weighting procedures unusable. In fact, you may not even be able to get off the ground because the initial stratification is just too thin

36 Example: Backwards Selection
Research question: Is prior hospitalization associated with the presence of methicillin-resistant S. aureus (MRSA)? (from Kleinbaum 2003) Outcome variable: MRSA (present or absent) Primary predictor: prior hospitalization (yes/no) Potential confounders: age (<55, >55), gender, prior antibiotic use (atbxuse; yes/no) Assume no interaction Here’s an example using backwards selection. Here, the research question is whether prior hospitalization is associated with the presence of methicillin-resistant Staph aureus (MRSA)? The outcome variable is MRSA, present or absent. The primary predictor in question is prior hospitalization, yes or no? The potential confounders are age, dichotomized as less than or greater than 55 years old, gender, and prior antibiotic use. Here’s a table showing the various odds ratios. In the first column, we show the factors that are adjusted for. In the middle column we show the odds ratio and its 95% confidence interval for the set of variables adjusted for. In the far right column is the width of the confidence interval. In the first line is the crude measure of association between prior hospitalization and presence of MRSA. The crude odds ratio is In the second line is the odds ratio when we adjust jointly for all three potential confounders. The odds ratio is 4.66, which is very different than Hence, we would all conclude that confounding is occurring here. What happens if we then drop age? The odds ratio is now 5.04, which is within 10% of the gold standard of Hence, it might be reasonable to eliminate age. What if we drop gender? The odds ratio is 4.63, also not different than 4.66, the gold standard. It may be reasonable to drop gender. What if we drop prior antibiotic use? You could have guessed that if age and gender did not have much an effect but the gold standard adjustment containing all three factors did, then prior antibiotic use should have an effect. Indeed, when antibiotic use is dropped, as shown in the line containing just age and gender, then the odds ratio is 11.59, very different from the gold standard. What if you just adjusted for antibiotic use alone? The resulting odds ratio is 5.0, again not very different from the gold standard. Now it is time to report your final result? Which odds ratio would you use? From a validity perspective, you could choose the gold standard or any of the lines where the odds ratio is within 10% of Here, you would probably choose the set of adjusted for variables that results in the narrowest confidence interval. As you can see, in this example, it turns out that the gold standard set of variables (adjusting for all three) also produces the narrowest confidence interval. However, it does not always turn out like this. Often the gold standard model is the least precise and in such cases it is allowable to choose the most precise estimate. Which OR to report?

37 Approaches for When More than One Potential Confounder is Present
In the forward selection approach, you start with . . . Forward Strategy start with the variable that has the biggest “change-in-estimate” impact then add the variable with the second biggest impact keep this variable if its presence meaningfully changes the adjusted estimate procedure continues until no other added variable has an important impact Advantage: avoids the initial sparse cell problem of backwards approach Problem: does not evaluate joint confounding effects of many variables In contrast to the backwards strategy is what is known as a forward strategy. This is where you start by assessing the effects of the potential confounders one variable at a time. After you find the variable that has the biggest change in the estimate compared to the crude estimate, you make that your gold standard adjusted estimate. You then add another variable in the mix, the one with the second biggest change in estimate when evaluated individually. If the addition of the this variable results in a meaningfully different change (e.g., 10%) compared to the gold standard adjusted estimate, you would keep this variable. If not, drop it and consider the remaining variables. This continues until no other added variable has an important impact. This procedure has some advantages like that it gets away from the problem of having many sparse cells when looking at joint confounding of many variables but it has a big disadvantage in that it really does not fully look at the effects of joint confounding. If possible we always prefer backward procedures because they formally assess joint confounding.

38 Stratification to Reduce Confounding
Although you are all now learning about the wonderful world of multivariable modeling, I would encourage you to examine your data whenever you can with stratification because it is the most native way to see your data and the easiest to explain your data to others Advantages straightforward to implement and comprehend easy way to evaluate statistical interaction Limitations Looks at only one exposure-disease assoc. at a time Requires continuous variables to be discretized loses information; possibly results in “residual confounding” Deteriorates with multiple confounders e.g. suppose 4 confounders with 3 levels 3x3x3x3=81 strata needed unless huge sample, many cells have “0”’s and strata have undefined effect measures Solution: Mathematical modeling (multivariable regression) e.g. linear regression logistic regression proportional hazards regression It does, however, have its limitations which is principally that it breaks down with multiple confounders Finally, although we have spent the last several sessions focusing on stratification as a very straightforward approach to evaluate interaction and confounding, stratification does have its limitations. First, we can only look at one exposure-disease association at a time. Each time you want to use your same data to look at the association of another exposure and the disease under study, you have to re-format your 2x2 tables. Second, what do we do if we have continuous variables as our exposure or potential confounders - something like age, for example? To use stratification, we have to break these continuous variables into categories in order to get them into our contingency tables. This unfortunately is not the richest use of continuous data. But finally the biggest limitation of stratification, that we have already touched upon, is that it really deteriorates with multiple potential confounders. Suppose there are 4 potential confounders present each with 3 levels. If you wanted to look for the presence of joint confounding by these 4 pot CF’s you would have to form 81 strata and unless you had an enormous sample size, many of these strata would have unusable data and you could not perform adjustment. The solution to all of these limitations lies in the use of mathematical models also known as multivariable regression. Mitch Katz will give you a conceptual approach to these in the next three sessions of this course and then you will learn the technical aspects of how to do these in the Biostatistics course starting in Jan. These approaches are the topics of Mitch Katz’s upcoming sessions and your Thursday sessions.


Download ppt "Winter Electives Molecular and Genetic Epidemiology"

Similar presentations


Ads by Google