Confounding, Matching, and Related Analysis Issues Kevin Schwartzman MD Lecture 8a June 22, 2005.

Slides:



Advertisements
Similar presentations
Case-control study 3: Bias and confounding and analysis Preben Aavitsland.
Advertisements

1 Matching EPIET introductory course Mahón, 2011.
Matching in Case-Control Designs EPID 712 Lecture 13 02/23/00 Megan O’Brien.
Three or more categorical variables
M2 Medical Epidemiology
Study Designs in Epidemiologic
1 Confounding and Interaction: Part II  Methods to Reduce Confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Case-Control Studies (Retrospective Studies). What is a cohort?
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Sensitivity Analysis for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Chance, bias and confounding
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
What is a sample? Epidemiology matters: a new introduction to methodological foundations Chapter 4.
Chapter 11 Sampling Design. Chapter 11 Sampling Design.
Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics Biostatistics March 2007 Carla Talarico.
THREE CONCEPTS ABOUT THE RELATIONSHIPS OF VARIABLES IN RESEARCH
Confounding, Effect Modification, and Stratification.
Association vs. Causation
Are exposures associated with disease?
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Stratification and Adjustment
Cohort Study.
INTRODUCTION TO EPIDEMIOLO FOR POME 105. Lesson 3: R H THEKISO:SENIOR PAT TIME LECTURER INE OF PRESENTATION 1.Epidemiologic measures of association 2.Study.
Unit 6: Standardization and Methods to Control Confounding.
Lecture 3: Measuring the Occurrence of Disease
The third factor Effect modification Confounding factor FETP India.
Multiple Choice Questions for discussion
Case control study Moderator : Chetna Maliye Presenter Reshma Sougaijam.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Amsterdam Rehabilitation Research Center | Reade Multiple regression analysis Analysis of confounding and effectmodification Martin van de Esch, PhD.
Experimental Design All experiments have independent variables, dependent variables, and experimental units. Independent variable. An independent.
A short introduction to epidemiology Chapter 2b: Conducting a case- control study Neil Pearce Centre for Public Health Research Massey University Wellington,
October 15. In Chapter 19: 19.1 Preventing Confounding 19.2 Simpson’s Paradox 19.3 Mantel-Haenszel Methods 19.4 Interaction.
Study Designs for Clinical and Epidemiological Research Carla J. Alvarado, MS, CIC University of Wisconsin-Madison (608)
Analytical epidemiology Disease frequency Study design: cohorts & case control Choice of a reference group Biases Alain Moren, 2006 Impact Causality Effect.
Case Control Study Dr. Ashry Gad Mohamed MB, ChB, MPH, Dr.P.H. Prof. Of Epidemiology.
Matching (in case control studies) James Stuart, Fernando Simón EPIET Dublin, 2006.
Instructor Resource Chapter 14 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Case Control Study : Analysis. Odds and Probability.
11/20091 EPI 5240: Introduction to Epidemiology Confounding: concepts and general approaches November 9, 2009 Dr. N. Birkett, Department of Epidemiology.
A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
31/7/20091 Summer Course: Introduction to Epidemiology August 21, Confounding: control, standardization Dr. N. Birkett, Department of Epidemiology.
Instructor Resource Chapter 15 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
POPLHLTH 304 Regression (modelling) in Epidemiology Simon Thornley (Slides adapted from Assoc. Prof. Roger Marshall)
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
Matched Case-Control Study Duanping Liao, MD, Ph.D Phone:
Types of Studies. Aim of epidemiological studies To determine distribution of disease To examine determinants of a disease To judge whether a given exposure.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Purpose of Epi Studies Discover factors associated with diseases, physical conditions and behaviors Identify the causal factors Show the efficacy of intervening.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
(www).
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Chapter 9: Case Control Studies Objectives: -List advantages and disadvantages of case-control studies -Identify how selection and information bias can.
Instructional Objectives:
Matched Case-Control Study
Epidemiological Methods
Epidemiology 503 Confounding.
Lecture 3: Introduction to confounding (part 1)
2. Stratified Random Sampling.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Saturday, August 06, 2016 Farrokh Alemi, PhD.
Kanguk Samsung Hospital, Sungkyunkwan University
Evaluating Effect Measure Modification
The Aga Khan University
Mpundu MKC MSc Epidemiology and Biostatistics, BSc Nursing, RM, RN
Confounders.
Case-control studies: statistics
Presentation transcript:

Confounding, Matching, and Related Analysis Issues Kevin Schwartzman MD Lecture 8a June 22, 2005

Readings Fletcher, chapter 1 Hennekens and Buring, Epidemiology in Medicine, 1987: Chapter 12, Analysis of Epidemiologic Studies: Evaluating the Role of Confounding [course pack] Confounding, Matching & Related Analysis Issues

Confounding, Matching & Related Analysis Issues - Slide 1 Objectives Students will be able to: 1.Define confounding 2.Explain what must be true of a confounding variable 3.Describe design strategies for control of confounding a.Restriction b.Randomization, including stratified design c.Matching, including different matching schemes

Objectives 4.Describe analytic strategies for control of confounding a.Stratified analyses b.Standardization c.Calculation of pooled effect estimates: the example of the Mantel-Haenszel odds ratio d.The special case of matched pair case-control studies e.Multivariate analyses 5.Identify advantages and disadvantages of matching 6.Define and identify effect modification Confounding, Matching & Related Analysis Issues - Slide 2

Confounding Refers to distortion of the true underlying relationship (or lack thereof) between an exposure and an outcome of interest, because of the influence of a third factor (a “confounder” or a “confounding variable”) At the design phase, confounding is potential; its true presence or absence is assessed through appropriate data analyses Confounding, Matching & Related Analysis Issues - Slide 3

Confounding Variables A variable is said to be a confounder if: -it is associated with the exposure of interest -it is an independent risk factor for the outcome of interest -it is not an intermediate along the causal pathway from exposure to outcome Exposure Confounder Outcome Confounding, Matching & Related Analysis Issues - Slide 4

Confounding, Matching & Related Analysis Issues - Slide 5

Case-Control Study Confounding, Matching & Related Analysis Issues - Slide 6

Smoking as Confounder Smoking was associated with coffee drinking -400/450 coffee drinkers were smokers, vs 80/230 non-coffee drinkers Smoking is an independent risk factor for lung cancer -here, OR = (300 x 160)/(40 x 180) = 6.7 By separating the group into smokers and non-smokers, and examining the relationship between coffee and lung cancer within each subgroup, confounding by smoking was eliminated Confounding, Matching & Related Analysis Issues - Slide 7

Smoking as Confounder There was no independent association of coffee drinking with lung cancer (odds ratio within both smoking subgroups or strata was 1)  The apparent relationship was due entirely to confounding by smoking  Confounding can also reduce, eliminate, exaggerate, or even change the direction of true underlying associations  The presence of confounding can be assessed by comparing crude and adjusted effect estimates (some investigators use 10% “rule of thumb”) Confounding, Matching & Related Analysis Issues - Slide 8

Design Strategies to Control Confounding First of all, any potential confounder must be measured appropriately Simplest strategy (in terms of design) is restriction, to eliminate variation in potential confounder If there is no variation in the potential confounder, it cannot influence the outcome Example: restriction of the lung cancer-coffee study to smokers only However, in this particular case, there could still be residual variation in smoking which could influence outcome Confounding, Matching & Related Analysis Issues - Slide 9

Randomization Goal is to distribute potential confounders equally between study groups  Again, if there is no variation in a potential confounder, it cannot be responsible for differences in outcome  Smaller sample sizes may lead to imbalance between groups with respect to potential confounders, simply by chance Confounding, Matching & Related Analysis Issues - Slide 10

Randomization Stratified randomization (often combined with blocked randomization): promotes equal distribution of treatment groups across strata of variable(s) of interest e.g. gender, age, study centre  Number of strata limited by logistical constraints  All reports of randomized studies include a table for assessing the adequacy of randomization  As soon as analysis is limited to subgroups, the control of confounding disappears e.g. compliance bias (healthy behaviours etc.) Confounding, Matching & Related Analysis Issues - Slide 11

Matching Matching is an element of observational study design, introduced to help control potential confounders  it involves selection of a comparison group that is forced to resemble the index group with respect to the distribution of one or more potential confounders  in case-control studies  selection of control group (matched to cases with respect to potential confounders)  in cohort studies  selection of unexposed group (matched to exposed with respect to potential confounders) Confounding, Matching & Related Analysis Issues - Slide 12

Subjects can be matched for continuous covariates (e.g. age) or categorical covariates (e.g. sex, HIV serology, etc.) Matching may be done at the level of the individual or of the group In a case-control study, individual matching means that each case is separately matched to one or more control(s) according to the matching factor(s) Matching or variable ratio may be fixed (e.g. 1 case:1 control, 1:2, etc.) Confounding, Matching & Related Analysis Issues - Slide 13

We will primarily discuss matching in case-control studies For categorical covariates, individual matching means that for each case, the control subject(s) is/are drawn from the same category, e.g. male controls for male subjects Continuous covariates may also be “categorized”, e.g. age divided into categorical ranges: 20-39, 40-59, 60-79, etc. Confounding, Matching & Related Analysis Issues - Slide 14

Continuous variables may be matched by a)Caliper matching: a rule by which values are considered sufficiently close Matching done on sex plus age within 3 years Potential controls: men aged 28, 35, 39, 49, 57 women aged 31, 34, 43 Case 1:31 y.o. male  matched to 28 y.o. male Case 2: 38 y.o. female  no match found  case discarded  or additional controls identified Confounding, Matching & Related Analysis Issues - Slide 15

Continuous variables may be matched by b) Nearest available matching -controls are selected based on the closest value of the matching factor In above example, the match for the 38 y.o. female case would be a 34 y.o. female control Advantage: less restrictive, more efficient Disadvantage: Subjects may be less well matched if the distribution of the matching variable is quite different between cases and controls Confounding, Matching & Related Analysis Issues - Slide 16

Example: cases of a disease which affects primarily elderly persons Controls drawn from the general population with matching based on nearest age may be considerably younger, on average, depending on the number of potential controls identified. -the same may occur when continuous variables are categorized into wide ranges - the impact of the study will depend on the nature of the relationship between the matching factor, the exposure, and the outcome of interest Confounding, Matching & Related Analysis Issues - Slide 17

Group level matching Cases are stratified according to the matching factor, and then controls are selected to match the grouping of cases a)Stratified sampling: The levels of the covariate in which sampling occurs are defined. Then preset numbers of cases and controls are drawn from each stratum, with a consistent matching ratio Confounding, Matching & Related Analysis Issues - Slide 18

Example of stratified sampling: Case-control study examining coffee intake and lung cancer Confounding, Matching & Related Analysis Issues - Slide 19

b) Frequency matching There is also a constant proportion of controls to cases, but the distribution of cases is not fixed according to the matching factor. However, controls are forced to have the same distribution of the matching factor as do the cases. The distribution of the matching factors is therefore representative of that among the population that gave rise to cases. Confounding, Matching & Related Analysis Issues - Slide 20

Example of frequency matching: Coffee intake and lung cancer Confounding, Matching & Related Analysis Issues - Slide 21 -here the number of cases in each smoking stratum reflects the distribution of smoking behaviour among lung cancer cases -the matching ratio is 2 controls per case throughout

Analysis of case-control studies with matching: -Always requires stratification by the matching factor (or the multivariate equivalent - conditional logistic regression). -The crude odds ratio will be biased toward the null value. -This is because matching forces the cases and controls to be more alike with respect to the exposure of interest than would ordinarily be the case. Confounding, Matching & Related Analysis Issues - Slide 22

Hypothetical example: Obesity YesNoTotal SmokersHeart disease48020|500 No heart disease42080|500 _________________________________ Total900100|1000 _________________________________ OR = 4.6 Obesity YesNoTotal Non-smokersHeart disease 842|50 No heart disease 248|50 _________________________________ Totals1090|100 _________________________________ OR = 4.6 Confounding, Matching & Related Analysis Issues - Slide 23

Crude analysis of same data Obesity YesNoTotal Heart disease 48862|550 No heart disease |550 ____________________ Totals |1100 OR crude = 2.4 Despite matching, the underlying association between smoking (confounder) and obesity (exposure) remains: smokers were much more likely than non-smokers to be obese.  However, matching on smoking behaviour made cases and controls more similar with respect to obesity, thereby leading to underestimation of the odds ratio. Stratified analysis corrects this problem. Confounding, Matching & Related Analysis Issues - Slide 24

Matching in cohort studies - does not lead to inappropriate crude risk/rate ratio estimates e.g. cohort study of obesity and heart disease Obesity YesNoTotal SmokersHeart disease No heart disease _______________________________________ Total _______________________________________ RR = 4.6 Obesity YesNoTotal Non-smokersHeart disease4610 No heart disease _______________________________________ Total _______________________________________ RR = 4.6 Confounding, Matching & Related Analysis Issues - Slide 25

Crude analysis Coffee YesNoTotals SmokersLung cancer506110| 616 No cancer | 3384 ___________________________________ Totals |4000 RR = 4.6 Here the crude RR is the same as within the individual strata. This is because matching eliminates the association between smoking (confounder) and coffee drinking (the exposure studied). Confounding, Matching & Related Analysis Issues - Slide 26

Stratified Analysis If effect estimates are identical across strata, then it is easy to report a single summary estimate (e.g. odds ratio)  More often, they are not precisely identical, which may reflect random error/imprecision (e.g. small strata), residual confounding, or truly different effects (effect modification)  Effect modification will be described separately Confounding, Matching & Related Analysis Issues - Slide 27

Combining Effects from Strata Can take some type of weighted average One approach is to use weights which reflect the distribution of the stratification variable in the population of interest For example, age-specific risk ratios could be combined using a weighted average that accounts for the age distribution of the general population This is an example of standardization: the effect is adjusted to reflect a standard age distribution This does not assume that the effects are homogeneous The most heavily weighted strata may not have much information Confounding, Matching & Related Analysis Issues - Slide 28

Mantel-Haenszel Odds Ratio An odds ratio that reflects pooling of effects across strata, to summarize the overall association between exposure and outcome, while adjusting for the effect of the confounder of concern Pooling assumes that the effect is homogeneous, and variation reflects random error Is a weighted average of odds ratio estimates across strata Weights reflect quantity of information in each stratum, expressed as bc/T where b and c are exposed controls and unexposed cases within the stratum, and T is total subjects within the stratum Note this differs from standardization using “external” weights Confounding, Matching & Related Analysis Issues - Slide 29

Mantel-Haenszel Odds Ratio OR MH = Σ [(bc/T) x ad/bc]=Σ(ad/T) ___________________________ Σ(bc/T)Σ(bc/T) For the case-control study of obesity and heart disease, this would be: (480 x 80)/ (8 x 48)/100 __________________________ (20 x 420)/ (42 x 2)/100 = ( )/( ) = 4.6 Confounding, Matching & Related Analysis Issues - Slide 30

Analysis of matched pair data in case control studies can be thought of as a special case of stratified analysis each matched pair constitutes a single stratum with 2 subjects only informative strata are those where exposure status of case and control are discordant Confounding, Matching & Related Analysis Issues - Slide 31

Recall Mantel-Haenszel OR estimates OR MH = ( ad/T) _______ ( bc/T) Concordant strata:E + E - D + 10 D - 10 orE + E - D + 01 D - 01  ad = 0, bc = 0 Confounding, Matching & Related Analysis Issues - Slide 32

The pairs can be grouped as follows: Case ControlExposedUnexposed Exposedrs Unexposedtu Then OR MH = t/s i.e.N(case exposed, control unexposed) _____________________________ N(case unexposed, control exposed) where N refers to number of pairs Confounding, Matching & Related Analysis Issues - Slide 33

Example: Marrie et al conducted a study evaluating the relationship between certain infections (the exposure) and the subsequent development of multiple sclerosis (the outcome). Data was taken from a general practice database.  Cases and controls were matched on age ( 2 years), sex, physician practice, and date seen.  Imagine a 1:1 design (in fact it was 1:4, on average). Confounding, Matching & Related Analysis Issues - Slide 34

Hypothetical data MS (cases) No MS (controls)InfectionNo infection Infection305 No infection20170 OR = 20/5 = 4 Confounding, Matching & Related Analysis Issues - Slide 35

Suppose the key confounder is physician practice -the physicians most likely to see and diagnose infections may also be those most likely to pursue and establish the diagnosis of multiple sclerosis Unmatched analysis MSNo MS Infection5035 No infection Crude OR = (190x50) / (175x35) = 1.6 As before, the unstratified analysis yields an OR estimate biased toward the null. As before, this is because the matching forces the controls to “resemble” the cases with respect to the distribution of exposure in the crude analysis. Confounding, Matching & Related Analysis Issues - Slide 36

Multivariate Analysis Has become the standard approach for identifying and accounting for confounding  Complex process: computer essentially solves multiple equations to identify “best guess” effect estimate while holding other covariates constant, e.g. effect of obesity while holding smoking behaviour, sex, diabetes constant  Mathematically breaks the data down into numerous strata  Examples: logistic regression for binary outcome data (very frequent), Cox proportional hazards modelling for incidence data, Poisson model for count data Confounding, Matching & Related Analysis Issues - Slide 37

Rationale for Matching Matching can be considered a form of partial restriction: the controls are restricted so as to resemble the cases with respect to some factor(s). The main purpose of matching is to improve statistical efficiency (precision). In principle, stratified analysis alone (including multivariate techniques) should be sufficient to deal with the confounder in question. However, matching may be needed to ensure that all strata are sufficiently informative. Confounding, Matching & Related Analysis Issues - Slide 38

Example: An investigator wishes to investigate a possible association between use of calcium channel blockers (drugs used for blood pressure and heart disease) and Alzheimer’s disease. Age is obviously a key confounder: increasing age is associated with use of the drugs in question and with the onset of Alzheimer’s disease  Unmatched controls drawn from the general population will be younger and hence less likely to be using calcium channel blockers, leading the crude analysis to overestimate any potential association Confounding, Matching & Related Analysis Issues - Slide 39

This can be handled through stratified analysis by age (e.g. various age categories) If unmatched general population controls are used, there may be few controls in the oldest age strata, leading to imprecise OR estimates in those strata (wide confidence intervals) Matching ensures sufficient numbers of subjects for each level of the matching variable(s) - in this case, age Matched cohort studies are also more efficiently analyzed using stratification by the matching factor(s) Confounding, Matching & Related Analysis Issues - Slide 40

Advantages of matching 1.Promotes efficiency, as discussed above. Studies are most efficient when the the ratio of index to referent subjects (e.g. cases:controls) is constant across the different strata of a confounder. 2.Very useful in situations where the confounder is difficult to quantify or control, making stratification impossible. Classic example: using sibling controls. Confounding, Matching & Related Analysis Issues - Slide 41

Disadvantages of matching 1.Practical -may be cumbersome, expensive, time consuming. Depending on the circumstances, index subjects may be dropped if no matching referent subjects are found  loss of data. Also very onerous when many matching factors are used. 2.The effect of the matching factor on the outcome of interest cannot be evaluated. 3.Potential for overmatching. Confounding, Matching & Related Analysis Issues - Slide 42

Overmatching Refers in general to situations where matching interferes with the logistics, statistical efficiency, or scientific validity of a study. 1.Overmatching as a cause of logistical inefficiency matching on many factors, or on factors that are difficult to match, adds to the expense and difficulty of study conduct  difficulty with matching may lead to loss of cases as well as of potential controls (in case-control studies) Confounding, Matching & Related Analysis Issues - Slide 43

2.Overmatching as a cause of reduced statistical efficiency  occurs when matching factor is not a true confounder, e.g. associated with exposure but not with outcome  simplest example is with matched pair case-control design  if cases and controls made more similar with respect to exposure frequency, then there will be many uninformative pairs  these do not contribute to the odds ratio estimate and are essentially “ wasted”  conversely with fewer discordant pairs, the precision of the odds ratio estimate is reduced  the same holds true for other matching ratios Confounding, Matching & Related Analysis Issues - Slide 44

With weak confounders (e.g. limited effect on outcome) the loss of statistical efficiency may outweigh any apparent benefits of matching Recall that stratified analysis and multivariate techniques will still account for potential confounders in the absence of matching Confounding, Matching & Related Analysis Issues - Slide 45

3.Overmatching as a cause of biased effect estimates Occurs when matching factor is: a)produced by exposure and related to disease (e.g. an intermediate in pathway) or b)produced by disease and related to exposure Confounding, Matching & Related Analysis Issues - Slide 46

Effect Modification Effect modification refers to the situation where the biologic effect of exposure on outcome differs according to some additional factor, e.g. different influence of smoking on development of COPD in men and women Also known as interaction In stratified analysis, will see different exposure-outcome relationships within different strata, e.g. different odds ratios, rate ratios, etc. Confounding, Matching & Related Analysis Issues - Slide 47

In the absence of confounding, the overall effect estimate will simply be an average of the stratum-specific estimates, weighted by the size of the strata e.g. males and females Effect modification is NOT the same as confounding -It refers to biologic variation in an effect, not artefactual distortion of results because of inadequate design or analysis Effect modification should be noted and reported, rather than “controlled” through design and analysis strategies Effect modification is relevant to randomized trials as well as observational studies Confounding, Matching & Related Analysis Issues - Slide 48

Effect modification is only evident from stratified analysis, with stratification by the factor(s) of interest Analyses/effect estimates restricted to specific strata (e.g. women, young adults) have less precision and statistical power than the study as a whole If investigators wish to detect and document effect modification, they need to ensure the necessary sample sizes Confounding, Matching & Related Analysis Issues - Slide 49