Presentation on theme: "Covariate Selection for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ) www.ahrq.gov."— Presentation transcript:
Covariate Selection for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ) www.ahrq.gov
This presentation will: Describe the data source(s) that will be used to identify important covariates Discuss the potential for unmeasured confounding and misclassification Describe the approach to be used to select covariates for statistical models Outline of Material
Nonexperimental studies that compare the effectiveness of treatments are often strongly affected by confounding. Confounding occurs when patients with a higher risk of experiencing the outcome are more likely to receive one treatment over another. Causal graphs are often used to illustrate relationships among variables that lead to confounding and other types of bias. They represent the investigator’s beliefs about the mechanisms that generated the data and allow the investigator to better interpret statistical associations observed in the data. Statistical methods can be used to control for confounding. Introduction
Treatment effects Risk factors Causal Models and the Structural Relationship of Variables (1 of 7) Causal graph illustrating a randomized trial where the assigned treatment (A 0 ) has a causal effect on the outcome (Y 1 ). Causal graph illustrating a baseline risk factor (C 0 ) for the outcome (Y 1 ).
Confounding Causal Models and the Structural Relationship of Variables (2 of 7) A causal graph illustrating confounding from the unmeasured variable U2. Conditioning on the measured variable (C 0 ), as indicated by the box around the variable, removes confounding from U1. Measured confounders are often proxies for unmeasurable constructs. For example, family history of heart disease is a measured variable indicating someone’s risk for cardiovascular disease (U1).
Provider actions Confounding by indication: Prescribers choose treatments for patients who they believe are most likely to benefit or least likely to be harmed. Confounding by frailty: Patients perceived by a physician to be frail or very sick may be less likely to receive preventive therapies. Patient actions Healthy user bias: Patients who initiate or adhere to a preventive therapy may be more likely than other patients to engage in other prevention-oriented behaviors. Environmental and social factors Access to health care: Patient access may have a direct or indirect relation to treatment and study outcomes. Sources of Confounding
Intermediate variables (1 of 2) Causal Models and the Structural Relationship of Variables (3 of 7) A causal graph representing an intermediate causal pathway. Blood pressure after treatment initiation (C 1 ) is on the causal pathway between antihypertensive treatment (A 0 ) and cardiovascular events (Y 1 ). Baseline blood pressure (C 0 ) is a measured confounder of disease severity (U1) and the box around the variable represents adjustment.
Time-varying confounding Causal Models and the Structural Relationship of Variables (5 of 7) A simplified causal graph illustrating adherence to initial antihypertensive therapy as a time-varying treatment (A 0, A 1 ), joint predictors of treatment adherence, and the outcome (C 0, C 1 ). The unmeasured variable (U1) indicates this is a nonexperimental study.
Collider variables Causal Models and the Structural Relationship of Variables (6 of 7) Hypothetical causal diagram illustrating M-type collider stratification bias. Formulary policy (U1) influences treatment with a calcium channel blocker (A 0 ) and treatment for erectile dysfunction (F 0 ). Unmeasured alcohol use (U2) influences impotence and erectile dysfunction treatment (F 0 ) and acute liver disease (Y 1 ). In this example, there is no effect of antihypertensive treatment on liver disease, but antihypertensive treatment and liver disease would be associated when adjusting for medical treatment of erectile dysfunction. The box around F 0 represents adjustment, and the conditional relationship is represented by the dotted arrow connecting U1 and U2.
Instrumental variables Causal Models and the Structural Relationship of Variables (7 of 7) Bias is amplified (Z-bias) when an instrumental variable (Z 0 ) is added to a model with unmeasured confounders (U1).
Unmeasured confounders A measured proxy can sometimes stand in for an unmeasured confounder. For example, use of oxygen canisters as a proxy for failing health Mismeasured confounders For example, self-reported body mass index Adjusting for a proxy or mismeasured confounder will reduce bias relative to the unadjusted estimate. Any increase in the confounder should, on average, always affect treatment in the same direction for both treated and untreated groups. Sensitivity analysis techniques can address misclassified and unmeasured confounding. Proxy Confounders and Mismeasured and Unmeasured Confounders
Variable selection based on background knowledge Causal Graph Theory An understanding of causal structure of variables is required to separate confounders from other bias-inducing variables. Adjustment for all observed pretreatment covariates Propensity score methods assign the probability of receiving treatment, given the set of observed covariates. Adjustment for all possible risk factors of the outcome Include only variables thought to be direct causes (risk factors) to avoid including strong instruments and colliders. Disjunctive cause criterion All observed variables that are a cause of treatment, outcome, or both should be included for statistical adjustment. Selection of Variables To Control Confounding (1 of 3)
Empirical variable-selection approaches Forward selection procedures Begin with an empty set of covariates and then consider whether each covariate is associated with the outcome conditional on treatment (p-value cutoff of 0.05 or 0.10). Backward selection procedures Begin with all covariates in the model and consider whether each covariate is independent of the outcome conditional on treatment and all other covariates (p-value cutoff of 0.05 or 0.10). Forward selection requires additional assumptions to achieve conditional exchangeability. Selection of Variables To Control Confounding (2 of 3)
Empirical variable selection approaches Automatic high-dimensional “proxy” adjustment An algorithm that creates a very large set of empirically defined covariates from health care utilization data is used. Created variables capture the frequency of codes for procedures, diagnoses, and medication fills during the baseline period. Covariates are required to have a minimal prevalence and have a marginal association with exposure and outcome. Covariates are entered into propensity score models where statistical control of a large number of covariates is possible. Causal analysis with empirical selection A practical approach to variable selection may involve a combination of: 1. A priori variable selection based on the investigator’s knowledge 2. Empirical selection using the high-dimensional approach Selection of Variables To Control Confounding (3 of 3)
Confounding can strongly impact observational comparative effectiveness studies. Clearly describe how variables are measured and provide rationale for a priori selection of potential confounders. If augmenting a model by using an empirical variable- selection technique, present both models and describe how additional variables were measured and selected. All variables included for adjustment should be listed in the manuscript or final report. Sensitivity analyses can address residual confounding. Conclusions
Summary Checklist GuidanceKey Considerations Describe the data source(s) that will be used to identify important covariates Provide information about the source(s) of data for key covariates, acknowledging the strengths and weaknesses of the data source (e.g., administrative claims, electronic medical records, chart review, patient self-report) for measuring each type of covariate. Discuss the potential for unmeasured confounding and misclassification Discuss the potential impact of unmeasured confounders and misclassification or measurement error. Propose specific formal sensitivity analysis of the impact of unmeasured confounders or misclassified variables. Describe the approach to be used to select covariates for statistical models Use approaches based on background knowledge (e.g., selection of all hypothesized common causes, disjunctive cause criterion, Directed Acyclic Graphs, or selection of all variables thought to be risk factors for the outcome). Describe model-reduction techniques to be used (e.g., forward or backward selection). Describe empirical variable-selection techniques and how variables were removed from consideration when they were thought to be bias-inducing rather than bias- reducing variables.