Quantitative Synthesis I Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide www.ahrq.gov.

Slides:



Advertisements
Similar presentations
Systematic Review Module 10: Quantitative Synthesis II Thomas Trikalinos, MD, PhD Joseph Lau, MD Tufts EPC.
Advertisements

How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Comparing Two Proportions (p1 vs. p2)
Sample size estimation
Meta-analysis: summarising data for two arm trials and other simple outcome studies Steff Lewis statistician.
Quantitative Synthesis II Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
ODAC May 3, Subgroup Analyses in Clinical Trials Stephen L George, PhD Department of Biostatistics and Bioinformatics Duke University Medical Center.
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Meta-analysis & psychotherapy outcome research
CRITICAL READING OF THE LITERATURE RELEVANT POINTS: - End points (including the one used for sample size) - Surrogate end points - Quality of the performed.
Overview of Meta-Analytic Data Analysis. Transformations Some effect size types are not analyzed in their “raw” form. Standardized Mean Difference Effect.
BS704 Class 7 Hypothesis Testing Procedures
Heterogeneity in Hedges. Fixed Effects Borenstein et al., 2009, pp
CHAPTER 19: Two-Sample Problems
Are exposures associated with disease?
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Are the results valid? Was the validity of the included studies appraised?
Systematic Review Module 9: Quantitative Synthesis I Joseph Lau, MD Thomas Trikalinos, MD, PhD Tufts EPC.
Department of O UTCOMES R ESEARCH. Daniel I. Sessler, M.D. Michael Cudahy Professor and Chair Department of O UTCOMES R ESEARCH The Cleveland Clinic Clinical.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Screening and Prevention of Illnesses and Injuries: Research Methods.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Nathaniel Cannon Describing Data: Categorical Variables SECTIONS 2.1 One categorical variable Two.
Systematic Reviews Professor Kate O’Donnell. Reviews Reviews (or overviews) are a drawing together of material to make a case. These may, or may not,
Inference for a Single Population Proportion (p).
Data Analysis in Systematic Reviews
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
1 ICEBOH Split-mouth studies and systematic reviews Ian Needleman 1 & Helen Worthington 2 1 Unit of Periodontology UCL Eastman Dental Institute International.
Grant Stephen: Chair of the MBC Life Science Informatics Group & CEO, Tessella Inc: Creating Insight & Understanding from Scientific.
Simon Thornley Meta-analysis: pooling study results.
Meta-analysis and “statistical aggregation” Dave Thompson Dept. of Biostatistics and Epidemiology College of Public Health, OUHSC Learning to Practice.
Meta-analysis 統合分析 蔡崇弘. EBM ( evidence based medicine) Ask Acquire Appraising Apply Audit.
1October In Chapter 17: 17.1 Data 17.2 Risk Difference 17.3 Hypothesis Test 17.4 Risk Ratio 17.5 Systematic Sources of Error 17.6 Power and Sample.
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Introduction to meta-analysis.
Quantitative Synthesis I Interactive Quiz Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods.
META-ANALYSIS: THE ART AND SCIENCE OF COMBINING INFORMATION Ora Paltiel, October 28, 2014.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
PH 401: Meta-analysis Eunice Pyon, PharmD (718) , HS 506.
Basics of Meta-analysis
Significance Tests Martin Bland Professor of Health Statistics University of York
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
1 Lecture 10: Meta-analysis of intervention studies Introduction to meta-analysis Selection of studies Abstraction of information Quality scores Methods.
Quantitative Synthesis II Interactive Quiz Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews.
Statistical significance using Confidence Intervals
C-1 Efficacy of the Combination: Meta-Analyses Donald A. Berry, Ph.D. Frank T. McGraw Memorial Chair of Cancer Research University of Texas M.D. Anderson.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Systematic Review and Meta-Analysis.
Systematic Reviews and Meta-analyses. Introduction A systematic review (also called an overview) attempts to summarize the scientific evidence related.
Course: Research in Biomedicine and Health III Seminar 5: Critical assessment of evidence.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
1 Lecture 10: Meta-analysis of intervention studies Introduction to meta-analysis Selection of studies Abstraction of information Quality scores Methods.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 27 Systematic Reviews of Research Evidence: Meta-Analysis, Metasynthesis,
Date of download: 6/2/2016 From: Quantitative Synthesis in Systematic Reviews Ann Intern Med. 1997;127(9): doi: /
Chapter 10: The t Test For Two Independent Samples.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Inference for a Single Population Proportion (p)
Improving Adverse Drug Reaction Information in Product Labels
From: Quantitative Synthesis in Systematic Reviews
Heterogeneity and sources of bias
Lecture 4: Meta-analysis
Chapter 8: Inference for Proportions
Baseline characteristics of HPS participants by prior diabetes
Gerald Dyer, Jr., MPH October 20, 2016
Risk ratios 12/6/ : Risk Ratios 12/6/2018 Risk ratios StatPrimer.
Regression To The Mean 林 建 甫 C.F. Jeff Lin, MD. PhD.
EAST GRADE course 2019 Introduction to Meta-Analysis
Statistical significance using p-value
Publication Bias in Systematic Reviews
Type I and Type II Errors
Systematic Reviews and Meta-Analysis -Part 2-
Presentation transcript:

Quantitative Synthesis I Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide

Systematic Review Process Overview

 To list the basic principles of combining data  To recognize common metrics for meta-analysis  To describe the role of weights to combine results across studies  To distinguish between clinical and methodological diversity and statistical heterogeneity  To define fixed effect model and random effects model Learning Objectives

 Quantitative overview/synthesis  Pooling  Less precise  Suggests that data from multiple sources are simply lumped together  Combining  Preferred by some  Suggests applying statistical procedures to data Synonyms for Meta-Analysis

 Improve the power to detect a small difference if the individual studies are small  Improve the precision of the effect measure  Compare the efficacy of alternative interventions and assess consistency of effects across study and patient characteristics  Gain insights into statistical heterogeneity  Help to understand controversy arising from conflicting studies or generate new hypotheses to explain these conflicts  Force rigorous assessment of the data Reasons To Conduct Meta-Analyses

Commonly Encountered Comparative Effect Measures Type of Data Corresponding Effect Measure Continuous Mean difference (e.g., mmol, mmHg) Standardized mean difference (effect size) Correlation DichotomousOdds ratio, risk ratio, risk difference Time to eventHazard ratio

 For each analysis, one study should contribute only one treatment effect.  The effect estimate may be for a single outcome or a composite.  The outcome being combined should be the same — or similar, based on clinical plausibility — across studies.  Know the research question. The question drives study selection, data synthesis, and interpretation of the results. Principles of Combining Data for Basic Meta-Analyses

 Biological and clinical plausibility  Scale of effect measure  Studies with small numbers of events do not give reliable estimates Things To Know About the Data Before Combining Them

True Associations May Disappear When Data Are Combined Inappropriately

An Association May Be Seen When There Is None

Changes in the Same Scale May Have Different Meanings  Both A–B and C–D involve a change of one absolute unit  A–B change (1 to 2) represents a 100% relative change  C–D change (7 to 8) represents only a 14% relative change

Effect of the Choice of Metric on Meta-analysis TreatmentControl StudyEventsTotalRateEventsTotalRate Relative Risk Risk Difference A % % 0.510% B % % %

Effect of Small Changes on the Estimate Baseline case Effect of decrease of 1 event Effect of increase of 1 event Relative change of estimate 2/10 20% 1/10 10% 3/10 30% ±50% 20/100 20% 19/100 19% 21/100 21% ±5% 200/1,000 20% 199/1, % 201/1, % ±0.5%

 Outcomes that have two states (e.g., dead or alive, success or failure)  The most common type of outcome reported in clinical trials  2x2 tables commonly used to report binary outcomes Binary Outcomes

A Sample 2x2 Table ISIS-2 Collaborative Group. Lancet 1988;2: Vascular deaths SurvivalTotal Streptokinase 791 7,8018,592 Placebo 1,029 7,5668,595 Binary outcomes data to be extracted from studies

OR = (a / b) / (c / d) Treatment Effect Metrics That Can Be Calculated From a 2x2 Table

 Value ranges from -1 to +1  Magnitude of effect is directly interpretable  Has the same meaning for the complementary outcome (e.g., 5% more people dying is 5% fewer living)  Across studies in many settings, tends to be more heterogeneous than relative measures  Inverse is the number needed to treat (NNT) and may be clinically useful  If heterogeneity is present, a single NNT derived from the overall risk difference could be misleading Some Characteristics and Uses of the Risk Difference

 Value ranges from 1/oo to +   Has desirable statistical properties; better normality approximation in log scale than risk ratio  Symmetrical meaning for complementary outcome (the odds ratio of dying is equal to the opposite [inverse] of the odds ratio of living)  Ratio of two odds is not intuitive to interpret  Often used to approximate risk ratio (but gives inflated values at high event rates) Some Characteristics and Uses of the Odds Ratio

 Value ranges from 0 to   Like its derivative, relative risk reduction, is easy to understand and is preferred by clinicians  Example: a risk ratio of 0.75 is a 25% relative reduction of the risk  Requires a baseline rate for proper interpretation  Example: an identical risk ratio for a study with a low event rate and another study with higher event rate may have very different clinical and public health implications  Asymmetric meaning for the complementary outcome  Example: the risk ratio of dying is not the same as the inverse of the risk ratio of living Some Characteristics and Uses of the Risk Ratio

When the Complementary Outcome of the Risk Ratio Is Asymmetric DeadAliveTotal Treatment Control  Odds Ratio (Dead) = 20 x 60 / 40 x 80 = 3/8 =  Odds Ratio (Alive) = 80 x 40 / 20 x 60 = 8/3 = 2.67  Risk Ratio (Dead) = 20/100 / 40/100 = 1/2 = 0.5  Risk Ratio (Alive) = 80/100 / 60/100 = 4/3 = 1.33

Calculation of Treatment Effects in the Second International Study of Infarct Survival (ISIS-2)  Treatment-Group Effect Rate = 791 / 8592 =  Control-Group Effect Rate = 1029 / 8595 =  Risk Ratio = / = 0.77  Odds Ratio = (791 x 7566) / (1029 x 7801) = 0.75  Risk Difference = – = Vascular deaths SurvivalTotal Streptokinase 7917,8018,592 Placebo1,0297,5668,595 ISIS-2 Collaborative Group. Lancet 1988;2:

Treatment Effects Estimates in Different Metrics: S econd International Study of Infarct Survival ( ISIS-2) Streptokinase vs. Placebo Vascular Death Estimate 95% Confidence Interval Risk ratio to 0.84 Odds ratio to 0.82 Risk difference to Number needed to treat3627 to 54 ISIS-2 Collaborative Group. Lancet 1988;2:

Example: Meta-Analysis Data Set Beta-Blockers after Myocardial Infarction - Secondary Prevention Beta-Blockers after Myocardial Infarction - Secondary Prevention Experiment Control Odds 95% CI Experiment Control Odds 95% CI N Study Year Obs Tot Obs Tot Ratio Low High N Study Year Obs Tot Obs Tot Ratio Low High === ============ ==== ====== ====== ====== ====== ===== ===== ===== 1 Reynolds Reynolds Wilhelmsson Wilhelmsson Ahlmark Ahlmark Multctr. Int Multctr. Int Baber Baber Rehnqvist Rehnqvist Norweg.Multr Norweg.Multr Taylor Taylor BHAT BHAT Julian Julian Hansteen Hansteen Manger Cats Manger Cats Rehnqvist Rehnqvist ASPS ASPS EIS EIS LITRG LITRG Herlitz Herlitz

 A 1986 study by Charig et al. compared the treatment of renal calculi by open surgery and percutaneous nephrolithotomy.  The authors reported that success was achieved in 78% of patients after open surgery and in 83% after percutaneous nephrolithotomy.  When the size of the stones was taken into account, the apparent higher success rate of percutaneous nephrolithotomy was reversed. Simpson’s Paradox (I) Charig CR, et al. BMJ 1986;292:

Simpson’s Paradox (II) SuccessFailure Open 81 6 PN23436 SuccessFailure Open19271 PN 5525 Stones < 2 cm Stones ≥ 2 cm Pooling Tables 1 and 2 Open (93%) > PN (87%) Open (73%) > PN (69%) Open (78%) < PN (83%) SuccessFailure Open27377 PN28961 Charig CR, et al. BMJ 1986;292: PN = percutaneous nephrolithotomy

StudyN Mean difference (mm Hg) 95% Confidence Interval A to -5.5 B to -5.2 C to 6.3 Combining Effect Estimates What is the average (overall) treatment-control difference in blood pressure?

Simple Average (-6.2) + (-7.7) + (-0.1) 3 = -4.7 mm Hg -4.7 mm Hg StudyN Mean difference mmHg 95% CI A to -5.5 B to -5.2 C to 6.3 What is the average (overall) treatment-control difference in blood pressure?

Weighted Average (554 x 6.2) + (304 x 7.7) + (39 x 0.1) (554 x - 6.2) + (304 x - 7.7) + (39 x - 0.1) = 6.4 mm Hg mm Hg StudyN Mean difference mmHg 95% CI A to -5.5 B to -5.2 C to 6.3 What is the average (overall) treatment-control difference in blood pressure?

General Formula: Weighted Average Effect Size (d + ) Where: d i = effect size of the i th study w i = weight of the i th study k = number of studies

 Generally is the inverse of the variance of treatment effect (that captures both study size and precision)  Different formula for odds ratio, risk ratio, and risk difference  Readily available in books and software Calculation of Weights

 Is it reasonable?  Are the characteristics and effects of studies sufficiently similar to estimate an average effect?  Types of heterogeneity:  Clinical diversity  Methodological diversity  Statistical heterogeneity Heterogeneity (Diversity)

 Are the studies of similar treatments, populations, settings, design, et cetera, such that an average effect would be clinically meaningful? Clinical Diversity

 25 randomized controlled trials compared endoscopic hemostasis with standard therapy for bleeding peptic ulcer.  5 different types of treatment were used: monopolar electrode, bipolar electrode, argon laser, neodymium-YAG laser, and sclerosant injection.  4 different conditions were treated: active bleeding, a nonspurting blood vessel, no blood vessels seen, and undesignated.  3 different outcomes were assessed: emergency surgery, overall mortality, and recurrent bleeding. Example: A Meta-analysis With a Large Degree of Clinical Diversity Sacks HS, et al. JAMA 1990;264:494-9.

 Are the studies of similar design and conduct such that an average effect would be clinically meaningful? Methodological Diversity

 Is the observed variability of effects greater than that expected by chance alone?  Two statistical measures are commonly used to assess statistical heterogeneity:  Cochran’s Q-statistics  I2 index Statistical Heterogeneity

Cochran’s Q-Statistics: Chi-square (  2 ) Test for Homogeneity d i = effect measure; d + = weighted average Q-statistics measure between-study variation.

The I 2 Index and Its Interpretation  Describes the percentage of total variation in study estimates that is due to heterogeneity rather than to chance  Value ranges from 0 to 100 percent  A value of 25 percent is considered to be low heterogeneity, 50 percent to be moderate, and 75 percent to be large  Is independent of the number of studies in the meta-analysis; it could be compared directly between meta-analyses Higgins JP, et al. BMJ 2003;327:

Example: A Fixed Effect Model  Suppose that we have a container with a very large number of black and white balls.  The ratio of white to black balls is predetermined and fixed.  We wish to estimate this ratio.  Now, imagine that the container represents a clinical condition and the balls represent outcomes.

Random Sampling From a Container With a Fixed Number of White and Black Balls (Equal Sample Size)

Random Sampling From a Container With a Fixed Number of Black and White Balls (Different Sample Size)

Different Containers With Different Proportions of Black and White Balls (Random Effects Model)

Random Sampling From Containers To Get an Overall Estimate of the Proportion of Black and White Balls

 Fixed effect model: assumes a common treatment effect.  For inverse variance weighted method, the precision of the estimate determines the importance of the study.  The Peto and Mantel-Haenzel methods are noninverse variance weighted fixed effect models.  Random effects model: in contrast to the fixed effect model, accounts for within-study variation.  The most popular random effects model in use is the DerSimonian and Laird inverse variance weighted method, which calculates the sum of the within-study variation and the among-study variation.  Random effects model can also be implemented with Bayesian methods. Statistical Models of Combining 2x2 Tables

Example Meta-analysis Where Fixed and the Random Effects Models Yield Identical Results

Example Meta-analysis Where Results from Fixed and Random Effects Models Will Differ Gross PA, et al. Inn Intern Med 1995;123: Reprinted with permission from the American College of Physicians.

Weights of the Fixed Effect and Random Effects Models Random Effects Weight Fixed Effect Weight where:v i = within study variance v * = between study variance

Commonly Used Statistical Methods for Combining 2x2 Tables Odds RatioRisk Ratio Risk Difference Fixed Effect Model Mante l-Haenszel Peto Exact Inverse variance weighted Mantel-Haenszel Inverse variance weighted Random Effects Model DerSimonian and Laird

Dealing With Heterogeneity Lau J, et al. Ann Intern Med 1997;127: Reprinted with permission from the American College of Physicians.

 Most meta-analyses of clinical trials combine treatment effects (risk ratio, odds ratio, risk difference) across studies to produce a common estimate, by using either a fixed effect or random effects model.  In practice, the results from using these two models are similar when there is little or no heterogeneity.  When heterogeneity is present, the random effects model generally produces a more conservative result (smaller Z-score) with a similar estimate but also a wider confidence interval; however, there are rare exceptions of extreme heterogeneity where the random effects model may yield counterintuitive results. Summary: Statistical Models of Combining 2x2 Tables

 Many assumptions are made in meta-analyses, so care is needed in the conduct and interpretation.  Most meta-analyses are retrospective exercises, suffering from all the problems of being an observational design.  Researchers cannot make up missing information or fix poorly collected, analyzed, or reported data. Caveats

 Basic meta-analyses can be easily carried out with readily available statistical software.  Relative measures are more likely to be homogeneous across studies and are generally preferred.  The random effects model is the appropriate statistical model in most instances.  The decision to conduct a meta-analysis should be based on:  a well-formulated question,  appreciation of the heterogeneity of the data, and  understanding of how the results will be used. Key Messages

 Charig CR, Webb DR, Payne, SR, et al. Comparison of treatment of renal calculi by operative surgery, percutaneous nephrolithotomy, and extracorporeal shock wave lithotripsy. BMJ 1986;292:879–82.  Gross PA, Hermogenes AW, Sacks HS, et al. The efficacy of Influenza vaccine in elderly persons: a meta-analysis and review of the literature. Ann Intern Med 1995;123:  Higgins JPT, Thompson SG, Deeks JJ, et al. Measuring inconsistency in meta-analyses. BMJ 2003;327:557–60.  Lau J, Ioannidis JPA, Schmid CH. Quantitative synthesis in systematic review. Ann Intern Med 1997;127: References (I)

 ISIS-2 (Second International Study of Infarct Survival) Collaborative Group. Randomized trial of intravenous streptokinase, oral aspirin, both, or neither among 17,817 cases of suspected acute myocardial infarction: ISIS-2. Lancet 1988;2:  Sacks HS, Chalmers TC, Blum AL, et al. Endoscopic hemostasis: an effective therapy for bleeding peptic ulcers. JAMA 1990;264: References (II)

 This presentation was prepared by Joseph Lau, M.D., and Thomas Trikalinos, M.D., Ph.D., members of the Tufts Medical Center Evidence- based Practice Center.  The information in this module is based on Chapter 9 in Version 1.0 of the Methods Guide for Comparative Effectiveness Reviews (available at: es/2007_10DraftMethodsGuide.pdf). Authors