Presentation on theme: "Mark E. Nunnally, MD, FCCM Co-Director, Critical Care Fellowship and Associate Professor in the Department of Anesthesia and Critical Care University of."— Presentation transcript:
1 Mark E. Nunnally, MD, FCCMCo-Director, Critical Care Fellowship and Associate Professor in the Department of Anesthesia and Critical CareUniversity of Chicago Medical CenterChicago, IllinoisGRADE Methodology ExpertContributing Author, “Surviving Sepsis Campaign: International Guidelines for Management of Sepsis and Septic Shock: 2012”
2 Making GRADE work: a how-to for guidelines authors Mark E. Nunnally, MD, FCCMAssociate ProfessorDepartment of Anesthesia & Critical CareThe University of Chicago
3 Course objectives I Translate evidence into graded recommendations. Identify the features that reduce or increase the quality of evidence.
4 Course objectives IIAppraise clinical data to determine quality of evidence.Integrate quality of evidence for an intervention with costs, the balance between desirable and undesirable effects and values to determine the strength of a recommendation.
5 Contents GRADE- why? Transparency and Certainty The Guidelines process: a methodologist’s perspectiveGRADE- componentsSummary
6 Conflict of interest. I am a GRADE advisor for the Surviving Sepsis Campaign
7 Conflict of interest. I am also only a consultant. YOU are the experts.
9 Many guidelines, little standardization Some inform…Some restrict…All claim to be evidence-based……how can we be certain a guideline is supported by the evidence?…how can we be certain its recommendations will hold over time?…how relevant is the recommendation to the things that matter to me?American Healthcare is undergoing a transition to a new era of control
10 Should we rate evidence? ‘Quality’ is a diluted termQuality is a continuumDecisions are always somewhat arbitrary‘Experts’ and clinicians don’t always share the same viewThis is one reason evidence and recommendations should be separate.
11 Should we rate evidence? You need some referenceSimplicityTransparencyVividness
12 Grading of Recommendations Assessment, Development and Evaluation International consensus documentTemplate for systematic reviews, recommendationsYou could mention the webinars done and recorded recently, I think I was sending you a link
15 QOE- definitionFor Guidelines Authors“Extent to which confidence in an estimate of the effect is adequate to support recommendations.”Guyatt G, BMJ 336, 2008I find it a difficult concept. For reviews we look at QoE as certainty that we know the true effect. Here the idea goes further – the same level of certainty in effects may lead to different level of cerainty in recommendation (for example, inexpesive versus expensive intervention, availibility of alternatives)
16 QOE- Philosophical Bent We are going to make recommendations that we (or others) will subsequently change.GRADE lets us:try to define how likely that iscommunicate our certainty in any effecttranslate findings to clinical realities, by accounting for the costs, tradeoffs and effort behind following a recommendation
17 Example- Glycemic Control 2001: Van den Berghe publishes sentinel article: NEJM 2001, 345: Guidelines, protocols, quality metrics proposed2009: NICE SUGAR2009-present: Re-write or retire
18 Be Explicit What are the data? What are their limitations? How easy is it to do something?How confident are you in recommending?
19 The guidelines process: a methodologist’s perspective
20 Getting from evidence to guidelines Evidence HierarchyGuidelines HierarchyExperienceReportsObservational StudiesRCTsMeta-analysesClinical biasesExperience-based tendenciesCost analysesDecision analysesFormal GuidelinesNot all guidelines are created equal
21 (outcomes across studies) Evidence Profile (GRADEpro) 1 FormulatequestionSelectoutcomesRateimportanceof outcomesSystematic Review(outcomes across studies)Evidence Profile (GRADEpro)1Pooled estimate of effect for each outcome2Quality of evidence for each outcomeHigh Moderate Low Very lowHigh | Moderate | Low | Very lowOutcome1CriticalactionPICOstartRCTobservationalhighlowOutcome2Criticalrisk of biasinconsistencyindirectnessimprecisionpublication biasOutcome3ImportantOutcome4Notrate downimportantlarge effectdose-responseantagonistic biasrate upsystematic review of evidenceGuideline panelrecommendationPICO – P: transfusions; I vs C: NA (versus dopamine); 4-6 vs 10-12; 110 vs 180; vs flat;O: bleeding vs mortality vs VAP; symtomatic vs. asymptomatic DVTFormulate recommendationsFor or against an actionStrong or weak (strength)Strong or weak:Quality of evidenceBalance benefits/downsidesValues and preferencesResource use (cost)Rate overall quality of evidenceacross outcomesWording“We recommend…” | “Clinicians should…”“We suggest…” | “Clinicians might…”unambiguousclear implications for actiontransparent (values & preferences statement)
24 PICO Population Intervention Comparison Outcome Ventilated patients, APACHE scoresInterventionMedicine, therapy, education, systems interventionComparisonHigh(how high) versus low (how low) tidal volumeOutcomeFBI: mortality (at what follow-up), LOS, VAP
25 Overall quality of evidence Most systems just use evidence about primary benefit outcomeBut what about others (harms)?Optionsignore all but primaryany outcomeblended approachcrucial (critical) outcomes (SUP and pneumonia)
26 Rating outcomes 7-9: critical [death, disability or both] 4-6: important [skin breakdown, sepsis]1-3: limited [ileus, ICU stay]The importance may be arbitrary – sepsis may be critical, so may be LOS
28 Collect evidence Be thorough Use explicit search strategies Decide on published v unpublished dataConsider gray literature in some casesProceedings papersAbstractsClinicaltrials.govALWAYS consider comparatorSometimes, when faced with 50 questions, you may need to be pragmatic
29 Assembling Evidence is Hard Data have to be summarized to inform
30 GRADE pragmatic approach Get a good meta-analysis (MA)If no MA, identify main studiesIf possible, do your own MAIf no MA, describe main studies/resultsBe explicit (inclusion/exclusion, flaws)Keep the link between recommendation and evidence
31 Meta Analysis- the Good and the Bad One-stop synthesisImportant detail lostExploration of variabilityHeterogeneityImprove powerN-omegalic significanceIdeally- data shown as sum and partsA stew is the sum of its ingredients
32 Don’t GRADE everything No plausible alternativeSurveying for infection, resuscitating shock, practicing quality improvementRecommend to considerAs opposed to not considering?Statements lacking specificityIntervention, Comparison, relevant Outcomes (good and bad)
37 Entering the GRADE meat-grinder RCT- High qualityObservational study- Low qualityExpert report- Very Low quality
38 Entering the GRADE meat-grinder RCT- High qualityObservational study- Low qualityExpert report- Very Low quality
39 Grade Down Study limitations Inconsistency Indirectness Imprecision Publication Bias
40 Grade Down Study limitations Allocation concealment Inconsistency BlindingIndirectnessLoss to follow-upImprecisionNo intent-to-treatStopping earlyPublication BiasFailure to report outcomesRemember that you are looking at ‘number of studies’ at once
41 Study Limitations/Risk of Bias Bias definition: 1. Unequal distribution of risk factors (confounders) across study groups. 2. Factors that systematically change study effects to result in a directional change in the signal.
42 Risk of Bias GRADE treats bias by individual outcomes Pain scores- strong effect if unblindedMortality- effect of blinding less clearLoss to follow-up for different outcome windowsWith multiple studies and different risks of bias, quality should be judged by the relative contribution of studies to the confidence in the effect.
43 Risk of Bias Blinding Concealment of allocation Patient, clinician, data assessorConcealment of allocationIntention-to-treat principleAbsence negates the balance from randomization
44 Risk of BiasStopping Early for Benefit, especially if trials have < 500 eventsBrassler D, et al. JAMA, 2010;303(12):1180-7Selective outcome reportingOnly positive outcomes, composite results only, or lack of pre-specified outcomesLoss to follow-upSignificance relates to # of events
45 Risk of bias- Observational Studies Prognosis can differGroups can have multiple differences:TimePlacePopulationCo-morbidityThis is why observational studies typically enter as “Low” quality of evidence
46 Grade Down Study limitations Widely differing estimates of treatment effectInconsistencyHeterogeneity not explainedIndirectnessDifferences:ImprecisionPopulations, interventions, outcomesPublication Bias
47 InconsistencyDefinition: 1. Heterogeneity. 2. Lack of similarity of point estimates or confidence intervals. 3. Variable findings unexplained by a priori hypotheses. 4. Subgroup effects that cannot be sufficiently explained.
48 InconsistencyGenerally, effects are looked at in relative terms, rather than absoluteSubgroups may have different baseline rates, but similar relative effects
49 Inconsistency Inconsistency can come from study diversity: PopulationsInterventionsOutcomesStudy methodsCredible inconsistency may lead to split recommendations
50 Basic assessments of inconsistency Point estimates vary widelyLittle or no CI overlapTest of heterogeneity shows a low p value𝛘2I2 is large:(P ≤ 0.10 may be sufficient)-<40%: low-50-90%: substantial-30-60%: moderate%: considerable
55 ContextIt is only significant inconsistency if the variability would influence a clinical decisionIf point estimates and CIs favor treatment over costs/burdens/side effects, no need to downgrade
56 Inconsistency Example: Low-dose steroids in sepsis: 6 studies, 3 high baseline mortality, 3 low, with difference in effect:Patel GP. Am J Respir Crit Care Med 2012;185:Placebo mortality: 30-63%
57 Grade Down Study limitations If a>>b and c>b, is a>c? InconsistencyDifferences from intervention and outcome of interest:IndirectnessImprecisionpopulation, intervention, comparatorPublication Bias
58 IndirectnessDefinition: 1. Evidence does not directly compare to the clinical question of interest Differing patients, interventions, comparisons or outcomes in available studies necessitate extrapolation of evidence to question being addressed.
59 Indirectness Examples: Animal studies: downgrade 1 or 2 levels, in general, but consider the relevance of the data (toxicity v therapeutic benefit)If drug A>B and B>C, is A>C?Low-fat diet: US versus French populationSetting, co-”interventions,” geneticsSurrogate outcomes: Blood pressure control versus cardiovascular eventsVegetarians often have lifestyle differences from general population
60 Indirectness Example: H2RA and PPI: C. Difficile infection: observational study not direct to critically ill patients, but with interesting effect: Very Low QOELeonard J et al. Am J Gastroenterol 2007;102: 2047Case-control study of inpatients and outpatients.Risk of GIB probably not the same.
61 Grade Down Study limitations Few patients, outcomes Inconsistency Wide confidence intervalsIndirectnessImprecisionPublication Bias
62 ImprecisionDefinition: 1. High impact of random error on evidence quality. 2. Wide range of results to be expected from repetitive study Wide range in which the truth likely lies.
63 Imprecision Driven by # of events and by degree of effect 95% confidence intervals may encompass harm and benefitTaken in the context of the recommendationMore important: 95% CIs embrace absolute values that reduce our confidence in a recommendation
64 With rare event, relative CIs can be broad, but absolute differences can be negligible, even when the intervals cross RR=1. Ex: 16/1482 v 19/1465 (RR 0.85(o.43, 1.66) for stroke with angioplasty v CEA.Absolute difference: (-0.5%, 1.0%): not clinically significant.Use absolute effects
65 You mean the difference not significant at either end of CI for Absolute With rare event, relative CIs can be broad, but absolute differences can be negligible, even when the intervals cross RR=1. Ex: 16/1482 v 19/1465 (RR 0.85(o.43, 1.66) for stroke with angioplasty v CEA.Absolute difference: (-0.5%, 1.0%): not clinically significant.
69 ImprecisionExample:NE v Vasopressin: Mortality CI wide, spanned RR = 1.for ventricular arrhythmias, RR 0.47 (0.38, 0.58), but 21 events FRAGILEH2RA and pneumonia: unable to exclude harmNegative factors may require tighter CIs:Side effects/toxicityBurdens/costs
70 Grade Down Study limitations Few trials Inconsistency Industry funding Asymmetric Funnel plotIndirectnessImprecisionPublication Bias
71 Publication BiasDefinition: 1. Studies with statistically significant results more likely to be counted than negative studies. 2. Smaller, high-effect studies disproportionately impact published literature Published commercially-funded studies are more likely to be positive.
73 Publication BiasHow to detect? It’s more difficult than one might think.Look for:Small trialsConflicts in authors/study sponsorsDuplicationsAbstracts, grey literature with negative findingsUnpublished dataIdeally, we would trend MAs over time
77 Publication Bias- Testing Tests of asymmetryImputing missing informationRepeated MA over time
78 Publication Bias- Addressing the Problem Thorough researchGray LiteratureFDA submissionsAbstracts, proceedingsAuthor ContactClinicaltrials.govN.B: only for RCTs, not observational studies
79 Grade Up Large magnitude of effect Dose response gradient Bias likely to blunt results
80 Grade Up Large magnitude of effect Stronger signals signal stronger evidenceDose response gradientBias likely to blunt results
81 Grade Up Large magnitude of effect Signal pattern consistent with physiologic modelDose response gradientBias likely to blunt results
82 Grade Up Large magnitude of effect Some studies run up against mitigating factors that work against them.Dose response gradientBias likely to blunt results
83 Moving Up- ExamplesVery strong, consistent association; no plausible confounders, up 2 gradesinsulin in diabetic ketoacidosisantibiotics in septic shockStrong, consistent association with no plausible confounders up 1 grade
88 GRADE output: Evidence Profile Question: Should longer term (7 day) low dose (up to 300 mg/day of hydrocortisone) glucocorticosteroids be used in severe sepsis and septic shock? Settings: ICU Bibliography: Annane 2009Quality assessmentSummary of findingsImportanceNo of patientsEffectQualityNo of studiesDesignLimitationsInconsistencyIndirectnessImprecisionOther considerationslonger term (7 day) low dose (up to 300 mg/day of hydrocortisone) glucocorticosteroi dscontrolRelative (95% CI)AbsoluteMortality, 28 days12randomised trialsno serious limitationsserious1no serious indirectnessno serious imprecisionnone236/629 (37.5%)264/599 (44.1%)RR 0.84 (0.72 to 0.97)71 fewer per (from 13 fewer to 123 fewer)ÅÅÅO MODERATECRITICAL2GI bleeding3no serious inconsistenc y3serious465/827 (7.9%)56/767 (7.3%)RR 1.12 (0.81 to 1.53)9 more per (from 14 fewer to 39 more)IMPORTANTSuperinfections45no serious inconsistenc y6no serious imprecision7184/983 (18.7%)170/934 (18.2%)RR 1.01 (0.82 to 1.25)2 more per (from 33 fewer to 46 more)ÅÅÅÅ HIGHPlease notcie that these two slides use fdifferent body of evidence – one looks at 6 studies, one at 12. People may get confused.1 Meta-regression examining the effect of severity of illness (baseline mortality) on efficacy suggested an effect - p value 0.04 using fixed effect and 0.06 using random effect model. JAMA 2009; 302: Reported for all trials 3 I2=0 4 RR up to need to check 6 I2=8%
89 Final QOE High: A , ++++, ↑↑↑↑ Medium: B, +++-, ↑↑↑ Low: C, ++--, ↑↑ Very Low: D, +---, ↑
90 Alternate QOE interpretation High- Further research very unlikely to change confidenceModerate- likely to have an important impactLow- very likely to impactVery Low- uncertain
91 Separate QOE and Strength of Recommendation GRADE’s defining featureEvidence: high or low quality?likelihood estimates are true and adequateRecommendation: weak or strong?confidence that following recommendation will cause more good than harmYou see, this definition is for ‘systematic review’ = confidence in estimates, not that estimates support recommendation
92 Factors- STRONG vs WEAK Balance good & badQOEUncertaintyvaluespreferencesCost
93 Factors- STRONG vs WEAK Balance good & badGI Bleed v C. DificileQOEUncertaintyEarly antibiotics v inappropriate antibioticsvaluespreferencesCost
94 Factors- STRONG vs WEAK Balance good & badA or B can support STRONGQOEUncertaintyC or D should usually be WEAKvaluespreferencesCost
95 Factors- STRONG vs WEAK Balance good & badCancer remission v quality of lifeQOEUncertaintyDelirium v pain controlvaluespreferencesCost
96 Factors- STRONG vs WEAK Balance good & bad$/QALYQOEUncertaintyAllocating limited resourcesvaluespreferencesCostBurdens for patients and providers
97 STRONG to stakeholders Patient: most people would want itClinician: most should receive, uniform behaviorPolicymaker: adopt as policy, use as quality indicator
98 WEAK to stakeholders Patient: many people would not want it Clinician: help patient make a balanced decisiondecision aid might be neededPolicymaker: debate
99 Final Strength of Recommendations STRONG:WEAK:do it, or don’t do itprobably do it, or probably don’t“We recommend”“We suggest”GRADE 1GRADE 2
100 (outcomes across studies) Evidence Profile (GRADEpro) 1 FormulatequestionSelectoutcomesRateimportanceof outcomesSystematic Review(outcomes across studies)Evidence Profile (GRADEpro)1Pooled estimate of effect for each outcome2Quality of evidence for each outcomeHigh Moderate Low Very lowHigh | Moderate | Low | Very lowOutcome1CriticalactionPICOstartRCTobservationalhighlowOutcome2Criticalrisk of biasinconsistencyindirectnessimprecisionpublication biasOutcome3ImportantOutcome4Notrate downimportantlarge effectdose-responseantagonistic biasrate upsystematic review of evidenceGuideline panelrecommendationPICO – P: transfusions; I vs C: NA (versus dopamine); 4-6 vs 10-12; 110 vs 180; vs flat;O: bleeding vs mortality vs VAP; symtomatic vs. asymptomatic DVTFormulate recommendationsFor or against an actionStrong or weak (strength)Strong or weak:Quality of evidenceBalance benefits/downsidesValues and preferencesResource use (cost)Rate overall quality of evidenceacross outcomesWording“We recommend…” | “Clinicians should…”“We suggest…” | “Clinicians might…”unambiguousclear implications for actiontransparent (values & preferences statement)
101 Useful Resources BMJ: GRADE series GRADE Introduction: Overview of Quality of Evidence:BMJ 2008;336;Translating Evidence to Recommendations:BMJ 2008;336;How to handle disagreements in guidelines panels: BMJ 2008;337:a744
102 Useful Resources II Journal of Clinical Epidemiology GRADE Guidelines Series:April, 2011 (64(4)): 1-4Intro, framing the question and outcomes, rating quality of evidence, risk of biasDecember, 2011 (64(12)): 5-9Publication bias, imprecision, inconsistency, indirectness, rating up
103 Useful Resources II Journal of Clinical Epidemiology GRADE Guidelines Series:April, 2011 (64(4)): 1-4Intro, framing the question and outcomes, rating quality of evidence, risk of biasDecember, 2011 (64(12)): 5-9Publication bias, imprecision, inconsistency, indirectness, rating up