Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grading evidence and recommendations Holger Schünemann Gunn Vist Gordon Guyatt for the GRADE Working Group.

Similar presentations


Presentation on theme: "Grading evidence and recommendations Holger Schünemann Gunn Vist Gordon Guyatt for the GRADE Working Group."— Presentation transcript:

1

2 Grading evidence and recommendations Holger Schünemann Gunn Vist Gordon Guyatt for the GRADE Working Group

3   Introduction to GRADE   Example of applying GRADE   Demonstration of the GRADE profiler sofware Today’s talk

4 Intro Exercise 0

5

6 Where would you prefer to live?

7 ← Option 1 Option 2 →

8 ← Option 1 (pink card) Option 2 → (green card)

9 Introduction to GRADE 1

10 When to make a recommendation?   never   patient values differ   guidelines should just lay out benefits and risks   when evidence strong enough   if very weak, too uncertain   clinicians need guidance intense study demands decision

11 What type of recommendations? strong recommendations – –high quality methods – –large precise effect – –few down sides of therapy weak recommendations – –low quality methods – –imprecise estimate – –small overall effect – –substantial down sides – –people react differently to same outcomes/circumstances

12 Should we grade recommendations? People draw conclusions about the – –quality of evidence – –strength of recommendations Systematic and explicit approaches can help – –protect against errors – –resolve disagreements – –facilitate critical appraisal – –communicate information

13 Why grade recommendations? Change practitioner behavior Strong: apply uniformly – –just do it Weak: think about it – –examine evidence yourself – –consider patient circumstances very carefully – –explore with the patient However, there is wide variation in currently used approaches

14 Which grading system? Evidence Recommendation II-2B C+ 1 StrongStrongly recommended Organization   USPSTF   ACCP   GCPS

15 Still not confused? EvidenceRecommendation BClass I C+ 1 IVC Organization   AHA   ACCP   SIGN Recommendation for use of oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease

16 Grading System Current profusion: can there be consensus?

17 GRADE G rades of R ecommendation A ssessment, D evelopment and E valuation

18 What do you know about GRADE? Have prepared a guideline Have prepared a guideline Read the BMJ paper Read the BMJ paper Have prepared a systematic review and a summary of findings table Have prepared a systematic review and a summary of findings table Have attended a GRADE meeting, workshop or talk Have attended a GRADE meeting, workshop or talk

19 About GRADE o Began as informal working group in 2000 o Researchers/guideline developers with interest in methodology o Aim: to develop a common system for grading the quality of evidence and the strength of recommendations that is sensible and to explore the range of interventions and contexts for which it might be useful* o 13 meetings (~10 – 35 attendants) o Evaluation of existing systems and reliability* o Workshops at various places including Cochrane Colloquia, WHO and GIN since 2000 *Grade Working Group. CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005

20 GRADE Working Group David Atkins, chief medical officer a Dana Best, assistant professor b Peter A Briss, chief c Martin Eccles, professor d Yngve Falck-Ytter, associate director e Signe Flottorp, researcher f Gordon H Guyatt, professor g Robin T Harbour, quality and information director h Margaret C Haugh, methodologist i David Henry, professor j Suzanne Hill, senior lecturer j Roman Jaeschke, clinical professor k Gillian Leng, guidelines programme director l Alessandro Liberati, professor m Nicola Magrini, director n James Mason, professor d Philippa Middleton, honorary research fellow o Jacek Mrukowicz, executive director p Dianne O ’ Connell, senior epidemiologist q Andrew D Oxman, director f Bob Phillips, associate fellow r Holger J Sch ü nemann, associate professor g,s Tessa Tan-Torres Edejer, medical officer/scientist t Helena Varonen, associate editor u Gunn E Vist, researcher f John W Williams Jr, associate professor v Stephanie Zaza, project director w a) Agency for Healthcare Research and Quality, USA b) Children's National Medical Center, USA c) Centers for Disease Control and Prevention, USA d) University of Newcastle upon Tyne, UK e) German Cochrane Centre, Germany f) Norwegian Centre for Health Services, Norway g) McMaster University, Canada h) Scottish Intercollegiate Guidelines Network, UK i) F é d é ration Nationale des Centres de Lutte Contre le Cancer, France j) University of Newcastle, Australia k) McMaster University, Canada l) National Institute for Clinical Excellence, UK m) Universit à di Modena e Reggio Emilia, Italy n) Centro per la Valutazione della Efficacia della Assistenza Sanitaria, Italy o) Australasian Cochrane Centre, Australia p) Polish Institute for Evidence Based Medicine, Poland q) The Cancer Council, Australia r) Centre for Evidence-based Medicine, UK s) National Cancer Institute, Italy t) World Health Organisation, Switzerland u) Finnish Medical Society Duodecim, Finland v) Duke University Medical Center, USA w) Centers for Disease Control and Prevention, USA

21 How can we judge the extent of our confidence that adherence to a recommendation will do more good than harm?

22 Grading System Strength of the recommendation do it (or don’t do it)/recommend probably do it (or probably don’t do it)/suggest Quality of underlying evidence high quality (well done RCT) moderate (quasi-RCT) low (well done observational) very low (anything else)

23 Moving quality down poor (RCT) design, implementation → →randomization, blinding, concealment, follow-up, intention to treat principle, early stopping for benefit inconsistency Indirect evidence → →patients, interventions, outcomes → →A vs B, but have A to C, B to C sparse or imprecise data reporting bias

24 Reporting bias high likelihood of reporting bias can lower quality reporting of outcomes reporting of studies publication bias

25 Moving quality up Observational studies – high or moderate quality? Strong association → →strong association: RR > 2 or RR < 0.5 → →very strong association: RR > 5 or RR < 0.2 Dose response relationship – –bleeding risk associated with increasing INR (blood thinning with warfarin) Plausible confounders would have reduced the effect For example, plausible explanatory factors that were not adjusted for in studies comparing mortality rates of for-profit and not-for-profit hospitals would have reduced the observed effect. Thus, the evidence showing that for-profit hospitals have a higher risk of mortality is more convincing

26 Quality assessment criteria

27 Categories of quality High: Further research is very unlikely to change our confidence in the estimate of effect. Moderate: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low: Any estimate of effect is very uncertain. “The extent to which one can be confident that an estimate of effect or association is correct.”

28 Strength of recommendation “The extent to which one can be confident that adherence to a recommendation will do more good than harm.” quality of the evidence translation of the evidence into practice in a specific setting uncertainty about baseline risk trade-offs (the relative value attached to the expected benefits, harms and costs)

29 More practice using the voting instrument….

30 ← Option 1 (pink card) Option 2 → (green card) Remember

31 You are hiking. Which of the following animals would you prefer to encounter?

32 ← Option 1 (pink card) Option 2 → (green card)

33 You are buying an ice cream. Which flavor do you prefer?

34 ← Option 1 (pink card) Option 2 → (green card) Chocolate Strawberry

35 You are buying a new car. Which one would you buy?

36 ← Option 1 (pink card) Option 2 → (green card) Yellow fox Red Ferrari

37 What determines your choices?

38 Values and preferences underlying values and preferences always present sometimes crucial important to make explicit

39 Judgements about the balance between benefits and harms Before considering cost and making a recommendation

40 Judgment: Benefits vs Risks/Costs quality of evidence seriousness of outcome magnitude of effect precision of treatment effect risk of target event risk of adverse events cost of therapy values

41 Judgements about recommendations

42 Strong recommendation when evidence is weak? Balance of benefits and downsides clearly on one side Not frequent if quality is low or very low

43 Comparison of GRADE and other systems Explicit definitions Explicit, sequential judgements Components of quality Overall quality Relative importance of outcomes Balance between health benefits and harms Balance between incremental health benefits and costs Consideration of equity Evidence profiles International collaboration Software Consistent judgements? Communication?

44 Who is interested in GRADE American College of Chest Physicians (ACCP) WHO American Endocrine Society UpToDate Clinical Evidence American Society of Clinical Oncology (ASCO) American Thoracic Society (ATS) Urologists worldwide (EBUro) NICE

45 Conclusion Challenges in grading – –judgment always required Must consider study design, execution, consistency, directness, reporting bias – –magnitude, precision Balance of benefits and risks/cost – –magnitude of effects; precision of effects; values and preferences Separation of recommendation from quality of the evidence GRADE working group active in obtaining feedback and dissemination

46 Making judgments using GRADE 2

47 The clinical question Population: In patients with chronic atrial fibrillation and no prior history of stroke Intervention: does oral anticoagulation (comparison)compared with no therapy Outcome:reduce the risk for embolic stroke, hemorrhage and death?

48 The evidence   Systematic Review*   5 RCTs   2,313 Patients randomised   Warfarin in all studies Studien   1.5 years mean follow-up   Outcomes: Ischemic Stroke, hemorrhage (major, including intracranial), death (vascular and all cause) and dependency *Systematic Review: Aguilar & Hart. Cochrane Database of Systematic Reviews 2005, Issue 3.

49   All disabling or fatal stroke (isch. and hemorrh.)   Major hemorrhage (non IC)   All cause mortality   Minor bleeding (hematoma, prolonged bleeding of minor wounds) *Systematic Review: Aguilar & Hart. Cochrane Database of Systematic Reviews 2005, Issue 3. Outcomes/endpoints

50 How important is the endpoint for decision making? Judgment about the relative importance for each endpoint on a scale from 9 (most important) to 1 (least important): 7 – 9: the endpoint is critical for decision making. 4 – 6: the endpoint is important but not critical. 1 – 3: the endpoint is not important. Outcomes/endpoints

51   All disabling or fatal stroke (isch. and hemorrh.)   Major hemorrhage (non IC)   All cause mortality   Minor bleeding (hematoma, prolonged bleeding of minor wounds) *Systematic Review: Aguilar & Hart. Cochrane Database of Systematic Reviews 2005, Issue 3. Outcomes/endpoints 9 7 9 5

52 Quality assessment criteria

53 Disabling or fatal stroke Study design:   5 RCTs Quality of evidence for this endpoint:   High

54 Disabling or fatal stroke Detailed design and execution Concealment Follow-up   In two studies (CAFA; SPINAF) both patients and outcome assessors were blinded; in the other studies only outcomes assessors. Quality of evidence for this endpoint now: High (or -1  Moderate)

55 Disabling or fatal stroke Consistency:

56

57 Disabling or fatal stroke Consistency: No inconsistency Quality of evidence for this endpoint now:   High

58 Directness of evidence indirect treatment comparisons – –interested in A versus B – –have A versus C and B versus C alendronate vs risedronate (biphosponates) – –both versus placebo, no head-to-head

59 Directness - patients patients meet trials’ eligibility criteria not included, but no reason to question – –slight age difference, comorbidity, race some question, bottom line applicable – –valvular atrial fibrillation serious question about biology – –heart failure trials applicability to aortic stenosis

60 Directness - interventions same drugs and doses – –captopril 100 mg. tid in heart failure similar drugs and doses – –captopril in lower doses same class and biology – –other ACEI in heart failure questionable class and biology – –ARB in heart failure

61 Directness - outcomes same outcomes – –alendronate over 3 years on fracture similar but questionable – –alendronate over long-term serious question – –surrogate outcomes – –bone density; arrhythmia suppression; cholesterol levels (clofibrate)

62 Disabling or fatal stroke Directness of the evidence: Population, Intervention, Outcomes Direct Quality of evidence for this endpoint now:   High

63 Disabling or fatal stroke Imprecise or sparse data:   Would few additional events or larger studies likely alter the results?

64

65 Disabling or fatal stroke Imprecise or sparse data:   No imprecise or sparse data Quality of evidence for this endpoint now:   High

66 Disabling or fatal stroke   Reporting bias: Not present Quality of evidence for this endpoint now:   High

67 Disabling or fatal stroke   Strong association? present (RR = 0.46) Quality of evidence for this endpoint now:   High [or +1  High (from moderate)] strong, no plausible confounder, consistent and direkt evidence

68 Endpoint: Major extracranial hemorhage Study design: 4 RCTs → Quality: High Study details and execution: No serious limitations No inconsistency and direct Imprecise or sparse data?

69

70 Imprecise or sparse data There is not an empirical basis for defining imprecise or sparse data. Two possible definitions are: Data are sparse if the results include just a few events or observations and they are uninformative Data are imprecise if the confidence intervals are sufficiently wide that an estimate is consistent with either important harms or important benefits. These different definitions can result in different judgments.

71 Major extracranial hemorhage Study design: 4 RCTs → Quality: High Study details and execution: no serious limitations No inconsistency and direct Imprecise or sparse data? Imprecise data (wide confidence intervals benefit and harm uncertain)

72

73

74 Judgements about the overall quality of evidence Most systems not explicit Options: – –strongest outcome – –primary outcome – –benefits – –weighted – –separate grades for benefits and harms – –no overall grade – –weakest outcome Based on lowest of all the critical outcomes Beyond the scope of a systematic review

75 Quality across all endpoints

76 Risk groups Risk for cardio-embolic stroke: High (prior TIA or stroke*, > 75 yrs,  LVEF/CHF, HTN or DM): 10%/year Moderate risk (65 to 75 years) or one risk factor: 3 to 4%/year Low risk (< 65 years): 0.5%/year

77

78 Risk groups Risk for cardio-embolic stroke: High (prior TIA or stroke*, > 75 yrs,  LVEF/CHF, HTN or DM): 10%/year – –Benefits greater downsides: do it, high Moderate risk (65 to 75 years) or one risk factor: 3 to 4%/year – –Benefits greater downsides: do it, high Low risk (< 65 years): 0.5%/year – –Benefits smaller than downsides: values: probably do not do it, high

79 Value and preference statements underlying values and preferences always present sometimes crucial important to make explicit

80 Observational studies – high or moderate quality? Strong association Dose response relationship – –bleeding risk associated with increasing INR (blood thinning with warfarin) Plausible confounders would have reduced the effect For example, plausible explanatory factors that were not adjusted for in studies comparing mortality rates of for-profit and not-for-profit hospitals would have reduced the observed effect. Thus, the evidence that for-profit hospitals have a higher risk of mortality is more convincing

81 GRADEpro 3

82 Guideline development process Prioritise Problems, establish panel  Systematic Review  Evidence Profile  Relative importance of outcomes  Overall quality of evidence  Benefit – downside evaluation  Strength of recommendation  Implementation and evaluation of guidelines GRADE

83 Guideline development process Prioritise Problems, establish panel  Systematic Review  Evidence Profile  Relative importance of outcomes  Overall quality of evidence  Benefit – downside evaluation  Strength of recommendation  Implementation and evaluation of guidelines GRADE Summary of Findings

84 GRADE Profiler

85 GRADEpro© Visual studio.net Windows based (Mac version coming) Simple installation Help file Will be integrated with Revman (trial) Free availability Beta version

86 Development of GRADE profiles

87

88

89

90

91

92

93 8. In two studies (CAFA; SPINAF) patients and outcome assessors were blind to OAC administration, while in the remaining trials treatment was given open label with outcomes verified by those unaware of treatment assignment.

94

95

96

97

98

99

100

101 GRADEpro Reproducible Transparent – –Footnotes – –Judgments GRADE profiles – –Summary Integration with Revman Real time

102 Judgements about recommendations “We recommend”…”should” …“Do it” “We suggest”…”may” … “Probably do it” “We suggest not”… “may not” …“Probably don’t do it” “We recommend not”…”should not”… “Don’t do it” No recommendation This could include considerations of costs; i.e. “Is the net gain (benefits-downsides) worth the costs?”

103 Questions?

104 What are we concerned about? methodological quality of evidence – –likelihood of bias – –high, moderate, low and very low quality strength of recommendations – –must do to might do – –strong recommendation or weak

105 Strong recommendation when evidence is weak? recommendations against – –uncertainty of benefit – –confidence in down sides whole body CT or MRI screening – –maybe benefit, maybe not – –true positives some harm – –false positive some harm

106 Strong recommendation when evidence is weak? known benefit, strong recommendation for one of two alternatives – –CABG for left main stem disease – –but venous or mammary artery graft benefit: weak evidence suggests appreciable benefit for mammary artery harm: strong evidence that little difference

107 Population: In patients with chronic atrial fibrillation and no prior history of stroke Intervention: does oral anticoagulation (comparison)compared with no therapy Outcome:reduce the risk for embolic stroke, hemorrhage and death? Different risk groups: Low, moderate, high Other outcomes: Inconvenience, quality of life

108 Strong recommendation when evidence is weak? known benefit, strong recommendation for one of two alternatives benefit: strong evidence of equivalence harm: weak evidence that harm differs appreciably


Download ppt "Grading evidence and recommendations Holger Schünemann Gunn Vist Gordon Guyatt for the GRADE Working Group."

Similar presentations


Ads by Google