The GRADE approach: an introductory workshop

Slides:



Advertisements
Similar presentations
Comparator Selection in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Advertisements

Summary of Findings & Assessment of Quality of Evidence: Grade Workshop Sunday, October 17, to 1700 Introduction.
Introduction to the User’s Guide for Developing a Protocol for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research.
Holger Schünemann, MD, PhD From Evidence to EMS Practice: Building the National Model Washington, September 4,
Reading the Dental Literature
The Science of Guidelines The 7th ACCP Conference on Antithrombotic and Thrombolytic Therapy: Evidence-Based Guidelines Holger Schünemann, MD, PhD Italian.
EVIDENCE BASED MEDICINE for Beginners
August , 2012 GUIDELINE AND SYSTEMATIC REVIEW WORKSHOP Dr. Elie Akl Dr. Holger Schünemann Dr. Ruth Kalda Dr. Alar Irs.
Critically Evaluating the Evidence: Tools for Appraisal Elizabeth A. Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant.
Summarising findings about the likely impacts of options Judgements about the quality of evidence Preparing summary of findings tables Plain language summaries.
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
Grading of Recommendations Assessment, Development and Evaluation (GRADE) Methodology.
ABCWINRisk and Statistics1 Risk and Statistics Risk Assessment in Clinical Decision Making Ulrich Mansmann Medical Statistics Branch University of Heidelberg.
Chapter 7. Getting Closer: Grading the Literature and Evaluating the Strength of the Evidence.
Felix I. Zemel, MPH DrPH Student Tufts University School of Medicine.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Critical Appraisal of Clinical Practice Guidelines
Are the results valid? Was the validity of the included studies appraised?
Illustrating the GRADE Methodology: The Cather Associated-UTI Case Study TEACH Level II Workshop 5 NYAM August 9 th, 2013 Craig A Umscheid, MD, MSCE, FACP.
How GRADE could help to implement the evidence
Grading of Recommendations Assessment, Development, and Evaluation (GRADE) Working Group
AHRQ Annual Meeting 2009: "Research to Reform: Achieving Health System Change" September 14, 2009 Yngve Falck-Ytter, M.D. Case Western Reserve University,
Society of General International Medicine 32 nd Annual Meeting, May 14 th 2009 Elie A. Akl, MD, MPH, PhD David Atkins, MD, MPH Eric Bass, MD, MPH Yngve.
Holger Schünemann, MD, PhD Chair and Professor, Department of Clinical Epidemiology & Biostatistics Professor of Medicine Michael Gent Chair in Healthcare.
Holger Schünemann, MD, PhD Chair, Department of Clinical Epidemiology & Biostatistics Michael Gent Chair in Healthcare Research Professor of Clinical Epidemiology,
AHRQ Annual Meeting 2009: "Research to Reform: Achieving Health System Change" September 14, 2009 Yngve Falck-Ytter, M.D. Case Western Reserve University,
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Guideline development through GRADE August 28, 2011 GIN 2011, Seoul, Korea.
Dr.F Eslamipour DDS.MS Orthodontist Associated professor Department of Oral Public Health Isfahan University of Medical Science.
Canadian Task Force on Preventive Health Care:
Brief summary of the GRADE framework Holger Schünemann, MD, PhD Chair and Professor, Department of Clinical Epidemiology & Biostatistics Professor of Medicine.
Systematic Reviews.
GRADE example application of Jan Brożek. My potential conflicts of interest GRADE working group Cochrane Collaboration.
Evidence-Based Public Health Nancy Allee, MLS, MPH University of Michigan November 6, 2004.
Placebo-Controls in Short-Term Clinical Trials of Hypertension Sana Al-Khatib, MD, MHS Assistant Professor of Medicine Division of Cardiology Duke University.
EBCP. Random vs Systemic error Random error: errors in measurement that lead to measured values being inconsistent when repeated measures are taken. Ie:
Plan GRADE backgroundGRADE background confidence in estimates (quality of evidence)confidence in estimates (quality of evidence) evidence profilesevidence.
This material was developed by Oregon Health & Science University, funded by the Department of Health and Human Services, Office of the National Coordinator.
Plymouth Health Community NICE Guidance Implementation Group Workshop Two: Debriding agents and specialist wound care clinics. Pressure ulcer risk assessment.
Clinical Writing for Interventional Cardiologists.
VSM CHAPTER 6: HARM Evidence-Based Medicine How to Practice and Teach EMB.
Wipanee Phupakdi, MD September 15, Overview  Define EBM  Learn steps in EBM process  Identify parts of a well-built clinical question  Discuss.
Evidence-Based Medicine – Definitions and Applications 1 Component 2 / Unit 5 Health IT Workforce Curriculum Version 1.0 /Fall 2010.
Introduction to Healthcare and Public Health in the US The Evolution and Reform of Healthcare in the US Lecture b This material (Comp1_Unit9b) was developed.
Objectives  Identify the key elements of a good randomised controlled study  To clarify the process of meta analysis and developing a systematic review.
WHO GUIDANCE FOR THE DEVELOPMENT OF EVIDENCE-BASED VACCINE RELATED RECOMMENDATIONS August 2011.
Advice on global guideline development by SURE to the WHO Holger Schünemann State University of New York at Buffalo Italian National Cancer Institute „Regina.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Anne Matthews, Health & Society, School of Nursing and Human Sciences, DCU The paradox of ‘low quality evidence; strong recommendation’: An analysis of.
Is the conscientious explicit and judicious use of current best evidence in making decision about the care of the individual patient (Dr. David Sackett)
Developing evidence-based guidelines at WHO. Evidence-based guidelines at WHO | January 17, |2 |
EVALUATING u After retrieving the literature, you have to evaluate or critically appraise the evidence for its validity and applicability to your patient.
The New York Academy of Medicine Teaching Evidence Assimilation for Collaborative Healthcare New York, August 8, 2012 Yngve Falck-Ytter, MD, AGAF for the.
GDG Meeting Wednesday November 9, :30 – 11:30 am.
The US Preventive Services Task Force: Potential Impact on Medicare Coverage Ned Calonge, MD, MPH Chair, USPSTF.
GRADE Grading of Recommendations Assessment, Development and Evaluation British Association of Dermatologists April 2014.
Clinical Practice Guidelines: Can we fix Babel? Eddy Lang Department Chair, Emergency Alberta Health Services Associate Professor University of Calgary.
© 2010 Jones and Bartlett Publishers, LLC. Chapter 12 Clinical Epidemiology.
Evidence-Based Mental Health PSYC 377. Structure of the Presentation 1. Describe EBP issues 2. Categorize EBP issues 3. Assess the quality of ‘evidence’
From evidence to Policy: Paediatric guideline development in Kenya Mercy Mulaku.
Approach to guideline development
for Overall Prognosis Workshop Cochrane Colloquium, Seoul
Why this talk? you will be seeing a lot of GRADE
Conflicts of interest Major role in development of GRADE
8. Causality assessment:
Overview of the GRADE approach – selected slides
Chapter 7 The Hierarchy of Evidence
WHO Guideline development
Plan GRADE background two steps evidence profiles
Interpreting Basic Statistics
Presentation transcript:

The GRADE approach: an introductory workshop Holger Schünemann, MD, PhD Professor and Chair, Dept. of Clinical Epidemiology & Biostatistics Professor of Medicine Michael Gent Chair in Healthcare Research McMaster University, Hamilton, Canada NTP, Raleigh June 22, 2011

2

The Department of Clinical Epidemiology & Biostatistics at McMaster History - 1967 – Founded by David Sackett - 6 chairs since - Instrumental in specialty of Clinical Epidemiology, origin of “Evidence-Based Medicine” People 45 full time and joint faculty ~ 120 associate & part time faculty; 19 emeritus ~ 180 staff ~ 200 PhD and Master students

Time Session title 8.30 –  10.00 Overview of GRADE 10.00 –  10.15 BREAK 10.15 – Noon Details of quality assessment -small group sessions Noon – 1.00 LUNCH 1.00 – 3.00 Framework for devleoping recommendations Small group sessions 3.00 – 5.00    OHAT assessment tool

Content Guidelines and GRADE Quality of evidence Background about GRADE Quality of evidence Going from evidence to recommendations

What is a guideline? "Guidelines are recommendations intended to assist providers and recipients of health care and other stakeholders to make informed decisions. Recommendations may relate to clinical interventions, public health activities, or government policies." WHO 2003, 2007

Evidence based healthcare decisions (Clinical) state and circumstances Population values and preferences Expertise Research evidence Haynes et al. 2002

Confidence in evidence There always is evidence “When there is a question there is evidence” Better research  greater confidence in the evidence and decisions

Hierarchy of evidence based on quality BIAS STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations Expert Opinion

“Everything should be made as simple as possible but not simpler.” Explain the following? Confounding, effect modification & ext. validity Concealment of randomization Blinding (who is blinded in a double blinded study?) Intention to treat analysis and its correct application P-values and confidence intervals

BMJ 2003 BMJ, 2003

Relative risk reduction: ….> 99.9 % (1/100,000) U.S. Parachute Association reported 821 injuries and 18 deaths out of 2.2 million jumps in 2007 BMJ 2003

Simple hierarchies are (too) simplistic STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations BIAS Expert Opinion Expert Opinion Schünemann & Bone, 2003

Which hierarchy? Recommendation for use of oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease Evidence Recommendation B Class I A 1 IV C Organization AHA ACCP SIGN

Oxford Centre for Evidence Based Medicine This is a more sophisticate development of Sackett’s initial approach by the Oxford Centre of Evidence-Based Medicine. This approach, however, is quite different from what we have used for the ACCP Consensus Conference and is too complicated.

USPSTF - Grade Definitions After May 2007: Certainty Level of Certainty Description High The available evidence usually includes consistent results from well-designed, well-conducted studies in representative primary care populations. These studies assess the effects of the preventive service on health outcomes. This conclusion is therefore unlikely to be strongly affected by the results of future studies. Moderate The available evidence is sufficient to determine the effects of the preventive service on health outcomes, but confidence in the estimate is constrained by such factors as: The number, size, or quality of individual studies. Inconsistency of findings across individual studies. Limited generalizability of findings to routine primary care practice. Lack of coherence in the chain of evidence. As more information becomes available, the magnitude or direction of the observed effect could change, and this change may be large enough to alter the conclusion. Low The available evidence is insufficient to assess effects on health outcomes. Evidence is insufficient because of: The limited number or size of studies. Important flaws in study design or methods. Gaps in the chain of evidence. Findings not generalizable to routine primary care practice. Lack of information on important health outcomes. More information may allow estimation of effects on health outcomes. The USPSTF defines certainty as "likelihood that the USPSTF assessment of the net benefit of a preventive service is correct."

Recommendations for prognosis Use prognostic information to determine baseline risk for healthcare decisions

Center for Disease Control and Prevention (CDC) This is the approach used by the CDC. On of the main criticisms with this approach is the lack of standardization with this approach and the probably lack of reproducibility.

“Long term perspective” “Healthy people” “Herd immunity” “Long term perspective” “Disease perception” “Lots of other things” Healthcare problem recommendation

Grades of Recommendation Assessment, Development and Evaluation GRADE Working Group Grades of Recommendation Assessment, Development and Evaluation Aim: to develop a common, transparent and sensible system for grading the quality of evidence and the strength of recommendations International group of guideline developers, methodologists & clinicians from around the world (>250 contributors) – since 2000 International group: ACCP, AHRQ, Australian NMRC, BMJ Clinical Evidence, Cochrane Collaboration, CDC, McMaster, NICE, Oxford CEBM, SIGN, UpToDate, USPSTF, WHO CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005, AJRCCM 2006, Chest 2006, BMJ 2008

GRADE Uptake World Health Organization CDC-ACIP Allergic Rhinitis in Asthma Guidelines (ARIA) American Thoracic Society American College of Physicians European Respiratory Society European Society of Thoracic Surgeons British Medical Journal Infectious Disease Society of America American College of Chest Physicians UpToDate® National Institutes of Health and Clinical Excellence (NICE) Scottish Intercollegiate Guideline Network (SIGN) Cochrane Collaboration Clinical Evidence Agency for Health Care Research and Quality (AHRQ) Partner of GIN Over 40 major organizations

Guideline development Process

Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick. Potential interventions: antivirals, such as neuraminidase inhibitors oseltamivir and zanamivir

Types of questions Background Questions Definition: What is Avian Influenza? Mechanism: What is the mechanism of action of oseltamivir? Foreground Questions Benefit > harm: In patients with avian influenza, does oseltamivir therapy improve survival, …?

Framing a foreground question Population: Avian Flu/influenza A (H5N1) patients Intervention: Oseltamivir Comparison: No pharmacological intervention Outcomes: Mortality, hospitalizations, resource use, adverse outcomes, antimicrobial resistance Schunemann, et al., The Lancet ID, 2007

Choosing outcomes Desirable outcomes Undesirable outcomes lower mortality reduced hospital stay reduced duration of disease reduced resource expenditure Undesirable outcomes adverse reactions the development of resistance costs of treatment Every decision comes with desirable and undesirable consequences Developing recommendations must include a consideration of desirable and undesirable outcomes

Relative importance of outcomes Decision makers (and guideline authors) need to consider the relative importance of outcomes when balancing these outcomes to make a recommendation Relative importance vary across populations Relative importance may vary across patient groups within the same population When considered critical - evaluate

GRADE: recommendation – quality of evidence Clear separation: 1) Recommendation: 2 grades – weak/conditional/optional or strong (for or against an intervention)? Balance of benefits and downsides, values and preferences, resource use and quality of evidence 2) 4 categories of quality of evidence:  (High), (Moderate), (Low), (Very low)? methodological quality of evidence likelihood of bias by outcome and across outcomes *www.GradeWorking-Group.org

GRADE Quality of Evidence In the context of a systematic review The quality of evidence reflects the extent to which we are confident that an estimate of effect is correct. In the context of making recommendations The quality of evidence reflects the extent to which our confidence in estimates of the effects is adequate to support a particular recommendation.

Likelihood of and confidence in an outcome

Definition of grades of evidence Research /A/High: Further research is very unlikely to change confidence in the estimate of effect. /B/Moderate: Further research is likely to have an important impact on confidence in the estimate of effect and may change the estimate. /C/Low: Further research is very likely to have an important impact on confidence in the estimate of effect and is likely to change the estimate. /D/Very low: Any estimate of effect is very uncertain.

Confidence in evidence /A/High: We are very confident that the true effect lies close to that of the estimate of the effect. /B/Moderate: : We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.   /C/Low : Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect. /D/Very low : We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.

Determinants of quality RCTs  observational studies  5 factors that can lower quality limitations in detailed design and execution (risk of bias criteria) Inconsistency (or heterogeneity) Indirectness (PICO and applicability) Imprecision (number of events and confidence intervals) Publication bias 3 factors can increase quality large magnitude of effect all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed dose-response gradient

1. Design and Execution/Risk of Bias Examples: Inappropriate selection of exposed and unexposed groups Failure to adequately measure/control for confounding Selective outcome reporting Failure to blind (e.g. outcome assessors) High loss to follow-up Lack of concealment in RCTs Intention to treat principle violated

Design and Execution/RoB From Cates , CDSR 2008

Design and Execution/RoB Overall judgment required

Design and Execution/RoB Cancer, anticoagulation and mortality Akl E, Barba M, Rohilla S, Terrenato I, Sperati F, Schünemann HJ. “Anticoagulation for the long term treatment of venous thromboembolism in patients with cancer”. Cochrane Database Syst Rev. 2008 Apr 16;(2):CD006650.

2. Inconsistency of results (Heterogeneity) if inconsistency, look for explanation patients, intervention, comparator, outcome if unexplained inconsistency lower quality

Reminders for immunization uptake

Indoor air polution: ALRI

Non-steroidal drug use and risk of pancreatic cancer Capurso G, Schünemann HJ, Terrenato I, Moretti A, Koch M, Muti P, Capurso L, Delle Fave G. Meta-analysis: the use of non-steroidal anti-inflammatory drugs and pancreatic cancer risk for different exposure categories. Aliment Pharmacol Ther. 2007 Oct 15;26(8):1089-99.

3. Directness of Evidence differences in populations/patients (children – neonates, women in general – pregnant women) interventions (all vaccines, new - old) comparator appropriate (new policy – old or no policy) outcomes (important – surrogate: cases prevented – seroconversion) indirect comparisons interested in A versus B have A versus C and B versus C Vaccine A versus Placebo versus Vaccine B

Possibly. The “high” dose effects of bisphenol A in laboratory animals that provide clear evidence for adverse effects on development, i.e., reduced survival, birth weight, and growth of offspring early in life, and delayed puberty in female rats and male rats and mice, are observed at levels of exposure that far exceed those encountered by humans. However, estimated exposures in pregnant women and fetuses, infants, and children are similar to levels of bisphenol A associated with several “low” dose laboratory animal findings of effects on the brain and behavior, prostate and mammary gland development, and early onset of puberty in females. When considered together, these laboratory animal findings provide limited evidence that bisphenol A has adverse effects on development.

Myocardial infarction Hierarchy of outcomes according to their importance to assess the effect of phosphate lowering drugs in patients with renal failure and hyperphosphatemia Flatulence Importance of outcomes 2 5 6 7 8 9 3 4 1 Mortality Myocardial infarction Fractures Pain due to soft tissue calcification / function Critical for decision making Important, but not critical for Low importance for Coronary calcification Ca2+/P- product Bone density Soft tissue calcification Surrogates: relation to important outcomes increasingly uncertain

4. Publication Bias Should always be suspected Only small “positive” studies (hypothesis confirming) For profit interest Various methods to evaluate – none perfect, but clearly a problem

I.V. Mg in acute myocardial infarction ISIS-4 Lancet 1995 Meta-analysis Yusuf S.Circulation 1993 Publication bias Egger M, Smith DS. BMJ 1995;310:752-54

Funnel plot Symmetrical: No publication bias Standard Error Odds ratio Symmetrical: No publication bias 1 Standard Error 2 3 0.1 0.3 0.6 1 3 10 Odds ratio

Funnel plot File drawer problem File drawer problem No interest in publishing or being published 1 0.4 Standard Error Asymmetrical: Publication bias? 2 3 0.1 0.3 0.6 1 3 10 Odds ratio

Indoor air polution: ALRI

5. Imprecision Small sample size Wide confidence intervals small number of events Wide confidence intervals uncertainty about magnitude of effect Extent to which confidence in estimate of effect adequate to support decision

Example: Immunization in children

What can raise quality? 1. large magnitude can upgrade (RRR 50%/RR 2) very large two levels (RRR 80%/RR 5) criteria everyone used to do badly almost everyone does well parachutes to prevent death when jumping from airplanes

Reminders for immunization uptake

What can raise quality? 2. dose response relation (higher INR – increased bleeding) childhood lymphoblastic leukemia risk for CNS malignancies 15 years after cranial irradiation no radiation: 1% (95% CI 0% to 2.1%) 12 Gy: 1.6% (95% CI 0% to 3.4%) 18 Gy: 3.3% (95% CI 0.9% to 5.6%) 3. all plausible confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed

All plausible residual confounding would result in an overestimate of effect Hypoglycaemic drug phenformin causes lactic acidosis The related agent metformin is under suspicion for the same toxicity. Large observational studies have failed to demonstrate an association Clinicians would be more alert to lactic acidosis in the presence of the agent Vaccine – adverse effects Meningococcal Conjugate Vaccine (Menactra) – Guillain Barre

Quality assessment criteria

Evidence Profiles/Summaries

Evidence Profiles/Summaries

Evidence Profiles/Summaries

Evidence Profiles/Summaries

Content Background Quality of evidence Moving from evidence to recommendations

Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” Strong or weak/conditional

Determinants of the strength of recommendation

Developing recommendations

Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick.

Methods – WHO Rapid Advice Guidelines for management of Avian Flu Applied findings of a recent systematic evaluation of guideline development for WHO/ACHR Group composition (including panel of 13 voting members): clinicians who treated influenza A(H5N1) patients infectious disease experts basic scientists public health officers methodologists Independent scientific reviewers: Identified systematic reviews, recent RCTs, case series, animal studies related to H5N1 infection

Oseltamivir for Avian Flu Summary of findings: No clinical trial of oseltamivir for treatment of H5N1 patients. 4 systematic reviews and health technology assessments (HTA) reporting on 5 studies of oseltamivir in seasonal influenza. Hospitalization: OR 0.22 (0.02 – 2.16) Pneumonia: OR 0.15 (0.03 - 0.69) 3 published case series. Many in vitro and animal studies. No alternative that is more promising at present. Cost: 40$ per treatment course

From evidence to recommendation

Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (????? recommendation, very low quality evidence). Schunemann et al. The Lancet ID, 2007

Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence). Values and Preferences Remarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment. Schunemann et al. The Lancet ID, 2007

Implications of a strong recommendation Patients: Most people in this situation would want the recommended course of action and only a small proportion would not Clinicians: Most patients should receive the recommended course of action Policy makers: The recommendation can be adapted as a policy in most situations

Implications of a conditional/weak recommendation Patients: The majority of people in this situation would want the recommended course of action, but many would not Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making Policy makers: There is a need for substantial debate and involvement of stakeholders

Panel P I C O Systematic review evidence profile with GRADEpro Create Summary of findings & estimate of effect for each outcome Outcomes across studies Rate quality of evidence for each outcome Formulate question Randomization increases initial quality Select outcomes Rate importance Risk of bias Inconsistency Indirectness Imprecision Publication bias P I C O Outcome Critical Grade overall quality of evidence across outcomes based on lowest quality of critical outcomes High Outcome Critical Moderate Grade down Low Outcome Important Very low Outcome Not important Large effect Dose response Confounders Grade up Panel Guideline development Formulate recommendations: For or against (direction) Strong or weak (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences Revise if necessary by considering: Resource use (cost) “We recommend using…” “We suggest using…” “We recommend against using…” “We suggest against using…”

Issues in guideline development in Public Health Causation versus effects of intervention Causation not equivalent to efficacy of interventions Bradford Hill Nearly half a century old – tablet from the mountain? Harms caused by medications Assumption is that removal of exposure leads to NO adverse effects How confident can one be that removal of the exposure is effective in preventing disease? Whether drugs or environmental factors it will depend on the intervention to remove exposure

GRADE and immunizations Can herd immunity following immunisation and indirect effects on the co-circulation of other pathogens typically be ascertained only through the use of observational epidemiological methods? Innovative randomized controlled trials (RCTs) using cluster-randomization increasingly being conducted to provide this evidence A 94% protective effect of a live, monovalent vaccine against measles is classified as “moderate level of scientific evidence.” GRADE’s strength of association criteria maybe applied to increase the grade by 2 levels – from “low” to “high” - possible in this situation GRADE ratings do not give credit to “gradient of effects with scale of population level impact compatible with degree of coverage.” GRADE’s dose-response criterion would apply to such gradients May anti-vaccination lobby groups abuse the ratings. Abuse of any system is possible: equally likely that increased transparency provided by the GRADE framework can strengthen, rather than undermine, the trust in vaccines and other interventions

Schünemann et al. JECH 2010

Bradford Hill Criteria Would agree with modifying if these criteria were not already considered Emphasis is on strength of recommendation, quality of evidence is only one factor

Conclusions Clinical practice guidelines should be based on the best available evidence to be evidence based GRADE combines what is known in health research methodology and provides a structured approach to improve communication Criteria for evidence assessment across questions and outcomes Criteria for moving from evidence to recommendations Transparent, systematic four categories of quality of evidence two grades for strength of recommendations Transparency in decision making and judgments is key

Formulating Questions and Choosing Outcomes

Outline Type of questions Framing a foreground question Choosing outcomes Relative importance of outcomes

Guidelines and questions Guidelines are a way of answering questions about clinical, communication, organisational or policy interventions, in the hope of improving health care or health policy. It is therefore helpful to structure a guideline in terms of answerable questions. WHO Guideline Handbook, 2008

Types of questions Background Questions Definition: What is COPD? Mechanism: What is the mechanism of action of mucolytic therapy? Foreground Questions Efficacy: In patients with COPD, does mucolytic therapy improve survival?

Framing a foreground question P I C O

Framing a foreground question Population: Intervention: Comparison: Outcomes:

Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick. Potential interventions: antivirals, such as neuraminidase inhibitors oseltamivir and zanamivir

What are examples of: Background questions Foreground questions Population: Intervention: Comparison: Outcomes:

Framing a foreground question Population: Avian Flu/influenza A (H5N1) patients Intervention: Oseltamivir (or Zanamivir) Comparison: No pharmacological intervention Outcomes: Mortality, hospitalizations, resource use, adverse outcomes, antimicrobial resistance Schunemann, Hill et al., The Lancet ID, 2007

Choosing outcomes Every decision comes with desirable and undesirable consequences Developing recommendations must include a consideration of desirable and undesirable outcomes Outcomes should be patient important outcomes.

Choosing outcomes lower mortality reduced hospital stay desirable outcomes lower mortality reduced hospital stay reduced duration of disease reduced resource expenditure undesirable outcomes adverse reactions the development of resistance costs of treatment

Choosing outcomes What if what is important is not measured? What if what is measured is not important? How do we make sure we’ve covered all important outcomes?

Relative importance of outcomes Decision makers (and guideline authors) need to consider the relative importance of outcomes when balancing these outcomes to make a recommendation Relative importance vary across populations Relative importance may vary across patient groups within the same population When considered critical - evaluate

Relative importance of outcomes 9 Critical for decision making Important, but not critical for decision making Of low importance 8 7 6 5 4 3 2 1

Using GRADEpro

Creating a new GRADEpro file

Profile groups Profiles

Managing outcomes

Content Quality of evidence Going from evidence to recommendations

Healthcare problem recommendation

Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” Strong or conditional

Strength of recommendation The degree of confidence that the desirable effects of adherence to a recommendation outweigh the undesirable effects. Desirable effects health benefits less burden savings Undesirable effects harms more burden costs

Determinants of the strength of recommendation

Balancing benefits and downsides ↑ Allergic reactions ↑ Local skin reactions ↑ Nausea ↑ Resources ↑ QoL ↓ Death ↓ Morbidity ↑ herd immunity Conditional Strong For Against

Balancing benefits and downsides ↑ Allergic reactions ↑ Local skin reactions ↑ Nausea ↑ Resources ↑ QoL ↓ Death ↓ Morbidity ↑ herd immunity Conditional Strong For Against

Balancing benefits and downsides ↑ Allergic reactions ↑ Local skin reactions ↑ Nausea ↑ Resources ↑ QoL ↓ Death ↓ Morbidity ↑ herd immunity Conditional Strong For Against

Balancing benefits and downsides ↑ Allergic reactions ↑ Local skin reactions ↑ Nausea ↑ Resources ↑ QoL ↓ Death ↓ Morbidity ↑ herd immunity Conditional Strong For Against

Balancing benefits and downsides ↑ Allergic reactions ↑ Local skin reactions ↑ Nausea ↑ Resources ↑ QoL ↓ Death ↓ Morbidity ↑ herd immunity Conditional Strong For Against

Implications of a strong recommendation Policy makers: The recommendation can be adapted as a policy in most situations Patients: Most people in this situation would want the recommended course of action and only a small proportion would not Clinicians: Most patients should receive the recommended course of action

Implications of a conditional recommendation Policy makers: There is a need for substantial debate and involvement of stakeholders Patients: The majority of people in this situation would want the recommended course of action, but many would not Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making

Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick.

Methods – WHO Rapid Advice Guidelines for Avian Flu Applied findings of a recent systematic evaluation of guideline development for WHO/ACHR Group composition (including panel of 13 voting members): clinicians who treated influenza A(H5N1) patients infectious disease experts basic scientists public health officers methodologists Independent scientific reviewers: Identified systematic reviews, recent RCTs, case series, animal studies related to H5N1 infection

Oseltamivir for Avian Flu Summary of findings: No clinical trial of oseltamivir for treatment of H5N1 patients. 4 systematic reviews and health technology assessments (HTA) reporting on 5 studies of oseltamivir in seasonal influenza. Hospitalization: OR 0.22 (0.02 – 2.16) Pneumonia: OR 0.15 (0.03 - 0.69) 3 published case series. Many in vitro and animal studies. No alternative that was more promising at present. Cost: 40$ per treatment course

From evidence to recommendation

Complex data & decisions: yes/no?

Recommendation The Guidelines Group recommends that fluoroquinolones are / not used in the treatment of all patients with MDR (Strong(conditional) recommendation/ low(moderate, high) grade of evidence)

Recommendation: In women with histologically confirmed CIN, the expert panel recommends/suggests cryotherapy/LEEP over cryotherapy/LEEP. Population: Women with histologically confirmed CIN Intervention: Cryotherapy versus LEEP Factor Decision Explanation High or moderate evidence (is there high or moderate quality evidence?) The higher the quality of evidence, the more likely is a strong recommendation. Yes N0 ÅÅOO There is moderate quality evidence from both randomised and observational controlled studies for recurrence rates. However, there is low quality evidence for other outcomes which were considered critical and important for decision making (e.g., severe adverse events, cervical cancer). There is uncertainty for fertility and other obstetrical outcomes, and HIV acquisition/transmission was not measured. Certainty about the balance of benefits versus harms and burdens (is there certainty?) The larger the difference between the desirable and undesirable consequences and the certainty around that difference, the more likely a strong recommendation. The smaller the net benefit and the lower the certainty for that benefit, the more likely is a conditional/ weak recommendation. No Benefits of LEEP were greater, and harms were fewer or similar Recurrence rates of CIN I, CIN II-III and all CINs are probably greater with cryotherapy CIN II-III, OR 3.3 (1.04 to 10.46) CIN I, OR 2.74 (0.62 to 12.07) All CIN, OR 2.14 (1.05 to 4.33) Cryotherapy may be less acceptable to patients than LEEP There may be little difference in serious adverse events between cryotherapy and LEEP, but there may be fewer minor adverse events (such as pain) with cryotherapy It is unclear whether there is a difference in fertility/obstetric outcomes Certainty in or similar values (is there certainty or similarity?) The more certainty or similarity in values and preferences, the more likely a strong recommendation. YES Similar values across women High value was placed on CIN recurrence, serious adverse events and acceptability to the patient Low value was placed on minor adverse events Resource implications (are the resources worth the intervention?) The lower the cost of an intervention compared to the alternative that is considered and other costs related to the decision – that is, the less resources consumed – the more likely is a strong recommendation. More resources required for LEEP Need for more skilled providers to perform LEEP Need for more or expensive equipment/supplies for LEEP; electrical supply for LEEP Need for local anaesthesia with LEEP Overall strength of recommendation Conditional

Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence). Remarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment. Schunemann et al. The Lancet ID, 2007

Issues in guideline development for immunization Causation versus effects of intervention Causation not equivalent to efficacy of interventions Bradford Hill Nearly half a century old – tablet from the mountain? Harms caused by interventions Assumption is that removal of vaccine (or no exposure) leads to NO adverse effects How confident can one be that removal of the exposure is effective in preventing disease? Whether immunization or environmental factors: will depend on the intervention to remove exposure

Current state of recommendations

Current state of recommendations Reviewed 7527 recommendations 1275 randomly selected Inconsistency across/within 31.6% did not recommendations clearly Most of them not written as executable actions 52.7% did not indicated strength

Recommendation The Guideline Group recommends rapid DST testing for resistance to INH and RIF or RIF alone over conventional testing or no testing at the time of diagnosis of TB (conditional,  /low quality evidence). Values and preferences: A high value was placed on outcomes such as preventing death and transmission of MDR as a result of delayed diagnosis as well as avoiding spending resources.

Group composition Group composition might affect recommendation Common principle: include all affected by the recommendations ( multi-disciplinary groups incl. patients/carers) – Industry? Keep a manageable size

The Process: How to make it constructive? Group members are heterogeneous and might have different objectives Chair facilitates rather than leads the group Common understanding of goal, tasks and ground rules Similar level of required knowhow and skills Sufficient technical support

Balanced participation and formal agreement Key task of chair Formal consensus processes Delphi Method Nominal group process Voting

Group processes

How to present controversies Lay out the controversies Describe the evidence Ask members to focus on the agreed upon evidence and the factors leading to a decision Ask whether there still is disagreement Vote Make voting explicit and transparent (ways of doing this to come tomorrow)

Conclusions - Process Success depends on strong chair(s), training of group, good facilitation and technical support Clinical and methods co-chairs Formal consensus developing methods might support agreement on recommendations Voting represents forced consensus Guideline development will require sufficient resources.

GRADE Grid

Panel P I C O Systematic review evidence profile with GRADEpro Create Summary of findings & estimate of effect for each outcome Outcomes across studies Rate quality of evidence for each outcome Formulate question Randomization increases initial quality Select outcomes Rate importance Risk of bias Inconsistency Indirectness Imprecision Publication bias P I C O Outcome Critical Grade overall quality of evidence across outcomes based on lowest quality of critical outcomes High Outcome Critical Moderate Grade down Low Outcome Important Very low Outcome Not important Large effect Dose response Confounders Grade up Panel Guideline development Formulate recommendations: For or against (direction) Strong or conditional (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences (Revise by considering:) Resource use (cost) “We recommend using…/should” “We suggest using…/might” “We recommend against using…/might not” “We suggest against using…/should not”

Conclusions WHO guidelines should be based on the best available evidence to be evidence based GRADE is the approach used by WHO and gaining acceptance internationally combines what is known in health research methodology and provides a structured approach to improve communication Does not avoid judgments but provides framework Criteria for evidence assessment across questions and outcomes Criteria for moving from evidence to recommendations Transparent, systematic four categories of quality of evidence two grades for strength of recommendations Transparency in decision making and judgments is key

Format Mix of seminars/interactive lectures, self directed learning and simulation Large group and smaller group discussion Computer work Simulate guideline panel work Select rapporteur (both for large group and any small group work)

Format Mix of seminars/interactive lectures, self directed learning and simulation Large group and smaller group discussion Computer work? Simulate guideline panel work Select rapporteur (for any small group work)