Assessing Applicability Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide

1 Assessing Applicability Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide

2 Systematic Review Process Overview

3  To describe applicability and substantiate its importance  To delineate a systematic approach to assessing applicability by using PICOS domains  To recognize common features that may impact the applicability of studies Learning Objectives PICOS domains = population, intervention, comparators, outcomes, and setting

4  Applicability is defined as the extent to which the results observed in published studies are likely to reflect the expected outcomes when an intervention is applied to broader populations under real-world conditions.  Similar terms include external validity, generalizability, directness, and relevance. Defining Applicability

5  Unless the evidence synthesized within a systematic review is applicable to relevant populations under real-world practice conditions, the conclusions may be inappropriate for guiding clinical or policy decisions. The Importance of Applicability in Systematic Reviews

6  Frame issues of applicability with reference to specific clinical or policy questions the review is intended to inform.  Applicability needs to be considered at the outset:  when the scope of the review is determined, and  when the key questions are identified.  This module focuses on handling applicability issues at the analyze-and-synthesize-data stage. Framing Applicability Issues

7  Clinical experts and stakeholders can provide general information important to framing applicability issues.  What does the population of interest looks like?  What types of care or procedures are routine or represent standard of care?  Are certain subpopulations characteristically different from others? Applicability Resources

8  Registry or epidemiological information, practice guidelines, consensus papers, book chapters, and general reviews can provide useful applicability information.  Applicability issues do not have to be reviewed for each study.  They are used to place the available literature in context. Other Applicability Resources

9  Studies often do not report on factors needed to judge applicability.  Increased use of CONSORT diagrams helps, but deficiencies remain.  Only a small number of studies are designed with applicability in mind.  Most trials are efficacy trials that maximize internal validity at the expense of applicability. How Individual Studies Consider Applicability CONSORT = Consolidated Standards of Reporting Trials

10  Study parameters may affect applicability through these different mechanisms:  Effect modifiers  Modifiers of baseline rates  Factors that reduce ability to generalize results to everyday practice Study Parameters Affect Applicability

11  Applicability should be judged separately for each important question.  Factors influencing benefits and harms often depend on distinct physiologic processes.  The most applicable evidence may differ when considering a benefit or a harm. Applicability Judged for Each Question

12  Applicability of a body of evidence is not the same as applicability of individual studies.  A collection of studies may provide broad applicable evidence even if individual studies do not.  Consistency of study results across different populations and settings increases confidence in the applicability of the results.  Applicability depends on context and cannot be assessed with a simple rating scale.  There are no empiric data validating any scoring system for applicability. Applicability of a Body of Evidence

13  Applicability is best reported separately from quality or strength of evidence.  Internal validity concerns are not altered by changes in perspective, but applicability is.  Some evidence hierarchies (GRADE) consider applicability “or directness” when assessing the quality of evidence.  For systematic reviews in which multiple perspectives (clinician, patient, policymaker) are considered, applicability should be assessed separately. Judge Applicability and Strength of Evidence Separately

14  Stepwise approach to applicability: 1.Identify beforehand a limited number of factors that are most likely to influence applicability of evidence for the specific issue being reviewed and report the relevant elements from individual studies. 2.Judge limitations to the applicability of individual studies based on those factors. 3.Summarize the applicability of the body of evidence from multiple studies with respect to specific populations, interventions, comparisons, and outcomes. Three Steps To Assess Applicability

15  Identify most important applicability issues  Understand the real-world context  Understand how studies differ from the real-world context  Use PICOS format to organize factors  Report features from individual studies that impact applicability in evidence tables Step 1. Determine and Report the Most Important Factors in PICOS Format PICOS = population, intervention, comparators, outcomes, and setting

16  In a trial of women with osteoporosis, only 4,000 of the 54,000 women screened were enrolled; the enrollees were younger, healthier, and more adherent to therapy than is typical of women with osteoporosis.  A trial of etanercept for juvenile diabetes excluded patients with side effects during an active run-in period; the trial found a low incidence of adverse events.  Clinical trials used to inform Medicare decisions enrolled patients who were younger (60 vs. 75 years of age) and more often male (75 vs. 42%) than is typical of Medicare patients with cardiovascular disease. Population and Applicability: Examples

17 Conditions That Limit Applicability Features That Should Be Extracted Into Evidence Tables Narrow eligibility criteria, high exclusion rate, low enrollment Eligibility criteria, proportion of screened individuals enrolled Differences between patients in study and the community Demographics (range and mean): age, gender, race, ethnicity Narrow or unrepresentative severity or stage of illness Severity or stage of illness (referral or primary care population) Run-in periods with high exclusion rates Run-in period: attrition rate before randomization and reasons (e.g., nonadherence, adverse drug events, no response) Events rates markedly different than in community Event rates in treatment and control groups Disease prevalence in study population different than community Prevalence of disease (for diagnostic studies) Population and Applicability

18  Studies of behavioral modification to promote healthy diet employ a larger number and longer duration of visits than those available to most community patients.  The use of pill counts in antiretroviral trials does not always translate into effectiveness in real-world practice.  Combining iron and zinc attenuates the ability of iron to raise hemoglobin levels.  Trials of carotid endarterectomy select surgeons with extensive experience and low complication rates and are not representative of average vascular surgeons. Intervention and Applicability: Examples

19 Conditions That Limit Applicability Features That Should Be Extracted Into Evidence Tables Regimen not reflective of current practiceMedication dose, schedule, duration Intensity of intervention not feasible for routine use Intensity of behavioral interventions Monitoring practices or visit frequency not used in practice Adherence to interventions Versions not in common useVersion of rapidly changing technology Cointerventions that likely modify effectiveness of therapy Cointerventions Level of training not widely availableTraining/skill level of intervention team (surgery/diagnostics) Intervention and Applicability

20  A fixed-dose study that compared high-dose duloxetine to low-dose paroxetine  Many trials evaluating magnesium as a treatment for acute myocardial infarction that were conducted before thrombolytic drugs, antiplatelet drugs, ß-blockers, and primary percutaneous coronary intervention (PCI) were used  Only 1 of 23 trials that compared bypass surgery to PCI used drug-eluting stents Comparator and Applicability: Examples

21  Trials of biologics for rheumatoid arthritis that used radiographic progression rather than symptom evaluations as an outcome measure  Trials comparing cyclooxygenase-2 inhibitors and nonsteroidal antiinflammatory drugs that used endoscopy-evaluated ulceration rather than symptomatic ulcers as an outcome measure Outcomes and Applicability: Examples

22 Conditions That Limit Applicability Features That Should Be Extracted Into Evidence Tables Comparator Regimen not reflective of current practice Medication dose, schedule, duration (if applicable) Use of substandard alternative therapy Comparator chosen vs. others available (if applicable) Outcomes Surrogate endpoints, brief follow- up periods, improper definitions for outcomes, composite endpoints Outcomes (benefits and harms) and how they were defined Comparator, Outcomes, and Applicability

23  Studies evaluating the benefits of breast self- examinations conducted in China and Russia, countries that do not employ routine mammography screening as in the United States  Studies of open surgical abdominal aortic aneurysm repair showing an inverse relationship between hospital volume and short-term mortality Setting and Applicability: Examples

24 Setting and Applicability Conditions That Limit Applicability Features That Should Be Extracted That Should Be Extracted Into Evidence Tables Into Evidence Tables Settings in which standards of care differ markedly from setting of interest Geographic setting Specialty population or level of care that differs from community Clinical setting

25  Make judgments about applicability of individual studies  Look for effectiveness trials  Specifically designed to produce applicable results  Report on applicability  Highlight effectiveness trials  Compare and contrast effectiveness and efficacy trials  Clearly report and highlight major applicability issues of efficacy trials and what impact they may have on the interpretation of the study Step 2. Make and Report Judgments About Major Limitations to the Applicability of Individual Studies

26 Seven criteria: 1.Enroll a primary-care population 2.Have less-stringent eligibility criteria 3.Assess health-related outcomes 4.Have a long study duration and clinically relevant treatment modalities 5.Assess for adverse events 6.Have an adequate sample size to assess minimally important differences from a patient perspective 7.Use intention-to-treat analysis Effectiveness Trials Gartlehner G, et al. J Clin Epidemiol 2006;59:1040-8; Gartlehner G, et al. Int J Technol Assess Health Care 2009;25:

27  For efficacy trials, clearly report characteristics that may limit applicability.  Describe those characteristics in the text or in evidence tables under the heading “comments” or “limitations.”  Describe how important factors would affect applicability and the expected direction and magnitude of bias. Make Judgments About the Applicability of Individual Studies

28 Evidence Table Template for Applicability TrialPopulationInterventionComparator Outcomes, Setting Comments Smith et al. 24 Heart failure population Mean age: 65 years NYHA class II or III: 83% Surgical debulking of myocardium Watchful waiting (ACE inhibitor use, 34%; ß- blocker use, 40%) Hospitali- zations and survival Median followup at 1 year Single, large, tertiary care hospital An efficacy trial; limited standardization of intervention; comparator did not include optimal medical therapy; unclear how the benefits and harms would compare in a smaller community hospital After reviewing this sample evidence table, assume that four other trials are available with similar PICOS domains. Slutsky J, et al. In: Methods guide for comparative effectiveness reviews. Available at: ACE = angiotensin-converting enzyme; NYHA = New York Heart Association; PICOS = population, intervention, comparators, outcomes, and setting

29  Describe the limitations of aggregate evidence in the conclusions.  Generate a summary applicability table with a PICOS structure.  Aggregate applicability is not the sum of applicability of individual studies.  Each individual study can have relatively low applicability in at least one PICOS domain, but the body of evidence can have high applicability. Step 3. Consider and Summarize the Applicability of a Body of Evidence From Individual Studies PICOS = population, intervention, comparators, outcomes, and setting

30 Which Series of Studies Has an Applicability Issue? Study 1 Study 2Study 3 Study 4 Higher Age Lower Age Study 4Study 1Study 3Study 2 Scenario 1 Scenario 2

31 Step 3. Applicability Summary Table Template Domain Description of Applicability Evidence for a Key Question Population Describe the general characteristics of enrolled populations and how they may differ from the target population. Where possible, describe proportions (percentage older than 65) rather than an age range or average age. Intervention Describe the general characteristics of the interventions, how they compare to routine practice, and how this may affect benefits or harms. Comparators Describe comparators, whether they reflect best alternative therapy, and how this may impact treatment effect size. Outcomes Describe outcomes most frequently reported and over what period of time. Describe whether or not these outcomes reflect the most important benefits or harms. Setting Describe the settings in which the studies were conducted and the potential implications of using the intervention elsewhere.

32 Step 3. Completed Applicability Summary Table DomainDescription of Applicability Evidence for a Key Question PopulationThe population and disease stage are representative of the United States population with heart failure. InterventionThe intervention is plausible. ComparatorsWatchful waiting is reasonable if the baseline treatment in both groups was standard medical therapy. Standard medical therapy is not being used in most patients. Subgroup analyses suggest that benefits are predominantly in those patients not receiving standard therapy. OutcomesAlthough hospitalizations and survival are being evaluated, other outcomes, including harms, are not. SettingThe settings for the studies are large tertiary medical centers, which may overestimate the benefits of therapy in actual practice and accentuate the harms.

33 Comparison Strength of Evidence Conclusions With Description of Applicability Myocardial debulking versus watchful waiting LowWhen compared with watchful waiting, the use of myocardial debulking reduced hospitalizations without affecting survival. These results are predominantly limited to patients 65 years of age and older with NYHA class II and III disease. This is similar to the population with heart failure in the United States. In a subgroup analysis, benefits were accentuated in patients not receiving standard medical therapy with ACE inhibitors and ß- blockers, thus limiting applicability. No harms were being evaluated. These trials were conducted in select, large, tertiary medical centers and may not be applicable to other settings. Summary Table for Body of Evidence Applicability ACE = angiotensin-converting enzyme; NYHA = New York Heart Association

34  The PICOS framework is useful for organizing contextual factors of interest.  Input from clinical experts/stakeholders and a review of current practices can help identify specific applicability issues.  Population-based surveys, pharmacoepidemiologic studies, and large case series or registries can help determine how trial populations and circumstances differ from normal practice. Key Messages (I) PICOS = population, intervention, comparators, outcomes, and setting

35  Reviewers should routinely select the most important characteristics that may limit applicability and assess whether variation within the factors may modify outcome effects.  Reports should clearly highlight issues about applicability of individual studies in a “comments” or “limitations” section in evidence tables or in the text.  Meta-regression and/or separate applicability summary tables may help reviewers, and those using the reports, to see how the body of evidence applies to the question at hand.  Judgments about applicability of the evidence should consider the entire body of studies.  Important limitations of the applicability of the evidence should be described within each summary conclusion. Key Messages (II)

36  Gartlehner G, Hansen RA, Nissman D, et al. A simple and valid tool distinguished efficacy from effectiveness studies. J Clin Epidemiol 2006;59:  Gartlehner G, Thieda P, Hansen RA, et al. Inadequate reporting of trials compromises the applicability of systematic reviews. Int J Technol Assess Health Care 2009;25:  Shadish W, Cook T. Experimental and quasi-experimental design for generalized causal inference. Boston: Houghton Mifflin;  Thorpe KE, Zwarenstein M, Oxman AD, et al. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J Clin Epidemiol 2009;62: References

37  This presentation was prepared by C. Michael White, Pharm.D., FCP, FCCP, a member of the University of Connecticut/Hartford Hospital Evidence-based Practice Center.  This module is based on chapter 6 in version 1.0 of the Methods Reference Guide for Effectiveness and Comparative Effectiveness Reviews (available at: 07_10Draft MethodsGuide.pdf). Author

