Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Extraction Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide

Similar presentations

Presentation on theme: "Data Extraction Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide"— Presentation transcript:

1 Data Extraction Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide

2 Systematic Review Process Overview

3  To describe why data extraction is important  To identify challenges in data extraction  To describe the general layout of a data extraction form  To suggest methods for collecting data accurately and efficiently  To discuss the pros and cons for querying original authors Learning Objectives

4  To summarize studies in a common format to facilitate synthesis and coherent presentation of data  To identify numerical data for meta-analyses  To obtain information to assess more objectively the risk of bias in and applicability of studies  To identify systematically missing or incorrectly assessed data, outcomes that are never studied, and underrepresented populations Why Is Data Extraction Important?

5  Extracted data should:  Accurately reflect information reported in the publication  Remain in a form close to the original reporting, so that disputes can be easily resolved  Provide sufficient information to understand the studies and to perform analyses  Extract only the data needed, because the extraction process:  Is labor intensive  Can be costly and error prone  Different research questions may have different data needs On Data Extraction (I)

6  Data extraction involves more than copying words and numbers from the publication to a form.  Clinical domain, methodological, and statistical knowledge is needed to ensure the right information is captured.  Interpretation of published data is often needed.  What is reported is sometimes not what was carried out.  Data extraction and evaluation of risk of bias and of applicability typically occur at the same time. On Data Extraction (II)

7 “It is an eye-opening experience to attempt to extract information from a paper that you have read carefully and thoroughly understood only to be confronted with ambiguities, obscurities, and gaps in the data that only an attempt to quantify the results reveals.” — Gurevitch and Hedges (1993) Gurevitch J, Hedges LV. In: Design and analysis of ecological experiments; Data Extraction: A Boring Task?

8  In the Evidence-based Practice Center Program, we often refer to two types of tables:  Evidence Tables  Essentially are data extraction forms  Typically are study specific, with data from each study extracted into a set of such tables  Are detailed and typically not included in main reports  Summary Tables  Are used in main reports facilitate the presentation of the synthesis of the studies  Typically contain context-relevant pieces of the information included in study-specific evidence tables  Address particular research questions Comparative Effectiveness Reviews: Clarifying Research Terminology

9  Use key questions and eligibility criteria as a guide  Anticipate what data summary tables should include:  To describe studies  To assess outcomes, risk of bias, and applicability  To conduct meta-analyses  Use the PICOTS framework to choose data elements:  Population  Intervention (or exposure)  Comparator (when applicable)  Outcome (remember numerical data)  Timing  Study design (study setting) What Data To Collect?

10  Population-generic elements may include patient characteristics, such as age, gender distribution, and disease stage.  More specific items may be needed, depending upon the topic.  Intervention or exposure and comparator items depend upon the extracted study.  Study types include randomized trial, observational study, diagnostic test study, prognostic factor study, family-based or population-based genetic study, et cetera. Data Elements: Population, Intervention, and Comparator

11  Outcomes should be determined a priori with the Technical Expert Panel.  Criteria often are unclear about which outcomes to include and which to discard.  Example: mean change in ejection fraction versus the proportion of subjects with an increase in ejection fraction by ≥5 percent  Record different definitions of “outcome” and consult with content experts before making a decision about which definition to use. Data Elements: Outcome (I)

12  Apart from outcome definitions, quantitative data are needed for meta-analysis:  Dichotomous variables (e.g., deaths, patients with at least one stroke)  Count data (e.g., number of strokes, counting multiple ones)  Continuous variables (e.g., mm Hg, pain score)  Survival data  Sensitivity, specificity, receiver operating characteristic  Correlations  Slopes Data Elements: Outcome (II)

13  The data elements to be extracted vary by type of study.  Consider collecting this information when recording study characteristics for randomized trials:  Number of centers (multicenter studies)  Method of randomization (adequacy of allocation concealment)  Blinding  Funding source  Whether or not an intention-to-treat analysis was used Data Elements: Timing and Study Design

14  Provide “operational definitions” (instructions) indicating exactly what should be extracted in each field of the form.  Make sure that all data extractors understand the operational definitions the same way.  Pilot-test the forms on several published papers.  Encourage communication to clarify even apparently mundane questions. Always Provide Instructions

15  Independent extraction of data by at least two experienced reviewers is ideal but is also resource intensive.  There is a tradeoff between cost and the quality of data extraction.  Data extraction often takes longer than 2 hours per paper.  A reduction in the scope of the work may be necessary if independent data extraction is desired.  Careful single extraction by experienced reviewers, with or without crosschecking of selected items by a second reviewer, is a good compromise. Single Versus Double Extraction

16  To address all needs, a generic data extraction form will have to be very comprehensive.  Although there are common generic elements, forms need to be adapted to each topic or study design to be most efficient.  Organization of information in the PICOTS (population, intervention, comparator, outcome, timing, and setting) format is highly desirable.  Balance the structure of the form with the flexibility of its use.  Anticipate the need to capture unanticipated data.  Use an iterative process and have several individuals test the form on multiple studies. Developing Data Extraction Forms (Evidence Tables)

17  Forms have to be constructed before any serious data extraction is underway.  Original fields may turn out to be inefficient or unusable when coding begins.  Reviewers must:  be as thorough as possible in the initial set-up,  reconfigure the tables as needed, and  use a dual review process to fill in gaps. Common Problems Encountered When Creating Data Extraction Forms (Evidence Tables) (I)

18 First DraftSecond Draft Evidence Tables: Example (I)

19 Final Draft Evidence Tables: Example (II)

20  Lack of uniformity among outside reviewers:  No matter how clear and detailed are the instructions, data will not be entered identically by one reviewer to the next.  Solutions:  Develop an evidence table guidance document—instructions on how to input data.  Limit the number of core members handling the evidence tables to avoid discrepancies in presentation. Common Problems Encountered When Creating Data Extraction Forms (Evidence Tables) (II)

21  In the “country, setting” field, data extractors should list possible settings that could be encountered in the literature:  Academic medical center(s), community, database, tertiary care hospital(s), specialty care treatment center(s), substance abuse center(s), level I trauma center(s), et cetera.  In the “study design” field, data extractors should list one of the following:  Randomized controlled trial, cross-sectional study, longitudinal study, case-control study, et cetera. Sample Fields From a Table Guidance Document: Vanderbilt University Evidence-based Practice Center

22 Reviewer AReviewer B Example: Two Reviewers Extract Different Data

23 Trikalinos TA, et al. AHRQ Technology Assessment. Available at:  For evidence reports or technology assessments that have many key questions, data extraction forms may be several pages long.  The next few slides are examples of data extraction forms.  Remember, there is more than one way to structure a data extraction form. Samples of Final Data Extraction Forms (Evidence Tables)

24 Trikalinos TA, et al. AHRQ Technology Assessment. Available at: Examples: Differential Data Extraction by Two Reviewers

25 Trikalinos TA, et al. AHRQ Technology Assessment. Available at: Characteristics of the Index Test and Reference Standard

26 Trikalinos TA, et al. AHRQ Technology Assessment. Available at: Results (Concordance/Accuracy)

27 Trikalinos TA, et al. AHRQ Technology Assessment. Available at: Results (Nonquantitative)

28  Pencil and paper  Word processing software (e.g., Microsoft Word)  Spreadsheet (e.g., Microsoft Excel)  Database software (e.g., Microsoft Access, Epi Info™)  Dedicated off-the-shelf commercial software  Homegrown software Tools Available for Data Extraction and Collection

29 Berlin J, for the University of Pennsylvania Meta-analysis Blinding Study Group. Lancet 1997;350:  Who should extract the data?  Domain experts versus methodologists  What extraction method should be used?  Single or double independent extraction followed by reconciliation versus single extraction and independent verification  Should data extraction be blinded (to authors, journal, results)? Extracting the Data

30  Problems in data reporting  Inconsistencies in published papers  Data reported in graphs Challenges in Data Extraction

31 “Data for the 40 patients who were given all 4 doses of medications were considered evaluable for efficacy and safety. The overall study population consisted of 10 (44%) men and 24 (56%) women, with a racial composition of 38 (88%) whites and 5 (12%) blacks.” Examples of Data Reporting Problems (I)

32 Examples of Data Reporting Problems (II)

33 Examples of Data Reporting Problems (III)

34  Let us extract the number of deaths in two study arms, at 5 years of followup... Inconsistencies in Published Papers

35 PCI(205)MED(203) Dead2425 Overall Mortality […] 24 deaths occurred in the PCI group, […] and 25 in the MT group […] MED and MT = medical treatment; PCI = percutaneous coronary intervention Results Text

36 PCI(205)MT(203) Dead MT = medical treatment PCI = percutaneous coronary intervention Overall Mortality Figure

37 PCI (205) MT (203) Dead CABG = coronary artery bypass graft MT = medical treatment PCI = percutaneous coronary intervention Clinical Events Table

38 Green BF, Hall JA. Annu Rev Psychol 1984;35:  Because so few research reports give effect size, standard normal deviates, or exact p-values, the quantitative reviewer must calculate almost all indices of study outcomes.  Little of this calculation is automatic, because results are presented in a bewildering variety of forms and are often obscure. Why Do Such Problems Exist?

39 Source Forge Web site. Engauge Digitizer. Available at:  Engauge Digitizer, an open-source software:  Each data point is marked with an “X,” and the coordinates are given in a spreadsheet. Using Digitizing Software

40  Missing information in published papers  Publications with at least partially overlapping patient subgroups  Potentially fraudulent data Additional Common Issues

41  Data extraction is laborious and tedious.  To err is human: data extractors will identify errors and will err themselves.  Interpretation and subjectivity are unavoidable.  Data are often not reported in a uniform manner (e.g., quality, location in paper, metrics, outcomes, numerical value vs. graphs). Conclusions

42  Key questions will guide reviewers in choosing which information to extract.  There is no single correct way to record extracted data.  Extracting data requires familiarity with the content and knowledge of epidemiological principles and statistical concepts.  Be persistent:  Often, one can extract more information than the paper initially appears to contain (e.g., by digitizing graphs).  Be comprehensive:  Try to verify the same piece of information from different places in the same article.  Sometimes there are surprising inconsistencies.  Inconsistencies indicate suboptimal reporting quality at least. Key Messages

43 Berlin J, for the University of Pennsylvania Meta-analysis Blinding Study Group. Does blinding of readers affect the results of meta-analysis? Lancet 1997;350: Green BF, Hall JA. Quantitative methods for literature reviews. Annu Rev Psychol 1984;35: Gurevitch J, Hedges LV. Meta-analysis: combining the results of independent experiments. In: Scheiner AM and Gurevich J, eds. Design and analysis of ecological experiments. New York: Chapman & Hall; p Source Forge Web site. Engauge Digitizer. Available at: %20Digitizer/. References (I)

44  Trikalinos TA, Ip S, Raman G, et al. Home Diagnosis of Obstructive Sleep Apnea-Hypopnea Syndrome. Technology Assessment (Prepared by Tufts–New England Medical Center Evidence- based Practice Center). Rockville, MD: Agency for Healthcare Research and Quality; August Available at: wnloads/ id48TA.pdf. References (II)

45  This presentation was prepared by Joseph Lau, M.D., and Thomas Trikalinos, M.D., Ph.D., members of the Tufts–New England Medical Center Evidence-based Practice Center, and Melissa L. McPheeters, Ph.D., M.P.H., and Jeff Seroogy, B.S., members of the Vanderbilt University Evidence-based Practice Center.  The information in this module is currently not included in Version 1.0 of the Methods Guide for Comparative Effectiveness Reviews (available at: uide.pdf). Authors

Download ppt "Data Extraction Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide"

Similar presentations

Ads by Google