Download presentation
Presentation is loading. Please wait.
Published byElijah Barber Modified over 9 years ago
1
Selection of Data Sources for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ) www.ahrq.gov
2
This presentation will: Propose data source(s) that include data required to address primary and secondary research questions Describe details of data source(s) selected for the study Describe validation or other quality assessments that have been conducted on the data source that are relevant to the data elements required for the study Describe what patient identifiers are necessary for the research purpose, how they will be protected, and the permissions/waivers required Provide details on the data-linkage approach and the quality/accuracy of the linkage, if applicable Outline of Material
3
The research question dictates the type of data required. The data source must be chosen carefully to ensure it: Can address the study question Has a sufficient number of observations Contains key variables including potential confounders Allows for sufficient length of followup Introduction
4
Primary data are collected by the investigator directly from study participants to address a specific question or hypothesis. The observational research designs that often require primary data collection are described below: Prospective observational studies: Subjects are selected on the basis of specific characteristics, and their progress is monitored. The investigator does not assign the exposure of interest. Registries: Registries use an observational study design to collect data and do not specify treatments or require therapies intended to change patient outcomes. Registry data can be used for public health surveillance, to determine incidence rates, to perform risk assessment, and to monitor progress. Primary Data
5
Secondary data is data collected for other purposes that can be used to answer the research question. Electronic medical record data The method of data collection is not standardized nationally, and the intervals between visits vary for every patient. Patients may receive care at different facilities not captured in the electronic medical record. Paper-based records The exclusion of sites without electronic records may bias results. These records may be valuable as a source of primary data for validating data available elsewhere (e.g., administrative claims data). Secondary Data (1 of 3)
6
Administrative data Typically generated as part of the process of obtaining insurance reimbursement International Classification of Disease (ICD) and Common Procedural Terminology (CPT) Pharmacy data Claims submitted to insurance companies for payments, as well as pharmacy dispensing records National Drug Code (NDC): a unique 10-digit identifier to indicate active ingredient(s), dosage and strength, route of administration, and package size and type Secondary Data (2 of 3)
7
Regulatory data The U.S. Food and Drug Administration (FDA) has a vast store of data from regulatory approval submissions. The FDA is converting its older datasets into research-appropriate data to be stored into its clinical trial repository. These data, however, typically come from efficacy trials rather than effectiveness trials. Repurposed trial data or data from completed observational studies Investigators who conduct clinical research studies funded by the Federal Government are mandated to make their data publicly available. The available data are limited to what were collected. Secondary Data (3 of 3)
8
Required data elements Identify the minimum data requirements a priori Supplement datasets by predicting absent data variables or by linking to available data Time period and duration of followup Essential that the data capture the time period of relevance Inadequate length of followup often makes data unusable Followup should take the natural history of the outcomes into account Considerations for Selecting Data
9
Missing data Can compromise validity of the study, depending on whether there is a pattern in the type and extent of “missingness” Define a priori an acceptable percent of missing data for key data elements required for analysis Use imputation techniques for missing data Changes that may alter data availability and consistency over time Is either a singular event or a gradual shift over time Evaluate points in time where the database is discontinuous (trend breaks) Ensuring Quality Data (1 of 2)
10
Validity of key data definitions Assess the general definition of key variables and their reliability and validity in the chosen database. Commonly used data resources may have validated outcomes of health events (e.g., heart attack, mortality). Creating best definitions of key variables often requires involvement of clinicians. Compare study data to a “gold standard,” such as paper- based medical records if available. Ensuring Quality Data (2 of 2)
11
Data outside the United States It may be easier to obtain complete, long-term data from countries with single-payer systems that have regional or national electronic medical record systems. Consider possible regional variations in exposure, availability, cost, and health care systems. Point-of-care data collection and interactive voice response/other technologies The advantages of a new and timely data stream include more detailed data that can be used to spot trends or patterns. Data streams are often specialized. Emerging Issues and Opportunities (1 of 2)
12
Data pooling and networking Data pooling: combing data, at the level of the unit of analysis, from several sources into a single cohort Networking: common protocols, data definitions, and programming developed for several data resources Personal health records Electronically stored health records initiated by the patient Patient-reported outcomes (PROs) Increased availability of PROs in administrative data of value to comparative effectiveness research Emerging Issues and Opportunities (2 of 2)
13
Summary Checklist (1 of 2) GuidanceKey Considerations Propose data source(s) that include data required to address primary and secondary research questions Ensure that the data resource is appropriate for addressing the study question Ensure that key variables needed to conduct the study are available in the data source Describe details of data source(s) selected for the study The nature of the data (claims, paper, or electronic medical records; if prospective, how is/was the information collected and from whom) Coding system(s) that may be used (e.g., ICD9 or ICD10; HCPCS; etc.) Population included in the data source (ages, geography, etc.) Other features (e.g., health plan membership, retention rate [i.e., average duration of followup for members in the database, proportion of patients with followup sufficiently long for the study purpose]) Time period covered by the data source(s) If non-U.S. data, describe relevant differences in health care and how this will affect results
14
Summary Checklist (2 of 2) GuidanceKey Considerations Describe validation or other quality assessments that have been conducted on the data source(s) that are relevant to the data elements required for the study If validation/quality assessments have not previously been performed, propose a method to assess data quality Describe what patient identifiers are necessary for the research purpose, how they will be protected, and permissions/waivers required Provide details on the data-linkage approach and the quality/accuracy of linkage, if applicable Provide enough detail to clarify the quality of the linkage approach
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.