Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reuse of Electronic Medical Records for Research Our architecture Two examples.

Similar presentations


Presentation on theme: "Reuse of Electronic Medical Records for Research Our architecture Two examples."— Presentation transcript:

1 Reuse of Electronic Medical Records for Research Our architecture Two examples

2 Francis Collins, NEJM 9/16/2009 VanderbiltBioVU: A clinical laboratory for genomics and pharmacogenomics

3 Vanderbilt BioVU: an Opt-Out DNA Biobank Extracting DNA from left over blood samples

4 De-Identification eligible John Doe One way hash 32ef34a6e88c2… scrubbed Extract DNA 32ef34a6e88c2… John Doe 1.7 million records ~135,000 samples (>14,000 children) Research Identifier EMR

5 Patient Chart

6 Sample accrual into BioVU Currently >40 active projects w/ DNA >100 projects using Synthetic Derivative

7 Platform for EMR-clinical research at VUMC De-identified DNA Discarded blood samples Synthetic Derivative De-identification Clinical Notes WizOrder Orders Clinical Messaging StarChart ICD9, CPT Test Results

8 The “demonstration project” Are genotype-phenotype relations replicated in BioVU? Genotype “high-value” SNPs in the first 10,000 samples accrued. – 21 established loci (>1 SNP for some) – in 5 diseases with known associations: Atrial fibrillation Crohn’s disease Multiple Sclerosis Rheumatoid arthritis Type II Diabetes Develop “electronic phenotype algorithms” to identify cases and controls

9 Finding cases accurately Billing codes alone o nly 50-80% accurate Negation terms – “I don’t think this is MS” Context clues: – “FAMILY MEDICAL HISTORY: positive for rheumatoid arthritis.” Others – Note titles: “ Multiple Sclerosis Clinic Note“ True cases Natural Language Processing Billing codes Medications & Labs Genetic association tests

10 Rheumatoid Arthritis–Case Definition Evolution #Definition# Cases (in first 10k in BioVU) Problem 1ICD9 codes for RA + Medications (only in problem list) 371Found incomplete problem lists 2Same as above but searched notes411Patients billed as RA but actually other conditions, overlap syndromes, juvenile RA 3Above + require “rheumatoid arthritis” and small list of exclusions 358Overlap syndromes with other autoimmune conditions, conditions in which physicians did not agree 4Above + exclusion of other inflammatory arthritides 255PPV = 97%; a few “possible RA” or family history items remained

11 Finding cases: Rheumatoid Arthritis 2555071184 Definite Cases (algorithm-defined) Possible Cases (require manual review) Controls (algorithm-defined) Excluded (algorithm-defined) 7121 Used for analysis

12 Validating EMR phenotype algorithms (Using first 10,000 patients in BioVU) DiseaseMethodsDefinite CasesControlsCase PPVControl PPV Atrial fibrillationNLP of ECG impressions ICD9 codes CPT codes 1681695 98%100% Crohn’s DiseaseICD9 codes Medications (NLP) 1162643 100% Type 2 DiabetesICD9 codes Medications (NLP) NLP exclusions Labs 570764 100% Multiple SclerosisICD9 codes or text diagnosis 661857 87%100% Rheumatoid Arthritis ICD9 codes Medications (NLP) NLP exclusions 170701 97%100% NLP = Natural language processing Common themes: Billing codes – 5/5 NLP – 5/5 Meds – 4/5 Labs – 2/5 Common themes: Billing codes – 5/5 NLP – 5/5 Meds – 4/5 Labs – 2/5

13 Results 0.55.01.0 Odds Ratio 2.0 Ritchie et al., AJHG 2010 rs2200733Chr. 4q25 rs10033464Chr. 4q25 rs11805303IL23R rs17234657Chr. 5 rs1000113Chr. 5 rs17221417NOD2 rs2542151PTPN22 rs3135388DRB1*1501 rs2104286IL2RA rs6897932IL7RA rs6457617Chr. 6 rs6679677RSBN1 rs2476601PTPN22 rs4506565TCF7L2 rs12255372TCF7L2 rs12243326TCF7L2 rs10811661CDKN2B rs8050136FTO rs5219KCNJ11 rs5215KCNJ11 rs4402960IGF2BP2 Atrial fibrillation Crohn's disease Multiple sclerosis Rheumatoid arthritis Type 2 diabetes disease gene / region marker observedpublished

14 The eMERGE Network Coordinating center 135,000 20,000 10,000 4,000 3,000 Goal: to assess utility of DNA collections integrated with electronic medical records (EMRs) as resources for genome science Outcome: GWAS data in >20,000 subjects with EMRs. Vanderbilt phenotype: normal variability in QRS duration

15 GH Marsh Mayo NW Domain experts define phenotype (VU) Create initial EMR-based algorithm (VU) Evaluate & refine Share algorithm Hypothyroidism: An eMERGE network phenotype

16 Hypothyroidism algorithm Diagram courtesy Mike Conway (Mayo)

17 Site Case PPV (%) Control PPV (%) Group Health 98100 Marshfield 91100 Mayo Clinic 8296 Northwestern 98100 Vanderbilt 98100 All sites (weighted)92.498.5 Same algorithm, deployed at five sites Denny et al., AJHG 2011 Hypothyroidism Validation

18 Hypothyroidism: “No-Genotyping” GWAS FOXE1 Denny et al., AJHG 2011

19 eMERGE Phenotypes SitePrimary phenotypeSecondary Phenotypes Group HealthDementiawhite blood cell counts MarshfieldCataractsdiabetic retinopathy Mayo ClinicPeripheral Arterial Disease red blood cell counts ESR levels NorthwesternType 2 Diabeteslipids and height VanderbiltNormal cardiac conductionPheWAS Network Phenotypes Autoimmune Hypothyroidism Resistant hypertension =novel associations discovered bold=GWAS completed with significant results


Download ppt "Reuse of Electronic Medical Records for Research Our architecture Two examples."

Similar presentations


Ads by Google